The sum-rank metric is a hybrid between the Hamming metric and the rank metric and suitable for error correction in multishot network coding and distributed storage as well as for the design of quantum-resistant cryptosystems. In this work, we consider the construction and decoding of folded linearized Reed-Solomon (FLRS) codes, which are shown to be maximum sum-rank distance (MSRD) for appropriate parameter choices. We derive an efficient interpolation-based decoding algorithm for FLRS codes that can be used as a list decoder or as a probabilistic unique decoder. The proposed decoding scheme can correct sum-rank errors beyond the unique decoding radius with a computational complexity that is quadratic in the length of the unfolded code. We show how the error-correction capability can be optimized for high-rate codes by an alternative choice of interpolation points. We derive a heuristic upper bound on the decoding failure probability of the probabilistic unique decoder and verify its tightness by Monte Carlo simulations. Further, we study the construction and decoding of folded skew Reed-Solomon codes in the skew metric. Up to our knowledge, FLRS codes are the first MSRD codes with different block sizes that come along with an efficient decoding algorithm.
We present a novel extension of the traditional neural network approach to classification tasks, referred to as variational classification (VC). By incorporating latent variable modeling, akin to the relationship between variational autoencoders and traditional autoencoders, we derive a training objective based on the evidence lower bound (ELBO), optimized using an adversarial approach. Our VC model allows for more flexibility in design choices, in particular class-conditional latent priors, in place of the implicit assumptions made in off-the-shelf softmax classifiers. Empirical evaluation on image and text classification datasets demonstrates the effectiveness of our approach in terms of maintaining prediction accuracy while improving other desirable properties such as calibration and adversarial robustness, even when applied to out-of-domain data.
What are the relations between the edge weights and the topology in real-world graphs? Given only the topology of a graph, how can we assign realistic weights to its edges based on the relations? Several trials have been done for edge-weight prediction where some unknown edge weights are predicted with most edge weights known. There are also existing works on generating both topology and edge weights of weighted graphs. Differently, we are interested in generating edge weights that are realistic in a macroscopic scope, merely from the topology, which is unexplored and challenging. To this end, we explore and exploit the patterns involving edge weights and topology in real-world graphs. Specifically, we divide each graph into layers where each layer consists of the edges with weights at least a threshold. We observe consistent and surprising patterns appearing in multiple layers: the similarity between being adjacent and having high weights, and the nearly-linear growth of the fraction of edges having high weights with the number of common neighbors. We also observe a power-law pattern that connects the layers. Based on the observations, we propose PEAR, an algorithm assigning realistic edge weights to a given topology. The algorithm relies on only two parameters, preserves all the observed patterns, and produces more realistic weights than the baseline methods with more parameters.
Background & Objective: Biomedical text data are increasingly available for research. Tokenization is an initial step in many biomedical text mining pipelines. Tokenization is the process of parsing an input biomedical sentence (represented as a digital character sequence) into a discrete set of word/token symbols, which convey focused semantic/syntactic meaning. The objective of this study is to explore variation in tokenizer outputs when applied across a series of challenging biomedical sentences. Method: Diaz [2015] introduce 24 challenging example biomedical sentences for comparing tokenizer performance. In this study, we descriptively explore variation in outputs of eight tokenizers applied to each example biomedical sentence. The tokenizers compared in this study are the NLTK white space tokenizer, the NLTK Penn Tree Bank tokenizer, Spacy and SciSpacy tokenizers, Stanza/Stanza-Craft tokenizers, the UDPipe tokenizer, and R-tokenizers. Results: For many examples, tokenizers performed similarly effectively; however, for certain examples, there were meaningful variation in returned outputs. The white space tokenizer often performed differently than other tokenizers. We observed performance similarities for tokenizers implementing rule-based systems (e.g. pattern matching and regular expressions) and tokenizers implementing neural architectures for token classification. Oftentimes, the challenging tokens resulting in the greatest variation in outputs, are those words which convey substantive and focused biomedical/clinical meaning (e.g. x-ray, IL-10, TCR/CD3, CD4+ CD8+, and (Ca2+)-regulated). Conclusion: When state-of-the-art, open-source tokenizers from Python and R were applied to a series of challenging biomedical example sentences, we observed subtle variation in the returned outputs.
Few-weight codes over finite chain rings are associated with combinatorial objects such as strongly regular graphs (SRGs), strongly walk-regular graphs (SWRGs) and finite geometries, and are also widely used in data storage systems and secret sharing schemes. The first objective of this paper is to characterize all possible parameters of Plotkin-optimal two-homogeneous weight regular projective codes over finite chain rings, as well as their weight distributions. We show the existence of codes with these parameters by constructing an infinite family of two-homogeneous weight codes. The parameters of their Gray images have the same weight distribution as that of the two-weight codes of type SU1 in the sense of Calderbank and Kantor (Bull Lond Math Soc 18: 97-122, 1986). Further, we also construct three-homogeneous weight regular projective codes over finite chain rings combined with some known results. Finally, we study applications of our constructed codes in secret sharing schemes and graph theory. In particular, infinite families of SRGs and SWRGs with non-trivial parameters are obtained.
We present Modular Polynomial (MP) Codes for Secure Distributed Matrix Multiplication (SDMM). The construction is based on the observation that one can decode certain proper subsets of the coefficients of a polynomial with fewer evaluations than is necessary to interpolate the entire polynomial. We also present Generalized Gap Additive Secure Polynomial (GGASP) codes. Both MP and GGASP codes are shown experimentally to perform favorably in terms of recovery threshold when compared to other comparable polynomials codes for SDMM which use the grid partition. Both MP and GGASP codes achieve the recovery threshold of Entangled Polynomial Codes for robustness against stragglers, but MP codes can decode below this recovery threshold depending on the set of worker nodes which fails. The decoding complexity of MP codes is shown to be lower than other approaches in the literature, due to the user not being tasked with interpolating an entire polynomial.
We propose a self-stabilizing leader election (SS-LE) protocol on ring networks in the population protocol model. Given a rough knowledge $\psi = \lceil \log n \rceil + O(1)$ on the population size $n$, the proposed protocol lets the population reach a safe configuration within $O(n^2 \log n)$ steps with high probability starting from any configuration. Thereafter, the population keeps the unique leader forever. Since no protocol solves SS-LE in $o(n^2)$ steps with high probability, the convergence time is near-optimal: the gap is only an $O(\log n)$ multiplicative factor. This protocol uses only $polylog(n)$ states. There exist two state-of-the-art algorithms in current literature that solve SS-LE on ring networks. The first algorithm uses a polynomial number of states and solves SS-LE in $O(n^2)$ steps, whereas the second algorithm requires exponential time but it uses only a constant number of states. Our proposed algorithm provides an excellent middle ground between these two.
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory. We give here a complete solution in the special case of linear networks with output dimension one trained using zero noise Bayesian inference with Gaussian weight priors and mean squared error as a negative log-likelihood. For any training dataset, network depth, and hidden layer widths, we find non-asymptotic expressions for the predictive posterior and Bayesian model evidence in terms of Meijer-G functions, a class of meromorphic special functions of a single complex variable. Through novel asymptotic expansions of these Meijer-G functions, a rich new picture of the joint role of depth, width, and dataset size emerges. We show that linear networks make provably optimal predictions at infinite depth: the posterior of infinitely deep linear networks with data-agnostic priors is the same as that of shallow networks with evidence-maximizing data-dependent priors. This yields a principled reason to prefer deeper networks when priors are forced to be data-agnostic. Moreover, we show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth, elucidating the salutary role of increased depth for model selection. Underpinning our results is a novel emergent notion of effective depth, given by the number of hidden layers times the number of data points divided by the network width; this determines the structure of the posterior in the large-data limit.
We study fair mechanisms for the (asymmetric) one-sided allocation problem with m items and n multi-unit demand agents with additive, unit-sum valuations. The symmetric case (m=n), the one-sided matching problem, has been studied extensively for the class of unit demand agents, in particular with respect to the folklore Random Priority mechanism and the Probabilistic Serial mechanism, introduced by Bogomolnaia and Moulin. Under the assumption of unit-sum valuation functions, Christodoulou et al. proved that the price of anarchy is $\Theta(\sqrt{n})$ in the one-sided matching problem for both the Random Priority and Probabilistic Serial mechanisms. Whilst both Random Priority and Probabilistic Serial are ordinal mechanisms, these approximation guarantees are the best possible even for the broader class of cardinal mechanisms. To extend these results to the general setting there are two technical obstacles. One, asymmetry ($m\neq n$) is problematic especially when the number of items is much greater than the number of items. Two, it is necessary to study multi-unit demand agents rather than simply unit demand agents. Our approach is to study a cardinal mechanism variant of Probabilistic Serial, which we call Cardinal Probabilistic Serial. We present structural theorems for this mechanism and use them to obtain bounds on the price of anarchy. Our first main result is an upper bound of $O(\sqrt{n}\cdot \log m)$ on the price of anarchy for the asymmetric one-sided allocation problem with multi-unit demand agents. This upper bound applies to Probabilistic Serial as well and there is a complementary lower bound of $\Omega(\sqrt{n})$ for any fair mechanism. Our second main result is that the price of anarchy degrades with the number of items. Specifically, a logarithmic dependence on the number of items is necessary for both mechanisms.
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,\delta)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $\eta_0 >0.0006$, we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.
We introduce new techniques for the parameterized verification of disjunctive timed networks (DTNs), i.e., networks of timed automata (TAs) that communicate via location guards that enable a transition only if at least one process is in a given location. This computational model has been considered in the literature before, and example applications are gossiping clock synchronization protocols or planning problems. We address the minimum-time reachability problem (minreach) in DTNs, and show how to efficiently solve it based on a novel zone-graph algorithm. We further show that solving minreach allows us to construct a summary TA capturing exactly the possible behaviors of a single TA within a DTN of arbitrary size. The combination of these two results enables the parameterized verification of DTNs, while avoiding the construction of an exponential-size cutoff-system required by existing results. Our techniques are also implemented, and experiments show their practicality.