The symbol-pair read channel was first proposed by Cassuto and Blaum. Later, Yaakobi et al. generalized it to the $b$-symbol read channel. It is motivated by the limitations of the reading process in high density data storage systems. One main task in $b$-symbol coding theory is to determine the $b$-symbol weight hierarchy of codes. In this paper, we study the $b$-symbol weight hierarchy of the Kasami codes, which are well known for their applications to construct sequences with optimal correlation magnitudes. The complete symbol-pair weight distribution of the Kasami codes is determined.
Reed Muller (RM) codes are known for their good minimum distance. One can use their structure to construct polar-like codes with good distance properties by choosing the information set as the rows of the polarization matrix with the highest Hamming weight, instead of the most reliable synthetic channels. However, the information length options of RM codes are quite limited due to their specific structure. In this work, we present sufficient conditions to increase the information length by at least one bit for some underlying RM codes and in order to obtain pre-transformed polar-like codes with the same minimum distance than lower rate codes. Moreover, our findings are combined with the method presented in [1] to further reduce the number of minimum weight codewords. Numerical results show that the designed codes perform close to the meta-converse bound at short blocklengths and better than the polarized adjusted convolutional polar codes with the same parameters.
Tensors, i.e., multi-linear functions, are a fundamental building block of machine learning algorithms. In order to train on large data-sets, it is common practice to distribute the computation amongst workers. However, stragglers and other faults can severely impact the performance and overall training time. A novel strategy to mitigate these failures is the use of coded computation. We introduce a new metric for analysis called the typical recovery threshold, which focuses on the most likely event and provide a novel construction of distributed coded tensor operations which are optimal with this measure. We show that our general framework encompasses many other computational schemes and metrics as a special case. In particular, we prove that the recovery threshold and the tensor rank can be recovered as a special case of the typical recovery threshold when the probability of noise, i.e., a fault, is equal to zero, thereby providing a noisy generalization of noiseless computation as a serendipitous result. Far from being a purely theoretical construction, these definitions lead us to practical random code constructions, i.e., locally random p-adic alloy codes, which are optimal with respect to the measures. We analyze experiments conducted on Amazon EC2 and establish that they are faster and more numerically stable than many other benchmark computation schemes in practice, as is predicted by theory.
A multilevel coded modulation scheme is studied that uses binary polar codes and Honda-Yamamoto probabilistic shaping. The scheme is shown to achieve the capacity of discrete memoryless channels with input alphabets of cardinality a power of two. The performance of finite-length implementations is compared to polar-coded probabilistic amplitude shaping and constant composition distribution matching.
In this paper we focus on modules over a finite chain ring $\mathcal{R}$ of size $q^s$. We compute the density of free modules of $\mathcal{R}^n$, where we separately treat the asymptotics in $n,q$ and $s$. In particular, we focus on two cases: one where we fix the length of the module and one where we fix the rank of the module. In both cases, the density results can be bounded by the Andrews-Gordon identities. We also study the asymptotic behaviour of modules generated by random matrices over $\mathcal{R}$. Since linear codes over $\mathcal{R}$ are submodules of $\mathcal{R}^n$ we get direct implications for coding theory. For example, we show that random codes achieve the Gilbert-Varshamov bound with high probability.
We establish the capacity of a class of communication channels introduced in [1]. The $n$-letter input from a finite alphabet is passed through a discrete memoryless channel $P_{Z|X}$ and then the output $n$-letter sequence is uniformly permuted. We show that the maximal communication rate (normalized by $\log n$) equals $1/2 (rank(P_{Z|X})-1)$ whenever $P_{Z|X}$ is strictly positive. This is done by establishing a converse bound matching the achievability of [1]. The two main ingredients of our proof are (1) a sharp bound on the entropy of a uniformly sampled vector from a type class and observed through a DMC; and (2) the covering $\epsilon$-net of a probability simplex with Kullback-Leibler divergence as a metric. In addition to strictly positive DMC we also find the noisy permutation capacity for $q$-ary erasure channels, the Z-channel and others.
Models defined by moment conditions are at the center of structural econometric estimation, but economic theory is mostly agnostic about moment selection. While a large pool of valid moments can potentially improve estimation efficiency, in the meantime a few invalid ones may undermine consistency. This paper investigates the empirical likelihood estimation of these moment-defined models in high-dimensional settings. We propose a penalized empirical likelihood (PEL) estimation and establish its oracle property with consistent detection of invalid moments. The PEL estimator is asymptotically normally distributed, and a projected PEL procedure further eliminates its asymptotic bias and provides more accurate normal approximation to the finite sample behavior. Simulation exercises demonstrate excellent numerical performance of these methods in estimation and inference.
It is known for decades that computer-based systems cannot be understood without a concept of modularization and decomposition. We suggest a universal, expressive, intuitively attractive composition operator for Petri nets, combined with a refinement concept and an algebraic representation of nets and their composition. Case studies show exemplarily, how large systems can be composed from tiny net snippets. In the future, more field studies are needed to better understand the consequences of the proposed ideas in the real world.
Graph Neural Networks (GNNs) draw their strength from explicitly modeling the topological information of structured data. However, existing GNNs suffer from limited capability in capturing the hierarchical graph representation which plays an important role in graph classification. In this paper, we innovatively propose hierarchical graph capsule network (HGCN) that can jointly learn node embeddings and extract graph hierarchies. Specifically, disentangled graph capsules are established by identifying heterogeneous factors underlying each node, such that their instantiation parameters represent different properties of the same entity. To learn the hierarchical representation, HGCN characterizes the part-whole relationship between lower-level capsules (part) and higher-level capsules (whole) by explicitly considering the structure information among the parts. Experimental studies demonstrate the effectiveness of HGCN and the contribution of each component.
When the federated learning is adopted among competitive agents with siloed datasets, agents are self-interested and participate only if they are fairly rewarded. To encourage the application of federated learning, this paper employs a management strategy, i.e., more contributions should lead to more rewards. We propose a novel hierarchically fair federated learning (HFFL) framework. Under this framework, agents are rewarded in proportion to their pre-negotiated contribution levels. HFFL+ extends this to incorporate heterogeneous models. Theoretical analysis and empirical evaluation on several datasets confirm the efficacy of our frameworks in upholding fairness and thus facilitating federated learning in the competitive settings.
Named entity recognition (NER) models are typically based on the architecture of Bi-directional LSTM (BiLSTM). The constraints of sequential nature and the modeling of single input prevent the full utilization of global information from larger scope, not only in the entire sentence, but also in the entire document (dataset). In this paper, we address these two deficiencies and propose a model augmented with hierarchical contextualized representation: sentence-level representation and document-level representation. In sentence-level, we take different contributions of words in a single sentence into consideration to enhance the sentence representation learned from an independent BiLSTM via label embedding attention mechanism. In document-level, the key-value memory network is adopted to record the document-aware information for each unique word which is sensitive to similarity of context information. Our two-level hierarchical contextualized representations are fused with each input token embedding and corresponding hidden state of BiLSTM, respectively. The experimental results on three benchmark NER datasets (CoNLL-2003 and Ontonotes 5.0 English datasets, CoNLL-2002 Spanish dataset) show that we establish new state-of-the-art results.