Polar codes are normally designed based on the reliability of the sub-channels in the polarized vector channel. There are various methods with diverse complexity and accuracy to evaluate the reliability of the sub-channels. However, designing polar codes solely based on the sub-channel reliability may result in poor Hamming distance properties. In this work, we propose a different approach to design the information set for polar codes and PAC codes where the objective is to reduce the number of codewords with minimum weight (a.k.a. error coefficient) of a code designed for maximum reliability. This approach is based on the coset-wise characterization of the rows of polar transform $\mathbf{G}_N$ involved in the formation of the minimum-weight codewords. Our analysis capitalizes on the properties of the polar transform based on its row and column indices. The numerical results show that the designed codes outperform PAC codes and CRC-Polar codes at the practical block error rate of $10^{-2}-10^{-3}$. Furthermore, a by-product of the combinatorial properties analyzed in this paper is an alternative enumeration method of the minimum-weight codewords.
Let $W$ be a binary-input memoryless symmetric (BMS) channel with Shannon capacity $I(W)$ and fix any $\alpha > 0$. We construct, for any sufficiently small $\delta > 0$, binary linear codes of block length $O(1/\delta^{2+\alpha})$ and rate $I(W)-\delta$ that enable reliable communication on $W$ with quasi-linear time encoding and decoding. Shannon's noisy coding theorem established the \emph{existence} of such codes (without efficient constructions or decoding) with block length $O(1/\delta^2)$. This quadratic dependence on the gap $\delta$ to capacity is known to be best possible. Our result thus yields a constructive version of Shannon's theorem with near-optimal convergence to capacity as a function of the block length. This resolves a central theoretical challenge associated with the attainment of Shannon capacity. Previously such a result was only known for the erasure channel. Our codes are a variant of Ar{\i}kan's polar codes based on multiple carefully constructed local kernels, one for each intermediate channel that arises in the decoding. A crucial ingredient in the analysis is a strong converse of the noisy coding theorem when communicating using random linear codes on arbitrary BMS channels. Our converse theorem shows extreme unpredictability of even a single message bit for random coding at rates slightly above capacity.
We study quasi-Monte Carlo (QMC) integration of smooth functions defined over the multi-dimensional unit cube. Inspired by a recent work of Pan and Owen, we study a new construction-free median QMC rule which can exploit the smoothness and the weights of function spaces adaptively. For weighted Korobov spaces, we draw a sample of $r$ independent generating vectors of rank-1 lattice rules, compute the integral estimate for each, and approximate the true integral by the median of these $r$ estimates. For weighted Sobolev spaces, we use the same approach but with the rank-1 lattice rules replaced by high-order polynomial lattice rules. A major advantage over the existing approaches is that we do not need to construct good generating vectors by a computer search algorithm, while our median QMC rule achieves almost the optimal worst-case error rate for the respective function space with any smoothness and weights, with a probability that converges to 1 exponentially fast as $r$ increases. Numerical experiments illustrate and support our theoretical findings.
This work presents a fast successive-cancellation list flip (Fast-SCLF) decoding algorithm for polar codes that addresses the high latency issue associated with the successive-cancellation list flip (SCLF) decoding algorithm. We first propose a bit-flipping strategy tailored to the state-of-the-art fast successive-cancellation list (FSCL) decoding that avoids tree-traversal in the binary tree representation of SCLF, thus reducing the latency of the decoding process. We then derive a parameterized path selection error model to accurately estimate the bit index at which the correct decoding path is eliminated from the initial FSCL decoding. The trainable parameter is optimized online based on an efficient supervised learning framework. Simulation results show that for a polar code of length 512 with 256 information bits, with similar error-correction performance and memory consumption, the proposed Fast-SCLF decoder reduces up to $73.4\%$ of the average decoding latency of the SCLF decoder with the same list size at the frame error rate of $10^{-4}$, while incurring a maximum computational complexity overhead of $27.6\%$. For the same polar code of length 512 with 256 information bits and at practical signal-to-noise ratios, the proposed decoder with list size 4 reduces $89.3\%$ and $43.7\%$ of the average complexity and decoding latency of the FSCL decoder with list size 32 (FSCL-32), respectively, while also reducing $83.2\%$ of the memory consumption of FSCL-32. The significant improvements of the proposed decoder come at the cost of $0.07$ dB error-correction performance degradation compared with FSCL-32.
A novel recursive list decoding (RLD) algorithm of Reed-Muller (RM) codes based on successive permutations (SP) of the codeword is presented. An SP scheme that performs maximum likelihood decoding on a subset of the symmetry group of RM codes is first proposed to carefully select a good codeword permutation on the fly. Then, the proposed SP technique is applied to an improved RLD algorithm that initializes different decoding paths with random codeword permutations, which are sampled from the full symmetry group of RM codes. Finally, an efficient latency reduction scheme is introduced that virtually preserves the error-correction performance of the proposed decoder. Simulation results demonstrate that for the RM code of size $256$ with $163$ information bits, the proposed decoder reduces $39\%$ of the computational complexity, $36\%$ of the decoding latency, and $74\%$ of the memory requirement of the state-of-the-art RLD algorithm with list size $64$ that also uses the permutations from the full symmetry group of RM codes, while only incurring an error-correction performance degradation of $0.1$ dB at the target frame error rate of $10^{-3}$.
We consider a time-varying first-order autoregressive model with irregular innovations, where we assume that the coefficient function is H\"{o}lder continuous. To estimate this function, we use a quasi-maximum likelihood based approach. A precise control of this method demands a delicate analysis of extremes of certain weakly dependent processes, our main result being a concentration inequality for such quantities. Based on our analysis, upper and matching minimax lower bounds are derived, showing the optimality of our estimators. Unlike the regular case, the information theoretic complexity depends both on the smoothness and an additional shape parameter, characterizing the irregularity of the underlying distribution. The results and ideas for the proofs are very different from classical and more recent methods in connection with statistics and inference for locally stationary processes.
Consider a set $P$ of $n$ points in $\mathbb{R}^d$. In the discrete median line segment problem, the objective is to find a line segment bounded by a pair of points in $P$ such that the sum of the Euclidean distances from $P$ to the line segment is minimized. In the continuous median line segment problem, a real number $\ell>0$ is given, and the goal is to locate a line segment of length $\ell$ in $\mathbb{R}^d$ such that the sum of the Euclidean distances between $P$ and the line segment is minimized. To begin with, we show how to compute $(1+\epsilon\Delta)$- and $(1+\epsilon)$-approximations to a discrete median line segment in time $O(n\epsilon^{-2d}\log n)$ and $O(n^2\epsilon^{-d})$, respectively, where $\Delta$ is the spread of line segments spanned by pairs of points. While developing our algorithms, by using the principle of pair decomposition, we derive new data structures that allow us to quickly approximate the sum of the distances from a set of points to a given line segment or point. To our knowledge, our utilization of pair decompositions for solving minsum facility location problems is the first of its kind -- it is versatile and easily implementable. Furthermore, we prove that it is impossible to construct a continuous median line segment for $n\geq3$ non-collinear points in the plane by using only ruler and compass. In view of this, we present an $O(n^d\epsilon^{-d})$-time algorithm for approximating a continuous median line segment in $\mathbb{R}^d$ within a factor of $1+\epsilon$. The algorithm is based upon generalizing the point-segment pair decomposition from the discrete to the continuous domain. Last but not least, we give an $(1+\epsilon)$-approximation algorithm, whose time complexity is sub-quadratic in $n$, for solving the constrained median line segment problem in $\mathbb{R}^2$ where an endpoint or the slope of the median line segment is given at input.
Given input-output pairs of an elliptic partial differential equation (PDE) in three dimensions, we derive the first theoretically-rigorous scheme for learning the associated Green's function $G$. By exploiting the hierarchical low-rank structure of $G$, we show that one can construct an approximant to $G$ that converges almost surely and achieves a relative error of $\mathcal{O}(\Gamma_\epsilon^{-1/2}\log^3(1/\epsilon)\epsilon)$ using at most $\mathcal{O}(\epsilon^{-6}\log^4(1/\epsilon))$ input-output training pairs with high probability, for any $0<\epsilon<1$. The quantity $0<\Gamma_\epsilon\leq 1$ characterizes the quality of the training dataset. Along the way, we extend the randomized singular value decomposition algorithm for learning matrices to Hilbert--Schmidt operators and characterize the quality of covariance kernels for PDE learning.
We establish a complete classification of binary group codes with complementary duals for a finite group and explicitly determine the number of linear complementary dual (LCD) cyclic group codes by using cyclotomic cosets. The dimension and the minimum distance for LCD group codes are explored. Finally, we find a connection between LCD MDS group codes and maximal ideals.
A toric code, introduced by Hansen to extend the Reed-Solomon code as a $k$-dimensional subspace of $\mathbb{F}_q^n$, is determined by a toric variety or its associated integral convex polytope $P \subseteq [0,q-2]^n$, where $k=|P \cap \mathbb{Z}^n|$ (the number of integer lattice points of $P$). There are two relevant parameters that determine the quality of a code: the information rate, which measures how much information is contained in a single bit of each codeword; and the relative minimum distance, which measures how many errors can be corrected relative to how many bits each codeword has. Soprunov and Soprunova defined a good infinite family of codes to be a sequence of codes of unbounded polytope dimension such that neither the corresponding information rates nor relative minimum distances go to 0 in the limit. We examine different ways of constructing families of codes by considering polytope operations such as the join and direct sum. In doing so, we give conditions under which no good family can exist and strong evidence that there is no such good family of codes.
For neural networks (NNs) with rectified linear unit (ReLU) or binary activation functions, we show that their training can be accomplished in a reduced parameter space. Specifically, the weights in each neuron can be trained on the unit sphere, as opposed to the entire space, and the threshold can be trained in a bounded interval, as opposed to the real line. We show that the NNs in the reduced parameter space are mathematically equivalent to the standard NNs with parameters in the whole space. The reduced parameter space shall facilitate the optimization procedure for the network training, as the search space becomes (much) smaller. We demonstrate the improved training performance using numerical examples.