Let $\mathbb{F}_q$ be a finite field of size $q$ and $\mathbb{F}_q^*$ the set of non-zero elements of $\mathbb{F}_q$. In this paper, we study a class of twisted generalized Reed-Solomon code $C_\ell(D, k, \eta, \vec{v})\subset \mathbb{F}_q^n$ generated by the following matrix \[ \left(\begin{array}{cccc} v_{1} & v_{2} & \cdots & v_{n} \\ v_{1} \alpha_{1} & v_{2} \alpha_{2} & \cdots & v_{n} \alpha_{n} \\ \vdots & \vdots & \ddots & \vdots \\ v_{1} \alpha_{1}^{\ell-1} & v_{2} \alpha_{2}^{\ell-1} & \cdots & v_{n} \alpha_{n}^{\ell-1} \\ v_{1} \alpha_{1}^{\ell+1} & v_{2} \alpha_{2}^{\ell+1} & \cdots & v_{n} \alpha_{n}^{\ell+1} \\ \vdots & \vdots & \ddots & \vdots \\ v_{1} \alpha_{1}^{k-1} & v_{2} \alpha_{2}^{k-1} & \cdots & v_{n} \alpha_{n}^{k-1} \\ v_{1}\left(\alpha_{1}^{\ell}+\eta\alpha_{1}^{q-{2}}\right) & v_{2}\left(\alpha_{2}^{\ell}+ \eta \alpha_{2}^{q-2}\right) &\cdots & v_{n}\left(\alpha_{n}^{\ell}+\eta\alpha_{n}^{q-2}\right) \end{array}\right) \] where $0\leq \ell\leq k-1,$ the evaluation set $D=\{\alpha_{1},\alpha_{2},\cdots, \alpha_{n}\}\subseteq \mathbb{F}_q^*$, scaling vector $\vec{v}=(v_1,v_2,\cdots,v_n)\in (\mathbb{F}_q^*)^n$ and $\eta\in\mathbb{F}_q^*$. The minimum distance and dual code of $C_\ell(D, k, \eta, \vec{v})$ will be determined. For the special case $\ell=k-1,$ a sufficient and necessary condition for $C_{k-1}(D, k, \eta, \vec{v})$ to be self-dual will be given. We will also show that the code is MDS or near-MDS. Moreover, a complete classification when the code is near-MDS or MDS will be presented.
By defining two important terms called basic perturbation vectors and obtaining their linear bounds, we obtain the linear componentwise perturbation bounds for unitary factors and upper triangular factors of the generalized Schur decomposition. The perturbation bounds for the diagonal elements of the upper triangular factors and the generalized invariant subspace are also derived. From the former, we present an upper bound and a condition number of the generalized eigenvalue. Furthermore, with numerical iterative method, the nonlinear componentwise perturbation bounds of the generalized Schur decomposition are also provided. Numerical examples are given to test the obtained bounds. Among them, we compare our upper bound and condition number of the generalized eigenvalue with their counterparts given in the literature. Numerical results show that they are very close to each other but our results don't contain the information on the left and right generalized eigenvectors.
Certain simplicial complexes are used to construct a subset $D$ of $\mathbb{F}_{2^n}^m$ and $D$, in turn, defines the linear code $C_{D}$ over $\mathbb{F}_{2^n}$ that consists of $(v\cdot d)_{d\in D}$ for $v\in \mathbb{F}_{2^n}^m$. Here we deal with the case $n=3$, that is, when $C_{D}$ is an octanary code. We establish a relation between $C_{D}$ and its binary subfield code $C_{D}^{(2)}$ with the help of a generator matrix. For a given length and dimension, a code is called distance optimal if it has the highest possible distance. With respect to the Griesmer bound, five infinite families of distance optimal codes are obtained, and sufficient conditions for certain linear codes to be minimal are established.
SVD (singular value decomposition) is one of the basic tools of machine learning, allowing to optimize basis for a given matrix. However, sometimes we have a set of matrices $\{A_k\}_k$ instead, and would like to optimize a single common basis for them: find orthogonal matrices $U$, $V$, such that $\{U^T A_k V\}$ set of matrices is somehow simpler. For example DCT-II is orthonormal basis of functions commonly used in image/video compression - as discussed here, this kind of basis can be quickly automatically optimized for a given dataset. While also discussed gradient descent optimization might be computationally costly, there is proposed CSVD (common SVD): fast general approach based on SVD. Specifically, we choose $U$ as built of eigenvectors of $\sum_i (w_k)^q (A_k A_k^T)^p$ and $V$ of $\sum_k (w_k)^q (A_k^T A_k)^p$, where $w_k$ are their weights, $p,q>0$ are some chosen powers e.g. 1/2, optionally with normalization e.g. $A \to A - rc^T$ where $r_i=\sum_j A_{ij}, c_j =\sum_i A_{ij}$.
Weighted automata are a generalization of nondeterministic automata that associate a weight drawn from a semiring $K$ with every transition and every state. Their behaviours can be formalized either as weighted language equivalence or weighted bisimulation. In this paper we explore the properties of weighted automata in the framework of coalgebras over (i) the category $\mathsf{SMod}$ of semimodules over a semiring $K$ and $K$-linear maps, and (ii) the category $\mathsf{Set}$ of sets and maps. We show that the behavioural equivalences defined by the corresponding final coalgebras in these two cases characterize weighted language equivalence and weighted bisimulation, respectively. These results extend earlier work by Bonchi et al. using the category $\mathsf{Vect}$ of vector spaces and linear maps as the underlying model for weighted automata with weights drawn from a field $K$. The key step in our work is generalizing the notions of linear relation and linear bisimulation of Boreale from vector spaces to semimodules using the concept of the kernel of a $K$-linear map in the sense of universal algebra. We also provide an abstract procedure for forward partition refinement for computing weighted language equivalence. Since for weighted automata defined over semirings the problem is undecidable in general, it is guaranteed to halt only in special cases. We provide sufficient conditions for the termination of our procedure. Although the results are similar to those of Bonchi et al., many of our proofs are new, especially those about the coalgebra in $\mathsf{SMod}$ characterizing weighted language equivalence.
Symbol-pair codes introduced by Cassuto and Blaum in 2010 are designed to protect against the pair errors in symbol-pair read channels. One of the central themes in symbol-error correction is the construction of maximal distance separable (MDS) symbol-pair codes that possess the largest possible pair-error correcting performance. In this paper, we construct more general generator polynomials for two classes of MDS symbol-pair codes with code length $lp$. Based on repeated-root cyclic codes, we derive all MDS symbol-pair codes of length $3p$, when the degree of the generator polynomials is no more than 10. We also give two new classes of (almost maximal distance separable) AMDS symbol-pair codes with the length $lp$ or $4p$ by virtue of repeated-root cyclic codes. For length $3p$, we derive all AMDS symbol-pair codes, when the degree of the generator polynomials is less than 10. The main results are obtained by determining the solutions of certain equations over finite fields.
Generalized pair weights of linear codes are generalizations of minimum symbol-pair weights, which were introduced by Liu and Pan \cite{LP} recently. Generalized pair weights can be used to characterize the ability of protecting information in the symbol-pair read wire-tap channels of type II. In this paper, we introduce the notion of generalized $b$-symbol weights of linear codes over finite fields, which is a generalization of generalized Hamming weights and generalized pair weights. We obtain some basic properties and bounds of generalized $b$-symbol weights which are called Singleton-like bounds for generalized $b$-symbol weights. As examples, we calculate generalized weight matrices for simplex codes and Hamming codes. We provide a necessary and sufficient condition for a linear code to be a $b$-symbol MDS code by using the generator matrix and the parity check matrix of this linear code. Finally, a necessary and sufficient condition of a linear isomorphism preserving $b$-symbol weights between two linear codes is obtained. As a corollary, we get the classical MacWilliams extension theorem when $b=1$.
The lossless compression of a single source $X^n$ was recently shown to be achievable with a notion of strong locality; any $X_i$ can be decoded from a {\emph{constant}} number of compressed bits, with a vanishing in $n$ probability of error. In contrast with the single source setup, we show that for two separately encoded sources $(X^n,Y^n)$, lossless compression and strong locality is generally not possible. More precisely, we show that for the class of "confusable" sources strong locality cannot be achieved whenever one of the sources is compressed below its entropy. In this case, irrespectively of $n$, the probability of error of decoding any $(X_i,Y_i)$ is lower bounded by $2^{-O(d_{\mathrm{loc}})}$, where $d_{\mathrm{loc}}$ denotes the number of compressed bits accessed by the local decoder. Conversely, if the source is not confusable, strong locality is possible even if one of the sources is compressed below its entropy. Results extend to any number of sources.
The minimum energy path (MEP) describes the mechanism of reaction, and the energy barrier along the path can be used to calculate the reaction rate in thermal systems. The nudged elastic band (NEB) method is one of the most commonly used schemes to compute MEPs numerically. It approximates an MEP by a discrete set of configuration images, where the discretization size determines both computational cost and accuracy of the simulations. In this paper, we consider a discrete MEP to be a stationary state of the NEB method and prove an optimal convergence rate of the discrete MEP with respect to the number of images. Numerical simulations for the transitions of some several proto-typical model systems are performed to support the theory.
Universal coding of integers~(UCI) is a class of variable-length code, such that the ratio of the expected codeword length to $\max\{1,H(P)\}$ is within a constant factor, where $H(P)$ is the Shannon entropy of the decreasing probability distribution $P$. However, if we consider the ratio of the expected codeword length to $H(P)$, the ratio tends to infinity by using UCI, when $H(P)$ tends to zero. To solve this issue, this paper introduces a class of codes, termed generalized universal coding of integers~(GUCI), such that the ratio of the expected codeword length to $H(P)$ is within a constant factor $K$. First, the definition of GUCI is proposed and the coding structure of GUCI is introduced. Next, we propose a class of GUCI $\mathcal{C}$ to achieve the expansion factor $K_{\mathcal{C}}=2$ and show that the optimal GUCI is in the range $1\leq K_{\mathcal{C}}^{*}\leq 2$. Then, by comparing UCI and GUCI, we show that when the entropy is very large or $P(0)$ is not large, there are also cases where the average codeword length of GUCI is shorter. Finally, the asymptotically optimal GUCI is presented.
In structure learning, the output is generally a structure that is used as supervision information to achieve good performance. Considering the interpretation of deep learning models has raised extended attention these years, it will be beneficial if we can learn an interpretable structure from deep learning models. In this paper, we focus on Recurrent Neural Networks (RNNs) whose inner mechanism is still not clearly understood. We find that Finite State Automaton (FSA) that processes sequential data has more interpretable inner mechanism and can be learned from RNNs as the interpretable structure. We propose two methods to learn FSA from RNN based on two different clustering methods. We first give the graphical illustration of FSA for human beings to follow, which shows the interpretability. From the FSA's point of view, we then analyze how the performance of RNNs are affected by the number of gates, as well as the semantic meaning behind the transition of numerical hidden states. Our results suggest that RNNs with simple gated structure such as Minimal Gated Unit (MGU) is more desirable and the transitions in FSA leading to specific classification result are associated with corresponding words which are understandable by human beings.