We consider $t$-Lee-error-correcting codes of length $n$ over the residue ring $\mathbb{Z}_m := \mathbb{Z}/m\mathbb{Z}$ and determine upper and lower bounds on the number of $t$-Lee-error-correcting codes. We use two different methods, namely estimating isolated nodes on bipartite graphs and the graph container method. The former gives density results for codes of fixed size and the latter for any size. This confirms some recent density results for linear Lee metric codes and provides new density results for nonlinear codes. To apply a variant of the graph container algorithm we also investigate some geometrical properties of the balls in the Lee metric.
Most link prediction methods return estimates of the connection probability of missing edges in a graph. Such output can be used to rank the missing edges, from most to least likely to be a true edge, but it does not directly provide a classification into true and non-existent. In this work, we consider the problem of identifying a set of true edges with a control of the false discovery rate (FDR). We propose a novel method based on high-level ideas from the literature on conformal inference. The graph structure induces intricate dependence in the data, which we carefully take into account, as this makes the setup different from the usual setup in conformal inference, where exchangeability is assumed. The FDR control is empirically demonstrated for both simulated and real data.
The gradient discretisation method (GDM) -- a generic framework encompassing many numerical methods -- is studied for a general stochastic Stefan problem with multiplicative noise. The convergence of the numerical solutions is proved by compactness method using discrete functional analysis tools, Skorohod theorem and the martingale representation theorem. The generic convergence results established in the GDM framework are applicable to a range of different numerical methods, including for example mass-lumped finite elements, but also some finite volume methods, mimetic methods, lowest-order virtual element methods, etc. Theoretical results are complemented by numerical tests based on two methods that fit in GDM framework.
Recently, Meta-Auto-Decoder (MAD) was proposed as a novel reduced order model (ROM) for solving parametric partial differential equations (PDEs), and the best possible performance of this method can be quantified by the decoder width. This paper aims to provide a theoretical analysis related to the decoder width. The solution sets of several parametric PDEs are examined, and the upper bounds of the corresponding decoder widths are estimated. In addition to the elliptic and the parabolic equations on a fixed domain, we investigate the advection equations that present challenges for classical linear ROMs, as well as the elliptic equations with the computational domain shape as a variable PDE parameter. The resulting fast decay rates of the decoder widths indicate the promising potential of MAD in addressing these problems.
We study a class of stochastic semilinear damped wave equations driven by additive Wiener noise. Owing to the damping term, under appropriate conditions on the nonlinearity, the solution admits a unique invariant distribution. We apply semi-discrete and fully-discrete methods in order to approximate this invariant distribution, using a spectral Galerkin method and an exponential Euler integrator for spatial and temporal discretization respectively. We prove that the considered numerical schemes also admit unique invariant distributions, and we prove error estimates between the approximate and exact invariant distributions, with identification of the orders of convergence. To the best of our knowledge this is the first result in the literature concerning numerical approximation of invariant distributions for stochastic damped wave equations.
We consider the problem of testing for the martingale difference hypothesis for univariate strictly stationary time series by implementing a novel test for conditional mean independence based on the concept of martingale difference divergence. The martingale difference divergence function allows us to measure the degree to which a certain variable is conditionally mean dependent upon its past values: in particular, it does so by computing the regularized norm of the covariance between the current value of the variable and the characteristic function of its past values. In this paper, we make use of such a concept, along with the theoretical framework of generalized spectral density, to construct a Ljung-Box type test for the martingale difference hypothesis. In addition to the results obtained with the implementation of the test statistic, we proceed to show some asymptotics for martingale difference divergence in the time series framework.
In the permutation inversion problem, the task is to find the preimage of some challenge value, given oracle access to the permutation. This is a fundamental problem in query complexity, and appears in many contexts, particularly cryptography. In this work, we examine the setting in which the oracle allows for quantum queries to both the forward and the inverse direction of the permutation -- except that the challenge value cannot be submitted to the latter. Within that setting, we consider two options for the inversion algorithm: whether it can get quantum advice about the permutation, and whether it must produce the entire preimage (search) or only the first bit (decision). We prove several theorems connecting the hardness of the resulting variations of the inversion problem, and establish a number of lower bounds. Our results indicate that, perhaps surprisingly, the inversion problem does not become significantly easier when the adversary is granted oracle access to the inverse, provided it cannot query the challenge itself.
This paper presents a novel approach to Bayesian nonparametric spectral analysis of stationary multivariate time series. Starting with a parametric vector-autoregressive model, the parametric likelihood is nonparametrically adjusted in the frequency domain to account for potential deviations from parametric assumptions. We show mutual contiguity of the nonparametrically corrected likelihood, the multivariate Whittle likelihood approximation and the exact likelihood for Gaussian time series. A multivariate extension of the nonparametric Bernstein-Dirichlet process prior for univariate spectral densities to the space of Hermitian positive definite spectral density matrices is specified directly on the correction matrices. An infinite series representation of this prior is then used to develop a Markov chain Monte Carlo algorithm to sample from the posterior distribution. The code is made publicly available for ease of use and reproducibility. With this novel approach we provide a generalization of the multivariate Whittle-likelihood-based method of Meier et al. (2020) as well as an extension of the nonparametrically corrected likelihood for univariate stationary time series of Kirch et al. (2019) to the multivariate case. We demonstrate that the nonparametrically corrected likelihood combines the efficiencies of a parametric with the robustness of a nonparametric model. Its numerical accuracy is illustrated in a comprehensive simulation study. We illustrate its practical advantages by a spectral analysis of two environmental time series data sets: a bivariate time series of the Southern Oscillation Index and fish recruitment and time series of windspeed data at six locations in California.
Error-correcting codes have an important role in data storage and transmission and in cryptography, particularly in the post-quantum era. Hermitian matrices over finite fields and equipped with the rank metric have the potential to offer enhanced security with greater efficiency in encryption and decryption. One crucial tool for evaluating the error-correcting capabilities of a code is its weight distribution and the MacWilliams Theorem has long been used to identify this structure of new codes from their known duals. Earlier papers have developed the MacWilliams Theorem for certain classes of matrices in the form of a functional transformation, developed using $q$-algebra, character theory and Generalised Krawtchouk polynomials, which is easy to apply and also allows for moments of the weight distribution to be found. In this paper, recent work by Kai-Uwe Schmidt on the properties of codes based on Hermitian matrices such as bounds on their size and the eigenvalues of their association scheme is extended by introducing a negative-$q$ algebra to establish a MacWilliams Theorem in this form together with some of its associated moments. The similarities in this approach and in the paper for the Skew-Rank metric by Friedlander et al. have been emphasised to facilitate future generalisation to any translation scheme.
The optimal branch number of MDS matrices makes them a preferred choice for designing diffusion layers in many block ciphers and hash functions. However, in lightweight cryptography, Near-MDS (NMDS) matrices with sub-optimal branch numbers offer a better balance between security and efficiency as a diffusion layer, compared to MDS matrices. In this paper, we study NMDS matrices, exploring their construction in both recursive and nonrecursive settings. We provide several theoretical results and explore the hardware efficiency of the construction of NMDS matrices. Additionally, we make comparisons between the results of NMDS and MDS matrices whenever possible. For the recursive approach, we study the DLS matrices and provide some theoretical results on their use. Some of the results are used to restrict the search space of the DLS matrices. We also show that over a field of characteristic 2, any sparse matrix of order $n\geq 4$ with fixed XOR value of 1 cannot be an NMDS when raised to a power of $k\leq n$. Following that, we use the generalized DLS (GDLS) matrices to provide some lightweight recursive NMDS matrices of several orders that perform better than the existing matrices in terms of hardware cost or the number of iterations. For the nonrecursive construction of NMDS matrices, we study various structures, such as circulant and left-circulant matrices, and their generalizations: Toeplitz and Hankel matrices. In addition, we prove that Toeplitz matrices of order $n>4$ cannot be simultaneously NMDS and involutory over a field of characteristic 2. Finally, we use GDLS matrices to provide some lightweight NMDS matrices that can be computed in one clock cycle. The proposed nonrecursive NMDS matrices of orders 4, 5, 6, 7, and 8 can be implemented with 24, 50, 65, 96, and 108 XORs over $\mathbb{F}_{2^4}$, respectively.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.