A code of length $n$ is said to be (combinatorially) $(\rho,L)$-list decodable if the Hamming ball of radius $\rho n$ around any vector in the ambient space does not contain more than $L$ codewords. We study a recently introduced class of higher order MDS codes, which are closely related (via duality) to codes that achieve a generalized Singleton bound for list decodability. For some $\ell\geq 1$, higher order MDS codes of length $n$, dimension $k$, and order $\ell$ are denoted as $(n,k)$-MDS($\ell$) codes. We present a number of results on the structure of these codes, identifying the `extend-ability' of their parameters in various scenarios. Specifically, for some parameter regimes, we identify conditions under which $(n_1,k_1)$-MDS($\ell_1$) codes can be obtained from $(n_2,k_2)$-MDS($\ell_2$) codes, via various techniques. We believe that these results will aid in efficient constructions of higher order MDS codes. We also obtain a new field size upper bound for the existence of such codes, which arguably improves over the best known existing bound, in some parameter regimes.
The Byzantine consensus problem involves $n$ processes, out of which t < n could be faulty and behave arbitrarily. Three properties characterize consensus: (1) termination, requiring correct (non-faulty) processes to eventually reach a decision, (2) agreement, preventing them from deciding different values, and (3) validity, precluding ``unreasonable'' decisions. But, what is a reasonable decision? Strong validity, a classical property, stipulates that, if all correct processes propose the same value, only that value can be decided. Weak validity, another established property, stipulates that, if all processes are correct and they propose the same value, that value must be decided. The space of possible validity properties is vast. However, their impact on consensus remains unclear. This paper addresses the question of which validity properties allow Byzantine consensus to be solvable with partial synchrony, and at what cost. First, we determine necessary and sufficient conditions for a validity property to make the consensus problem solvable; we say that such validity properties are solvable. Notably, we prove that, if n <= 3t, all solvable validity properties are trivial (there exists an always-admissible decision). Furthermore, we show that, with any non-trivial (and solvable) validity property, consensus requires Omega(t^2) messages. This extends the seminal Dolev-Reischuk bound, originally proven for strong validity, to all non-trivial validity properties. Lastly, we give a general Byzantine consensus algorithm, we call Universal, for any solvable (and non-trivial) validity property. Importantly, Universal incurs O(n^2) message complexity. Thus, together with our lower bound, Universal implies a fundamental result in partial synchrony: with t \in Omega(n), the message complexity of all (non-trivial) consensus variants is Theta(n^2).
Model specification searches and modifications are commonly employed in covariance structure analysis (CSA) or structural equation modeling (SEM) to improve the goodness-of-fit. However, these practices can be susceptible to capitalizing on chance, as a model that fits one sample may not generalize to another sample from the same population. This paper introduces the improved Lagrange Multipliers (LM) test, which provides a reliable method for conducting a thorough model specification search and effectively identifying missing parameters. By leveraging the stepwise bootstrap method in the standard LM and Wald tests, our data-driven approach enhances the accuracy of parameter identification. The results from Monte Carlo simulations and two empirical applications in political science demonstrate the effectiveness of the improved LM test, particularly when dealing with small sample sizes and models with large degrees of freedom. This approach contributes to better statistical fit and addresses the issue of capitalization on chance in model specification.
We consider the problem of testing for the martingale difference hypothesis for univariate strictly stationary time series by implementing a novel test for conditional mean independence based on the concept of martingale difference divergence. The martingale difference divergence function allows us to measure the degree to which a certain variable is conditionally mean dependent upon its past values: in particular, it does so by computing the regularized norm of the covariance between the current value of the variable and the characteristic function of its past values. In this paper, we make use of such a concept, along with the theoretical framework of generalized spectral density, to construct a Ljung-Box type test for the martingale difference hypothesis. In addition to the results obtained with the implementation of the test statistic, we proceed to show some asymptotics for martingale difference divergence in the time series framework.
The analysis of survey data is a frequently arising issue in clinical trials, particularly when capturing quantities which are difficult to measure using, e.g., a technical device or a biochemical procedure. Typical examples are questionnaires about patient's well-being, pain, anxiety, quality of life or consent to an intervention. Data is captured on a discrete scale containing only a limited (usually three to ten) number of possible answers, of which the respondent has to pick the answer which fits best his personal opinion to the question. This data is generally located on an ordinal scale as answers can usually be arranged in an increasing order, e.g., "bad", "neutral", "good" for well-being or "none", "mild", "moderate", "severe" for pain. Since responses are often stored numerically for data processing purposes, analysis of survey data using ordinary linear regression (OLR) models seems to be natural. However, OLR assumptions are often not met as linear regression requires a constant variability of the response variable and can yield predictions out of the range of response categories. Moreover, in doing so, one only gains insights about the mean response which might, depending on the response distribution, not be very representative. In contrast, ordinal regression models are able to provide probability estimates for all response categories and thus yield information about the full response scale rather than just the mean. Although these methods are well described in the literature, they seem to be rarely applied to biomedical or survey data. In this paper, we give a concise overview about fundamentals of ordinal models, applications to a real data set, outline usage of state-of-the-art-software to do so and point out strengths, limitations and typical pitfalls. This article is a companion work to a current vignette-based structured interview study in paediatric anaesthesia.
This paper introduces discrete-holomorphic Perfectly Matched Layers (PMLs) specifically designed for high-order finite difference (FD) discretizations of the scalar wave equation. In contrast to standard PDE-based PMLs, the proposed method achieves the remarkable outcome of completely eliminating numerical reflections at the PML interface, in practice achieving errors at the level of machine precision. Our approach builds upon the ideas put forth in a recent publication [Journal of Computational Physics 381 (2019): 91-109] expanding the scope from the standard second-order FD method to arbitrary high-order schemes. This generalization uses additional localized PML variables to accommodate the larger stencils employed. We establish that the numerical solutions generated by our proposed schemes exhibit an exponential decay rate as they propagate within the PML domain. To showcase the effectiveness of our method, we present a variety of numerical examples, including waveguide problems. These examples highlight the importance of employing high-order schemes to effectively address and minimize undesired numerical dispersion errors, emphasizing the practical advantages and applicability of our approach.
The optimal branch number of MDS matrices makes them a preferred choice for designing diffusion layers in many block ciphers and hash functions. However, in lightweight cryptography, Near-MDS (NMDS) matrices with sub-optimal branch numbers offer a better balance between security and efficiency as a diffusion layer, compared to MDS matrices. In this paper, we study NMDS matrices, exploring their construction in both recursive and nonrecursive settings. We provide several theoretical results and explore the hardware efficiency of the construction of NMDS matrices. Additionally, we make comparisons between the results of NMDS and MDS matrices whenever possible. For the recursive approach, we study the DLS matrices and provide some theoretical results on their use. Some of the results are used to restrict the search space of the DLS matrices. We also show that over a field of characteristic 2, any sparse matrix of order $n\geq 4$ with fixed XOR value of 1 cannot be an NMDS when raised to a power of $k\leq n$. Following that, we use the generalized DLS (GDLS) matrices to provide some lightweight recursive NMDS matrices of several orders that perform better than the existing matrices in terms of hardware cost or the number of iterations. For the nonrecursive construction of NMDS matrices, we study various structures, such as circulant and left-circulant matrices, and their generalizations: Toeplitz and Hankel matrices. In addition, we prove that Toeplitz matrices of order $n>4$ cannot be simultaneously NMDS and involutory over a field of characteristic 2. Finally, we use GDLS matrices to provide some lightweight NMDS matrices that can be computed in one clock cycle. The proposed nonrecursive NMDS matrices of orders 4, 5, 6, 7, and 8 can be implemented with 24, 50, 65, 96, and 108 XORs over $\mathbb{F}_{2^4}$, respectively.
Graph Neural Networks (GNNs) have been successfully used in many problems involving graph-structured data, achieving state-of-the-art performance. GNNs typically employ a message-passing scheme, in which every node aggregates information from its neighbors using a permutation-invariant aggregation function. Standard well-examined choices such as the mean or sum aggregation functions have limited capabilities, as they are not able to capture interactions among neighbors. In this work, we formalize these interactions using an information-theoretic framework that notably includes synergistic information. Driven by this definition, we introduce the Graph Ordering Attention (GOAT) layer, a novel GNN component that captures interactions between nodes in a neighborhood. This is achieved by learning local node orderings via an attention mechanism and processing the ordered representations using a recurrent neural network aggregator. This design allows us to make use of a permutation-sensitive aggregator while maintaining the permutation-equivariance of the proposed GOAT layer. The GOAT model demonstrates its increased performance in modeling graph metrics that capture complex information, such as the betweenness centrality and the effective size of a node. In practical use-cases, its superior modeling capability is confirmed through its success in several real-world node classification benchmarks.
We describe ACE0, a lightweight platform for evaluating the suitability and viability of AI methods for behaviour discovery in multiagent simulations. Specifically, ACE0 was designed to explore AI methods for multi-agent simulations used in operations research studies related to new technologies such as autonomous aircraft. Simulation environments used in production are often high-fidelity, complex, require significant domain knowledge and as a result have high R&D costs. Minimal and lightweight simulation environments can help researchers and engineers evaluate the viability of new AI technologies for behaviour discovery in a more agile and potentially cost effective manner. In this paper we describe the motivation for the development of ACE0.We provide a technical overview of the system architecture, describe a case study of behaviour discovery in the aerospace domain, and provide a qualitative evaluation of the system. The evaluation includes a brief description of collaborative research projects with academic partners, exploring different AI behaviour discovery methods.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.
In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.