Partisan gerrymandering, i.e., manipulation of electoral district boundaries for political advantage, is one of the major challenges to election integrity in modern day democracies. Yet most of the existing methods for detecting partisan gerrymandering are narrowly tailored toward fully contested two-party elections, and fail if there are more parties or if the number of candidates per district varies (as is the case in many plurality-based electoral systems outside the United States). We propose two methods, based on nonparametric statistical learning, that are able to deal with such cases. The use of multiple methods makes the proposed solution robust against violation of their respective assumptions. We then test the proposed methods against real-life data from national and subnational elections in 17 countries employing the FPTP system.
The utilization of finite field multipliers is pervasive in contemporary digital systems, with hardware implementation for bit parallel operation often necessitating millions of logic gates. However, various digital design issues, whether inherent or stemming from soft errors, can result in gate malfunction, ultimately can cause gates to malfunction, which in turn results in incorrect multiplier outputs. Thus, to prevent susceptibility to error, it is imperative to employ a reliable finite field multiplier implementation that boasts a robust fault detection capability. In order to achieve the best fault detection performance for finite field detection performance for finite field multipliers while maintaining a low-complexity implementation, this study proposes a novel fault detection scheme for a recent bit-parallel polynomial basis over GF(2m). The primary concept behind the proposed approach is centered on the implementation of an efficient BCH decoder that utilize Berlekamp-Rumsey-Solomon (BRS) algorithm and Chien-search method to effectively locate errors with minimal delay. The results of our synthesis indicate that our proposed error detection and correction architecture for a 45-bit multiplier with 5-bit errors achieves a 37% and 49% reduction in critical path delay compared to existing designs. Furthermore, a 45-bit multiplicand with five errors has hardware complexity that is only 80%, which is significantly less complex than the most advanced BCH-based fault recognition techniques, such as TMR, Hamming's single error correction, and LDPC-based methods for finite field multiplication which is desirable in constrained applications, such as smart cards, IoT devices, and implantable medical devices.
We consider estimation of parameters defined as linear functionals of solutions to linear inverse problems. Any such parameter admits a doubly robust representation that depends on the solution to a dual linear inverse problem, where the dual solution can be thought as a generalization of the inverse propensity function. We provide the first source condition double robust inference method that ensures asymptotic normality around the parameter of interest as long as either the primal or the dual inverse problem is sufficiently well-posed, without knowledge of which inverse problem is the more well-posed one. Our result is enabled by novel guarantees for iterated Tikhonov regularized adversarial estimators for linear inverse problems, over general hypothesis spaces, which are developments of independent interest.
The dictionary learning problem can be viewed as a data-driven process to learn a suitable transformation so that data is sparsely represented directly from example data. In this paper, we examine the problem of learning a dictionary that is invariant under a pre-specified group of transformations. Natural settings include Cryo-EM, multi-object tracking, synchronization, pose estimation, etc. We specifically study this problem under the lens of mathematical representation theory. Leveraging the power of non-abelian Fourier analysis for functions over compact groups, we prescribe an algorithmic recipe for learning dictionaries that obey such invariances. We relate the dictionary learning problem in the physical domain, which is naturally modelled as being infinite dimensional, with the associated computational problem, which is necessarily finite dimensional. We establish that the dictionary learning problem can be effectively understood as an optimization instance over certain matrix orbitopes having a particular block-diagonal structure governed by the irreducible representations of the group of symmetries. This perspective enables us to introduce a band-limiting procedure which obtains dimensionality reduction in applications. We provide guarantees for our computational ansatz to provide a desirable dictionary learning outcome. We apply our paradigm to investigate the dictionary learning problem for the groups SO(2) and SO(3). While the SO(2)-orbitope admits an exact spectrahedral description, substantially less is understood about the SO(3)-orbitope. We describe a tractable spectrahedral outer approximation of the SO(3)-orbitope, and contribute an alternating minimization paradigm to perform optimization in this setting. We provide numerical experiments to highlight the efficacy of our approach in learning SO(3)-invariant dictionaries, both on synthetic and on real world data.
In this paper, we study the lattice linearity of multiplication and modulo operations. We demonstrate that these operations are lattice linear and the parallel processing algorithms that we study for both these operations are able to exploit the lattice linearity of their respective problems. This implies that these algorithms can be implemented in asynchronous environments, where the nodes are allowed to read old information from each other. These algorithms also exhibit snap-stabilizing properties, i.e., starting from an arbitrary state, the sequence of state transitions made by the system strictly follows its specification.
Antimicrobial resistance is becoming a major threat to public health throughout the world. Researchers are attempting to contrast it by developing both new antibiotics and patient-specific treatments. In the second case, whole-genome sequencing has had a huge impact in two ways: first, it is becoming cheaper and faster to perform whole-genome sequencing, and this makes it competitive with respect to standard phenotypic tests; second, it is possible to statistically associate the phenotypic patterns of resistance to specific mutations in the genome. Therefore, it is now possible to develop catalogues of genomic variants associated with resistance to specific antibiotics, in order to improve prediction of resistance and suggest treatments. It is essential to have robust methods for identifying mutations associated to resistance and continuously updating the available catalogues. This work proposes a general method to study minimal inhibitory concentration (MIC) distributions and to identify clusters of strains showing different levels of resistance to antimicrobials. Once the clusters are identified and strains allocated to each of them, it is possible to perform regression method to identify with high statistical power the mutations associated with resistance. The method is applied to a new 96-well microtiter plate used for testing M. Tuberculosis.
The standard photometric stereo model makes several assumptions that are rarely verified in experimental datasets. In particular, the observed object should behave as a Lambertian reflector and the light sources should be positioned at an infinite distance from it, along a known direction. Even when Lambert's law is approximately fulfilled, an accurate assessment of the relative position between the light source and the target is often unavailable in real situations. The Hayakawa procedure is a computational method for estimating such information directly from the data images. It occasionally breaks down when some of the available images deviate too much from ideality. Indeed, in narrow shooting scenarios, typical, e.g., of archaeological excavation sites, it may be impossible to position a flashlight at a sufficient distance from the observed surface. It is then necessary to understand if a given dataset is reliable and which images should be selected to better reconstruct the target. In this paper, we propose some algorithms to perform this task and explore their effectiveness.
Perfect synchronization in distributed machine learning problems is inefficient and even impossible due to the existence of latency, package losses and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient Tracking method (R-FAST), where each device performs local computation and communication at its own pace without any form of synchronization. Different from existing asynchronous distributed algorithms, R-FAST can eliminate the impact of data heterogeneity across devices and allow for packet losses by employing a robust gradient tracking strategy that relies on properly designed auxiliary variables for tracking and buffering the overall gradient vector. More importantly, the proposed method utilizes two spanning-tree graphs for communication so long as both share at least one common root, enabling flexible designs in communication architectures. We show that R-FAST converges in expectation to a neighborhood of the optimum with a geometric rate for smooth and strongly convex objectives; and to a stationary point with a sublinear rate for general non-convex settings. Extensive experiments demonstrate that R-FAST runs 1.5-2 times faster than synchronous benchmark algorithms, such as Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP, especially in the presence of stragglers.
In many modern statistical problems, the limited available data must be used both to develop the hypotheses to test, and to test these hypotheses-that is, both for exploratory and confirmatory data analysis. Reusing the same dataset for both exploration and testing can lead to massive selection bias, leading to many false discoveries. Selective inference is a framework that allows for performing valid inference even when the same data is reused for exploration and testing. In this work, we are interested in the problem of selective inference for data clustering, where a clustering procedure is used to hypothesize a separation of the data points into a collection of subgroups, and we then wish to test whether these data-dependent clusters in fact represent meaningful differences within the data. Recent work by Gao et al. [2022] provides a framework for doing selective inference for this setting, where a hierarchical clustering algorithm is used for producing the cluster assignments, which was then extended to k-means clustering by Chen and Witten [2022]. Both these works rely on assuming a known covariance structure for the data, but in practice, the noise level needs to be estimated-and this is particularly challenging when the true cluster structure is unknown. In our work, we extend this work to the setting of noise with unknown variance, and provide a selective inference method for this more general setting. Empirical results show that our new method is better able to maintain high power while controlling Type I error when the true noise level is unknown.
It has been observed by several authors that well-known periodization strategies like tent or Chebychev transforms lead to remarkable results for the recovery of multivariate functions from few samples. So far, theoretical guarantees are missing. The goal of this paper is twofold. On the one hand, we give such guarantees and briefly describe the difficulties of the involved proof. On the other hand, we combine these periodization strategies with recent novel constructive methods for the efficient subsampling of finite frames in $\mathbb{C}$. As a result we are able to reconstruct non-periodic multivariate functions from very few samples. The used sampling nodes are the result of a two-step procedure. Firstly, a random draw with respect to the Chebychev measure provides an initial node set. A further sparsification technique selects a significantly smaller subset of these nodes with equal approximation properties. This set of sampling nodes scales linearly in the dimension of the subspace on which we project and works universally for the whole class of functions. The method is based on principles developed by Batson, Spielman, and Srivastava and can be numerically implemented. Samples on these nodes are then used in a (plain) least-squares sampling recovery step on a suitable hyperbolic cross subspace of functions resulting in a near-optimal behavior of the sampling error. Numerical experiments indicate the applicability of our results.
In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.