In this paper, we establish the central limit theorem (CLT) for linear spectral statistics (LSS) of large-dimensional sample covariance matrix when the population covariance matrices are not uniformly bounded. This constitutes a nontrivial extension of the Bai-Silverstein theorem (BST) (Ann Probab 32(1):553--605, 2004), a theorem that has strongly influenced the development of high-dimensional statistics, especially in the applications of random matrix theory to statistics. Recently there has been a growing realization that the assumption of uniform boundedness of the population covariance matrices in BST is not satisfied in some fields, such as economics, where the variances of principal components could diverge as the dimension tends to infinity. Therefore, in this paper, we aim to eliminate the obstacles to the applications of BST. Our new CLT accommodates the spiked eigenvalues, which may either be bounded or tend to infinity. A distinguishing feature of our result is that the variance in the new CLT is related to both spiked eigenvalues and bulk eigenvalues, with dominance being determined by the divergence rate of the largest spiked eigenvalue. The new CLT for LSS is then applied to test the hypothesis that the population covariance matrix is the identity matrix or a generalized spiked model. The asymptotic distributions for the corrected likelihood ratio test statistic and corrected Nagao's trace test statistic are derived under the alternative hypothesis. Moreover, we provide power comparisons between the two LSSs and Roy's largest root test under certain hypotheses. In particular, we demonstrate that except for the case where the number of spikes is equal to 1, the LSSs may exhibit higher power than Roy's largest root test in certain scenarios.
Many real-world datasets are represented as tensors, i.e., multi-dimensional arrays of numerical values. Storing them without compression often requires substantial space, which grows exponentially with the order. While many tensor compression algorithms are available, many of them rely on strong data assumptions regarding its order, sparsity, rank, and smoothness. In this work, we propose TENSORCODEC, a lossy compression algorithm for general tensors that do not necessarily adhere to strong input data assumptions. TENSORCODEC incorporates three key ideas. The first idea is Neural Tensor-Train Decomposition (NTTD) where we integrate a recurrent neural network into Tensor-Train Decomposition to enhance its expressive power and alleviate the limitations imposed by the low-rank assumption. Another idea is to fold the input tensor into a higher-order tensor to reduce the space required by NTTD. Finally, the mode indices of the input tensor are reordered to reveal patterns that can be exploited by NTTD for improved approximation. Our analysis and experiments on 8 real-world datasets demonstrate that TENSORCODEC is (a) Concise: it gives up to 7.38x more compact compression than the best competitor with similar reconstruction error, (b) Accurate: given the same budget for compressed size, it yields up to 3.33x more accurate reconstruction than the best competitor, (c) Scalable: its empirical compression time is linear in the number of tensor entries, and it reconstructs each entry in logarithmic time. Our code and datasets are available at //github.com/kbrother/TensorCodec.
In this paper, we develop a general theory for adaptive nonparametric estimation of the mean function of a non-stationary and nonlinear time series model using deep neural networks (DNNs). We first consider two types of DNN estimators, non-penalized and sparse-penalized DNN estimators, and establish their generalization error bounds for general non-stationary time series. We then derive minimax lower bounds for estimating mean functions belonging to a wide class of nonlinear autoregressive (AR) models that include nonlinear generalized additive AR, single index, and threshold AR models. Building upon the results, we show that the sparse-penalized DNN estimator is adaptive and attains the minimax optimal rates up to a poly-logarithmic factor for many nonlinear AR models. Through numerical simulations, we demonstrate the usefulness of the DNN methods for estimating nonlinear AR models with intrinsic low-dimensional structures and discontinuous or rough mean functions, which is consistent with our theory.
Inference principles are postulated within statistics, they are not usually derived from any underlying physical constraints on real world observers. An exception to this rule is that in the context of partially observable information engines decision making can be based solely on physical arguments. An inference principle can be derived from minimization of the lower bound on average dissipation [Phys. Rev. Lett., 124(5), 050601], which is achievable with a quasi-static process. Thermodynamically rational decision strategies can be computed algorithmically with the resulting approach. Here, we use this to study an example of binary decision making under uncertainty that is very simple, yet just interesting enough to be non-trivial: observations are either entirely uninformative, or they carry complete certainty about the variable that needs to be known for successful energy harvesting. Solutions found algorithmically can be expressed in terms of parameterized soft partitions of the observable space. This allows for their interpretation, as well as for the analytical calculation of all quantities that characterize the decision problem and the thermodynamically rational strategies.
In this paper, we evaluate the portability of the SYCL programming model on some of the latest CPUs and GPUs from a wide range of vendors, utilizing the two main compilers: DPC++ and hipSYCL/OpenSYCL. Both compilers currently support GPUs from all three major vendors; we evaluate performance on the Intel(R) Data Center GPU Max 1100, the NVIDIA A100 GPU, and the AMD MI250X GPU. Support on CPUs currently is less established, with DPC++ only supporting x86 CPUs through OpenCL, however, OpenSYCL does have an OpenMP backend capable of targeting all modern CPUs; we benchmark the Intel Xeon Platinum 8360Y Processor (Ice Lake), the AMD EPYC 9V33X (Genoa-X), and the Ampere Altra platforms. We study a range of primarily bandwidth-bound applications implemented using the OPS and OP2 DSLs, evaluate different formulations in SYCL, and contrast their performance to "native" programming approaches where available (CUDA/HIP/OpenMP). On GPU architectures SCYL on average even slightly outperforms native approaches, while on CPUs it falls behind - highlighting a continued need for improving CPU performance. While SYCL does not solve all the challenges of performance portability (e.g. needing different algorithms on different hardware), it does provide a single programming model and ecosystem to target most current HPC architectures productively.
One of the fundamental results in quantum foundations is the Kochen-Specker (KS) theorem, which states that any theory whose predictions agree with quantum mechanics must be contextual, i.e., a quantum observation cannot be understood as revealing a pre-existing value. The theorem hinges on the existence of a mathematical object called a KS vector system. While many KS vector systems are known, the problem of finding the minimum KS vector system in three dimensions has remained stubbornly open for over 55 years. In this paper, we present a new method based on a combination of a Boolean satisfiability (SAT) solver and a computer algebra system (CAS) to address this problem. Our approach shows that a KS system in three dimensions must contain at least 24 vectors. Our SAT+CAS method is over 35,000 times faster at deriving the previously known lower bound of 22 vectors than the prior CAS-based searches. More importantly, we provide the first computer-verifiable proof certificate of a lower bound in the KS problem with a proof size of 41.6 TiB in order 23. The increase in efficiency is due to the fact we are able to exploit the powerful combinatorial search-with-learning capabilities of SAT solvers, together with the CAS-based isomorph-free exhaustive method of orderly generation of graphs. To the best of our knowledge, our work is the first application of a SAT+CAS method to a problem in the realm of quantum foundations and the first lower bound in the minimum Kochen-Specker problem with a computer-verifiable proof certificate.
In this paper, we explore zero- and few-shot generalization for fact verification (FV), which aims to generalize the FV model trained on well-resourced domains (e.g., Wikipedia) to low-resourced domains that lack human annotations. To this end, we first construct a benchmark dataset collection which contains 11 FV datasets representing 6 domains. We conduct an empirical analysis of generalization across these FV datasets, finding that current models generalize poorly. Our analysis reveals that several factors affect generalization, including dataset size, length of evidence, and the type of claims. Finally, we show that two directions of work improve generalization: 1) incorporating domain knowledge via pretraining on specialized domains, and 2) automatically generating training data via claim generation.
In this paper, we consider experiments involving both discrete factors and continuous factors under general parametric statistical models. To search for optimal designs under the D-criterion, we propose a new algorithm, called the ForLion algorithm, which performs an exhaustive search in a design space with discrete and continuous factors while keeping high efficiency and a reduced number of design points. Its optimality is guaranteed by the general equivalence theorem. We show its advantage using a real-life experiment under multinomial logistic models, and further specialize the algorithm for generalized linear models to show the improved efficiency with model-specific formulae and iterative steps.
In this paper, we explore a continuous modeling approach for deep-learning-based speech enhancement, focusing on the denoising process. We use a state variable to indicate the denoising process. The starting state is noisy speech and the ending state is clean speech. The noise component in the state variable decreases with the change of the state index until the noise component is 0. During training, a UNet-like neural network learns to estimate every state variable sampled from the continuous denoising process. In testing, we introduce a controlling factor as an embedding, ranging from zero to one, to the neural network, allowing us to control the level of noise reduction. This approach enables controllable speech enhancement and is adaptable to various application scenarios. Experimental results indicate that preserving a small amount of noise in the clean target benefits speech enhancement, as evidenced by improvements in both objective speech measures and automatic speech recognition performance.
In this paper, we combine the Smolyak technique for multi-dimensional interpolation with the Filon-Clenshaw-Curtis (FCC) rule for one-dimensional oscillatory integration, to obtain a new Filon-Clenshaw-Curtis-Smolyak (FCCS) rule for oscillatory integrals with linear phase over the $d-$dimensional cube $[-1,1]^d$. By combining stability and convergence estimates for the FCC rule with error estimates for the Smolyak interpolation operator, we obtain an error estimate for the FCCS rule, consisting of the product of a Smolyak-type error estimate multiplied by a term that decreases with $\mathcal{O}(k^{-\tilde{d}})$, where $k$ is the wavenumber and $\tilde{d}$ is the number of oscillatory dimensions. If all dimensions are oscillatory, a higher negative power of $k$ appears in the estimate. As an application, we consider the forward problem of uncertainty quantification (UQ) for a one-space-dimensional Helmholtz problem with wavenumber $k$ and a random heterogeneous refractive index, depending in an affine way on $d$ i.i.d. uniform random variables. After applying a classical hybrid numerical-asymptotic approximation, expectations of functionals of the solution of this problem can be formulated as a sum of oscillatory integrals over $[-1,1]^d$, which we compute using the FCCS rule. We give numerical results for the FCCS rule and the UQ algorithm showing that accuracy improves when both $k$ and the order of the rule increase. We also give results for dimension-adaptive sparse grid FCCS quadrature showing its efficiency as dimension increases.
Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.