This paper investigates the performance of a likelihood ratio test in combination with a polynomial subspace projection approach to detect weak transient signals in broadband array data. Based on previous empirical evidence that a likelihood ratio test is advantageously applied in a lower-dimensional subspace, we present analysis that highlights how the polynomial subspace projection whitens a crucial part of the signals, enabling a detector to operate with a shortened temporal window. This reduction in temporal correlation, together with a spatial compaction of the data, also leads to both computational and numerical advantages over a likelihood ratio test that is directly applied to the array data. The results of our analysis are illustrated by examples and simulations.
The meteoric rise in power and popularity of machine learning models dependent on valuable training data has reignited a basic tension between the power of running a program locally and the risk of exposing details of that program to the user. At the same time, fundamental properties of quantum states offer new solutions to data and program security that can require strikingly few quantum resources to exploit, and offer advantages outside of mere computational run time. In this work, we demonstrate such a solution with quantum one-time tokens. A quantum one-time token is a quantum state that permits a certain program to be evaluated exactly once. One-time security guarantees, roughly, that the token cannot be used to evaluate the program more than once. We propose a scheme for building quantum one-time tokens for any randomized classical program, which include generative AI models. We prove that the scheme satisfies an interesting definition of one-time security as long as outputs of the classical algorithm have high enough min-entropy, in a black box model. Importantly, the classical program being protected does not need to be implemented coherently on a quantum computer. In fact, the size and complexity of the quantum one-time token is independent of the program being protected, and additional quantum resources serve only to increase the security of the protocol. Due to this flexibility in adjusting the security, we believe that our proposal is parsimonious enough to serve as a promising candidate for a near-term useful demonstration of quantum computing in either the NISQ or early fault tolerant regime.
We study the computational complexity of computing Bayes-Nash equilibria in first-price auctions with discrete value distributions and discrete bidding space, under general subjective beliefs. It is known that such auctions do not always have pure equilibria. In this paper, we prove that the problem of deciding their existence is NP-complete, even for approximate equilibria. On the other hand, it can be shown that mixed equilibria are guaranteed to exist; however, their computational complexity has not been studied before. We establish the PPAD-completeness of computing a mixed equilibrium and we complement this by an efficient algorithm for finding symmetric approximate equilibria in the special case of iid priors. En route to these results, we develop a computational equivalence framework between continuous and discrete first-price auctions, which can be of independent interest, and which allows us to transfer existing positive and negative results from one setting to the other. Finally, we show that correlated equilibria of the auction can be computed in polynomial time.
This paper delves into the continuous post-training optimization methods for small language models, and proposes a continuous post-training alignment data construction method for small language models. The core of this method is based on the data guidance of large models, optimizing the diversity and accuracy of alignment data. In addition, to verify the effectiveness of the methods in this paper, we used Qwen2-0.5B-Instruct model as the baseline model for small language models, using the alignment dataset constructed by our proposed method, we trained and compared several groups of experiments, including SFT (Supervised Fine Tuning) post-training experiment and KTO (Kahneman Tversky optimization) post-training experiment, as well as SFT-KTO two-stage post-training experiment and model weight fusion experiment. Finally, we evaluated and analyzed the performance of post-training models, and confirmed that the continuous post-training optimization method proposed by us can significantly improve the performance of small language models.
This paper introduces the first generalization and adaptation benchmark using machine learning for evaluating out-of-distribution performance of electromyography (EMG) classification algorithms. The ability of an EMG classifier to handle inputs drawn from a different distribution than the training distribution is critical for real-world deployment as a control interface. By predicting the user's intended gesture using EMG signals, we can create a wearable solution to control assistive technologies, such as computers, prosthetics, and mobile manipulator robots. This new out-of-distribution benchmark consists of two major tasks that have utility for building robust and adaptable control interfaces: 1) intersubject classification and 2) adaptation using train-test splits for time-series. This benchmark spans nine datasets--the largest collection of EMG datasets in a benchmark. Among these, a new dataset is introduced, featuring a novel, easy-to-wear high-density EMG wearable for data collection. The lack of open-source benchmarks has made comparing accuracy results between papers challenging for the EMG research community. This new benchmark provides researchers with a valuable resource for analyzing practical measures of out-of-distribution performance for EMG datasets. Our code and data from our new dataset can be found at emgbench.github.io.
Low-rank adaptation (LoRA) offers an efficient alternative to full-weight adaptation in federated fine-tuning of language models, significantly reducing computational costs. By adjusting ranks for each client, federated LoRA enables flexible resource allocation. However, we observe that heterogeneous ranks among clients lead to unstable performance. Our analysis attributes this instability to the conventional zero-padding aggregation strategy, which dilutes information from high-rank clients during model aggregation. To address this issue, we propose a replication-based padding strategy that better retains valuable information from clients with high-quality data. Empirically, this approach accelerates convergence and enhances the global model's predictive performance.
This study investigates the potential of WebAssembly as a more secure and efficient alternative to Linux containers for executing untrusted code in cloud computing with Kubernetes. Specifically, it evaluates the security and performance implications of this shift. Security analyses demonstrate that both Linux containers and WebAssembly have attack surfaces when executing untrusted code, but WebAssembly presents a reduced attack surface due to an additional layer of isolation. The performance analysis further reveals that while WebAssembly introduces overhead, particularly in startup times, it could be negligible in long-running computations. However, WebAssembly enhances the core principle of containerization, offering better security through isolation and platform-agnostic portability compared to Linux containers. This research demonstrates that WebAssembly is not a silver bullet for all security concerns or performance requirements in a Kubernetes environment, but typical attacks are less likely to succeed and the performance loss is relatively small.
This paper proposes a distributed pseudo-likelihood method (DPL) to conveniently identify the community structure of large-scale networks. Specifically, we first propose a block-wise splitting method to divide large-scale network data into several subnetworks and distribute them among multiple workers. For simplicity, we assume the classical stochastic block model. Then, the DPL algorithm is iteratively implemented for the distributed optimization of the sum of the local pseudo-likelihood functions. At each iteration, the worker updates its local community labels and communicates with the master. The master then broadcasts the combined estimator to each worker for the new iterative steps. Based on the distributed system, DPL significantly reduces the computational complexity of the traditional pseudo-likelihood method using a single machine. Furthermore, to ensure statistical accuracy, we theoretically discuss the requirements of the worker sample size. Moreover, we extend the DPL method to estimate degree-corrected stochastic block models. The superior performance of the proposed distributed algorithm is demonstrated through extensive numerical studies and real data analysis.
Continual learning with deep neural networks presents challenges distinct from both the fixed-dataset and convex continual learning regimes. One such challenge is plasticity loss, wherein a neural network trained in an online fashion displays a degraded ability to fit new tasks. This problem has been extensively studied in both supervised learning and off-policy reinforcement learning (RL), where a number of remedies have been proposed. Still, plasticity loss has received less attention in the on-policy deep RL setting. Here we perform an extensive set of experiments examining plasticity loss and a variety of mitigation methods in on-policy deep RL. We demonstrate that plasticity loss is pervasive under domain shift in this regime, and that a number of methods developed to resolve it in other settings fail, sometimes even performing worse than applying no intervention at all. In contrast, we find that a class of ``regenerative'' methods are able to consistently mitigate plasticity loss in a variety of contexts, including in gridworld tasks and more challenging environments like Montezuma's Revenge and ProcGen.
This paper presents a simple yet efficient method for statistical inference of tensor linear forms using incomplete and noisy observations. Under the Tucker low-rank tensor model and the missing-at-random assumption, we utilize an appropriate initial estimate along with a debiasing technique followed by a one-step power iteration to construct an asymptotically normal test statistic. This method is suitable for various statistical inference tasks, including constructing confidence intervals, inference under heteroskedastic and sub-exponential noise, and simultaneous testing. We demonstrate that the estimator achieves the Cram\'er-Rao lower bound on Riemannian manifolds, indicating its optimality in uncertainty quantification. We comprehensively examine the statistical-to-computational gaps and investigate the impact of initialization on the minimal conditions regarding sample size and signal-to-noise ratio required for accurate inference. Our findings show that with independent initialization, statistically optimal sample sizes and signal-to-noise ratios are sufficient for accurate inference. Conversely, if only dependent initialization is available, computationally optimal sample sizes and signal-to-noise ratio conditions still guarantee asymptotic normality without the need for data-splitting. We present the phase transition between computational and statistical limits. Numerical simulation results align with the theoretical findings.
We report assumption-free bounds for any contrast between the probabilities of the potential outcome under exposure and non-exposure when the confounders are missing not at random. We assume that the missingness mechanism is outcome-independent. We also report a sensitivity analysis method to complement our bounds.