Noise is usually regarded as adversarial to extract the effective dynamics from time series, such that the conventional data-driven approaches usually aim at learning the dynamics by mitigating the noisy effect. However, noise can have a functional role of driving transitions between stable states underlying many natural and engineered stochastic dynamics. To capture such stochastic transitions from data, we find that leveraging a machine learning model, reservoir computing as a type of recurrent neural network, can learn noise-induced transitions. We develop a concise training protocol for tuning hyperparameters, with a focus on a pivotal hyperparameter controlling the time scale of the reservoir dynamics. The trained model generates accurate statistics of transition time and the number of transitions. The approach is applicable to a wide class of systems, including a bistable system under a double-well potential, with either white noise or colored noise. It is also aware of the asymmetry of the double-well potential, the rotational dynamics caused by non-detailed balance, and transitions in multi-stable systems. For the experimental data of protein folding, it learns the transition time between folded states, providing a possibility of predicting transition statistics from a small dataset. The results demonstrate the capability of machine-learning methods in capturing noise-induced phenomena.
The goal of this note is to explain the reconciliation problem for continuous-variable quantum key distribution protocols with a discrete modulation. Such modulation formats are attractive since they significantly simplify experimental implementations compared to protocols with a Gaussian modulation. Previous security proofs that relied crucially on the Gaussian distribution of the input states are rendered inapplicable, and new proofs based on the entropy accumulation theorem have emerged. Unfortunately, these proofs are not compatible with existing reconciliation procedures, and necessitate a reevaluation of the reconciliation problem. We argue that this problem is nontrivial and deserves further attention. In particular, assuming it can be solved with optimal efficiency leads to overly optimistic predictions for the performance of the key distribution protocol, in particular for long distances.
We address speech enhancement based on variational autoencoders, which involves learning a speech prior distribution in the time-frequency (TF) domain. A zero-mean complex-valued Gaussian distribution is usually assumed for the generative model, where the speech information is encoded in the variance as a function of a latent variable. In contrast to this commonly used approach, we propose a weighted variance generative model, where the contribution of each spectrogram time-frame in parameter learning is weighted. We impose a Gamma prior distribution on the weights, which would effectively lead to a Student's t-distribution instead of Gaussian for speech generative modeling. We develop efficient training and speech enhancement algorithms based on the proposed generative model. Our experimental results on spectrogram auto-encoding and speech enhancement demonstrate the effectiveness and robustness of the proposed approach compared to the standard unweighted variance model.
Simulation-based inference has been popular for amortized Bayesian computation. It is typical to have more than one posterior approximation, from different inference algorithms, different architectures, or simply the randomness of initialization and stochastic gradients. With a provable asymptotic guarantee, we present a general stacking framework to make use of all available posterior approximations. Our stacking method is able to combine densities, simulation draws, confidence intervals, and moments, and address the overall precision, calibration, coverage, and bias at the same time. We illustrate our method on several benchmark simulations and a challenging cosmological inference task.
We present a streamlined and simplified exponential lower bound on the length of proofs in intuitionistic implicational logic, adapted to Gordeev and Haeusler's dag-like natural deduction.
Quantum adversarial machine learning is an emerging field that studies the vulnerability of quantum learning systems against adversarial perturbations and develops possible defense strategies. Quantum universal adversarial perturbations are small perturbations, which can make different input samples into adversarial examples that may deceive a given quantum classifier. This is a field that was rarely looked into but worthwhile investigating because universal perturbations might simplify malicious attacks to a large extent, causing unexpected devastation to quantum machine learning models. In this paper, we take a step forward and explore the quantum universal perturbations in the context of heterogeneous classification tasks. In particular, we find that quantum classifiers that achieve almost state-of-the-art accuracy on two different classification tasks can be both conclusively deceived by one carefully-crafted universal perturbation. This result is explicitly demonstrated with well-designed quantum continual learning models with elastic weight consolidation method to avoid catastrophic forgetting, as well as real-life heterogeneous datasets from hand-written digits and medical MRI images. Our results provide a simple and efficient way to generate universal perturbations on heterogeneous classification tasks and thus would provide valuable guidance for future quantum learning technologies.
Model averaging is a useful and robust method for dealing with model uncertainty in statistical analysis. Often, it is useful to consider data subset selection at the same time, in which model selection criteria are used to compare models across different subsets of the data. Two different criteria have been proposed in the literature for how the data subsets should be weighted. We compare the two criteria closely in a unified treatment based on the Kullback-Leibler divergence, and conclude that one of them is subtly flawed and will tend to yield larger uncertainties due to loss of information. Analytical and numerical examples are provided.
Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.
Domain-specific terminology extraction is an important task in text analysis. A term in a corpus is said to be "bursty" when its occurrences are concentrated in few out of many documents. Being content rich, bursty terms are highly suited for subject matter characterization, and serve as natural candidates for identifying with technical terminology. Multiple measures of term burstiness have been proposed in the literature. However, the statistical significance testing paradigm has remained underexplored in text analysis, including in relation to term burstiness. To test these waters, we propose as our main contribution a multinomial language model-based exact test of statistical significance for term burstiness. Due to its prohibitive computational cost, we advance a heuristic formula designed to serve as a proxy for test P-values. As a complementary theoretical contribution, we derive a previously unreported relationship connecting the inverse document frequency and inverse collection frequency (two foundational quantities in text analysis) under the multinomial language model. The relation is used in the evaluation of our heuristic. Using the GENIA Term corpus benchmark, we compare our approach against established methods, demonstrating our heuristic's potential in identifying domain-specific technical terms. We hope this demonstration of statistical significance testing in text analysis serves as a springboard for future research.
Boundary value problems involving elliptic PDEs such as the Laplace and the Helmholtz equations are ubiquitous in physics and engineering. Many such problems have alternative formulations as integral equations that are mathematically more tractable than their PDE counterparts. However, the integral equation formulation poses a challenge in solving the dense linear systems that arise upon discretization. In cases where iterative methods converge rapidly, existing methods that draw on fast summation schemes such as the Fast Multipole Method are highly efficient and well established. More recently, linear complexity direct solvers that sidestep convergence issues by directly computing an invertible factorization have been developed. However, storage and compute costs are high, which limits their ability to solve large-scale problems in practice. In this work, we introduce a distributed-memory parallel algorithm based on an existing direct solver named ``strong recursive skeletonization factorization.'' The analysis of its parallel scalability applies generally to a class of existing methods that exploit the so-called strong admissibility. Specifically, we apply low-rank compression to certain off-diagonal matrix blocks in a way that minimizes data movement. Given a compression tolerance, our method constructs an approximate factorization of a discretized integral operator (dense matrix), which can be used to solve linear systems efficiently in parallel. Compared to iterative algorithms, our method is particularly suitable for problems involving ill-conditioned matrices or multiple right-hand sides. Large-scale numerical experiments are presented to demonstrate the performance of our implementation using the Julia language.
Stochastic sampling techniques are ubiquitous in real-time rendering, where performance constraints force the use of low sample counts, leading to noisy intermediate results. To remove this noise, the post-processing step of temporal and spatial denoising is an integral part of the real-time graphics pipeline. The main insight presented in this paper is that we can optimize the samples used in stochastic sampling such that the post-processing error is minimized. The core of our method is an analytical loss function which measures post-filtering error for a class of integrands - multidimensional Heaviside functions. These integrands are an approximation of the discontinuous functions commonly found in rendering. Our analysis applies to arbitrary spatial and spatiotemporal filters, scalar and vector sample values, and uniform and non-uniform probability distributions. We show that the spectrum of Monte Carlo noise resulting from our sampling method is adapted to the shape of the filter, resulting in less noisy final images. We demonstrate improvements over state-of-the-art sampling methods in three representative rendering tasks: ambient occlusion, volumetric ray-marching, and color image dithering. Common use noise textures, and noise generation code is available at //github.com/electronicarts/fastnoise.