Quantum computing has recently emerged as a transformative technology. Yet, its promised advantages rely on efficiently translating quantum operations into viable physical realizations. In this work, we use generative machine learning models, specifically denoising diffusion models (DMs), to facilitate this transformation. Leveraging text-conditioning, we steer the model to produce desired quantum operations within gate-based quantum circuits. Notably, DMs allow to sidestep during training the exponential overhead inherent in the classical simulation of quantum dynamics -- a consistent bottleneck in preceding ML techniques. We demonstrate the model's capabilities across two tasks: entanglement generation and unitary compilation. The model excels at generating new circuits and supports typical DM extensions such as masking and editing to, for instance, align the circuit generation to the constraints of the targeted quantum device. Given their flexibility and generalization abilities, we envision DMs as pivotal in quantum circuit synthesis, enhancing both practical applications but also insights into theoretical quantum computation.
We investigate the proof complexity of systems based on positive branching programs, i.e. non-deterministic branching programs (NBPs) where, for any 0-transition between two nodes, there is also a 1-transition. Positive NBPs compute monotone Boolean functions, just like negation-free circuits or formulas, but constitute a positive version of (non-uniform) NL, rather than P or NC1, respectively. The proof complexity of NBPs was investigated in previous work by Buss, Das and Knop, using extension variables to represent the dag-structure, over a language of (non-deterministic) decision trees, yielding the system eLNDT. Our system eLNDT+ is obtained by restricting their systems to a positive syntax, similarly to how the 'monotone sequent calculus' MLK is obtained from the usual sequent calculus LK by restricting to negation-free formulas. Our main result is that eLNDT+ polynomially simulates eLNDT over positive sequents. Our proof method is inspired by a similar result for MLK by Atserias, Galesi and Pudl\'ak, that was recently improved to a bona fide polynomial simulation via works of Je\v{r}\'abek and Buss, Kabanets, Kolokolova and Kouck\'y. Along the way we formalise several properties of counting functions within eLNDT+ by polynomial-size proofs and, as a case study, give explicit polynomial-size poofs of the propositional pigeonhole principle.
We address the problem of the best uniform approximation of a continuous function on a convex domain. The approximation is by linear combinations of a finite system of functions (not necessarily Chebyshev) under arbitrary linear constraints. By modifying the concept of alternance and of the Remez iterative procedure we present a method, which demonstrates its efficiency in numerical problems. The linear rate of convergence is proved under some favourable assumptions. A special attention is paid to systems of complex exponents, Gaussian functions, lacunar algebraic and trigonometric polynomials. Applications to signal processing, linear ODE, switching dynamical systems, and to Markov-Bernstein type inequalities are considered.
Collecting large quantities of high-quality data can be prohibitively expensive or impractical, and a bottleneck in machine learning. One may instead augment a small set of $n$ data points from the target distribution with data from more accessible sources, e.g. data collected under different circumstances or synthesized by generative models. We refer to such data as `surrogate data.' We introduce a weighted empirical risk minimization (ERM) approach for integrating surrogate data into training. We analyze mathematically this method under several classical statistical models, and validate our findings empirically on datasets from different domains. Our main findings are: $(i)$ Integrating surrogate data can significantly reduce the test error on the original distribution. Surprisingly, this can happen even when the surrogate data is unrelated to the original ones. We trace back this behavior to the classical Stein's paradox. $(ii)$ In order to reap the benefit of surrogate data, it is crucial to use optimally weighted ERM. $(iii)$ The test error of models trained on mixtures of real and surrogate data is approximately described by a scaling law. This scaling law can be used to predict the optimal weighting scheme, and to choose the amount of surrogate data to add.
An essential problem in statistics and machine learning is the estimation of expectations involving PDFs with intractable normalizing constants. The self-normalized importance sampling (SNIS) estimator, which normalizes the IS weights, has become the standard approach due to its simplicity. However, the SNIS has been shown to exhibit high variance in challenging estimation problems, e.g, involving rare events or posterior predictive distributions in Bayesian statistics. Further, most of the state-of-the-art adaptive importance sampling (AIS) methods adapt the proposal as if the weights had not been normalized. In this paper, we propose a framework that considers the original task as estimation of a ratio of two integrals. In our new formulation, we obtain samples from a joint proposal distribution in an extended space, with two of its marginals playing the role of proposals used to estimate each integral. Importantly, the framework allows us to induce and control a dependency between both estimators. We propose a construction of the joint proposal that decomposes in two (multivariate) marginals and a coupling. This leads to a two-stage framework suitable to be integrated with existing or new AIS and/or variational inference (VI) algorithms. The marginals are adapted in the first stage, while the coupling can be chosen and adapted in the second stage. We show in several examples the benefits of the proposed methodology, including an application to Bayesian prediction with misspecified models.
User experience on mobile devices is constrained by limited battery capacity and processing power, but 6G technology advancements are diving rapidly into mobile technical evolution. Mobile edge computing (MEC) offers a solution, offloading computationally intensive tasks to edge cloud servers, reducing battery drain compared to local processing. The upcoming integrated sensing and communication in mobile communication may improve the trajectory prediction and processing delays. This study proposes a greedy resource allocation optimization strategy for multi-user networks to minimize aggregate energy usage. Numerical results show potential improvement at 33\% for every 1000 iteration. Addressing prediction model division and velocity accuracy issues is crucial for better results. A plan for further improvement and achieving objectives is outlined for the upcoming work phase.
We propose a Fast Fourier Transform based Periodic Interpolation Method (FFT-PIM), a flexible and computationally efficient approach for computing the scalar potential given by a superposition sum in a unit cell of an infinitely periodic array. Under the same umbrella, FFT-PIM allows computing the potential for 1D, 2D, and 3D periodicities for dynamic and static problems, including problems with and without a periodic phase shift. The computational complexity of the FFT-PIM is of $O(N \log N)$ for $N$ spatially coinciding sources and observer points. The FFT-PIM uses rapidly converging series representations of the Green's function serving as a kernel in the superposition sum. Based on these representations, the FFT-PIM splits the potential into its near-zone component, which includes a small number of images surrounding the unit cell of interest, and far-zone component, which includes the rest of an infinite number of images. The far-zone component is evaluated by projecting the non-uniform sources onto a sparse uniform grid, performing superposition sums on this sparse grid, and interpolating the potential from the uniform grid to the non-uniform observation points. The near-zone component is evaluated using an FFT-based method, which is adapted to efficiently handle non-uniform source-observer distributions within the periodic unit cell. The FFT-PIM can be used for a broad range of applications, such as periodic problems involving integral equations in computational electromagnetic and acoustic, micromagnetic solvers, and density functional theory solvers.
In the study of extremes, the presence of asymptotic independence signifies that extreme events across multiple variables are probably less likely to occur together. Although well-understood in a bivariate context, the concept remains relatively unexplored when addressing the nuances of joint occurrence of extremes in higher dimensions. In this paper, we propose a notion of mutual asymptotic independence to capture the behavior of joint extremes in dimensions larger than two and contrast it with the classical notion of (pairwise) asymptotic independence. Furthermore, we define $k$-wise asymptotic independence which lies in between pairwise and mutual asymptotic independence. The concepts are compared using examples of Archimedean, Gaussian and Marshall-Olkin copulas among others. Notably, for the popular Gaussian copula, we provide explicit conditions on the correlation matrix for mutual asymptotic independence to hold; moreover, we are able to compute exact tail orders for various tail events.
This paper considers both the least squares and quasi-maximum likelihood estimation for the recently proposed scalable ARMA model, a parametric infinite-order vector AR model, and their asymptotic normality is also established. It makes feasible the inference on this computationally efficient model, especially for economic and financial time series. An efficient block coordinate descent algorithm is further introduced to search for estimates, and a Bayesian information criterion with selection consistency is suggested for model selection. Simulation experiments are conducted to illustrate their finite sample performance, and a real application on six macroeconomic indicators illustrates the usefulness of the proposed methodology.
This paper studies the influence of probabilism and non-determinism on some quantitative aspect X of the execution of a system modeled as a Markov decision process (MDP). To this end, the novel notion of demonic variance is introduced: For a random variable X in an MDP M, it is defined as 1/2 times the maximal expected squared distance of the values of X in two independent execution of M in which also the non-deterministic choices are resolved independently by two distinct schedulers. It is shown that the demonic variance is between 1 and 2 times as large as the maximal variance of X in M that can be achieved by a single scheduler. This allows defining a non-determinism score for M and X measuring how strongly the difference of X in two executions of M can be influenced by the non-deterministic choices. Properties of MDPs M with extremal values of the non-determinism score are established. Further, the algorithmic problems of computing the maximal variance and the demonic variance are investigated for two random variables, namely weighted reachability and accumulated rewards. In the process, also the structure of schedulers maximizing the variance and of scheduler pairs realizing the demonic variance is analyzed.
The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.