Complexity theory typically focuses on the difficulty of solving computational problems using classical inputs and outputs, even with a quantum computer. In the quantum world, it is natural to apply a different notion of complexity, namely the complexity of synthesizing quantum states. We investigate a state-synthesizing counterpart of the class NP, referred to as stateQMA, which is concerned with preparing certain quantum states through a polynomial-time quantum verifier with the aid of a single quantum message from an all-powerful but untrusted prover. This is a subclass of the class stateQIP recently introduced by Rosenthal and Yuen (ITCS 2022), which permits polynomially many interactions between the prover and the verifier. Our main result consists of error reduction of this class and its variants with an exponentially small gap or a bounded space, as well as how this class relates to other fundamental state synthesizing classes, i.e., states generated by uniform polynomial-time quantum circuits (stateBQP) and space-uniform polynomial-space quantum circuits (statePSPACE). Furthermore, we establish that the family of UQMA witnesses, considered as one of the most natural candidates, is in stateQMA. Additionally, we demonstrate that stateQCMA achieves perfect completeness.
In the context of finite sums minimization, variance reduction techniques are widely used to improve the performance of state-of-the-art stochastic gradient methods. Their practical impact is clear, as well as their theoretical properties. Stochastic proximal point algorithms have been studied as an alternative to stochastic gradient algorithms since they are more stable with respect to the choice of the stepsize but a proper variance reduced version is missing. In this work, we propose the first study of variance reduction techniques for stochastic proximal point algorithms. We introduce a stochastic proximal version of SVRG, SAGA, and some of their variants for smooth and convex functions. We provide several convergence results for the iterates and the objective function values. In addition, under the Polyak-{\L}ojasiewicz (PL) condition, we obtain linear convergence rates for the iterates and the function values. Our numerical experiments demonstrate the advantages of the proximal variance reduction methods over their gradient counterparts, especially about the stability with respect to the choice of the step size.
This study presents a novel high-order numerical method designed for solving the two-dimensional time-fractional convection-diffusion (TFCD) equation. The Caputo definition is employed to characterize the time-fractional derivative. A weak singularity at the initial time ($t=0$) is encountered in the considered problem, which is effectively managed by adopting a discretization approach for the time-fractional derivative, where Alikhanov's high-order L2-1$_\sigma$ formula is applied on a non-uniform fitted mesh, resulting in successful tackling of the singularity. A high-order two-dimensional compact operator is implemented to approximate the spatial variables. The alternating direction implicit (ADI) approach is then employed to solve the resulting system of equations by decomposing the two-dimensional problem into two separate one-dimensional problems. The theoretical analysis, encompassing both stability and convergence aspects, has been conducted comprehensively, and it has shown that method is convergent with an order $\mathcal O\left(N_t^{-\min\{3-\alpha,\theta\alpha,1+2\alpha,2+\alpha\}}+h_x^4+h_y^4\right)$, where $\alpha\in(0,1)$ represents the order of the fractional derivative, $N_t$ is the temporal discretization parameter and $h_x$ and $h_y$ represent spatial mesh widths. Moreover, the parameter $\theta$ is utilized in the construction of the fitted mesh.
This paper focuses on investigating the density convergence of a fully discrete finite difference method when applied to numerically solve the stochastic Cahn--Hilliard equation driven by multiplicative space-time white noises. The main difficulty lies in the control of the drift coefficient that is neither globally Lipschitz nor one-sided Lipschitz. To handle this difficulty, we propose a novel localization argument and derive the strong convergence rate of the numerical solution to estimate the total variation distance between the exact and numerical solutions. This along with the existence of the density of the numerical solution finally yields the convergence of density in $L^1(\mathbb{R})$ of the numerical solution. Our results partially answer positively to the open problem emerged in [J. Cui and J. Hong, J. Differential Equations (2020)] on computing the density of the exact solution numerically.
Interpreting a seemingly-simple function word like "or", "behind", or "more" can require logical, numerical, and relational reasoning. How are such words learned by children? Prior acquisition theories have often relied on positing a foundation of innate knowledge. Yet recent neural-network based visual question answering models apparently can learn to use function words as part of answering questions about complex visual scenes. In this paper, we study what these models learn about function words, in the hope of better understanding how the meanings of these words can be learnt by both models and children. We show that recurrent models trained on visually grounded language learn gradient semantics for function words requiring spacial and numerical reasoning. Furthermore, we find that these models can learn the meanings of logical connectives "and" and "or" without any prior knowledge of logical reasoning, as well as early evidence that they can develop the ability to reason about alternative expressions when interpreting language. Finally, we show that word learning difficulty is dependent on frequency in models' input. Our findings offer evidence that it is possible to learn the meanings of function words in visually grounded context by using non-symbolic general statistical learning algorithms, without any prior knowledge of linguistic meaning.
The languages of mathematical physics and modelling are endowed with a rich "grammar of dimensions" that common abstractions of programming languages fail to represent. We propose a dependently typed domain-specific language (embedded in Idris) that captures this grammar. We apply it to explain basic notions of dimensional analysis and Buckingham's Pi theorem. We argue that the language makes mathematical physics more accessible to computer scientists and functional programming more palatable to modelers and physicists.
We discuss applications of exact structures and relative homological algebra to the study of invariants of multiparameter persistence modules. This paper is mostly expository, but does contain a pair of novel results. Over finite posets, classical arguments about the relative projective modules of an exact structure make use of Auslander-Reiten theory. One of our results establishes a new adjunction which allows us to "lift" these arguments to certain infinite posets over which Auslander-Reiten theory is not available. We give several examples of this lifting, in particular highlighting the non-existence and existence of resolutions by upsets when working with finitely presentable representations of the plane and of the closure of the positive quadrant, respectively. We then restrict our attention to finite posets. In this setting, we discuss the relationship between the global dimension of an exact structure and the representation dimension of the incidence algebra of the poset. We conclude with our second novel contribution. This is an explicit description of the irreducible morphisms between relative projective modules for several exact structures which have appeared previously in the literature.
To enhance solution accuracy and training efficiency in neural network approximation to partial differential equations, partitioned neural networks can be used as a solution surrogate instead of a single large and deep neural network defined on the whole problem domain. In such a partitioned neural network approach, suitable interface conditions or subdomain boundary conditions are combined to obtain a convergent approximate solution. However, there has been no rigorous study on the convergence and parallel computing enhancement on the partitioned neural network approach. In this paper, iterative algorithms are proposed to address these issues. Our algorithms are based on classical additive Schwarz domain decomposition methods. Numerical results are included to show the performance of the proposed iterative algorithms.
Long-span bridges are subjected to a multitude of dynamic excitations during their lifespan. To account for their effects on the structural system, several load models are used during design to simulate the conditions the structure is likely to experience. These models are based on different simplifying assumptions and are generally guided by parameters that are stochastically identified from measurement data, making their outputs inherently uncertain. This paper presents a probabilistic physics-informed machine-learning framework based on Gaussian process regression for reconstructing dynamic forces based on measured deflections, velocities, or accelerations. The model can work with incomplete and contaminated data and offers a natural regularization approach to account for noise in the measurement system. An application of the developed framework is given by an aerodynamic analysis of the Great Belt East Bridge. The aerodynamic response is calculated numerically based on the quasi-steady model, and the underlying forces are reconstructed using sparse and noisy measurements. Results indicate a good agreement between the applied and the predicted dynamic load and can be extended to calculate global responses and the resulting internal forces. Uses of the developed framework include validation of design models and assumptions, as well as prognosis of responses to assist in damage detection and structural health monitoring.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.