Path integrals with complex actions are encountered for many physical systems ranging from spin- or mass-imbalanced atomic gases and graphene to quantum chromo-dynamics at finite density to the non-equilibrium evolution of quantum systems. Many computational approaches have been developed for tackling the sign problem emerging for complex actions. Among these, complex Langevin dynamics has the appeal of general applicability. One of its key challenges is the potential convergence of the dynamics to unphysical fixed points. The statistical sampling process at such a fixed point is not based on the physical action and hence leads to wrong predictions. Moreover, its unphysical nature is hard to detect due to the implicit nature of the process. In the present work we set up a general approach based on a Markov chain Monte Carlo scheme in an extended state space. In this approach we derive an explicit real sampling process for generalized complex Langevin dynamics. Subject to a set of constraints, this sampling process is the physical one. These constraints originate from the detailed-balance equations satisfied by the Monte Carlo scheme. This allows us to re-derive complex Langevin dynamics from a new perspective and establishes a framework for the explicit construction of new sampling schemes for complex actions.
Quantum-classical molecular dynamics, as a partial classical limit of the full quantum Schr\"odinger equation, is a widely used framework for quantum molecular dynamics. The underlying equations are nonlinear in nature, containing a quantum part (represents the electrons) and a classical part (stands for the nuclei). An accurate simulation of the wave function typically requires a time step comparable to the rescaled Planck constant $h$, resulting in a formidable cost when $h\ll 1$. We prove an additive observable error bound of Schwartz observables for the proposed time-splitting schemes based on semiclassical analysis, which decreases as $h$ becomes smaller. Furthermore, we establish a uniform-in-$h$ observable error bound, which allows an $\mathcal{O}(1)$ time step to accurately capture the physical observable regardless of the size of $h$. Numerical results verify our estimates.
This paper studies bulk-surface splitting methods of first order for (semi-linear) parabolic partial differential equations with dynamic boundary conditions. The proposed Lie splitting scheme is based on a reformulation of the problem as a coupled partial differential-algebraic equation system, i.e., the boundary conditions are considered as a second dynamic equation which is coupled to the bulk problem. The splitting approach is combined with bulk-surface finite elements and an implicit Euler discretization of the two subsystems. We prove first-order convergence of the resulting fully discrete scheme in the presence of a weak CFL condition of the form $\tau \leq c h$ for some constant $c>0$. The convergence is also illustrated numerically using dynamic boundary conditions of Allen-Cahn-type.
SARS-CoV-2, like any other virus, continues to mutate as it spreads, according to an evolutionary process. Unlike any other virus, the number of currently available sequences of SARS-CoV-2 in public databases such as GISAID is already several million. This amount of data has the potential to uncover the evolutionary dynamics of a virus like never before. However, a million is already several orders of magnitude beyond what can be processed by the traditional methods designed to reconstruct a virus's evolutionary history, such as those that build a phylogenetic tree. Hence, new and scalable methods will need to be devised in order to make use of the ever increasing number of viral sequences being collected. Since identifying variants is an important part of understanding the evolution of a virus, in this paper, we propose an approach based on clustering sequences to identify the current major SARS-CoV-2 variants. Using a $k$-mer based feature vector generation and efficient feature selection methods, our approach is effective in identifying variants, as well as being efficient and scalable to millions of sequences. Such a clustering method allows us to show the relative proportion of each variant over time, giving the rate of spread of each variant in different locations -- something which is important for vaccine development and distribution. We also compute the importance of each amino acid position of the spike protein in identifying a given variant in terms of information gain. Positions of high variant-specific importance tend to agree with those reported by the USA's Centers for Disease Control and Prevention (CDC), further demonstrating our approach.
We examine a variety of numerical methods that arise when considering dynamical systems in the context of physics-based simulations of deformable objects. Such problems arise in various applications, including animation, robotics, control and fabrication. The goals and merits of suitable numerical algorithms for these applications are different from those of typical numerical analysis research in dynamical systems. Here the mathematical model is not fixed a priori but must be adjusted as necessary to capture the desired behaviour, with an emphasis on effectively producing lively animations of objects with complex geometries. Results are often judged by how realistic they appear to observers (by the "eye-norm") as well as by the efficacy of the numerical procedures employed. And yet, we show that with an adjusted view numerical analysis and applied mathematics can contribute significantly to the development of appropriate methods and their analysis in a variety of areas including finite element methods, stiff and highly oscillatory ODEs, model reduction, and constrained optimization.
Securing necessary resources for edge computing processes via effective resource trading becomes a critical technique in supporting computation-intensive mobile applications. Conventional onsite spot trading could facilitate this paradigm with proper incentives, which, however, incurs excessive decision-making latency/energy consumption, and further leads to underutilization of dynamic resources. Motivated by this, a hybrid market unifying futures and spot is proposed to facilitate resource trading among an edge server (seller) and multiple smart devices (buyers) by encouraging some buyers to sign a forward contract with seller in advance, while leaving the remaining buyers to compete for available resources with spot trading. Specifically, overbooking is adopted to achieve substantial utilization and profit advantages owing to dynamic resource demands. By integrating overbooking into futures market, mutually beneficial and risk-tolerable forward contracts with appropriate overbooking rate can be achieved relying on analyzing historical statistics associated with future resource demand and communication quality, which are determined by an alternative optimization-based negotiation scheme. Besides, spot trading problem is studied via considering uniform/differential pricing rules, for which two bilateral negotiation schemes are proposed by addressing both non-convex optimization and knapsack problems. Experimental results demonstrate that the proposed mechanism achieves mutually beneficial players' utilities, while outperforming baseline methods on critical indicators, e.g., decision-making latency, resource usage, etc.
One of the well-known challenges in optimal experimental design is how to efficiently estimate the nested integrations of the expected information gain. The Gaussian approximation and associated importance sampling have been shown to be effective at reducing the numerical costs. However, they may fail due to the non-negligible biases and the numerical instabilities. A new approach is developed to compute the expected information gain, when the posterior distribution is multimodal - a situation previously ignored by the methods aiming at accelerating the nested numerical integrations. Specifically, the posterior distribution is approximated using a mixture distribution constructed by multiple runs of global search for the modes and weighted local Laplace approximations. Under any given probability of capturing all the modes, we provide an estimation of the number of runs of searches, which is dimension independent. It is shown that the novel global-local multimodal approach can be significantly more accurate and more efficient than the other existing approaches, especially when the number of modes is large. The methods can be applied to the designs of experiments with both calibrated and uncalibrated observation noises.
Complex-valued neural networks have attracted increasing attention in recent years, while it remains open on the advantages of complex-valued neural networks in comparison with real-valued networks. This work takes one step on this direction by introducing the \emph{complex-reaction network} with fully-connected feed-forward architecture. We prove the universal approximation property for complex-reaction networks, and show that a class of radial functions can be approximated by a complex-reaction network using the polynomial number of parameters, whereas real-valued networks need at least exponential parameters to reach the same approximation level. For empirical risk minimization, our theoretical result shows that the critical point set of complex-reaction networks is a proper subset of that of real-valued networks, which may show some insights on finding the optimal solutions more easily for complex-reaction networks.
We propose a very fast approximate Markov Chain Monte Carlo (MCMC) sampling framework that is applicable to a large class of sparse Bayesian inference problems, where the computational cost per iteration in several models is of order $O(ns)$, where $n$ is the sample size, and $s$ the underlying sparsity of the model. This cost can be further reduced by data sub-sampling when stochastic gradient Langevin dynamics are employed. The algorithm is an extension of the asynchronous Gibbs sampler of Johnson et al. (2013), but can be viewed from a statistical perspective as a form of Bayesian iterated sure independent screening (Fan et al. (2009)). We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal with high probability under some statistical assumptions. Furthermore we show that its mixing time is at most linear in the number of regressors. We illustrate the algorithm with several models.
Identifying algorithms for computational efficient unsupervised training of large language models is an important and active area of research. In this work, we develop and study a straightforward, dynamic always-sparse pre-training approach for BERT language modeling task, which leverages periodic compression steps based on magnitude pruning followed by random parameter re-allocation. This approach enables us to achieve Pareto improvements in terms of the number of floating-point operations (FLOPs) over statically sparse and dense models across a broad spectrum of network sizes. Furthermore, we demonstrate that training remains FLOP-efficient when using coarse-grained block sparsity, making it particularly promising for efficient execution on modern hardware accelerators.
Answering complex questions is a time-consuming activity for humans that requires reasoning and integration of information. Recent work on reading comprehension made headway in answering simple questions, but tackling complex questions is still an ongoing research challenge. Conversely, semantic parsers have been successful at handling compositionality, but only when the information resides in a target knowledge-base. In this paper, we present a novel framework for answering broad and complex questions, assuming answering simple questions is possible using a search engine and a reading comprehension model. We propose to decompose complex questions into a sequence of simple questions, and compute the final answer from the sequence of answers. To illustrate the viability of our approach, we create a new dataset of complex questions, ComplexWebQuestions, and present a model that decomposes questions and interacts with the web to compute an answer. We empirically demonstrate that question decomposition improves performance from 20.8 precision@1 to 27.5 precision@1 on this new dataset.