In this article, we derive fast and robust parallel-in-time preconditioned iterative methods for the all-at-once linear systems arising upon discretization of time-dependent PDEs. The discretization we employ is based on a Runge--Kutta method in time, for which the development of parallel solvers is an emerging research area in the literature of numerical methods for time-dependent PDEs. By making use of classical theory of block matrices, one is able to derive a preconditioner for the systems considered. The block structure of the preconditioner allows for parallelism in the time variable, as long as one is able to provide an optimal solver for the system of the stages of the method. We thus propose a preconditioner for the latter system based on a singular value decomposition (SVD) of the (real) Runge--Kutta matrix $A_{\mathrm{RK}} = U \Sigma V^\top$. Supposing $A_{\mathrm{RK}}$ is invertible, we prove that the spectrum of the system for the stages preconditioned by our SVD-based preconditioner is contained within the right-half of the unit circle, under suitable assumptions on the matrix $U^\top V$ (the assumptions are well posed due to the polar decomposition of $A_{\mathrm{RK}}$). We show the numerical efficiency of our SVD-based preconditioner by solving the system of the stages arising from the discretization of the heat equation and the Stokes equations, with sequential time-stepping. Finally, we provide numerical results of the all-at-once approach for both problems, showing the speed-up achieved on a parallel architecture.
SeDuMi and SDPT3 are two solvers for solving Semi-definite Programming (SDP) or Linear Matrix Inequality (LMI) problems. A computational performance comparison of these two are undertaken in this paper regarding the Stability of Continuous-time Linear Systems. The comparison mainly focuses on computational times and memory requirements for different scales of problems. To implement and compare the two solvers on a set of well-posed problems, we employ YALMIP, a widely used toolbox for modeling and optimization in MATLAB. The primary goal of this study is to provide an empirical assessment of the relative computational efficiency of SeDuMi and SDPT3 under varying problem conditions. Our evaluation indicates that SDPT3 performs much better in large-scale, high-precision calculations.
The numerical evaluation of statistics plays a crucial role in statistical physics and its applied fields. It is possible to evaluate the statistics for a stochastic differential equation with Gaussian white noise via the corresponding backward Kolmogorov equation. The important notice is that there is no need to obtain the solution of the backward Kolmogorov equation on the whole domain; it is enough to evaluate a value of the solution at a certain point that corresponds to the initial coordinate for the stochastic differential equation. For this aim, an algorithm based on combinatorics has recently been developed. In this paper, we discuss a higher-order approximation of resolvent, and an algorithm based on a second-order approximation is proposed. The proposed algorithm shows a second-order convergence. Furthermore, the convergence property of the naive algorithms naturally leads to extrapolation methods; they work well to calculate a more accurate value with fewer computational costs. The proposed method is demonstrated with the Ornstein-Uhlenbeck process and the noisy van der Pol system.
Prompt-tuning is an emerging strategy to adapt large language models (LLM) to downstream tasks by learning a (soft-)prompt parameter from data. Despite its success in LLMs, there is limited theoretical understanding of the power of prompt-tuning and the role of the attention mechanism in prompting. In this work, we explore prompt-tuning for one-layer attention architectures and study contextual mixture-models where each input token belongs to a context-relevant or -irrelevant set. We isolate the role of prompt-tuning through a self-contained prompt-attention model. Our contributions are as follows: (1) We show that softmax-prompt-attention is provably more expressive than softmax-self-attention and linear-prompt-attention under our contextual data model. (2) We analyze the initial trajectory of gradient descent and show that it learns the prompt and prediction head with near-optimal sample complexity and demonstrate how prompt can provably attend to sparse context-relevant tokens. (3) Assuming a known prompt but an unknown prediction head, we characterize the exact finite sample performance of prompt-attention which reveals the fundamental performance limits and the precise benefit of the context information. We also provide experiments that verify our theoretical insights on real datasets and demonstrate how prompt-tuning enables the model to attend to context-relevant information.
The discrete $\alpha$-neighbor $p$-center problem (d-$\alpha$-$p$CP) is an emerging variant of the classical $p$-center problem which recently got attention in literature. In this problem, we are given a discrete set of points and we need to locate $p$ facilities on these points in such a way that the maximum distance between each point where no facility is located and its $\alpha$-closest facility is minimized. The only existing algorithms in literature for solving the d-$\alpha$-$p$CP are approximation algorithms and two recently proposed heuristics. In this work, we present two integer programming formulations for the d-$\alpha$-$p$CP, together with lifting of inequalities, valid inequalities, inequalities that do not change the optimal objective function value and variable fixing procedures. We provide theoretical results on the strength of the formulations and convergence results for the lower bounds obtained after applying the lifting procedures or the variable fixing procedures in an iterative fashion. Based on our formulations and theoretical results, we develop branch-and-cut (B&C) algorithms, which are further enhanced with a starting heuristic and a primal heuristic. We evaluate the effectiveness of our B&C algorithms using instances from literature. Our algorithms are able to solve 116 out of 194 instances from literature to proven optimality, with a runtime of under a minute for most of them. By doing so, we also provide improved solution values for 116 instances.
We prove a central limit theorem for the Horvitz-Thompson estimator based on the Gram-Schmidt Walk (GSW) design, recently developed in Harshaw et al.(2022). In particular, we consider the version of the GSW design which uses randomized pivot order, thereby answering an open question raised in the same article. We deduce this under minimal and global assumptions involving only the problem parameters such as the (sum) potential outcome vector and the covariate matrix. As an interesting consequence of our analysis we also obtain the precise limiting variance of the estimator in terms of these parameters which is smaller than the previously known upper bound. The main ingredients are a simplified skeletal process approximating the GSW design and concentration phenomena for random matrices obtained from random sampling using the Stein's method for exchangeable pairs.
Partially observable Markov decision processes (POMDPs) provide a flexible representation for real-world decision and control problems. However, POMDPs are notoriously difficult to solve, especially when the state and observation spaces are continuous or hybrid, which is often the case for physical systems. While recent online sampling-based POMDP algorithms that plan with observation likelihood weighting have shown practical effectiveness, a general theory characterizing the approximation error of the particle filtering techniques that these algorithms use has not previously been proposed. Our main contribution is bounding the error between any POMDP and its corresponding finite sample particle belief MDP (PB-MDP) approximation. This fundamental bridge between PB-MDPs and POMDPs allows us to adapt any sampling-based MDP algorithm to a POMDP by solving the corresponding particle belief MDP, thereby extending the convergence guarantees of the MDP algorithm to the POMDP. Practically, this is implemented by using the particle filter belief transition model as the generative model for the MDP solver. While this requires access to the observation density model from the POMDP, it only increases the transition sampling complexity of the MDP solver by a factor of $\mathcal{O}(C)$, where $C$ is the number of particles. Thus, when combined with sparse sampling MDP algorithms, this approach can yield algorithms for POMDPs that have no direct theoretical dependence on the size of the state and observation spaces. In addition to our theoretical contribution, we perform five numerical experiments on benchmark POMDPs to demonstrate that a simple MDP algorithm adapted using PB-MDP approximation, Sparse-PFT, achieves performance competitive with other leading continuous observation POMDP solvers.
In this article we develop a high order accurate method to solve the incompressible boundary layer equations in a provably stable manner.~We first derive continuous energy estimates,~and then proceed to the discrete setting.~We formulate the discrete approximation using high-order finite difference methods on summation-by-parts form and implement the boundary conditions weakly using the simultaneous approximation term method.~By applying the discrete energy method and imitating the continuous analysis,~the discrete estimate that resembles the continuous counterpart is obtained proving stability.~We also show that these newly derived boundary conditions removes the singularities associated with the null-space of the nonlinear discrete spatial operator.~Numerical experiments that verifies the high-order accuracy of the scheme and coincides with the theoretical results are presented.~The numerical results are compared with the well-known Blasius similarity solution as well as that resulting from the solution of the incompressible Navier Stokes equations.
We propose an isogeometric solver for Poisson problems that combines i) low-rank tensor techniques to approximate the unknown solution and the system matrix, as a sum of a few terms having Kronecker product structure, ii) a Truncated Preconditioned Conjugate Gradient solver to keep the rank of the iterates low, and iii) a novel low-rank preconditioner, based on the Fast Diagonalization method where the eigenvector multiplication is approximated by the Fast Fourier Transform. Although the proposed strategy is written in arbitrary dimension, we focus on the three-dimensional case and adopt the Tucker format for low-rank tensor representation, which is well suited in low dimension. We show in numerical tests that this choice guarantees significant memory saving compared to the full tensor representation. We also extend and test the proposed strategy to linear elasticity problems.
Stein Variational Gradient Descent (SVGD) is a nonparametric particle-based deterministic sampling algorithm. Despite its wide usage, understanding the theoretical properties of SVGD has remained a challenging problem. For sampling from a Gaussian target, the SVGD dynamics with a bilinear kernel will remain Gaussian as long as the initializer is Gaussian. Inspired by this fact, we undertake a detailed theoretical study of the Gaussian-SVGD, i.e., SVGD projected to the family of Gaussian distributions via the bilinear kernel, or equivalently Gaussian variational inference (GVI) with SVGD. We present a complete picture by considering both the mean-field PDE and discrete particle systems. When the target is strongly log-concave, the mean-field Gaussian-SVGD dynamics is proven to converge linearly to the Gaussian distribution closest to the target in KL divergence. In the finite-particle setting, there is both uniform in time convergence to the mean-field limit and linear convergence in time to the equilibrium if the target is Gaussian. In the general case, we propose a density-based and a particle-based implementation of the Gaussian-SVGD, and show that several recent algorithms for GVI, proposed from different perspectives, emerge as special cases of our unified framework. Interestingly, one of the new particle-based instance from this framework empirically outperforms existing approaches. Our results make concrete contributions towards obtaining a deeper understanding of both SVGD and GVI.
This work is concerned with the analysis of a space-time finite element discontinuous Galerkin method on polytopal meshes (XT-PolydG) for the numerical discretization of wave propagation in coupled poroelastic-elastic media. The mathematical model consists of the low-frequency Biot's equations in the poroelastic medium and the elastodynamics equation for the elastic one. To realize the coupling, suitable transmission conditions on the interface between the two domains are (weakly) embedded in the formulation. The proposed PolydG discretization in space is then coupled with a dG time integration scheme, resulting in a full space-time dG discretization. We present the stability analysis for both the continuous and the semidiscrete formulations, and we derive error estimates for the semidiscrete formulation in a suitable energy norm. The method is applied to a wide set of numerical test cases to verify the theoretical bounds. Examples of physical interest are also presented to investigate the capability of the proposed method in relevant geophysical scenarios.