In this paper we consider the generalized inverse iteration for computing ground states of the Gross-Pitaevskii eigenvector problem (GPE). For that we prove explicit linear convergence rates that depend on the maximum eigenvalue in magnitude of a weighted linear eigenvalue problem. Furthermore, we show that this eigenvalue can be bounded by the first spectral gap of a linearized Gross-Pitaevskii operator, recovering the same rates as for linear eigenvector problems. With this we establish the first local convergence result for the basic inverse iteration for the GPE without damping. We also show how our findings directly generalize to extended inverse iterations, such as the Gradient Flow Discrete Normalized (GFDN) proposed in [W. Bao, Q. Du, SIAM J. Sci. Comput., 25 (2004)] or the damped inverse iteration suggested in [P. Henning, D. Peterseim, SIAM J. Numer. Anal., 53 (2020)]. Our analysis also reveals why the inverse iteration for the GPE does not react favourably to spectral shifts. This empirical observation can now be explained with a blow-up of a weighting function that crucially contributes to the convergence rates. Our findings are illustrated by numerical experiments.
A well-known boundary observability inequality for the elasticity system establishes that the energy of the system can be estimated from the solution on a sufficiently large part of the boundary for a sufficiently large time. This inequality is relevant in different contexts as the exact boundary controllability, boundary stabilization, or some inverse source problems. Here we show that a corresponding boundary observability inequality for the spectral collocation approximation of the linear elasticity system in a d-dimensional cube also holds, uniformly with respect to the discretization parameter. This property is essential to prove that natural numerical approaches to the previous problems based on replacing the elasticity system by collocation discretization will give successful approximations of the continuous counterparts.
A Low-rank Spectral Optimization Problem (LSOP) minimizes a linear objective subject to multiple two-sided linear matrix inequalities intersected with a low-rank and spectral constrained domain set. Although solving LSOP is, in general, NP-hard, its partial convexification (i.e., replacing the domain set by its convex hull) termed "LSOP-R," is often tractable and yields a high-quality solution. This motivates us to study the strength of LSOP-R. Specifically, we derive rank bounds for any extreme point of the feasible set of LSOP-R and prove their tightness for the domain sets with different matrix spaces. The proposed rank bounds recover two well-known results in the literature from a fresh angle and also allow us to derive sufficient conditions under which the relaxation LSOP-R is equivalent to the original LSOP. To effectively solve LSOP-R, we develop a column generation algorithm with a vector-based convex pricing oracle, coupled with a rank-reduction algorithm, which ensures the output solution satisfies the theoretical rank bound. Finally, we numerically verify the strength of the LSOP-R and the efficacy of the proposed algorithms.
We present a randomized, inverse-free algorithm for producing an approximate diagonalization of any $n \times n$ matrix pencil $(A,B)$. The bulk of the algorithm rests on a randomized divide-and-conquer eigensolver for the generalized eigenvalue problem originally proposed by Ballard, Demmel, and Dumitriu [Technical Report 2010]. We demonstrate that this divide-and-conquer approach can be formulated to succeed with high probability as long as the input pencil is sufficiently well-behaved, which is accomplished by generalizing the recent pseudospectral shattering work of Banks, Garza-Vargas, Kulkarni, and Srivastava [Foundations of Computational Mathematics 2022]. In particular, we show that perturbing and scaling $(A,B)$ regularizes its pseudospectra, allowing divide-and-conquer to run over a simple random grid and in turn producing an accurate diagonalization of $(A,B)$ in the backward error sense. The main result of the paper states the existence of a randomized algorithm that with high probability (and in exact arithmetic) produces invertible $S,T$ and diagonal $D$ such that $||A - SDT^{-1}||_2 \leq \varepsilon$ and $||B - SIT^{-1}||_2 \leq \varepsilon$ in at most $O \left( \log(n) \log^2 \left( \frac{n}{\varepsilon} \right) T_{\text{MM}}(n) \right)$ operations, where $T_{\text{MM}}(n)$ is the asymptotic complexity of matrix multiplication. This not only provides a new set of guarantees for highly parallel generalized eigenvalue solvers but also establishes nearly matrix multiplication time as an upper bound on the complexity of exact arithmetic matrix pencil diagonalization.
Digital twins hold substantial promise in many applications, but rigorous procedures for assessing their accuracy are essential for their widespread deployment in safety-critical settings. By formulating this task within the framework of causal inference, we show that attempts to certify the correctness of a twin using real-world observational data are unsound unless potentially tenuous assumptions are made about the data-generating process. To avoid these assumptions, we propose an assessment strategy that instead aims to find cases where the twin is not correct, and present a general-purpose statistical procedure for doing so that may be used across a wide variety of applications and twin models. Our approach yields reliable and actionable information about the twin under minimal assumptions about the twin and the real-world process of interest. We demonstrate the effectiveness of our methodology via a large-scale case study involving sepsis modelling within the Pulse Physiology Engine, which we assess using the MIMIC-III dataset of ICU patients.
The Plackett--Luce model is a popular approach for ranking data analysis, where a utility vector is employed to determine the probability of each outcome based on Luce's choice axiom. In this paper, we investigate the asymptotic theory of utility vector estimation by maximizing different types of likelihood, such as the full-, marginal-, and quasi-likelihood. We provide a rank-matching interpretation for the estimating equations of these estimators and analyze their asymptotic behavior as the number of items being compared tends to infinity. In particular, we establish the uniform consistency of these estimators under conditions characterized by the topology of the underlying comparison graph sequence and demonstrate that the proposed conditions are sharp for common sampling scenarios such as the nonuniform random hypergraph model and the hypergraph stochastic block model; we also obtain the asymptotic normality of these estimators and discuss the trade-off between statistical efficiency and computational complexity for practical uncertainty quantification. Both results allow for nonuniform and inhomogeneous comparison graphs with varying edge sizes and different asymptotic orders of edge probabilities. We verify our theoretical findings by conducting detailed numerical experiments.
Comprehending how the brain interacts with the external world through generated neural signals is crucial for determining its working mechanism, treating brain diseases, and understanding intelligence. Although many theoretical models have been proposed, they have thus far been difficult to integrate and develop. In this study, we were inspired in part by grid cells in creating a more general and robust grid module and constructing an interactive and self-reinforcing cognitive system together with Bayesian reasoning, an approach called space-division and exploration-exploitation with grid-feedback (Grid-SD2E). Here, a grid module can be used as an interaction medium between the outside world and a system, as well as a self-reinforcement medium within the system. The space-division and exploration-exploitation (SD2E) receives the 0/1 signals of a grid through its space-division (SD) module. The system described in this paper is also a theoretical model derived from experiments conducted by other researchers and our experience on neural decoding. Herein, we analyse the rationality of the system based on the existing theories in both neuroscience and cognitive science, and attempt to propose special and general rules to explain the different interactions between people and between people and the external world. What's more, based on this model, the smallest computing unit is extracted, which is analogous to a single neuron in the brain.
This paper explores variants of the subspace iteration algorithm for computing approximate invariant subspaces. The standard subspace iteration approach is revisited and new variants that exploit gradient-type techniques combined with a Grassmann manifold viewpoint are developed. A gradient method as well as a conjugate gradient technique are described. Convergence of the gradient-based algorithm is analyzed and a few numerical experiments are reported, indicating that the proposed algorithms are sometimes superior to a standard Chebyshev-based subspace iteration when compared in terms of number of matrix vector products, but do not require estimating optimal parameters. An important contribution of this paper to achieve this good performance is the accurate and efficient implementation of an exact line search. In addition, new convergence proofs are presented for the non-accelerated gradient method that includes a locally exponential convergence if started in a $\mathcal{O(\sqrt{\delta})}$ neighbourhood of the dominant subspace with spectral gap $\delta$.
This paper considers a single-trajectory system identification problem for linear systems under general nonlinear and/or time-varying policies with i.i.d. random excitation noises. The problem is motivated by safe learning-based control for constrained linear systems, where the safe policies during the learning process are usually nonlinear and time-varying for satisfying the state and input constraints. In this paper, we provide a non-asymptotic error bound for least square estimation when the data trajectory is generated by any nonlinear and/or time-varying policies as long as the generated state and action trajectories are bounded. This significantly generalizes the existing non-asymptotic guarantees for linear system identification, which usually consider i.i.d. random inputs or linear policies. Interestingly, our error bound is consistent with that for linear policies with respect to the dependence on the trajectory length, system dimensions, and excitation levels. Lastly, we demonstrate the applications of our results by safe learning with robust model predictive control and provide numerical analysis.
Two numerical schemes are proposed and investigated for the Yang--Mills equations, which can be seen as a nonlinear generalisation of the Maxwell equations set on Lie algebra-valued functions, with similarities to certain formulations of General Relativity. Both schemes are built on the Discrete de Rham (DDR) method, and inherit from its main features: an arbitrary order of accuracy, and applicability to generic polyhedral meshes. They make use of the complex property of the DDR, together with a Lagrange-multiplier approach, to preserve, at the discrete level, a nonlinear constraint associated with the Yang--Mills equations. We also show that the schemes satisfy a discrete energy dissipation (the dissipation coming solely from the implicit time stepping). Issues around the practical implementations of the schemes are discussed; in particular, the assembly of the local contributions in a way that minimises the price we pay in dealing with nonlinear terms, in conjunction with the tensorisation coming from the Lie algebra. Numerical tests are provided using a manufactured solution, and show that both schemes display a convergence in $L^2$-norm of the potential and electrical fields in $\mathcal O(h^{k+1})$ (provided that the time step is of that order), where $k$ is the polynomial degree chosen for the DDR complex. We also numerically demonstrate the preservation of the constraint.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.