In this article, we propose a reduced basis method for parametrized non-symmetric eigenvalue problems arising in the loading pattern optimization of a nuclear core in neutronics. To this end, we derive a posteriori error estimates for the eigenvalue and left and right eigenvectors. The practical computation of these estimators requires the estimation of a constant called prefactor, which we can express as the spectral norm of some operator. We provide some elements of theoretical analysis which illustrate the link between the expression of the prefactor we obtain here and its well-known expression in the case of symmetric eigenvalue problems, either using the notion of numerical range of the operator, or via a perturbative analysis. Lastly, we propose a practical method in order to estimate this prefactor which yields interesting numerical results on actual test cases. We provide detailed numerical simulations on two-dimensional examples including a multigroup neutron diffusion equation.
One of the central quantities of probabilistic seismic risk assessment studies is the fragility curve, which represents the probability of failure of a mechanical structure conditional to a scalar measure derived from the seismic ground motion. Estimating such curves is a difficult task because for most structures of interest, few data are available. For this reason, a wide range of the methods of the literature rely on a parametric log-normal model. Bayesian approaches allow for efficient learning of the model parameters. However, the choice of the prior distribution has a non-negligible influence on the posterior distribution, and therefore on any resulting estimate. We propose a thorough study of this parametric Bayesian estimation problem when the data are binary (i.e. data indicate the state of the structure, failure or non-failure). Using the reference prior theory as a support, we suggest an objective approach for the prior choice. This approach leads to the Jeffreys' prior which is explicitly derived for this problem for the first time. The posterior distribution is proven to be proper (i.e. it integrates to unity) with Jeffreys' prior and improper with some classical priors from the literature. The posterior distribution with Jeffreys' prior is also shown to vanish at the boundaries of the parameter domain, so sampling of the posterior distribution of the parameters does not produce anomalously small or large values, which in turn does not produce degenerate fragility curves such as unit step functions. The numerical results on three different case studies illustrate these theoretical predictions.
A standard approach to solve ordinary differential equations, when they describe dynamical systems, is to adopt a Runge-Kutta or related scheme. Such schemes, however, are not applicable to the large class of equations which do not constitute dynamical systems. In several physical systems, we encounter integro-differential equations with memory terms where the time derivative of a state variable at a given time depends on all past states of the system. Secondly, there are equations whose solutions do not have well-defined Taylor series expansion. The Maxey-Riley-Gatignol equation, which describes the dynamics of an inertial particle in nonuniform and unsteady flow, displays both challenges. We use it as a test bed to address the questions we raise, but our method may be applied to all equations of this class. We show that the Maxey-Riley-Gatignol equation can be embedded into an extended Markovian system which is constructed by introducing a new dynamical co-evolving state variable that encodes memory of past states. We develop a Runge-Kutta algorithm for the resultant Markovian system. The form of the kernels involved in deriving the Runge-Kutta scheme necessitates the use of an expansion in powers of $t^{1/2}$. Our approach naturally inherits the benefits of standard time-integrators, namely a constant memory storage cost, a linear growth of operational effort with simulation time, and the ability to restart a simulation with the final state as the new initial condition.
This paper presents the error analysis of numerical methods on graded meshes for stochastic Volterra equations with weakly singular kernels. We first prove a novel regularity estimate for the exact solution via analyzing the associated convolution structure. This reveals that the exact solution exhibits an initial singularity in the sense that its H\"older continuous exponent on any neighborhood of $t=0$ is lower than that on every compact subset of $(0,T]$. Motivated by the initial singularity, we then construct the Euler--Maruyama method, fast Euler--Maruyama method, and Milstein method based on graded meshes. By establishing their pointwise-in-time error estimates, we give the grading exponents of meshes to attain the optimal uniform-in-time convergence orders, where the convergence orders improve those of the uniform mesh case. Numerical experiments are finally reported to confirm the sharpness of theoretical findings.
We propose a new randomized method for solving systems of nonlinear equations, which can find sparse solutions or solutions under certain simple constraints. The scheme only takes gradients of component functions and uses Bregman projections onto the solution space of a Newton equation. In the special case of euclidean projections, the method is known as nonlinear Kaczmarz method. Furthermore, if the component functions are nonnegative, we are in the setting of optimization under the interpolation assumption and the method reduces to SGD with the recently proposed stochastic Polyak step size. For general Bregman projections, our method is a stochastic mirror descent with a novel adaptive step size. We prove that in the convex setting each iteration of our method results in a smaller Bregman distance to exact solutions as compared to the standard Polyak step. Our generalization to Bregman projections comes with the price that a convex one-dimensional optimization problem needs to be solved in each iteration. This can typically be done with globalized Newton iterations. Convergence is proved in two classical settings of nonlinearity: for convex nonnegative functions and locally for functions which fulfill the tangential cone condition. Finally, we show examples in which the proposed method outperforms similar methods with the same memory requirements.
We study the problem of testing identity of a collection of unknown quantum states given sample access to this collection, each state appearing with some known probability. We show that for a collection of $d$-dimensional quantum states of cardinality $N$, the sample complexity is $O(\sqrt{N}d/\epsilon^2)$, {with a matching lower bound, up to a multiplicative constant}. The test is obtained by estimating the mean squared Hilbert-Schmidt distance between the states, thanks to a suitable generalization of the estimator of the Hilbert-Schmidt distance between two unknown states by B\u{a}descu, O'Donnell, and Wright (//dl.acm.org/doi/10.1145/3313276.3316344).
We present a multigrid algorithm to solve efficiently the large saddle-point systems of equations that typically arise in PDE-constrained optimization under uncertainty. The algorithm is based on a collective smoother that at each iteration sweeps over the nodes of the computational mesh, and solves a reduced saddle-point system whose size depends on the number $N$ of samples used to discretized the probability space. We show that this reduced system can be solved with optimal $O(N)$ complexity. We test the multigrid method on three problems: a linear-quadratic problem for which the multigrid method is used to solve directly the linear optimality system; a nonsmooth problem with box constraints and $L^1$-norm penalization on the control, in which the multigrid scheme is used within a semismooth Newton iteration; a risk-adverse problem with the smoothed CVaR risk measure where the multigrid method is called within a preconditioned Newton iteration. In all cases, the multigrid algorithm exhibits very good performances and robustness with respect to all parameters of interest.
In this paper, to the best of our knowledge, we make the first attempt at studying the parametric semilinear elliptic eigenvalue problems with the parametric coefficient and some power-type nonlinearities. The parametric coefficient is assumed to have an affine dependence on the countably many parameters with an appropriate class of sequences of functions. In this paper, we obtain the upper bound estimation for the mixed derivatives of the ground eigenpairs that has the same form obtained recently for the linear eigenvalue problem. The three most essential ingredients for this estimation are the parametric analyticity of the ground eigenpairs, the uniform boundedness of the ground eigenpairs, and the uniform positive differences between ground eigenvalues of linear operators. All these three ingredients need new techniques and a careful investigation of the nonlinear eigenvalue problem that will be presented in this paper. As an application, considering each parameter as a uniformly distributed random variable, we estimate the expectation of the eigenpairs using a randomly shifted quasi-Monte Carlo lattice rule and show the dimension-independent error bound.
We investigate a class of parametric elliptic eigenvalue problems with homogeneous essential boundary conditions where the coefficients (and hence the solution $u$) may depend on a parameter $y$. For the efficient approximate evaluation of parameter sensitivities of the first eigenpairs on the entire parameter space we propose and analyse Gevrey class and analytic regularity of the solution with respect to the parameters. This is made possible by a novel proof technique which we introduce and demonstrate in this paper. Our regularity result has immediate implications for convergence of various numerical schemes for parametric elliptic eigenvalue problems, in particular, for elliptic eigenvalue problems with infinitely many parameters arising from elliptic differential operators with random coefficients.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.