We describe a `discretize-then-relax' strategy to globally minimize integral functionals over functions $u$ in a Sobolev space satisfying prescribed Dirichlet boundary conditions. The strategy applies whenever the integral functional depends polynomially on $u$ and its derivatives, even if it is nonconvex. The `discretize' step uses a bounded finite-element scheme to approximate the integral minimization problem with a convergent hierarchy of polynomial optimization problems over a compact feasible set, indexed by the decreasing size $h$ of the finite-element mesh. The `relax' step employs sparse moment-SOS relaxations to approximate each polynomial optimization problem with a hierarchy of convex semidefinite programs, indexed by an increasing relaxation order $\omega$. We prove that, as $\omega\to\infty$ and $h\to 0$, solutions of such semidefinite programs provide approximate minimizers that converge in $L^p$ to the global minimizer of the original integral functional if this is unique. We also report computational experiments that show our numerical strategy works well even when technical conditions required by our theoretical analysis are not satisfied.
We propose a numerical method to solve parameter-dependent hyperbolic partial differential equations (PDEs) with a moment approach, based on a previous work from Marx et al. (2020). This approach relies on a very weak notion of solution of nonlinear equations, namely parametric entropy measure-valued (MV) solutions, satisfying linear equations in the space of Borel measures. The infinite-dimensional linear problem is approximated by a hierarchy of convex, finite-dimensional, semidefinite programming problems, called Lasserre's hierarchy. This gives us a sequence of approximations of the moments of the occupation measure associated with the parametric entropy MV solution, which is proved to converge. In the end, several post-treatments can be performed from this approximate moments sequence. In particular, the graph of the solution can be reconstructed from an optimization of the Christoffel-Darboux kernel associated with the approximate measure, that is a powerful approximation tool able to capture a large class of irregular functions. Also, for uncertainty quantification problems, several quantities of interest can be estimated, sometimes directly such as the expectation of smooth functionals of the solutions. The performance of our approach is evaluated through numerical experiments on the inviscid Burgers equation with parametrised initial conditions or parametrised flux function.
We study the size of a neural network needed to approximate the maximum function over $d$ inputs, in the most basic setting of approximating with respect to the $L_2$ norm, for continuous distributions, for a network that uses ReLU activations. We provide new lower and upper bounds on the width required for approximation across various depths. Our results establish new depth separations between depth 2 and 3, and depth 3 and 5 networks, as well as providing a depth $\mathcal{O}(\log(\log(d)))$ and width $\mathcal{O}(d)$ construction which approximates the maximum function, significantly improving upon the depth requirements of the best previously known bounds for networks with linearly-bounded width. Our depth separation results are facilitated by a new lower bound for depth 2 networks approximating the maximum function over the uniform distribution, assuming an exponential upper bound on the size of the weights. Furthermore, we are able to use this depth 2 lower bound to provide tight bounds on the number of neurons needed to approximate the maximum by a depth 3 network. Our lower bounds are of potentially broad interest as they apply to the widely studied and used \emph{max} function, in contrast to many previous results that base their bounds on specially constructed or pathological functions and distributions.
It is well known that it is impossible to construct useful confidence intervals (CIs) about the mean or median of a response $Y$ conditional on features $X = x$ without making strong assumptions about the joint distribution of $X$ and $Y$. This paper introduces a new framework for reasoning about problems of this kind by casting the conditional problem at different levels of resolution, ranging from coarse to fine localization. In each of these problems, we consider local quantiles defined as the marginal quantiles of $Y$ when $(X,Y)$ is resampled in such a way that samples $X$ near $x$ are up-weighted while the conditional distribution $Y \mid X$ does not change. We then introduce the Weighted Quantile method, which asymptotically produces the uniformly most accurate confidence intervals for these local quantiles no matter the (unknown) underlying distribution. Another method, namely, the Quantile Rejection method, achieves finite sample validity under no assumption whatsoever. We conduct extensive numerical studies demonstrating that both of these methods are valid. In particular, we show that the Weighted Quantile procedure achieves nominal coverage as soon as the effective sample size is in the range of 10 to 20.
We study the validity of the Neumann or Born series approach in solving the Helmholtz equation and coefficient identification in related inverse scattering problems. Precisely, we derive a sufficient and necessary condition under which the series is strongly convergent. We also investigate the rate of convergence of the series. The obtained condition is optimal and it can be much weaker than the traditional requirement for the convergence of the series. Our approach makes use of reduction space techniques proposed by Suzuki \cite{Suzuki-1976}. Furthermore we propose an interpolation method that allows the use of the Neumann series in all cases. Finally, we provide several numerical tests with different medium functions and frequency values to validate our theoretical results.
The criticality problem in nuclear engineering asks for the principal eigen-pair of a Boltzmann operator describing neutron transport in a reactor core. Being able to reliably design, and control such reactors requires assessing these quantities within quantifiable accuracy tolerances. In this paper we propose a paradigm that deviates from the common practice of approximately solving the corresponding spectral problem with a fixed, presumably sufficiently fine discretization. Instead, the present approach is based on first contriving iterative schemes, formulated in function space, that are shown to converge at a quantitative rate without assuming any a priori excess regularity properties, and that exploit only properties of the optical parameters in the underlying radiative transfer model. We develop the analytical and numerical tools for approximately realizing each iteration step withing judiciously chosen accuracy tolerances, verified by a posteriori estimates, so as to still warrant quantifiable convergence to the exact eigen-pair. This is carried out in full first for a Newton scheme. Since this is only locally convergent we analyze in addition the convergence of a power iteration in function space to produce sufficiently accurate initial guesses. Here we have to deal with intrinsic difficulties posed by compact but unsymmetric operators preventing standard arguments used in the finite dimensional case. Our main point is that we can avoid any condition on an initial guess to be already in a small neighborhood of the exact solution. We close with a discussion of remaining intrinsic obstructions to a certifiable numerical implementation, mainly related to not knowing the gap between the principal eigenvalue and the next smaller one in modulus.
Let $\Omega = [0,1]^d$ be the unit cube in $\mathbb{R}^d$. We study the problem of how efficiently, in terms of the number of parameters, deep neural networks with the ReLU activation function can approximate functions in the Sobolev spaces $W^s(L_q(\Omega))$ and Besov spaces $B^s_r(L_q(\Omega))$, with error measured in the $L_p(\Omega)$ norm. This problem is important when studying the application of neural networks in a variety of fields, including scientific computing and signal processing, and has previously been completely solved only when $p=q=\infty$. Our contribution is to provide a complete solution for all $1\leq p,q\leq \infty$ and $s > 0$, including asymptotically matching upper and lower bounds. The key technical tool is a novel bit-extraction technique which gives an optimal encoding of sparse vectors. This enables us to obtain sharp upper bounds in the non-linear regime where $p > q$. We also provide a novel method for deriving $L_p$-approximation lower bounds based upon VC-dimension when $p < \infty$. Our results show that very deep ReLU networks significantly outperform classical methods of approximation in terms of the number of parameters, but that this comes at the cost of parameters which are not encodable.
Statistical inference of the high-dimensional regression coefficients is challenging because the uncertainty introduced by the model selection procedure is hard to account for. A critical question remains unsettled; that is, is it possible and how to embed the inference of the model into the simultaneous inference of the coefficients? To this end, we propose a notion of simultaneous confidence intervals called the sparsified simultaneous confidence intervals. Our intervals are sparse in the sense that some of the intervals' upper and lower bounds are shrunken to zero (i.e., $[0,0]$), indicating the unimportance of the corresponding covariates. These covariates should be excluded from the final model. The rest of the intervals, either containing zero (e.g., $[-1,1]$ or $[0,1]$) or not containing zero (e.g., $[2,3]$), indicate the plausible and significant covariates, respectively. The proposed method can be coupled with various selection procedures, making it ideal for comparing their uncertainty. For the proposed method, we establish desirable asymptotic properties, develop intuitive graphical tools for visualization, and justify its superior performance through simulation and real data analysis.
We introduce the notion of a Real Equation System (RES), which lifts Boolean Equation Systems (BESs) to the domain of extended real numbers. Our RESs allow arbitrary nesting of least and greatest fixed-point operators. We show that each RES can be rewritten into an equivalent RES in normal form. These normal forms provide the basis for a complete procedure to solve RESs. This employs the elimination of the fixed-point variable at the left side of an equation from its right-hand side, combined with a technique often referred to as Gau{\ss}-elimination. We illustrate how this framework can be used to verify quantitative modal formulas with alternating fixed-point operators interpreted over probabilistic labelled transition systems.
In contrast with the diffusion equation which smoothens the initial data to $C^\infty$ for $t>0$ (away from the corners/edges of the domain), the subdiffusion equation only exhibits limited spatial regularity. As a result, one generally cannot expect high-order accuracy in space in solving the subdiffusion equation with nonsmooth initial data. In this paper, a new splitting of the solution is constructed for high-order finite element approximations to the subdiffusion equation with nonsmooth initial data. The method is constructed by splitting the solution into two parts, i.e., a time-dependent smooth part and a time-independent nonsmooth part, and then approximating the two parts via different strategies. The time-dependent smooth part is approximated by using high-order finite element method in space and convolution quadrature in time, while the steady nonsmooth part could be approximated by using smaller mesh size or other methods that could yield high-order accuracy. Several examples are presented to show how to accurately approximate the steady nonsmooth part, including piecewise smooth initial data, Dirac--Delta point initial data, and Dirac measure concentrated on an interface. The argument could be directly extended to subdiffusion equations with nonsmooth source data. Extensive numerical experiments are presented to support the theoretical analysis and to illustrate the performance of the proposed high-order splitting finite element methods.
We introduce a new quantum algorithm for computing the Betti numbers of a simplicial complex. In contrast to previous quantum algorithms that work by estimating the eigenvalues of the combinatorial Laplacian, our algorithm is an instance of the generic Incremental Algorithm for computing Betti numbers that incrementally adds simplices to the simplicial complex and tests whether or not they create a cycle. In contrast to existing quantum algorithms for computing Betti numbers that work best when the complex has close to the maximal number of simplices, our algorithm works best for sparse complexes. To test whether a simplex creates a cycle, we introduce a quantum span-program algorithm. We show that the query complexity of our span program is parameterized by quantities called the effective resistance and effective capacitance of the boundary of the simplex. Unfortunately, we also prove upper and lower bounds on the effective resistance and capacitance, showing both quantities can be exponentially large with respect to the size of the complex, implying that our algorithm would have to run for exponential time to exactly compute Betti numbers. However, as a corollary to these bounds, we show that the spectral gap of the combinatorial Laplacian can be exponentially small. As the runtime of all previous quantum algorithms for computing Betti numbers are parameterized by the inverse of the spectral gap, our bounds show that all quantum algorithms for computing Betti numbers must run for exponentially long to exactly compute Betti numbers. Finally, we prove some novel formulas for effective resistance and effective capacitance to give intuition for these quantities.