Following White's approach of robust multiple linear regression, we give asymptotic confidence intervals for the multiple correlation coefficient R2 under minimal moment conditions. We also give the asymptotic joint distribution of the empirical estimators of the individual R2's. Through different sets of simulations, we show that the procedure is indeed robust (contrary to the procedure involving the near exact distribution of the empirical estimator of R2 is the multivariate Gaussian case) and can be also applied to count linear regression.
It is well known that Newton's method, especially when applied to large problems such as the discretization of nonlinear partial differential equations (PDEs), can have trouble converging if the initial guess is too far from the solution. This work focuses on accelerating this convergence, in the context of the discretization of nonlinear elliptic PDEs. We first provide a quick review of existing methods, and justify our choice of learning an initial guess with a Fourier neural operator (FNO). This choice was motivated by the mesh-independence of such operators, whose training and evaluation can be performed on grids with different resolutions. The FNO is trained using a loss minimization over generated data, loss functions based on the PDE discretization. Numerical results, in one and two dimensions, show that the proposed initial guess accelerates the convergence of Newton's method by a large margin compared to a naive initial guess, especially for highly nonlinear or anisotropic problems.
One way to make decisions under uncertainty is to select an optimal option from a possible range of options, by maximizing the expected utilities derived from a probability model. However, under severe uncertainty, identifying precise probabilities is hard. For this reason, imprecise probability models uncertainty through convex sets of probabilities, and considers decision rules that can return multiple options to reflect insufficient information. Many well-founded decision rules have been studied in the past, but none of those standard rules are able to control the number of returned alternatives. This can be a problem for large decision problems, due to the cognitive burden decision makers have to face when presented with a large number of alternatives. Our contribution proposes regret-based ideas to construct new decision rules which return a bounded number of options, where the limit on the number of options is set in advance by the decision maker as an expression of their cognitive limitation. We also study their consistency and numerical behaviour.
This work concerns the implementation of the hybridizable discontinuous Galerkin (HDG) method to solve the linear anisotropic elastic equation in the frequency domain. First-order formulation with the compliance tensor and Voigt notation are employed to provide a compact description of the discretized problem and flexibility with highly heterogeneous media. We further focus on the question of optimal choice of stabilization in the definition of HDG numerical traces. For this purpose, we construct a hybridized Godunov-upwind flux for anisotropic elasticity possessing three distinct wavespeeds. This stabilization removes the need to choose scaling factors, contrary to identity and Kelvin-Christoffel based stabilizations which are popular choices in literature. We carry out comparisons among these families for isotropic and anisotropic material, with constant background and highly heterogeneous ones, in two and three dimensions. They establish the optimality of the Godunov stabilization which can be used as a reference choice for generic material and different types of waves.
It has been shown that deep neural networks of a large enough width are universal approximators but they are not if the width is too small. There were several attempts to characterize the minimum width $w_{\min}$ enabling the universal approximation property; however, only a few of them found the exact values. In this work, we show that the minimum width for $L^p$ approximation of $L^p$ functions from $[0,1]^{d_x}$ to $\mathbb R^{d_y}$ is exactly $\max\{d_x,d_y,2\}$ if an activation function is ReLU-Like (e.g., ReLU, GELU, Softplus). Compared to the known result for ReLU networks, $w_{\min}=\max\{d_x+1,d_y\}$ when the domain is $\smash{\mathbb R^{d_x}}$, our result first shows that approximation on a compact domain requires smaller width than on $\smash{\mathbb R^{d_x}}$. We next prove a lower bound on $w_{\min}$ for uniform approximation using general activation functions including ReLU: $w_{\min}\ge d_y+1$ if $d_x<d_y\le2d_x$. Together with our first result, this shows a dichotomy between $L^p$ and uniform approximations for general activation functions and input/output dimensions.
We prove that QMA where the verifier may also make a single non-collapsing measurement is equal to NEXP, resolving an open question of Aaronson. We show this is a corollary to a modified proof of QMA+ = NEXP [arXiv:2306.13247]. At the core of many results inspired by Blier and Tapp [arXiv:0709.0738] is an unphysical property testing problem deciding whether a quantum state is close to an element of a fixed basis.
Running quantum algorithms protected by quantum error correction requires a real time, classical decoder. To prevent the accumulation of a backlog, this decoder must process syndromes from the quantum device at a faster rate than they are generated. Most prior work on real time decoding has focused on an isolated logical qubit encoded in the surface code. However, for surface code, quantum programs of utility will require multi-qubit interactions performed via lattice surgery. A large merged patch can arise during lattice surgery -- possibly as large as the entire device. This puts a significant strain on a real time decoder, which must decode errors on this merged patch and maintain the level of fault-tolerance that it achieves on isolated logical qubits. These requirements are relaxed by using spatially parallel decoding, which can be accomplished by dividing the physical qubits on the device into multiple overlapping groups and assigning a decoder module to each. We refer to this approach as spatially parallel windows. While previous work has explored similar ideas, none have addressed system-specific considerations pertinent to the task or the constraints from using hardware accelerators. In this work, we demonstrate how to configure spatially parallel windows, so that the scheme (1) is compatible with hardware accelerators, (2) supports general lattice surgery operations, (3) maintains the fidelity of the logical qubits, and (4) meets the throughput requirement for real time decoding. Furthermore, our results reveal the importance of optimally choosing the buffer width to achieve a balance between accuracy and throughput -- a decision that should be influenced by the device's physical noise.
Recently, efficiently deploying deep learning solutions on the edge has received increasing attention. New platforms are emerging to support the increasing demand for flexibility and high performance. In this work, we explore the efficient mapping of convolutional layers on an open-hardware, low-power Coarse-Grain Reconfigurable Array (CGRA), namely OpenEdgeCGRA. We explore both direct implementations of convolution and solutions that transform it into a matrix multiplication through an Im2col transformation, and experiment with various tensor parallelism axes. We show that for this hardware target, direct convolution, coupled with weight parallelism reaches the best latency and energy efficiency, outperforming a CPU implementation by 3.4x and 9.9x in terms of energy and latency, respectively.
This paper explores the connections between tempering (for Sequential Monte Carlo; SMC) and entropic mirror descent to sample from a target probability distribution whose unnormalized density is known. We establish that tempering SMC corresponds to entropic mirror descent applied to the reverse Kullback-Leibler (KL) divergence and obtain convergence rates for the tempering iterates. Our result motivates the tempering iterates from an optimization point of view, showing that tempering can be seen as a descent scheme of the KL divergence with respect to the Fisher-Rao geometry, in contrast to Langevin dynamics that perform descent of the KL with respect to the Wasserstein-2 geometry. We exploit the connection between tempering and mirror descent iterates to justify common practices in SMC and derive adaptive tempering rules that improve over other alternative benchmarks in the literature.
We present an asymptotic expansion formula of an estimator for the drift coefficient of the fractional Ornstein-Uhlenbeck process. As the machinery, we apply the general expansion scheme for Wiener functionals recently developed by the authors [26]. The central limit theorem in the principal part of the expansion has the classical scaling T^{1/2}. However, the asymptotic expansion formula is a complex in that the order of the correction term becomes the classical T^{-1/2} for H in (1/2,5/8), but T^{4H-3} for H in [5/8, 3/4).
In this paper, we present a new high-order discontinuous Galerkin (DG) method, in which neither a penalty parameter nor a stabilization parameter is needed. We refer to this method as penalty-free DG (\PFDG). In this method, the trial and test functions belong to the broken Sobolev space, in which the functions are in general discontinuous on the mesh skeleton and do not meet the Dirichlet boundary conditions. However, a subset can be distinguished in this space, where the functions are continuous and satisfy the Dirichlet boundary conditions, and this subset is called admissible. The trial solution is chosen to lie in an \emph{augmented} admissible subset, in which a small violation of the continuity condition is permitted. This subset is constructed by applying special augmented constraints to the linear combination of finite element basis functions. In this approach, all the advantages of the DG method are retained without the necessity of using stability parameters or numerical fluxes. Several benchmark problems in two dimensions (Poisson equation, linear elasticity, hyperelasticity, and biharmonic equation) on polygonal (triangles, quadrilateral and weakly convex polygons) meshes as well as a three-dimensional Poisson problem on hexahedral meshes are considered. Numerical results are presented that affirm the sound accuracy and optimal convergence of the method in the $L^2$ norm and the energy seminorm.