We study high-dimensional, ridge-regularized logistic regression in a setting in which the covariates may be missing or corrupted by additive noise. When both the covariates and the additive corruptions are independent and normally distributed, we provide exact characterizations of both the prediction error as well as the estimation error. Moreover, we show that these characterizations are universal: as long as the entries of the data matrix satisfy a set of independence and moment conditions, our guarantees continue to hold. Universality, in turn, enables the detailed study of several imputation-based strategies when the covariates are missing completely at random. We ground our study by comparing the performance of these strategies with the conjectured performance -- stemming from replica theory in statistical physics -- of the Bayes optimal procedure. Our analysis yields several insights including: (i) a distinction between single imputation and a simple variant of multiple imputation and (ii) that adding a simple ridge regularization term to single-imputed logistic regression can yield an estimator whose prediction error is nearly indistinguishable from the Bayes optimal prediction error. We supplement our findings with extensive numerical experiments.
In this paper we develop numerical analysis for finite element discretization of semilinear elliptic equations with potentially non-Lipschitz nonlinearites. The nonlinearity is essecially assumed to be continuous and monotonically decreasing including the case of non-Lipschitz or even non-H\"older continuous nonlinearities. For a direct discretization (without any regularization) with linear finite elements we derive error estimates with respect to various norms and illustrate these results with a numerical example.
We discuss nonparametric estimation of the trend coefficient in models governed by a stochastic differential equation driven by a multiplicative stochastic volatility.
If the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. How could we check whether this sensitivity holds? One idea is to consider every small subset of data, drop it from the dataset, and re-run our analysis. But running MCMC to approximate a Bayesian posterior is already very expensive; running multiple times is prohibitive, and the number of re-runs needed here is combinatorially large. Recent work proposes a fast and accurate approximation to find the worst-case dropped data subset, but that work was developed for problems based on estimating equations -- and does not directly handle Bayesian posterior approximations using MCMC. We make two principal contributions in the present work. We adapt the existing data-dropping approximation to estimators computed via MCMC. Observing that Monte Carlo errors induce variability in the approximation, we use a variant of the bootstrap to quantify this uncertainty. We demonstrate how to use our approximation in practice to determine whether there is non-robustness in a problem. Empirically, our method is accurate in simple models, such as linear regression. In models with complicated structure, such as hierarchical models, the performance of our method is mixed.
Models implicitly defined through a random simulator of a process have become widely used in scientific and industrial applications in recent years. However, simulation-based inference methods for such implicit models, like approximate Bayesian computation (ABC), often scale poorly as data size increases. We develop a scalable inference method for implicitly defined models using a metamodel for the Monte Carlo log-likelihood estimator derived from simulations. This metamodel characterizes both statistical and simulation-based randomness in the distribution of the log-likelihood estimator across different parameter values. Our metamodel-based method quantifies uncertainty in parameter estimation in a principled manner, leveraging the local asymptotic normality of the mean function of the log-likelihood estimator. We apply this method to construct accurate confidence intervals for parameters of partially observed Markov process models where the Monte Carlo log-likelihood estimator is obtained using the bootstrap particle filter. We numerically demonstrate that our method enables accurate and highly scalable parameter inference across several examples, including a mechanistic compartment model for infectious diseases.
Mixed methods for linear elasticity with strongly symmetric stresses of lowest order are studied in this paper. On each simplex, the stress space has piecewise linear components with respect to its Alfeld split (which connects the vertices to barycenter), generalizing the Johnson-Mercier two-dimensional element to higher dimensions. Further reductions in the stress space in the three-dimensional case (to 24 degrees of freedom per tetrahedron) are possible when the displacement space is reduced to local rigid displacements. Proofs of optimal error estimates of numerical solutions and improved error estimates via postprocessing and the duality argument are presented.
Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.
Many natural computational problems in computer science, mathematics, physics, and other sciences amount to deciding if two objects are equivalent. Often this equivalence is defined in terms of group actions. A natural question is to ask when two objects can be distinguished by polynomial functions that are invariant under the group action. For finite groups, this is the usual notion of equivalence, but for continuous groups like the general linear groups it gives rise to a new notion, called orbit closure intersection. It captures, among others, the graph isomorphism problem, noncommutative PIT, null cone problems in invariant theory, equivalence problems for tensor networks, and the classification of multiparty quantum states. Despite recent algorithmic progress in celebrated special cases, the computational complexity of general orbit closure intersection problems is currently quite unclear. In particular, tensors seem to give rise to the most difficult problems. In this work we start a systematic study of orbit closure intersection from the complexity-theoretic viewpoint. To this end, we define a complexity class TOCI that captures the power of orbit closure intersection problems for general tensor actions, give an appropriate notion of algebraic reductions that imply polynomial-time reductions in the usual sense, but are amenable to invariant-theoretic techniques, identify natural tensor problems that are complete for TOCI, including the equivalence of 2D tensor networks with constant physical dimension, and show that the graph isomorphism problem can be reduced to these complete problems, hence GI$\subseteq$TOCI. As such, our work establishes the first lower bound on the computational complexity of orbit closure intersection problems, and it explains the difficulty of finding unconditional polynomial-time algorithms beyond special cases, as has been observed in the literature.
In decision-making, maxitive functions are used for worst-case and best-case evaluations. Maxitivity gives rise to a rich structure that is well-studied in the context of the pointwise order. In this article, we investigate maxitivity with respect to general preorders and provide a representation theorem for such functionals. The results are illustrated for different stochastic orders in the literature, including the usual stochastic order, the increasing convex/concave order, and the dispersive order.
In this work is considered an elliptic problem, referred to as the Ventcel problem, involvinga second order term on the domain boundary (the Laplace-Beltrami operator). A variationalformulation of the Ventcel problem is studied, leading to a finite element discretization. Thefocus is on the construction of high order curved meshes for the discretization of the physicaldomain and on the definition of the lift operator, which is aimed to transform a functiondefined on the mesh domain into a function defined on the physical one. This lift is definedin a way as to satisfy adapted properties on the boundary, relatively to the trace operator.The Ventcel problem approximation is investigated both in terms of geometrical error and offinite element approximation error. Error estimates are obtained both in terms of the meshorder r $\ge$ 1 and to the finite element degree k $\ge$ 1, whereas such estimates usually have beenconsidered in the isoparametric case so far, involving a single parameter k = r. The numericalexperiments we led, both in dimension 2 and 3, allow us to validate the results obtained andproved on the a priori error estimates depending on the two parameters k and r. A numericalcomparison is made between the errors using the former lift definition and the lift defined inthis work establishing an improvement in the convergence rate of the error in the latter case.
In the present work, strong approximation errors are analyzed for both the spatial semi-discretization and the spatio-temporal fully discretization of stochastic wave equations (SWEs) with cubic polynomial nonlinearities and additive noises. The fully discretization is achieved by the standard Galerkin ffnite element method in space and a novel exponential time integrator combined with the averaged vector ffeld approach. The newly proposed scheme is proved to exactly satisfy a trace formula based on an energy functional. Recovering the convergence rates of the scheme, however, meets essential difffculties, due to the lack of the global monotonicity condition. To overcome this issue, we derive the exponential integrability property of the considered numerical approximations, by the energy functional. Armed with these properties, we obtain the strong convergence rates of the approximations in both spatial and temporal direction. Finally, numerical results are presented to verify the previously theoretical findings.