We prove residual-type a posteriori error estimates in the maximum norm for a linear scalar elliptic convection-diffusion problem that may be singularly perturbed. Similar error analysis in the energy norm by Verf\"{u}rth indicates that a dual norm of the {convective derivative of the} error must be added to the natural energy norm in order for the natural residual estimator to be reliable and efficient. We show that the situation is similar for the maximum norm. In particular, we define a mesh-dependent weighted seminorm of the convective error which functions as a maximum-norm counterpart to the dual norm used in the energy norm setting. The total error is then defined as the sum of this seminorm, the maximum norm of the error, and data oscillation. The natural maximum norm residual error estimator is shown to be equivalent to this total error notion, with constant independent of singular perturbation parameters. These estimates are proved under the assumption that certain natural estimates hold for the Green's function for the problem at hand. Numerical experiments confirm that our estimators effectively capture the maximum-norm error behavior for singularly perturbed problems, and can effectively drive adaptive refinement in order to capture layer phenomena.
We propose and analyze an augmented mixed finite element method for the pseudostress-velocity formulation of the stationary convective Brinkman-Forchheimer problem in $\mathrm{R}^d$, $d\in \{2,3\}$. Since the convective and Forchheimer terms forces the velocity to live in a smaller space than usual, we augment the variational formulation with suitable Galerkin type terms. The resulting augmented scheme is written equivalently as a fixed point equation, so that the well-known Schauder and Banach theorems, combined with the Lax-Milgram theorem, allow to prove the unique solvability of the continuous problem. The finite element discretization involves Raviart-Thomas spaces of order $k\geq 0$ for the pseudostress tensor and continuous piecewise polynomials of degree $\le k + 1$ for the velocity. Stability, convergence, and a priori error estimates for the associated Galerkin scheme are obtained. In addition, we derive two reliable and efficient residual-based a posteriori error estimators for this problem on arbitrary polygonal and polyhedral regions. The reliability of the proposed estimators draws mainly upon the uniform ellipticity of the form involved, a suitable assumption on the data, a stable Helmholtz decomposition, and the local approximation properties of the Cl\'ement and Raviart-Thomas operators. In turn, inverse inequalities, the localization technique based on bubble functions, and known results from previous works, are the main tools yielding the efficiency estimate. Finally, some numerical examples illustrating the performance of the mixed finite element method, confirming the theoretical rate of convergence and the properties of the estimators, and showing the behaviour of the associated adaptive algorithms, are reported. In particular, the case of flow through a $2$D porous media with fracture networks is considered.
Controlling the parameters' norm often yields good generalisation when training neural networks. Beyond simple intuitions, the relation between parameters' norm and obtained estimators theoretically remains misunderstood. For one hidden ReLU layer networks with unidimensional data, this work shows the minimal parameters' norm required to represent a function is given by the total variation of its second derivative, weighted by a $\sqrt{1+x^2}$ factor. As a comparison, this $\sqrt{1+x^2}$ weighting disappears when the norm of the bias terms are ignored. This additional weighting is of crucial importance, since it is shown in this work to enforce uniqueness and sparsity (in number of kinks) of the minimal norm interpolator. On the other hand, omitting the bias' norm allows for non-sparse solutions. Penalising the bias terms in the regularisation, either explicitly or implicitly, thus leads to sparse estimators. This sparsity might take part in the good generalisation of neural networks that is empirically observed.
In this article, we present an overview of different a posteriori error analysis and postprocessing methods proposed in the context of nonlinear eigenvalue problems, e.g. arising inelectronic structure calculations for the calculation of the ground state and compare them. Weprovide two equivalent error reconstructions based either on a second-order Taylor expansionof the minimized energy, or a first-order expansion of the nonlinear eigenvalue equation. Wethen show how several a posteriori error estimations as well as post-processing methods can beformulated as specific applications of the derived reconstructed errors, and we compare theirrange of applicability as well as numerical cost and precision.
The Stochastic Shortest Path (SSP) problem models probabilistic sequential-decision problems where an agent must pursue a goal while minimizing a cost function. Because of the probabilistic dynamics, it is desired to have a cost function that considers risk. Conditional Value at Risk (CVaR) is a criterion that allows modeling an arbitrary level of risk by considering the expectation of a fraction $\alpha$ of worse trajectories. Although an optimal policy is non-Markovian, solutions of CVaR-SSP can be found approximately with Value Iteration based algorithms such as CVaR Value Iteration with Linear Interpolation (CVaRVIQ) and CVaR Value Iteration via Quantile Representation (CVaRVILI). These type of solutions depends on the algorithm's parameters such as the number of atoms and $\alpha_0$ (the minimum $\alpha$). To compare the policies returned by these algorithms, we need a way to exactly evaluate stationary policies of CVaR-SSPs. Although there is an algorithm that evaluates these policies, this only works on problems with uniform costs. In this paper, we propose a new algorithm, Forward-PECVaR (ForPECVaR), that evaluates exactly stationary policies of CVaR-SSPs with non-uniform costs. We evaluate empirically CVaR Value Iteration algorithms that found solutions approximately regarding their quality compared with the exact solution, and the influence of the algorithm parameters in the quality and scalability of the solutions. Experiments in two domains show that it is important to use an $\alpha_0$ smaller than the $\alpha$ target and an adequate number of atoms to obtain a good approximation.
The conditional survival function of a time-to-event outcome subject to censoring and truncation is a common target of estimation in survival analysis. This parameter may be of scientific interest and also often appears as a nuisance in nonparametric and semiparametric problems. In addition to classical parametric and semiparametric methods (e.g., based on the Cox proportional hazards model), flexible machine learning approaches have been developed to estimate the conditional survival function. However, many of these methods are either implicitly or explicitly targeted toward risk stratification rather than overall survival function estimation. Others apply only to discrete-time settings or require inverse probability of censoring weights, which can be as difficult to estimate as the outcome survival function itself. Here, we employ a decomposition of the conditional survival function in terms of observable regression models in which censoring and truncation play no role. This allows application of an array of flexible regression and classification methods rather than only approaches that explicitly handle the complexities inherent to survival data. We outline estimation procedures based on this decomposition, empirically assess their performance, and demonstrate their use on data from an HIV vaccine trial.
We adopt the integral definition of the fractional Laplace operator and analyze solution techniques for fractional, semilinear, and elliptic optimal control problems posed on Lipschitz polytopes. We consider two strategies of discretization: a semidiscrete scheme where the admissible control set is not discretized and a fully discrete scheme where such a set is discretized with piecewise constant functions. As an instrumental step, we derive error estimates for finite element discretizations of fractional semilinear elliptic partial differential equations (PDEs) on quasi-uniform and graded meshes. With these estimates at hand, we derive error bounds for the semidiscrete scheme and improve the ones that are available in the literature for the fully discrete scheme.
In this article, we construct semiparametrically efficient estimators of linear functionals of a probability measure in the presence of side information using an easy empirical likelihood approach. We use estimated constraint functions and allow the number of constraints to grow with the sample size. Considered are three cases of information which can be characterized by infinitely many constraints: (1) the marginal distributions are known, (2) the marginals are unknown but identical, and (3) distributional symmetry. An improved spatial depth function is defined and its asymptotic properties are studied. Simulation results on efficiency gain are reported.
In Statistics, log-concave density estimation is a central problem within the field of nonparametric inference under shape constraints. Despite great progress in recent years on the statistical theory of the canonical estimator, namely the log-concave maximum likelihood estimator, adoption of this method has been hampered by the complexities of the non-smooth convex optimization problem that underpins its computation. We provide enhanced understanding of the structural properties of this optimization problem, which motivates the proposal of new algorithms, based on both randomized and Nesterov smoothing, combined with an appropriate integral discretization of increasing accuracy. We prove that these methods enjoy, both with high probability and in expectation, a convergence rate of order $1/T$ up to logarithmic factors on the objective function scale, where $T$ denotes the number of iterations. The benefits of our new computational framework are demonstrated on both synthetic and real data, and our implementation is available in a github repository \texttt{LogConcComp} (Log-Concave Computation).
We adopt an information-theoretic framework to analyze the generalization behavior of the class of iterative, noisy learning algorithms. This class is particularly suitable for study under information-theoretic metrics as the algorithms are inherently randomized, and it includes commonly used algorithms such as Stochastic Gradient Langevin Dynamics (SGLD). Herein, we use the maximal leakage (equivalently, the Sibson mutual information of order infinity) metric, as it is simple to analyze, and it implies both bounds on the probability of having a large generalization error and on its expected value. We show that, if the update function (e.g., gradient) is bounded in $L_2$-norm, then adding isotropic Gaussian noise leads to optimal generalization bounds: indeed, the input and output of the learning algorithm in this case are asymptotically statistically independent. Furthermore, we demonstrate how the assumptions on the update function affect the optimal (in the sense of minimizing the induced maximal leakage) choice of the noise. Finally, we compute explicit tight upper bounds on the induced maximal leakage for several scenarios of interest.
In minimum-cost inverse optimization problems, we are given a feasible solution to an underlying optimization problem together with a linear cost function, and the goal is to modify the costs by a small deviation vector so that the input solution becomes optimal. The difference between the new and the original cost functions can be measured in several ways. In this paper, we focus on two objectives: the weighted bottleneck Hamming distance and the weighted $\ell_\infty$-norm. We consider a general model in which the coordinates of the deviation vector are required to fall within given lower and upper bounds. For the weighted bottleneck Hamming distance objective, we present a simple, purely combinatorial algorithm that determines an optimal deviation vector in strongly polynomial time. For the weighted $\ell_\infty$-norm objective, we give a min-max characterization for the optimal solution, and provide a pseudo-polynomial algorithm for finding an optimal deviation vector that runs in strongly polynomial time in the case of unit weights. For both objectives, we assume that an algorithm with the same time complexity for solving the underlying combinatorial optimization problem is available. For both objectives, we also show how to extend the results to inverse optimization problems with multiple cost functions.