Towards identifying the number of minimal surfaces sharing the same boundary from the geometry of the boundary, we propose a numerical scheme with high speed and high accuracy. Our numerical scheme is based on the method of fundamental solutions. We establish the convergence analysis for Dirichlet energy and $L^\infty$-error analysis for mean curvature. Each of the approximate solutions in our scheme is a smooth surface, which is a significant difference from previous studies that required mesh division.
2-Opt is probably the most basic local search heuristic for the TSP. This heuristic achieves amazingly good results on real world Euclidean instances both with respect to running time and approximation ratio. There are numerous experimental studies on the performance of 2-Opt. However, the theoretical knowledge about this heuristic is still very limited. Not even its worst case running time on 2-dimensional Euclidean instances was known so far. We clarify this issue by presenting, for every $p\in\mathbb{N}$, a family of $L_p$ instances on which 2-Opt can take an exponential number of steps. Previous probabilistic analyses were restricted to instances in which $n$ points are placed uniformly at random in the unit square $[0,1]^2$. We consider a more advanced model in which the points can be placed independently according to general distributions on $[0,1]^d$, for an arbitrary $d\ge 2$. In particular, we allow different distributions for different points. We study the expected number of local improvements in terms of the number $n$ of points and the maximal density $\phi$ of the probability distributions. We show an upper bound on the expected length of any 2-Opt improvement path of $\tilde{O}(n^{4+1/3}\cdot\phi^{8/3})$. When starting with an initial tour computed by an insertion heuristic, the upper bound on the expected number of steps improves even to $\tilde{O}(n^{4+1/3-1/d}\cdot\phi^{8/3})$. If the distances are measured according to the Manhattan metric, then the expected number of steps is bounded by $\tilde{O}(n^{4-1/d}\cdot\phi)$. In addition, we prove an upper bound of $O(\sqrt[d]{\phi})$ on the expected approximation factor with respect to all $L_p$ metrics. Let us remark that our probabilistic analysis covers as special cases the uniform input model with $\phi=1$ and a smoothed analysis with Gaussian perturbations of standard deviation $\sigma$ with $\phi\sim1/\sigma^d$.
We analyse a class of time discretizations for solving the nonlinear Schr\"odinger equation with non-smooth potential and at low-regularity on an arbitrary Lipschitz domain $\Omega \subset \mathbb{R}^d$, $d \le 3$. We show that these schemes, together with their optimal local error structure, allow for convergence under lower regularity assumptions on both the solution and the potential than is required by classical methods, such as splitting or exponential integrator methods. Moreover, we show first and second order convergence in the case of periodic boundary conditions, in any fractional positive Sobolev space $H^{r}$, $r \ge 0$, beyond the more typical $L^2$ or $H^\sigma (\sigma>\frac{d}{2}$) -error analysis. Numerical experiments illustrate our results.
We analyze the convergence of a nonlocal gradient descent method for minimizing a class of high-dimensional non-convex functions, where a directional Gaussian smoothing (DGS) is proposed to define the nonlocal gradient (also referred to as the DGS gradient). The method was first proposed in [42], in which multiple numerical experiments showed that replacing the traditional local gradient with the DGS gradient can help the optimizers escape local minima more easily and significantly improve their performance. However, a rigorous theory for the efficiency of the method on nonconvex landscape is lacking. In this work, we investigate the scenario where the objective function is composed of a convex function, perturbed by a oscillating noise. We provide a convergence theory under which the iterates exponentially converge to a tightened neighborhood of the solution, whose size is characterized by the noise wavelength. We also establish a correlation between the optimal values of the Gaussian smoothing radius and the noise wavelength, thus justify the advantage of using moderate or large smoothing radius with the method. Furthermore, if the noise level decays to zero when approaching global minimum, we prove that DGS-based optimization converges to the exact global minimum with linear rates, similarly to standard gradient-based method in optimizing convex functions. Several numerical experiments are provided to confirm our theory and illustrate the superiority of the approach over those based on the local gradient.
Kernel mean embeddings, a widely used technique in machine learning, map probability distributions to elements of a reproducing kernel Hilbert space (RKHS). For supervised learning problems, where input-output pairs are observed, the conditional distribution of outputs given the inputs is a key object. The input dependent conditional distribution of an output can be encoded with an RKHS valued function, the conditional kernel mean map. In this paper we present a new recursive algorithm to estimate the conditional kernel mean map in a Hilbert space valued $L_2$ space, that is in a Bochner space. We prove the weak and strong $L_2$ consistency of our recursive estimator under mild conditions. The idea is to generalize Stone's theorem for Hilbert space valued regression in a locally compact Polish space. We present new insights about conditional kernel mean embeddings and give strong asymptotic bounds regarding the convergence of the proposed recursive method. Finally, the results are demonstrated on three application domains: for inputs coming from Euclidean spaces, Riemannian manifolds and locally compact subsets of function spaces.
We give a proof-theoretic as well as a semantic characterization of a logic in the signature with conjunction, disjunction, negation, and the universal and existential quantifiers that we suggest has a certain fundamental status. We present a Fitch-style natural deduction system for the logic that contains only the introduction and elimination rules for the logical constants. From this starting point, if one adds the rule that Fitch called Reiteration, one obtains a proof system for intuitionistic logic in the given signature; if instead of adding Reiteration, one adds the rule of Reductio ad Absurdum, one obtains a proof system for orthologic; by adding both Reiteration and Reductio, one obtains a proof system for classical logic. Arguably neither Reiteration nor Reductio is as intimately related to the meaning of the connectives as the introduction and elimination rules are, so the base logic we identify serves as a more fundamental starting point and common ground between proponents of intuitionistic logic, orthologic, and classical logic. The algebraic semantics for the logic we motivate proof-theoretically is based on bounded lattices equipped with what has been called a weak pseudocomplementation. We show that such lattice expansions are representable using a set together with a reflexive binary relation satisfying a simple first-order condition, which yields an elegant relational semantics for the logic. This builds on our previous study of representations of lattices with negations, which we extend and specialize for several types of negation in addition to weak pseudocomplementation. Finally, we discuss ways of extending these representations to lattices with a conditional or implication operation.
This paper develops and analyzes an accelerated proximal descent method for finding stationary points of nonconvex composite optimization problems. The objective function is of the form $f+h$ where $h$ is a proper closed convex function, $f$ is a differentiable function on the domain of $h$, and $\nabla f$ is Lipschitz continuous on the domain of $h$. The main advantage of this method is that it is "parameter-free" in the sense that it does not require knowledge of the Lipschitz constant of $\nabla f$ or of any global topological properties of $f$. It is shown that the proposed method can obtain an $\varepsilon$-approximate stationary point with iteration complexity bounds that are optimal, up to logarithmic terms over $\varepsilon$, in both the convex and nonconvex settings. Some discussion is also given about how the proposed method can be leveraged in other existing optimization frameworks, such as min-max smoothing and penalty frameworks for constrained programming, to create more specialized parameter-free methods. Finally, numerical experiments are presented to support the practical viability of the method.
In the usual Bayesian setting, a full probabilistic model is required to link the data and parameters, and the form of this model and the inference and prediction mechanisms are specified via de Finetti's representation. In general, such a formulation is not robust to model mis-specification of its component parts. An alternative approach is to draw inference based on loss functions, where the quantity of interest is defined as a minimizer of some expected loss, and to construct posterior distributions based on the loss-based formulation; this strategy underpins the construction of the Gibbs posterior. We develop a Bayesian non-parametric approach; specifically, we generalize the Bayesian bootstrap, and specify a Dirichlet process model for the distribution of the observables. We implement this using direct prior-to-posterior calculations, but also using predictive sampling. We also study the assessment of posterior validity for non-standard Bayesian calculations, and provide an efficient way to calibrate the scaling parameter in the Gibbs posterior so that it can achieve the desired coverage rate. We show that the developed non-standard Bayesian updating procedures yield valid posterior distributions in terms of consistency and asymptotic normality under model mis-specification. Simulation studies show that the proposed methods can recover the true value of the parameter efficiently and achieve frequentist coverage even when the sample size is small. Finally, we apply our methods to evaluate the causal impact of speed cameras on traffic collisions in England.
When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.
Interpretability in machine learning (ML) is crucial for high stakes decisions and troubleshooting. In this work, we provide fundamental principles for interpretable ML, and dispel common misunderstandings that dilute the importance of this crucial topic. We also identify 10 technical challenge areas in interpretable machine learning and provide history and background on each problem. Some of these problems are classically important, and some are recent problems that have arisen in the last few years. These problems are: (1) Optimizing sparse logical models such as decision trees; (2) Optimization of scoring systems; (3) Placing constraints into generalized additive models to encourage sparsity and better interpretability; (4) Modern case-based reasoning, including neural networks and matching for causal inference; (5) Complete supervised disentanglement of neural networks; (6) Complete or even partial unsupervised disentanglement of neural networks; (7) Dimensionality reduction for data visualization; (8) Machine learning models that can incorporate physics and other generative or causal constraints; (9) Characterization of the "Rashomon set" of good models; and (10) Interpretable reinforcement learning. This survey is suitable as a starting point for statisticians and computer scientists interested in working in interpretable machine learning.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.