In this paper, we propose a $C^{0}$ interior penalty method for $m$th-Laplace equation on bounded Lipschitz polyhedral domain in $\mathbb{R}^{d}$, where $m$ and $d$ can be any positive integers. The standard $H^{1}$-conforming piecewise $r$-th order polynomial space is used to approximate the exact solution $u$, where $r$ can be any integer greater than or equal to $m$. Unlike the interior penalty method in [T.~Gudi and M.~Neilan, {\em An interior penalty method for a sixth-order elliptic equation}, IMA J. Numer. Anal., \textbf{31(4)} (2011), pp. 1734--1753], we avoid computing $D^{m}$ of numerical solution on each element and high order normal derivatives of numerical solution along mesh interfaces. Therefore our method can be easily implemented. After proving discrete $H^{m}$-norm bounded by the natural energy semi-norm associated with our method, we manage to obtain stability and optimal convergence with respect to discrete $H^{m}$-norm. Numerical experiments validate our theoretical estimate.
Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy. However, computing LOO-CV criteria can be computationally expensive due to the need to fit the model multiple times. In the Bayesian context, importance sampling provides a possible solution but classical approaches can easily produce estimators whose variance is infinite, making them potentially unreliable. Here we propose and analyze a novel mixture estimator to compute Bayesian LOO-CV criteria. Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite variance of the resulting estimators. Both theoretical and numerical results are provided to illustrate the improved robustness and efficiency. The computational benefits are particularly significant in high-dimensional problems, allowing to perform Bayesian LOO-CV for a broader range of models. The proposed methodology is easily implementable in standard probabilistic programming software and has a computational cost roughly equivalent to fitting the original model once.
In the history of first-order algorithms, Nesterov's accelerated gradient descent (NAG) is one of the milestones. However, the cause of the acceleration has been a mystery for a long time. It has not been revealed with the existence of gradient correction until the high-resolution differential equation framework proposed in [Shi et al., 2021]. In this paper, we continue to investigate the acceleration phenomenon. First, we provide a significantly simplified proof based on precise observation and a tighter inequality for $L$-smooth functions. Then, a new implicit-velocity high-resolution differential equation framework, as well as the corresponding implicit-velocity version of phase-space representation and Lyapunov function, is proposed to investigate the convergence behavior of the iterative sequence $\{x_k\}_{k=0}^{\infty}$ of NAG. Furthermore, from two kinds of phase-space representations, we find that the role played by gradient correction is equivalent to that by velocity included implicitly in the gradient, where the only difference comes from the iterative sequence $\{y_{k}\}_{k=0}^{\infty}$ replaced by $\{x_k\}_{k=0}^{\infty}$. Finally, for the open question of whether the gradient norm minimization of NAG has a faster rate $o(1/k^3)$, we figure out a positive answer with its proof. Meanwhile, a faster rate of objective value minimization $o(1/k^2)$ is shown for the case $r > 2$.
The Strong Exponential Time Hypothesis (SETH) asserts that for every $\varepsilon>0$ there exists $k$ such that $k$-SAT requires time $(2-\varepsilon)^n$. The field of fine-grained complexity has leveraged SETH to prove quite tight conditional lower bounds for dozens of problems in various domains and complexity classes, including Edit Distance, Graph Diameter, Hitting Set, Independent Set, and Orthogonal Vectors. Yet, it has been repeatedly asked in the literature whether SETH-hardness results can be proven for other fundamental problems such as Hamiltonian Path, Independent Set, Chromatic Number, MAX-$k$-SAT, and Set Cover. In this paper, we show that fine-grained reductions implying even $\lambda^n$-hardness of these problems from SETH for any $\lambda>1$, would imply new circuit lower bounds: super-linear lower bounds for Boolean series-parallel circuits or polynomial lower bounds for arithmetic circuits (each of which is a four-decade open question). We also extend this barrier result to the class of parameterized problems. Namely, for every $\lambda>1$ we conditionally rule out fine-grained reductions implying SETH-based lower bounds of $\lambda^k$ for a number of problems parameterized by the solution size $k$. Our main technical tool is a new concept called polynomial formulations. In particular, we show that many problems can be represented by relatively succinct low-degree polynomials, and that any problem with such a representation cannot be proven SETH-hard (without proving new circuit lower bounds).
In this paper, we consider a coupled chemotaxis-fluid system that models self-organized collective behavior of oxytactic bacteria in a sessile drop. This model describes the biological chemotaxis phenomenon in the fluid environment and couples a convective chemotaxis system for the oxygen-consuming and oxytactic bacteria with the incompressible Navier-Stokes equations subject to a gravitational force, which is proportional to the relative surplus of the cell density compared to the water density. We develop a new positivity preserving and high-resolution method for the studied chemotaxis-fluid system. Our method is based on the diffuse-domain approach, which we use to derive a new chemotaxis-fluid diffuse-domain (cf-DD) model for simulating bioconvection in complex geometries. The drop domain is imbedded into a larger rectangular domain, and the original boundary is replaced by a diffuse interface with finite thickness. The original chemotaxis-fluid system is reformulated on the larger domain with additional source terms that approximate the boundary conditions on the physical interface. We show that the cf-DD model converges to the chemotaxis-fluid model asymptotically as the width of the diffuse interface shrinks to zero. We numerically solve the resulting cf-DD system by a second-order hybrid finite-volume finite-difference method and demonstrate the performance of the proposed approach on a number of numerical experiments that showcase several interesting chemotactic phenomena in sessile drops of different shapes, where the bacterial patterns depend on the droplet geometries.
We propose to combine the Carleman estimate and the Newton method to solve an inverse source problem for nonlinear parabolic equations from lateral boundary data. The stability of this inverse source problem is conditionally logarithmic. Hence, numerical results due to the conventional least squares optimization might not be reliable. In order to enhance the stability, we approximate this problem by truncating the high frequency terms of the Fourier series that represents the solution to the governing equation. By this, we derive a system of nonlinear elliptic PDEs whose solution consists of Fourier coefficients of the solution to the parabolic governing equation. We solve this system by the Carleman-Newton method. The Carleman-Newton method is a newly developed algorithm to solve nonlinear PDEs. The strength of the Carleman-Newton method includes (1) no good initial guess is required and (2) the computational cost is not expensive. These features are rigorously proved. Having the solutions to this system in hand, we can directly compute the solution to the proposed inverse problem. Some numerical examples are displayed.
Recently there is a large amount of work devoted to the study of Markov chain stochastic gradient methods (MC-SGMs) which mainly focus on their convergence analysis for solving minimization problems. In this paper, we provide a comprehensive generalization analysis of MC-SGMs for both minimization and minimax problems through the lens of algorithmic stability in the framework of statistical learning theory. For empirical risk minimization (ERM) problems, we establish the optimal excess population risk bounds for both smooth and non-smooth cases by introducing on-average argument stability. For minimax problems, we develop a quantitative connection between on-average argument stability and generalization error which extends the existing results for uniform stability \cite{lei2021stability}. We further develop the first nearly optimal convergence rates for convex-concave problems both in expectation and with high probability, which, combined with our stability results, show that the optimal generalization bounds can be attained for both smooth and non-smooth cases. To the best of our knowledge, this is the first generalization analysis of SGMs when the gradients are sampled from a Markov process.
Positive and unlabelled learning is an important problem which arises naturally in many applications. The significant limitation of almost all existing methods lies in assuming that the propensity score function is constant (SCAR assumption), which is unrealistic in many practical situations. Avoiding this assumption, we consider parametric approach to the problem of joint estimation of posterior probability and propensity score functions. We show that under mild assumptions when both functions have the same parametric form (e.g. logistic with different parameters) the corresponding parameters are identifiable. Motivated by this, we propose two approaches to their estimation: joint maximum likelihood method and the second approach based on alternating maximization of two Fisher consistent expressions. Our experimental results show that the proposed methods are comparable or better than the existing methods based on Expectation-Maximisation scheme.
In this paper, we develop a novel class of linear energy-preserving integrating factor methods for the 2D nonlinear Schr\"odinger equation with wave operator (NLSW), combining the scalar auxiliary variable approach and the integrating factor methods. A second-order scheme is first proposed, which is rigorously proved to be energy-preserving. By using the energy methods, we analyze its optimal convergence in the $H^1$ norm without any restrictions on the grid ratio, where a novel technique and an improved induction argument are proposed to overcome the difficulty posed by the unavailability of a priori $L^\infty$ estimates of numerical solutions. Based on the integrating factor Runge-Kutta methods, we extend the proposed scheme to arbitrarily high order, which is also linear and conservative. Numerical experiments are presented to confirm the theoretical analysis and demonstrate the advantages of the proposed methods.
We show that under minimal assumptions on a random vector $X\in\mathbb{R}^d$, and with high probability, given $m$ independent copies of $X$, the coordinate distribution of each vector $(\langle X_i,\theta \rangle)_{i=1}^m$ is dictated by the distribution of the true marginal $\langle X,\theta \rangle$. Formally, we show that with high probability, \[\sup_{\theta \in S^{d-1}} \left( \frac{1}{m}\sum_{i=1}^m \left|\langle X_i,\theta \rangle^\sharp - \lambda^\theta_i \right|^2 \right)^{1/2} \leq c \left( \frac{d}{m} \right)^{1/4},\] where $\lambda^{\theta}_i = m\int_{(\frac{i-1}{m}, \frac{i}{m}]} F_{ \langle X,\theta \rangle }^{-1}(u)^2 \,du$ and $a^\sharp$ denotes the monotone non-decreasing rearrangement of $a$. The proof follows from the optimal estimate on the worst Wasserstein distance between a marginal of $X$ and its empirical counterpart, $\frac{1}{m} \sum_{i=1}^m \delta_{\langle X_i, \theta \rangle}$. We then use the accurate information on the structures of the vectors $(\langle X_i,\theta \rangle)_{i=1}^m$ to construct the first non-gaussian ensemble that yields the optimal estimate in the Dvoretzky-Milman Theorem: the ensemble exhibits almost Euclidean sections in arbitrary normed spaces of the same dimension as the gaussian embedding -- despite being very far from gaussian (in fact, it happens to be heavy-tailed).
We consider a potential outcomes model in which interference may be present between any two units but the extent of interference diminishes with spatial distance. The causal estimand is the global average treatment effect, which compares outcomes under the counterfactuals that all or no units are treated. We study a class of designs in which space is partitioned into clusters that are randomized into treatment and control. For each design, we estimate the treatment effect using a Horvitz-Thompson estimator that compares the average outcomes of units with all or no neighbors treated, where the neighborhood radius is of the same order as the cluster size dictated by the design. We derive the estimator's rate of convergence as a function of the design and degree of interference and use this to obtain estimator-design pairs that achieve near-optimal rates of convergence under relatively minimal assumptions on interference. We prove that the estimators are asymptotically normal and provide a variance estimator. For practical implementation of the designs, we suggest partitioning space using clustering algorithms.