A singularly perturbed reaction-diffusion problem posed on the unit square in $\mathbb{R}^2$ is solved numerically by a local discontinuous Galerkin (LDG) finite element method. Typical solutions of this class of problem exhibit boundary layers along the sides of the domain; these layers generally cause difficulties for numerical methods. Our LDG method handles the boundary layers by using a Shishkin mesh and also introducing the new concept of a ``layer-upwind flux" -- a discrete flux whose values are chosen on the fine mesh (which lies inside the boundary layers) in the direction where the layer weakens. On the coarse mesh, one can use a standard central flux. No penalty terms are needed with these fluxes, unlike many other variants of the LDG method. Our choice of discrete flux makes it feasible to derive an optimal-order error analysis in a balanced norm; this norm is stronger than the usual energy norm and is a more appropriate measure for errors in computed solutions for singularly perturbed reaction-diffusion problems. It will be proved that the LDG method is usually convergent of order $O((N^{-1}\ln N)^{k+1})$ in the balanced norm, where $N$ is the number of mesh intervals in each coordinate direction and tensor-product piecewise polynomials of degree~$k$ in each coordinate variable are used in the LDG method. This result is the first of its kind for the LDG method applied to this class of problem and is optimal for convergence on a Shishkin mesh. Its sharpness is confirmed by numerical experiments.
This contribution introduces a model order reduction approach for an advection-reaction problem with a parametrized reaction function. The underlying discretization uses an ultraweak formulation with an $L^2$-like trial space and an 'optimal' test space as introduced by Demkowicz et al. This ensures the stability of the discretization and in addition allows for a symmetric reformulation of the problem in terms of a dual solution which can also be interpreted as the normal equations of an adjoint least-squares problem. Classic model order reduction techniques can then be applied to the space of dual solutions which also immediately gives a reduced primal space. We show that the necessary computations do not require the reconstruction of any primal solutions and can instead be performed entirely on the space of dual solutions. We prove exponential convergence of the Kolmogorov $N$-width and show that a greedy algorithm produces quasi-optimal approximation spaces for both the primal and the dual solution space. Numerical experiments based on the benchmark problem of a catalytic filter confirm the applicability of the proposed method.
Time-fractional parabolic equations with a Caputo time derivative of order $\alpha\in(0,1)$ are discretized in time using continuous collocation methods. For such discretizations, we give sufficient conditions for existence and uniqueness of their solutions. Two approaches are explored: the Lax-Milgram Theorem and the eigenfunction expansion. The resulting sufficient conditions, which involve certain $m\times m$ matrices (where $m$ is the order of the collocation scheme), are verified both analytically, for all $m\ge 1$ and all sets of collocation points, and computationally, for all $ m\le 20$. The semilinear case is also addressed.
An $n$-bit boolean function is resilient to coalitions of size $q$ if any fixed set of $q$ bits is unlikely to influence the function when the other $n-q$ bits are chosen uniformly. We give explicit constructions of depth-$3$ circuits that are resilient to coalitions of size $cn/\log^{2}n$ with bias $n^{-c}$. Previous explicit constructions with the same resilience had constant bias. Our construction is simpler and we generalize it to biased product distributions. Our proof builds on previous work; the main differences are the use of a tail bound for expander walks in combination with a refined analysis based on Janson's inequality.
Let $X_1, \ldots, X_n$ be probability spaces, let $X$ be their direct product, let $\phi_1, \ldots, \phi_m: X \longrightarrow {\Bbb C}$ be random variables, each depending only on a few coordinates of a point $x=(x_1, \ldots, x_n)$, and let $f=\phi_1 + \ldots + \phi_m$. The expectation $E\thinspace e^{\lambda f}$, where $\lambda \in {\Bbb C}$, appears in statistical physics as the partition function of a system with multi-spin interactions, and also in combinatorics and computer science, where it is known as the partition function of edge-coloring models, tensor network contractions or a Holant polynomial. Assuming that each $\phi_i$ is 1-Lipschitz in the Hamming metric of $X$, that each $\phi_i(x)$ depends on at most $r \geq 2$ coordinates $x_1, \ldots, x_n$ of $x \in X$, and that for each $j$ there are at most $c \geq 1$ functions $\phi_i$ that depend on the coordinate $x_j$, we prove that $E\thinspace e^{\lambda f} \ne 0$ provided $| \lambda | \leq \ (3 c \sqrt{r-1})^{-1}$ and that the bound is sharp up to a constant factor. Taking a scaling limit, we prove a similar result for functions $\phi_1, \ldots, \phi_m: {\Bbb R}^n \longrightarrow {\Bbb C}$ that are 1-Lipschitz in the $\ell^1$ metric of ${\Bbb R}^n$ and where the expectation is taken with respect to the standard Gaussian measure in ${\Bbb R}^n$. As a corollary, the value of the expectation can be efficiently approximated, provided $\lambda$ lies in a slightly smaller disc.
Bayesian inference gets its name from *Bayes's theorem*, expressing posterior probabilities for hypotheses about a data generating process as the (normalized) product of prior probabilities and a likelihood function. But Bayesian inference uses all of probability theory, not just Bayes's theorem. Many hypotheses of scientific interest are *composite hypotheses*, with the strength of evidence for the hypothesis dependent on knowledge about auxiliary factors, such as the values of nuisance parameters (e.g., uncertain background rates or calibration factors). Many important capabilities of Bayesian methods arise from use of the law of total probability, which instructs analysts to compute probabilities for composite hypotheses by *marginalization* over auxiliary factors. This tutorial targets relative newcomers to Bayesian inference, aiming to complement tutorials that focus on Bayes's theorem and how priors modulate likelihoods. The emphasis here is on marginalization over parameter spaces -- both how it is the foundation for important capabilities, and how it may motivate caution when parameter spaces are large. Topics covered include the difference between likelihood and probability, understanding the impact of priors beyond merely shifting the maximum likelihood estimate, and the role of marginalization in accounting for uncertainty in nuisance parameters, systematic error, and model misspecification.
Given an Abelian group G, a Boolean-valued function f: G -> {-1,+1}, is said to be s-sparse, if it has at most s-many non-zero Fourier coefficients over the domain G. In a seminal paper, Gopalan et al. proved "Granularity" for Fourier coefficients of Boolean valued functions over Z_2^n, that have found many diverse applications in theoretical computer science and combinatorics. They also studied structural results for Boolean functions over Z_2^n which are approximately Fourier-sparse. In this work, we obtain structural results for approximately Fourier-sparse Boolean valued functions over Abelian groups G of the form,G:= Z_{p_1}^{n_1} \times ... \times Z_{p_t}^{n_t}, for distinct primes p_i. We also obtain a lower bound of the form 1/(m^{2}s)^ceiling(phi(m)/2), on the absolute value of the smallest non-zero Fourier coefficient of an s-sparse function, where m=p_1 ... p_t, and phi(m)=(p_1-1) ... (p_t-1). We carefully apply probabilistic techniques from Gopalan et al., to obtain our structural results, and use some non-trivial results from algebraic number theory to get the lower bound. We construct a family of at most s-sparse Boolean functions over Z_p^n, where p > 2, for arbitrarily large enough s, where the minimum non-zero Fourier coefficient is 1/omega(n). The "Granularity" result of Gopalan et al. implies that the absolute values of non-zero Fourier coefficients of any s-sparse Boolean valued function over Z_2^n are 1/O(s). So, our result shows that one cannot expect such a lower bound for general Abelian groups. Using our new structural results on the Fourier coefficients of sparse functions, we design an efficient testing algorithm for Fourier-sparse Boolean functions, thata requires poly((ms)^phi(m),1/epsilon)-many queries. Further, we prove an Omega(sqrt{s}) lower bound on the query complexity of any adaptive sparsity testing algorithm.
In this paper, we study the problem of noisy, convex, zeroth order optimisation of a function $f$ over a bounded convex set $\bar{\mathcal X}\subset \mathbb{R}^d$. Given a budget $n$ of noisy queries to the function $f$ that can be allocated sequentially and adaptively, our aim is to construct an algorithm that returns a point $\hat x\in \bar{\mathcal X}$ such that $f(\hat x)$ is as small as possible. We provide a conceptually simple method inspired by the textbook center of gravity method, but adapted to the noisy and zeroth order setting. We prove that this method is such that the $f(\hat x) - \min_{x\in \bar{\mathcal X}} f(x)$ is of smaller order than $d^2/\sqrt{n}$ up to poly-logarithmic terms. We slightly improve upon existing literature, where to the best of our knowledge the best known rate is in [Lattimore, 2024] is of order $d^{2.5}/\sqrt{n}$, albeit for a more challenging problem. Our main contribution is however conceptual, as we believe that our algorithm and its analysis bring novel ideas and are significantly simpler than existing approaches.
We consider an on-line least squares regression problem with optimal solution $\theta^*$ and Hessian matrix H, and study a time-average stochastic gradient descent estimator of $\theta^*$. For $k\ge2$, we provide an unbiased estimator of $\theta^*$ that is a modification of the time-average estimator, runs with an expected number of time-steps of order k, with O(1/k) expected excess risk. The constant behind the O notation depends on parameters of the regression and is a poly-logarithmic function of the smallest eigenvalue of H. We provide both a biased and unbiased estimator of the expected excess risk of the time-average estimator and of its unbiased counterpart, without requiring knowledge of either H or $\theta^*$. We describe an "average-start" version of our estimators with similar properties. Our approach is based on randomized multilevel Monte Carlo. Our numerical experiments confirm our theoretical findings.
Rational approximation is a powerful tool to obtain accurate surrogates for nonlinear functions that are easy to evaluate and linearize. The interpolatory adaptive Antoulas--Anderson (AAA) method is one approach to construct such approximants numerically. For large-scale vector- and matrix-valued functions, however, the direct application of the set-valued variant of AAA becomes inefficient. We propose and analyze a new sketching approach for such functions called sketchAAA that, with high probability, leads to much better approximants than previously suggested approaches while retaining efficiency. The sketching approach works in a black-box fashion where only evaluations of the nonlinear function at sampling points are needed. Numerical tests with nonlinear eigenvalue problems illustrate the efficacy of our approach, with speedups above 200 for sampling large-scale black-box functions without sacrificing on accuracy.
We investigate perturbations of orthonormal bases of $L^2$ via a composition operator $C_h$ induced by a mapping $h$. We provide a comprehensive characterization of the mapping $h$ required for the perturbed sequence to form an orthonormal or Riesz basis. Restricting our analysis to differentiable mappings, we reveal that all Riesz bases of the given form are induced by bi-Lipschitz mappings. In addition, we discuss implications of these results for approximation theory, highlighting the potential of using bijective neural networks to construct complete sequences with favorable approximation properties.