We study overlapping Schwarz methods for the Helmholtz equation posed in any dimension with large, real wavenumber and smooth variable wave speed. The radiation condition is approximated by a Cartesian perfectly-matched layer (PML). The domain-decomposition subdomains are overlapping hyperrectangles with Cartesian PMLs at their boundaries. The overlaps of the subdomains and the widths of the PMLs are all taken to be independent of the wavenumber. For both parallel (i.e., additive) and sequential (i.e., multiplicative) methods, we show that after a specified number of iterations -- depending on the behaviour of the geometric-optic rays -- the error is smooth and smaller than any negative power of the wavenumber. For the parallel method, the specified number of iterations is less than the maximum number of subdomains, counted with their multiplicity, that a geometric-optic ray can intersect. These results, which are illustrated by numerical experiments, are the first wavenumber-explicit results about convergence of overlapping Schwarz methods for the Helmholtz equation, and the first wavenumber-explicit results about convergence of any domain-decomposition method for the Helmholtz equation with a non-trivial scatterer (here a variable wave speed).
We consider the fundamental problem of constructing fast and small circuits for binary addition. We propose a new algorithm with running time $\mathcal O(n \log_2 n)$ for constructing linear-size $n$-bit adder circuits with a significantly better depth guarantee compared to previous approaches: Our circuits have a depth of at most $\log_2 n + \log_2 \log_2 n + \log_2 \log_2 \log_2 n + \text{const}$, improving upon the previously best circuits by [12] with a depth of at most $\log_2 n + 8 \sqrt{\log_2 n} + 6 \log_2 \log_2 n + \text{const}$. Hence, we decrease the gap to the lower bound of $\log_2 n + \log_2 \log_2 n + \text{const}$ by [5] significantly from $\mathcal O (\sqrt{\log_2 n})$ to $\mathcal O(\log_2 \log_2 \log_2 n)$. Our core routine is a new algorithm for the construction of a circuit for a single carry bit, or, more generally, for an And-Or path, i.e., a Boolean function of type $t_0 \lor ( t_1 \land (t_2 \lor ( \dots t_{m-1}) \dots ))$. We compute linear-size And-Or path circuits with a depth of at most $\log_2 m + \log_2 \log_2 m + 0.65$ in time $\mathcal O(m \log_2 m)$. These are the first And-Or path circuits known that, up to an additive constant, match the lower bound by [5] and at the same time have a linear size. The previously fastest And-Or path circuits are only by an additive constant worse in depth, but have a much higher size in the order of $\mathcal O (m \log_2 m)$.
We propose an operator learning approach to accelerate geometric Markov chain Monte Carlo (MCMC) for solving infinite-dimensional Bayesian inverse problems (BIPs). While geometric MCMC employs high-quality proposals that adapt to posterior local geometry, it requires repeated computations of gradients and Hessians of the log-likelihood, which becomes prohibitive when the parameter-to-observable (PtO) map is defined through expensive-to-solve parametric partial differential equations (PDEs). We consider a delayed-acceptance geometric MCMC method driven by a neural operator surrogate of the PtO map, where the proposal exploits fast surrogate predictions of the log-likelihood and, simultaneously, its gradient and Hessian. To achieve a substantial speedup, the surrogate must accurately approximate the PtO map and its Jacobian, which often demands a prohibitively large number of PtO map samples via conventional operator learning methods. In this work, we present an extension of derivative-informed operator learning [O'Leary-Roseberry et al., J. Comput. Phys., 496 (2024)] that uses joint samples of the PtO map and its Jacobian. This leads to derivative-informed neural operator (DINO) surrogates that accurately predict the observables and posterior local geometry at a significantly lower training cost than conventional methods. Cost and error analysis for reduced basis DINO surrogates are provided. Numerical studies demonstrate that DINO-driven MCMC generates effective posterior samples 3--9 times faster than geometric MCMC and 60--97 times faster than prior geometry-based MCMC. Furthermore, the training cost of DINO surrogates breaks even compared to geometric MCMC after just 10--25 effective posterior samples.
We analyze a bilinear optimal control problem for the Stokes--Brinkman equations: the control variable enters the state equations as a coefficient. In two- and three-dimensional Lipschitz domains, we perform a complete continuous analysis that includes the existence of solutions and first- and second-order optimality conditions. We also develop two finite element methods that differ fundamentally in whether the admissible control set is discretized or not. For each of the proposed methods, we perform a convergence analysis and derive a priori error estimates; the latter under the assumption that the domain is convex. Finally, assuming that the domain is Lipschitz, we develop an a posteriori error estimator for each discretization scheme and obtain a global reliability bound.
We study three kinetic Langevin samplers including the Euler discretization, the BU and the UBU splitting scheme. We provide contraction results in $L^1$-Wasserstein distance for non-convex potentials. These results are based on a carefully tailored distance function and an appropriate coupling construction. Additionally, the error in the $L^1$-Wasserstein distance between the true target measure and the invariant measure of the discretization scheme is bounded. To get an $\varepsilon$-accuracy in $L^1$-Wasserstein distance, we show complexity guarantees of order $\mathcal{O}(\sqrt{d}/\varepsilon)$ for the Euler scheme and $\mathcal{O}(d^{1/4}/\sqrt{\varepsilon})$ for the UBU scheme under appropriate regularity assumptions on the target measure. The results are applicable to interacting particle systems and provide bounds for sampling probability measures of mean-field type.
Maximal regularity is a kind of a priori estimates for parabolic-type equations and it plays an important role in the theory of nonlinear differential equations. The aim of this paper is to investigate the temporally discrete counterpart of maximal regularity for the discontinuous Galerkin (DG) time-stepping method. We will establish such an estimate without logarithmic factor over a quasi-uniform temporal mesh. To show the main result, we introduce the temporally regularized Green's function and then reduce the discrete maximal regularity to a weighted error estimate for its DG approximation. Our results would be useful for investigation of DG approximation of nonlinear parabolic problems.
This paper presents a novel approach for pointwise estimation of multivariate density functions on known domains of arbitrary dimensions using nonparametric local polynomial estimators. Our method is highly flexible, as it applies to both simple domains, such as open connected sets, and more complicated domains that are not star-shaped around the point of estimation. This enables us to handle domains with sharp concavities, holes, and local pinches, such as polynomial sectors. Additionally, we introduce a data-driven selection rule based on the general ideas of Goldenshluger and Lepski. Our results demonstrate that the local polynomial estimators are minimax under a $L^2$ risk across a wide range of H\"older-type functional classes. In the adaptive case, we provide oracle inequalities and explicitly determine the convergence rate of our statistical procedure. Simulations on polynomial sectors show that our oracle estimates outperform those of the most popular alternative method, found in the sparr package for the R software. Our statistical procedure is implemented in an online R package which is readily accessible.
Deflation techniques are typically used to shift isolated clusters of small eigenvalues in order to obtain a tighter distribution and a smaller condition number. Such changes induce a positive effect in the convergence behavior of Krylov subspace methods, which are among the most popular iterative solvers for large sparse linear systems. We develop a deflation strategy for symmetric saddle point matrices by taking advantage of their underlying block structure. The vectors used for deflation come from an elliptic singular value decomposition relying on the generalized Golub-Kahan bidiagonalization process. The block targeted by deflation is the off-diagonal one since it features a problematic singular value distribution for certain applications. One example is the Stokes flow in elongated channels, where the off-diagonal block has several small, isolated singular values, depending on the length of the channel. Applying deflation to specific parts of the saddle point system is important when using solvers such as CRAIG, which operates on individual blocks rather than the whole system. The theory is developed by extending the existing framework for deflating square matrices before applying a Krylov subspace method like MINRES. Numerical experiments confirm the merits of our strategy and lead to interesting questions about using approximate vectors for deflation.
The detection of Extreme Mass Ratio Inspirals (EMRIs) is intricate due to their complex waveforms, extended duration, and low signal-to-noise ratio (SNR), making them more challenging to be identified compared to compact binary coalescences. While matched filtering-based techniques are known for their computational demands, existing deep learning-based methods primarily handle time-domain data and are often constrained by data duration and SNR. In addition, most existing work ignores time-delay interferometry (TDI) and applies the long-wavelength approximation in detector response calculations, thus limiting their ability to handle laser frequency noise. In this study, we introduce DECODE, an end-to-end model focusing on EMRI signal detection by sequence modeling in the frequency domain. Centered around a dilated causal convolutional neural network, trained on synthetic data considering TDI-1.5 detector response, DECODE can efficiently process a year's worth of multichannel TDI data with an SNR of around 50. We evaluate our model on 1-year data with accumulated SNR ranging from 50 to 120 and achieve a true positive rate of 96.3% at a false positive rate of 1%, keeping an inference time of less than 0.01 seconds. With the visualization of three showcased EMRI signals for interpretability and generalization, DECODE exhibits strong potential for future space-based gravitational wave data analyses.
When modeling a vector of risk variables, extreme scenarios are often of special interest. The peaks-over-thresholds method hinges on the notion that, asymptotically, the excesses over a vector of high thresholds follow a multivariate generalized Pareto distribution. However, existing literature has primarily concentrated on the setting when all risk variables are always large simultaneously. In reality, this assumption is often not met, especially in high dimensions. In response to this limitation, we study scenarios where distinct groups of risk variables may exhibit joint extremes while others do not. These discernible groups are derived from the angular measure inherent in the corresponding max-stable distribution, whence the term extreme direction. We explore such extreme directions within the framework of multivariate generalized Pareto distributions, with a focus on their probability density functions in relation to an appropriate dominating measure. Furthermore, we provide a stochastic construction that allows any prespecified set of risk groups to constitute the distribution's extreme directions. This construction takes the form of a smoothed max-linear model and accommodates the full spectrum of conceivable max-stable dependence structures. Additionally, we introduce a generic simulation algorithm tailored for multivariate generalized Pareto distributions, offering specific implementations for extensions of the logistic and H\"usler-Reiss families capable of carrying arbitrary extreme directions.
We study stochastic approximation procedures for approximately solving a $d$-dimensional linear fixed point equation based on observing a trajectory of length $n$ from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the order $t_{\mathrm{mix}} \tfrac{d}{n}$ on the squared error of the last iterate of a standard scheme, where $t_{\mathrm{mix}}$ is a mixing time. We then prove a non-asymptotic instance-dependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters $(d, t_{\mathrm{mix}})$ in the higher order terms. We complement these upper bounds with a non-asymptotic minimax lower bound that establishes the instance-optimality of the averaged SA estimator. We derive corollaries of these results for policy evaluation with Markov noise -- covering the TD($\lambda$) family of algorithms for all $\lambda \in [0, 1)$ -- and linear autoregressive models. Our instance-dependent characterizations open the door to the design of fine-grained model selection procedures for hyperparameter tuning (e.g., choosing the value of $\lambda$ when running the TD($\lambda$) algorithm).