In $d$ dimensions, approximating an arbitrary function oscillating with frequency $\lesssim k$ requires $\sim k^d$ degrees of freedom. A numerical method for solving the Helmholtz equation (with wavenumber $k$) suffers from the pollution effect if, as $k\to \infty$, the total number of degrees of freedom needed to maintain accuracy grows faster than this natural threshold. While the $h$-version of the finite element method (FEM) (where accuracy is increased by decreasing the meshwidth $h$ and keeping the polynomial degree $p$ fixed) suffers from the pollution effect, the celebrated papers [Melenk, Sauter 2010], [Melenk, Sauter 2011], [Esterhazy, Melenk 2012], and [Melenk, Parsania, Sauter 2013] showed that the $hp$-FEM (where accuracy is increased by decreasing the meshwidth $h$ and increasing the polynomial degree $p$) applied to a variety of constant-coefficient Helmholtz problems does not suffer from the pollution effect. The heart of the proofs of these results is a PDE result splitting the solution of the Helmholtz equation into "high" and "low" frequency components. In this expository paper we prove this splitting for the constant-coefficient Helmholtz equation in full space (i.e., in $\mathbb{R}^d$) using only integration by parts and elementary properties of the Fourier transform; this is in contrast to the proof for this set-up in [Melenk, Sauter 2010] which uses somewhat-involved bounds on Bessel and Hankel functions. The proof in this paper is motivated by the recent proof in [Lafontaine, Spence, Wunsch 2022] of this splitting for the variable-coefficient Helmholtz equation in full space; indeed, the proof in [Lafontaine, Spence, Wunsch 2022] uses more-sophisticated tools that reduce to the elementary ones above for constant coefficients.
A conceptually appealing approach for learning Extensive-Form Games (EFGs) is to convert them to Normal-Form Games (NFGs). This approach enables us to directly translate state-of-the-art techniques and analyses in NFGs to learning EFGs, but typically suffers from computational intractability due to the exponential blow-up of the game size introduced by the conversion. In this paper, we address this problem in natural and important setups for the \emph{$\Phi$-Hedge} algorithm -- A generic algorithm capable of learning a large class of equilibria for NFGs. We show that $\Phi$-Hedge can be directly used to learn Nash Equilibria (zero-sum settings), Normal-Form Coarse Correlated Equilibria (NFCCE), and Extensive-Form Correlated Equilibria (EFCE) in EFGs. We prove that, in those settings, the \emph{$\Phi$-Hedge} algorithms are equivalent to standard Online Mirror Descent (OMD) algorithms for EFGs with suitable dilated regularizers, and run in polynomial time. This new connection further allows us to design and analyze a new class of OMD algorithms based on modifying its log-partition function. In particular, we design an improved algorithm with balancing techniques that achieves a sharp $\widetilde{\mathcal{O}}(\sqrt{XAT})$ EFCE-regret under bandit-feedback in an EFG with $X$ information sets, $A$ actions, and $T$ episodes. To our best knowledge, this is the first such rate and matches the information-theoretic lower bound.
In this paper, we analyze the numerical approximation of the Navier-Stokes problem over a bounded polygonal domain in $\mathbb{R}^2$, where the initial condition is modeled by a log-normal random field. This problem usually arises in the area of uncertainty quantification. We aim to compute the expectation value of linear functionals of the solution to the Navier-Stokes equations and perform a rigorous error analysis for the problem. In particular, our method includes the finite element, fully-discrete discretizations, truncated Karhunen-Lo\'eve expansion for the realizations of the initial condition, and lattice-based quasi-Monte Carlo (QMC) method to estimate the expected values over the parameter space. Our QMC analysis is based on randomly-shifted lattice rules for the integration over the domain in high-dimensional space, which guarantees the error decays with $\mathcal{O}(N^{-1+\delta})$, where $N$ is the number of sampling points, $\delta>0$ is an arbitrary small number, and the constant in the decay estimate is independent of the dimension of integration.
Projection-based model order reduction allows for the parsimonious representation of full order models (FOMs), typically obtained through the discretization of certain partial differential equations (PDEs) using conventional techniques where the discretization may contain a very large number of degrees of freedom. As a result of this more compact representation, the resulting projection-based reduced order models (ROMs) can achieve considerable computational speedups, which are especially useful in real-time or multi-query analyses. One known deficiency of projection-based ROMs is that they can suffer from a lack of robustness, stability and accuracy, especially in the predictive regime, which ultimately limits their useful application. Another research gap that has prevented the widespread adoption of ROMs within the modeling and simulation community is the lack of theoretical and algorithmic foundations necessary for the "plug-and-play" integration of these models into existing multi-scale and multi-physics frameworks. This paper describes a new methodology that has the potential to address both of the aforementioned deficiencies by coupling projection-based ROMs with each other as well as with conventional FOMs by means of the Schwarz alternating method. Leveraging recent work that adapted the Schwarz alternating method to enable consistent and concurrent multi-scale coupling of finite element FOMs in solid mechanics, we present a new extension of the Schwarz formulation that enables ROM-FOM and ROM-ROM coupling in nonlinear solid mechanics. In order to maintain efficiency, we employ hyper-reduction via the Energy-Conserving Sampling and Weighting approach. We evaluate the proposed coupling approach in the reproductive as well as in the predictive regime on a canonical test case that involves the dynamic propagation of a traveling wave in a nonlinear hyper-elastic material.
Comparing the functional behavior of neural network models, whether it is a single network over time or two (or more networks) during or post-training, is an essential step in understanding what they are learning (and what they are not), and for identifying strategies for regularization or efficiency improvements. Despite recent progress, e.g., comparing vision transformers to CNNs, systematic comparison of function, especially across different networks, remains difficult and is often carried out layer by layer. Approaches such as canonical correlation analysis (CCA) are applicable in principle, but have been sparingly used so far. In this paper, we revisit a (less widely known) from statistics, called distance correlation (and its partial variant), designed to evaluate correlation between feature spaces of different dimensions. We describe the steps necessary to carry out its deployment for large scale models -- this opens the door to a surprising array of applications ranging from conditioning one deep model w.r.t. another, learning disentangled representations as well as optimizing diverse models that would directly be more robust to adversarial attacks. Our experiments suggest a versatile regularizer (or constraint) with many advantages, which avoids some of the common difficulties one faces in such analyses. Code is at //github.com/zhenxingjian/Partial_Distance_Correlation.
Quantized constant envelope (QCE) precoding, a new transmission scheme that only discrete QCE transmit signals are allowed at each antenna, has gained growing research interests due to its ability of reducing the hardware cost and the energy consumption of massive multiple-input multiple-output (MIMO) systems. However, the discrete nature of QCE transmit signals greatly complicates the precoding design. In this paper, we consider the QCE precoding problem for a massive MIMO system with phase shift keying (PSK) modulation and develop an efficient approach for solving the constructive interference (CI) based problem formulation. Our approach is based on a custom-designed (continuous) penalty model that is equivalent to the original discrete problem. Specifically, the penalty model relaxes the discrete QCE constraint and penalizes it in the objective with a negative $\ell_2$-norm term, which leads to a non-smooth non-convex optimization problem. To tackle it, we resort to our recently proposed alternating optimization (AO) algorithm. We show that the AO algorithm admits closed-form updates at each iteration when applied to our problem and thus can be efficiently implemented. Simulation results demonstrate the superiority of the proposed approach over the existing algorithms.
We present a highly scalable strategy for developing mesh-free neuro-symbolic partial differential equation solvers from existing numerical discretizations found in scientific computing. This strategy is unique in that it can be used to efficiently train neural network surrogate models for the solution functions and the differential operators, while retaining the accuracy and convergence properties of state-of-the-art numerical solvers. This neural bootstrapping method is based on minimizing residuals of discretized differential systems on a set of random collocation points with respect to the trainable parameters of the neural network, achieving unprecedented resolution and optimal scaling for solving physical and biological systems.
We consider the classic facility location problem in fully dynamic data streams, where elements can be both inserted and deleted. In this problem, one is interested in maintaining a stable and high quality solution throughout the data stream while using only little time per update (insertion or deletion). We study the problem and provide the first algorithm that at the same time maintains a constant approximation and incurs polylogarithmic amortized recourse per update. We complement our theoretical results with an experimental analysis showing the practical efficiency of our method.
The common methods of spectral analysis for multivariate ($n$-dimensional) time series, like discrete Frourier transform (FT) or Wavelet transform, are based on Fourier series to decompose discrete data into a set of trigonometric model components, e. g. amplitude and phase. Applied to discrete data with a finite range several limitations of (time discrete) FT can be observed which are caused by the orthogonality mismatch of the trigonometric basis functions on a finite interval. However, in the general situation of non-equidistant or fragmented sampling FT based methods will cause significant errors in the parameter estimation. Therefore, the classical Lomb-Scargle method (LSM), which is not based on Fourier series, was developed as a statistical tool for one dimensional data to circumvent the inconsistent and erroneous parameter estimation of FT. The present work deduces LSM for $n$-dimensional data sets by a redefinition of the shifting parameter $\tau$, to maintain orthogonality of the trigonometric basis. An analytical derivation shows, that $n$-D LSM extents the traditional 1D case preserving all the statistical benefits, such as the improved noise rejection. Here, we derive the parameter confidence intervals for LSM and compare it with FT. Applications with ideal test data and experimental data will illustrate and support the proposed method.
In this paper, we consider recent progress in estimating the average treatment effect when extreme inverse probability weights are present and focus on methods that account for a possible violation of the positivity assumption. These methods aim at estimating the treatment effect on the subpopulation of patients for whom there is a clinical equipoise. We propose a systematic approach to determine their related causal estimands and develop new insights into the properties of the weights targeting such a subpopulation. Then, we examine the roles of overlap weights, matching weights, Shannon's entropy weights, and beta weights. This helps us characterize and compare their underlying estimators, analytically and via simulations, in terms of the accuracy, precision, and root mean squared error. Moreover, we study the asymptotic behaviors of their augmented estimators (that mimic doubly robust estimators), which lead to improved estimations when either the propensity or the regression models are correctly specified. Based on the analytical and simulation results, we conclude that overall overlap weights are preferable to matching weights, especially when there is moderate or extreme violations of the positivity assumption. Finally, we illustrate the methods using a real data example marked by extreme inverse probability weights.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.