Lightning is a destructive and highly visible product of severe storms, yet there is still much to be learned about the conditions under which lightning is most likely to occur. The GOES-16 and GOES-17 satellites, launched in 2016 and 2018 by NOAA and NASA, collect a wealth of data regarding individual lightning strike occurrence and potentially related atmospheric variables. The acute nature and inherent spatial correlation in lightning data renders standard regression analyses inappropriate. Further, computational considerations are foregrounded by the desire to analyze the immense and rapidly increasing volume of lightning data. We present a new computationally feasible method that combines spectral and Laplace approximations in an EM algorithm, denoted SLEM, to fit the widely popular log-Gaussian Cox process model to large spatial point pattern datasets. In simulations, we find SLEM is competitive with contemporary techniques in terms of speed and accuracy. When applied to two lightning datasets, SLEM provides better out-of-sample prediction scores and quicker runtimes, suggesting its particular usefulness for analyzing lightning data, which tend to have sparse signals.
Determining the proper level of details to develop and solve physical models is usually difficult when one encounters new engineering problems. Such difficulty comes from how to balance the time (simulation cost) and accuracy for the physical model simulation afterwards. We propose a framework for automatic development of a family of surrogate models of physical systems that provide flexible cost-accuracy tradeoffs to assist making such determinations. We present both a model-based and a data-driven strategy to generate surrogate models. The former starts from a high-fidelity model generated from first principles and applies a bottom-up model order reduction (MOR) that preserves stability and convergence while providing a priori error bounds, although the resulting reduced-order model may lose its interpretability. The latter generates interpretable surrogate models by fitting artificial constitutive relations to a presupposed topological structure using experimental or simulation data. For the latter, we use Tonti diagrams to systematically produce differential equations from the assumed topological structure using algebraic topological semantics that are common to various lumped-parameter models (LPM). The parameter for the constitutive relations are estimated using standard system identification algorithms. Our framework is compatible with various spatial discretization schemes for distributed parameter models (DPM), and can supports solving engineering problems in different domains of physics.
It is common practice to use Laplace approximations to compute marginal likelihoods in Bayesian versions of generalised linear models (GLM). Marginal likelihoods combined with model priors are then used in different search algorithms to compute the posterior marginal probabilities of models and individual covariates. This allows performing Bayesian model selection and model averaging. For large sample sizes, even the Laplace approximation becomes computationally challenging because the optimisation routine involved needs to evaluate the likelihood on the full set of data in multiple iterations. As a consequence, the algorithm is not scalable for large datasets. To address this problem, we suggest using a version of a popular batch stochastic gradient descent (BSGD) algorithm for estimating the marginal likelihood of a GLM by subsampling from the data. We further combine the algorithm with Markov chain Monte Carlo (MCMC) based methods for Bayesian model selection and provide some theoretical results on the convergence of the estimates. Finally, we report results from experiments illustrating the performance of the proposed algorithm.
Motivated by the high-frequency data streams continuously generated, real-time learning is becoming increasingly important. These data streams should be processed sequentially with the property that the data stream may change over time. In this streaming setting, we propose techniques for minimizing convex objectives through unbiased estimates of their gradients, commonly referred to as stochastic approximation problems. Our methods rely on stochastic approximation algorithms because of their applicability and computational advantages. The reasoning includes iterate averaging that guarantees optimal statistical efficiency under classical conditions. Our non-asymptotic analysis shows accelerated convergence by selecting the learning rate according to the expected data streams. We show that the average estimate converges optimally and robustly for any data stream rate. In addition, noise reduction can be achieved by processing the data in a specific pattern, which is advantageous for large-scale machine learning problems. These theoretical results are illustrated for various data streams, showing the effectiveness of the proposed algorithms.
We propose a method for quantifying uncertainty in high-dimensional PDE systems with random parameters, where the number of solution evaluations is small. Parametric PDE solutions are often approximated using a spectral decomposition based on polynomial chaos expansions. For the class of systems we consider (i.e., high dimensional with limited solution evaluations) the coefficients are given by an underdetermined linear system in a regression formulation. This implies additional assumptions, such as sparsity of the coefficient vector, are needed to approximate the solution. Here, we present an approach where we assume the coefficients are close to the range of a generative model that maps from a low to a high dimensional space of coefficients. Our approach is inspired be recent work examining how generative models can be used for compressed sensing in systems with random Gaussian measurement matrices. Using results from PDE theory on coefficient decay rates, we construct an explicit generative model that predicts the polynomial chaos coefficient magnitudes. The algorithm we developed to find the coefficients, which we call GenMod, is composed of two main steps. First, we predict the coefficient signs using Orthogonal Matching Pursuit. Then, we assume the coefficients are within a sparse deviation from the range of a sign-adjusted generative model. This allows us to find the coefficients by solving a nonconvex optimization problem, over the input space of the generative model and the space of sparse vectors. We obtain theoretical recovery results for a Lipschitz continuous generative model and for a more specific generative model, based on coefficient decay rate bounds. We examine three high-dimensional problems and show that, for all three examples, the generative model approach outperforms sparsity promoting methods at small sample sizes.
Online markets are a part of everyday life, and their rules are governed by algorithms. Assuming participants are inherently self-interested, well designed rules can help to increase social welfare. Many algorithms for online markets are based on prices: the seller is responsible for posting prices while buyers make purchases which are most profitable given the posted prices. To make adjustments to the market the seller is allowed to update prices at certain timepoints. Posted prices are an intuitive way to design a market. Despite the fact that each buyer acts selfishly, the seller's goal is often assumed to be that of social welfare maximization. Berger, Eden and Feldman recently considered the case of a market with only three buyers where each buyer has a fixed number of goods to buy and the profit of a bought bundle of items is the sum of profits of the items in the bundle. For such markets, Berger et. al. showed that the seller can maximize social welfare by dynamically updating posted prices before arrival of each buyer. B\'{e}rczi, B\'{e}rczi-Kov\'{a}cs and Sz\"{o}gi showed that the social welfare can be maximized also when each buyer is ready to buy at most two items. We study the power of posted prices with dynamical updates in more general cases. First, we show that the result of Berger et. al. can be generalized from three to four buyers. Then we show that the result of B\'{e}rczi, B\'{e}rczi-Kov\'{a}cs and Sz\"{o}gi can be generalized to the case when each buyer is ready to buy up to three items. We also show that a dynamic pricing is possible whenever there are at most two allocations maximizing social welfare.
We study approximation of probability measures supported on $n$-dimensional manifolds embedded in R^m by injective flows -- neural networks composed of invertible flows and injective layers. We show that in general, injective flows between R^n and R^m universally approximate measures supported on images of extendable embeddings, which are a subset of standard embeddings: when the embedding dimension m is small, topological obstructions may preclude certain manifolds as admissible targets. When the embedding dimension is sufficiently large, m \geq 3n+1, we use an argument from algebraic topology known as the clean trick to prove that the topological obstructions vanish and injective flows universally approximate any differentiable embedding. Along the way we show that the studied injective flows admit efficient projections on the range, and that their optimality can be established "in reverse," resolving a conjecture made in Brehmer and Cranmer 2020
We propose a new hybrid topology optimization algorithm based on multigrid approach that combines the parallelization strategy of CPU using OpenMP and heavily multithreading capabilities of modern Graphics Processing Units (GPU). In addition to that significant computational efficiency in memory requirement has been achieved using homogenization strategy. The algorithm has been integrated with versitile computing platform of MATLAB for ease of use and customization. The bottlenecking repetitive solution of the state equation has been solved using an optimized geometric multigrid approach along with CUDA parallelization enabling an order of magnitude faster in computational time than current state of the art implementations. On-the-fly computation of auxiliary matrices in the multigrid scheme and modification in interpolation schemes using homogenization strategy removes memory limitation of GPUs. Memory hierarchy of GPU has also been exploited for further optimized implementations. All these enable solution of structures involving hundred millions of three dimensional brick elements to be accomplished in a standard desktop computer or a workstation. Performance of the proposed algorithm is illustrated using several examples including design dependent loads and multimaterial.Results obtained indicate the excellent performance and scalability of the proposed approach.
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.
Neural waveform models such as the WaveNet are used in many recent text-to-speech systems, but the original WaveNet is quite slow in waveform generation because of its autoregressive (AR) structure. Although faster non-AR models were recently reported, they may be prohibitively complicated due to the use of a distilling training method and the blend of other disparate training criteria. This study proposes a non-AR neural source-filter waveform model that can be directly trained using spectrum-based training criteria and the stochastic gradient descent method. Given the input acoustic features, the proposed model first uses a source module to generate a sine-based excitation signal and then uses a filter module to transform the excitation signal into the output speech waveform. Our experiments demonstrated that the proposed model generated waveforms at least 100 times faster than the AR WaveNet and the quality of its synthetic speech is close to that of speech generated by the AR WaveNet. Ablation test results showed that both the sine-wave excitation signal and the spectrum-based training criteria were essential to the performance of the proposed model.
We study the problem of learning a latent variable model from a stream of data. Latent variable models are popular in practice because they can explain observed data in terms of unobserved concepts. These models have been traditionally studied in the offline setting. The online EM is arguably the most popular algorithm for learning latent variable models online. Although it is computationally efficient, it typically converges to a local optimum. In this work, we develop a new online learning algorithm for latent variable models, which we call SpectralLeader. SpectralLeader always converges to the global optimum, and we derive a $O(\sqrt{n})$ upper bound up to log factors on its $n$-step regret in the bag-of-words model. We show that SpectralLeader performs similarly to or better than the online EM with tuned hyper-parameters, in both synthetic and real-world experiments.