We introduce a novel approach to waveform inversion, based on a data driven reduced order model (ROM) of the wave operator. The presentation is for the acoustic wave equation, but the approach can be extended to elastic or electromagnetic waves. The data are time resolved measurements of the pressure wave gathered by an acquisition system which probes the unknown medium with pulses and measures the generated waves. We propose to solve the inverse problem of velocity estimation by minimizing the square misfit between the ROM computed from the recorded data and the ROM computed from the modeled data, at the current guess of the velocity. We give the step by step computation of the ROM, which depends nonlinearly on the data and yet can be obtained from them in a non-iterative fashion, using efficient methods from linear algebra. We also explain how to make the ROM robust to data inaccuracy. The ROM computation requires the full array response matrix gathered with collocated sources and receivers. However, we show that the computation can deal with an approximation of this matrix, obtained from towed-streamer data using interpolation and reciprocity on-the-fly. While the full-waveform inversion approach of nonlinear least-squares data fitting is challenging without low frequency information, due to multiple minima of the data fit objective function, we show that the ROM misfit objective function has a better behavior, even for a poor initial guess. We also show by an explicit computation of the objective functions in a simple setting that the ROM misfit objective function has convexity properties, whereas the least squares data fit objective function displays multiple local minima.
Dynamic optimization of mean and variance in Markov decision processes (MDPs) is a long-standing challenge caused by the failure of dynamic programming. In this paper, we propose a new approach to find the globally optimal policy for combined metrics of steady-state mean and variance in an infinite-horizon undiscounted MDP. By introducing the concepts of pseudo mean and pseudo variance, we convert the original problem to a bilevel MDP problem, where the inner one is a standard MDP optimizing pseudo mean-variance and the outer one is a single parameter selection problem optimizing pseudo mean. We use the sensitivity analysis of MDPs to derive the properties of this bilevel problem. By solving inner standard MDPs for pseudo mean-variance optimization, we can identify worse policy spaces dominated by optimal policies of the pseudo problems. We propose an optimization algorithm which can find the globally optimal policy by repeatedly removing worse policy spaces. The convergence and complexity of the algorithm are studied. Another policy dominance property is also proposed to further improve the algorithm efficiency. Numerical experiments demonstrate the performance and efficiency of our algorithms. To the best of our knowledge, our algorithm is the first that efficiently finds the globally optimal policy of mean-variance optimization in MDPs. These results are also valid for solely minimizing the variance metrics in MDPs.
Sparse graph recovery methods works well where the data follows their assumptions but often they are not designed for doing downstream probabilistic queries. This limits their adoption to only identifying connections among the input variables. On the other hand, the Probabilistic Graphical Models (PGMs) assumes an underlying base graph between variables and learns a distribution over them. PGM design choices are carefully made such that the inference & sampling algorithms are efficient. This brings in certain restrictions and often simplifying assumptions. In this work, we propose Neural Graph Revealers (NGRs), that are an attempt to efficiently merge the sparse graph recovery methods with PGMs into a single flow. The problem setting consists of an input data X with D features and M samples and the task is to recover a sparse graph showing connection between the features. NGRs view the neural networks as a `white box' or more specifically as a multitask learning framework. We introduce `Graph-constrained path norm' that NGRs leverage to learn a graphical model that captures complex non-linear functional dependencies between the features in the form of an undirected sparse graph. Furthermore, NGRs can handle multimodal inputs like images, text, categorical data, embeddings etc. which is not straightforward to incorporate in the existing methods. We show experimental results of doing sparse graph recovery and probabilistic inference on data from Gaussian graphical models and a multimodal infant mortality dataset by CDC.
We introduce the Weak-form Estimation of Nonlinear Dynamics (WENDy) method for estimating model parameters for non-linear systems of ODEs. The core mathematical idea involves an efficient conversion of the strong form representation of a model to its weak form, and then solving a regression problem to perform parameter inference. The core statistical idea rests on the Errors-In-Variables framework, which necessitates the use of the iteratively reweighted least squares algorithm. Further improvements are obtained by using orthonormal test functions, created from a set of $C^{\infty}$ bump functions of varying support sizes. We demonstrate that WENDy is a highly robust and efficient method for parameter inference in differential equations. Without relying on any numerical differential equation solvers, WENDy computes accurate estimates and is robust to large (biologically relevant) levels of measurement noise. For low dimensional systems with modest amounts of data, WENDy is competitive with conventional forward solver-based nonlinear least squares methods in terms of speed and accuracy. For both higher dimensional systems and stiff systems, WENDy is typically both faster (often by orders of magnitude) and more accurate than forward solver-based approaches. We illustrate the method and its performance in some common population and neuroscience models, including logistic growth, Lotka-Volterra, FitzHugh-Nagumo, Hindmarsh-Rose, and a Protein Transduction Benchmark model. Software and code for reproducing the examples is available at (//github.com/MathBioCU/WENDy).
In the present work, we introduce a novel approach to enhance the precision of reduced order models by exploiting a multi-fidelity perspective and DeepONets. Reduced models provide a real-time numerical approximation by simplifying the original model. The error introduced by such operation is usually neglected and sacrificed in order to reach a fast computation. We propose to couple the model reduction to a machine learning residual learning, such that the above-mentioned error can be learnt by a neural network and inferred for new predictions. We emphasize that the framework maximizes the exploitation of the high-fidelity information, using it for building the reduced order model and for learning the residual. In this work we explore the integration of proper orthogonal decomposition (POD), and gappy POD for sensors data, with the recent DeepONet architecture. Numerical investigations for a parametric benchmark function and a nonlinear parametric Navier-Stokes problem are presented.
We consider the problem of supervised dimension reduction with a particular focus on extreme values of the target $Y\in\mathbb{R}$ to be explained by a covariate vector $X \in \mathbb{R}^p$. The general purpose is to define and estimate a projection on a lower dimensional subspace of the covariate space which is sufficient for predicting exceedances of the target above high thresholds. We propose an original definition of Tail Conditional Independence which matches this purpose. Inspired by Sliced Inverse Regression (SIR) methods, we develop a novel framework (TIREX, Tail Inverse Regression for EXtreme response) in order to estimate an extreme sufficient dimension reduction (SDR) space of potentially smaller dimension than that of a classical SDR space. We prove the weak convergence of tail empirical processes involved in the estimation procedure and we illustrate the relevance of the proposed approach on simulated and real world data.
We establish Lipschitz stability properties for a class of inverse problems. In that class, the associated direct problem is formulated by an integral operator Am depending non-linearly on a parameter m and operating on a function u. In the inversion step both u and m are unknown but we are only interested in recovering m. We discuss examples of such inverse problems for the elasticity equation with applications to seismology and for the inverse scattering problem in electromagnetic theory. Assuming a few injectivity and regularity properties for Am, we prove that the inverse problem with a finite number of data points is solvable and that the solution is Lipschitz stable in the data. We show a reconstruction example illustrating the use of neural networks.
This work introduces, analyzes and demonstrates an efficient and theoretically sound filtering strategy to ensure the condition of the least-squares problem solved at each iteration of Anderson acceleration. The filtering strategy consists of two steps: the first controls the length disparity between columns of the least-squares matrix, and the second enforces a lower bound on the angles between subspaces spanned by the columns of that matrix. The combined strategy is shown to control the condition number of the least-squares matrix at each iteration. The method is shown to be effective on a range of problems based on discretizations of partial differential equations. It is shown particularly effective for problems where the initial iterate may lie far from the solution, and which progress through distinct preasymptotic and asymptotic phases.
This paper investigates the inverse source problem with a single propagating mode at multiple frequencies in an acoustic waveguide. The goal is to provide both theoretical justifications and efficient algorithms for imaging extended sources using the sampling methods. In contrast to the existing far/near field operator based on the integral over the space variable in the sampling methods, a multi-frequency far-field operator is introduced based on the integral over the frequency variable. This far-field operator is defined in a way to incorporate the possibly non-linear dispersion relation, a unique feature in waveguides. The factorization method is deployed to establish a rigorous characterization of the range support which is the support of source in the direction of wave propagation. A related factorization-based sampling method is also discussed. These sampling methods are shown to be capable of imaging the range support of the source. Numerical examples are provided to illustrate the performance of the sampling methods, including an example to image a complete sound-soft block.
Multimodal learning helps to comprehensively understand the world, by integrating different senses. Accordingly, multiple input modalities are expected to boost model performance, but we actually find that they are not fully exploited even when the multimodal model outperforms its uni-modal counterpart. Specifically, in this paper we point out that existing multimodal discriminative models, in which uniform objective is designed for all modalities, could remain under-optimized uni-modal representations, caused by another dominated modality in some scenarios, e.g., sound in blowing wind event, vision in drawing picture event, etc. To alleviate this optimization imbalance, we propose on-the-fly gradient modulation to adaptively control the optimization of each modality, via monitoring the discrepancy of their contribution towards the learning objective. Further, an extra Gaussian noise that changes dynamically is introduced to avoid possible generalization drop caused by gradient modulation. As a result, we achieve considerable improvement over common fusion methods on different multimodal tasks, and this simple strategy can also boost existing multimodal methods, which illustrates its efficacy and versatility. The source code is available at \url{//github.com/GeWu-Lab/OGM-GE_CVPR2022}.
GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model, for the image to be faithfully reconstructed from the inverted code by the generator. As an emerging technique to bridge the real and fake image domains, GAN inversion plays an essential role in enabling the pretrained GAN models such as StyleGAN and BigGAN to be used for real image editing applications. Meanwhile, GAN inversion also provides insights on the interpretation of GAN's latent space and how the realistic images can be generated. In this paper, we provide an overview of GAN inversion with a focus on its recent algorithms and applications. We cover important techniques of GAN inversion and their applications to image restoration and image manipulation. We further elaborate on some trends and challenges for future directions.