The development of surrogate models to study uncertainties in hydrologic systems requires significant effort in the development of sampling strategies and forward model simulations. Furthermore, in applications where prediction time is critical, such as prediction of hurricane storm surge, the predictions of system response and uncertainties can be required within short time frames. Here, we develop an efficient stochastic shallow water model to address these issues. To discretize the physical and probability spaces we use a Stochastic Galerkin method and a Incremental Pressure Correction scheme to advance the solution in time. To overcome discrete stability issues, we propose cross-mode stabilization methods which employs existing stabilization methods in the probability space by adding stabilization terms to every stochastic mode in a modes-coupled way. We extensively verify the developed method for both idealized shallow water test cases and hindcasting of past hurricanes. We subsequently use the developed and verified method to perform a comprehensive statistical analysis of the established shallow water surrogate models. Finally, we propose a predictor for hurricane storm surge under uncertain wind drag coefficients and demonstrate its effectivity for Hurricanes Ike and Harvey.
This article surveys research on the application of compatible finite element methods to large scale atmosphere and ocean simulation. Compatible finite element methods extend Arakawa's C-grid finite difference scheme to the finite element world. They are constructed from a discrete de Rham complex, which is a sequence of finite element spaces which are linked by the operators of differential calculus. The use of discrete de Rham complexes to solve partial differential equations is well established, but in this article we focus on the specifics of dynamical cores for simulating weather, oceans and climate. The most important consequence of the discrete de Rham complex is the Hodge-Helmholtz decomposition, which has been used to exclude the possibility of several types of spurious oscillations from linear equations of geophysical flow. This means that compatible finite element spaces provide a useful framework for building dynamical cores. In this article we introduce the main concepts of compatible finite element spaces, and discuss their wave propagation properties. We survey some methods for discretising the transport terms that arise in dynamical core equation systems, and provide some example discretisations, briefly discussing their iterative solution. Then we focus on the recent use of compatible finite element spaces in designing structure preserving methods, surveying variational discretisations, Poisson bracket discretisations, and consistent vorticity transport.
We introduce the Weak-form Estimation of Nonlinear Dynamics (WENDy) method for estimating model parameters for non-linear systems of ODEs. The core mathematical idea involves an efficient conversion of the strong form representation of a model to its weak form, and then solving a regression problem to perform parameter inference. The core statistical idea rests on the Errors-In-Variables framework, which necessitates the use of the iteratively reweighted least squares algorithm. Further improvements are obtained by using orthonormal test functions, created from a set of $C^{\infty}$ bump functions of varying support sizes. We demonstrate that WENDy is a highly robust and efficient method for parameter inference in differential equations. Without relying on any numerical differential equation solvers, WENDy computes accurate estimates and is robust to large (biologically relevant) levels of measurement noise. For low dimensional systems with modest amounts of data, WENDy is competitive with conventional forward solver-based nonlinear least squares methods in terms of speed and accuracy. For both higher dimensional systems and stiff systems, WENDy is typically both faster (often by orders of magnitude) and more accurate than forward solver-based approaches. We illustrate the method and its performance in some common population and neuroscience models, including logistic growth, Lotka-Volterra, FitzHugh-Nagumo, Hindmarsh-Rose, and a Protein Transduction Benchmark model. Software and code for reproducing the examples is available at (//github.com/MathBioCU/WENDy).
Elliptic interface problems whose solutions are $C^0$ continuous have been well studied over the past two decades. The well-known numerical methods include the strongly stable generalized finite element method (SGFEM) and immersed FEM (IFEM). In this paper, we study numerically a larger class of elliptic interface problems where their solutions are discontinuous. A direct application of these existing methods fails immediately as the approximate solution is in a larger space that covers discontinuous functions. We propose a class of high-order enriched unfitted FEMs to solve these problems with implicit or Robin-type interface jump conditions. We design new enrichment functions that capture the imposed discontinuity of the solution while keeping the condition number from fast growth. A linear enriched method in 1D was recently developed using one enrichment function and we generalized it to an arbitrary degree using two simple discontinuous one-sided enrichment functions. The natural tensor product extension to the 2D case is demonstrated. Optimal order convergence in the $L^2$ and broken $H^1$-norms are established. We also establish superconvergence at all discretization nodes (including exact nodal values in special cases). Numerical examples are provided to confirm the theory. Finally, to prove the efficiency of the method for practical problems, the enriched linear, quadratic, and cubic elements are applied to a multi-layer wall model for drug-eluting stents in which zero-flux jump conditions and implicit concentration interface conditions are both present.
When implementing the gradient descent method in low precision, the employment of stochastic rounding schemes helps to prevent stagnation of convergence caused by the vanishing gradient effect. Unbiased stochastic rounding yields zero bias by preserving small updates with probabilities proportional to their relative magnitudes. This study provides a theoretical explanation for the stagnation of the gradient descent method in low-precision computation. Additionally, we propose two new stochastic rounding schemes that trade the zero bias property with a larger probability to preserve small gradients. Our methods yield a constant rounding bias that, on average, lies in a descent direction. For convex problems, we prove that the proposed rounding methods typically have a beneficial effect on the convergence rate of gradient descent. We validate our theoretical analysis by comparing the performances of various rounding schemes when optimizing a multinomial logistic regression model and when training a simple neural network with an 8-bit floating-point format.
Nearly all simulation-based games have environment parameters that affect incentives in the interaction but are not explicitly incorporated into the game model. To understand the impact of these parameters on strategic incentives, typical game-theoretic analysis involves selecting a small set of representative values, and constructing and analyzing separate game models for each value. We introduce a novel technique to learn a single model representing a family of closely related games that differ in the number of symmetric players or other ordinal environment parameters. Prior work trains a multi-headed neural network to output mixed-strategy deviation payoffs, which can be used to compute symmetric $\varepsilon$-Nash equilibria. We extend this work by making environment parameters into input dimensions of the regressor, enabling a single model to learn patterns which generalize across the parameter space. For continuous and discrete parameters, our results show that these generalized models outperform existing approaches, achieving better accuracy with far less data. This technique makes thorough analysis of the parameter space more tractable, and promotes analyses that capture relationships between parameters and incentives.
Dynamic Algorithm Configuration (DAC) tackles the question of how to automatically learn policies to control parameters of algorithms in a data-driven fashion. This question has received considerable attention from the evolutionary community in recent years. Having a good benchmark collection to gain structural understanding on the effectiveness and limitations of different solution methods for DAC is therefore strongly desirable. Following recent work on proposing DAC benchmarks with well-understood theoretical properties and ground truth information, in this work, we suggest as a new DAC benchmark the controlling of the key parameter $\lambda$ in the $(1+(\lambda,\lambda))$~Genetic Algorithm for solving OneMax problems. We conduct a study on how to solve the DAC problem via the use of (static) automated algorithm configuration on the benchmark, and propose techniques to significantly improve the performance of the approach. Our approach is able to consistently outperform the default parameter control policy of the benchmark derived from previous theoretical work on sufficiently large problem sizes. We also present new findings on the landscape of the parameter-control search policies and propose methods to compute stronger baselines for the benchmark via numerical approximations of the true optimal policies.
Subgradient methods are the natural extension to the non-smooth case of the classical gradient descent for regular convex optimization problems. However, in general, they are characterized by slow convergence rates, and they require decreasing step-sizes to converge. In this paper we propose a subgradient method with constant step-size for composite convex objectives with $\ell_1$-regularization. If the smooth term is strongly convex, we can establish a linear convergence result for the function values. This fact relies on an accurate choice of the element of the subdifferential used for the update, and on proper actions adopted when non-differentiability regions are crossed. Then, we propose an accelerated version of the algorithm, based on conservative inertial dynamics and on an adaptive restart strategy. Finally, we test the performances of our algorithms on some strongly and non-strongly convex examples.
We study the widely known Cubic-Newton method in the stochastic setting and propose a general framework to use variance reduction which we call the helper framework. In all previous work, these methods were proposed with very large batches (both in gradients and Hessians) and with various and often strong assumptions. In this work, we investigate the possibility of using such methods without large batches and use very simple assumptions that are sufficient for all our methods to work. In addition, we study these methods applied to gradient-dominated functions. In the general case, we show improved convergence (compared to first-order methods) to an approximate local minimum, and for gradient-dominated functions, we show convergence to approximate global minima.
Recent advances of data-driven machine learning have revolutionized fields like computer vision, reinforcement learning, and many scientific and engineering domains. In many real-world and scientific problems, systems that generate data are governed by physical laws. Recent work shows that it provides potential benefits for machine learning models by incorporating the physical prior and collected data, which makes the intersection of machine learning and physics become a prevailing paradigm. In this survey, we present this learning paradigm called Physics-Informed Machine Learning (PIML) which is to build a model that leverages empirical data and available physical prior knowledge to improve performance on a set of tasks that involve a physical mechanism. We systematically review the recent development of physics-informed machine learning from three perspectives of machine learning tasks, representation of physical prior, and methods for incorporating physical prior. We also propose several important open research problems based on the current trends in the field. We argue that encoding different forms of physical prior into model architectures, optimizers, inference algorithms, and significant domain-specific applications like inverse engineering design and robotic control is far from fully being explored in the field of physics-informed machine learning. We believe that this study will encourage researchers in the machine learning community to actively participate in the interdisciplinary research of physics-informed machine learning.
This PhD thesis contains several contributions to the field of statistical causal modeling. Statistical causal models are statistical models embedded with causal assumptions that allow for the inference and reasoning about the behavior of stochastic systems affected by external manipulation (interventions). This thesis contributes to the research areas concerning the estimation of causal effects, causal structure learning, and distributionally robust (out-of-distribution generalizing) prediction methods. We present novel and consistent linear and non-linear causal effects estimators in instrumental variable settings that employ data-dependent mean squared prediction error regularization. Our proposed estimators show, in certain settings, mean squared error improvements compared to both canonical and state-of-the-art estimators. We show that recent research on distributionally robust prediction methods has connections to well-studied estimators from econometrics. This connection leads us to prove that general K-class estimators possess distributional robustness properties. We, furthermore, propose a general framework for distributional robustness with respect to intervention-induced distributions. In this framework, we derive sufficient conditions for the identifiability of distributionally robust prediction methods and present impossibility results that show the necessity of several of these conditions. We present a new structure learning method applicable in additive noise models with directed trees as causal graphs. We prove consistency in a vanishing identifiability setup and provide a method for testing substructure hypotheses with asymptotic family-wise error control that remains valid post-selection. Finally, we present heuristic ideas for learning summary graphs of nonlinear time-series models.