In this work, we offer a theoretical analysis of two modern optimization techniques for training large and complex models: (i) adaptive optimization algorithms, such as Adam, and (ii) the model exponential moving average (EMA). Specifically, we demonstrate that a clipped version of Adam with model EMA achieves the optimal convergence rates in various nonconvex optimization settings, both smooth and nonsmooth. Moreover, when the scale varies significantly across different coordinates, we demonstrate that the coordinate-wise adaptivity of Adam is provably advantageous. Notably, unlike previous analyses of Adam, our analysis crucially relies on its core elements -- momentum and discounting factors -- as well as model EMA, motivating their wide applications in practice.
Finding the optimal design of experiments in the Bayesian setting typically requires estimation and optimization of the expected information gain functional. This functional consists of one outer and one inner integral, separated by the logarithm function applied to the inner integral. When the mathematical model of the experiment contains uncertainty about the parameters of interest and nuisance uncertainty, (i.e., uncertainty about parameters that affect the model but are not themselves of interest to the experimenter), two inner integrals must be estimated. Thus, the already considerable computational effort required to determine good approximations of the expected information gain is increased further. The Laplace approximation has been applied successfully in the context of experimental design in various ways, and we propose two novel estimators featuring the Laplace approximation to alleviate the computational burden of both inner integrals considerably. The first estimator applies Laplace's method followed by a Laplace approximation, introducing a bias. The second estimator uses two Laplace approximations as importance sampling measures for Monte Carlo approximations of the inner integrals. Both estimators use Monte Carlo approximation for the remaining outer integral estimation. We provide four numerical examples demonstrating the applicability and effectiveness of our proposed estimators.
In this paper, we present a stochastic method for the simulation of Laplace's equation with a mixed boundary condition in planar domains that are polygonal or bounded by circular arcs. We call this method the Reflected Walk-on-Spheres algorithm. The method combines a traditional Walk-on-Spheres algorithm with use of reflections at the Neumann boundaries. We apply our algorithm to simulate numerical conformal mappings from certain quadrilaterals to the corresponding canonical domains, and to compute their conformal moduli. Finally, we give examples of the method on three dimensional polyhedral domains, and use it to simulate the heat flow on an L-shaped insulated polyhedron.
Stability is a basic requirement when studying the behavior of dynamical systems. However, stabilizing dynamical systems via reinforcement learning is challenging because only little data can be collected over short time horizons before instabilities are triggered and data become meaningless. This work introduces a reinforcement learning approach that is formulated over latent manifolds of unstable dynamics so that stabilizing policies can be trained from few data samples. The unstable manifolds are minimal in the sense that they contain the lowest dimensional dynamics that are necessary for learning policies that guarantee stabilization. This is in stark contrast to generic latent manifolds that aim to approximate all -- stable and unstable -- system dynamics and thus are higher dimensional and often require higher amounts of data. Experiments demonstrate that the proposed approach stabilizes even complex physical systems from few data samples for which other methods that operate either directly in the system state space or on generic latent manifolds fail.
In this paper, we introduce the finite difference weighted essentially non-oscillatory (WENO) scheme based on the neural network for hyperbolic conservation laws. We employ the supervised learning and design two loss functions, one with the mean squared error and the other with the mean squared logarithmic error, where the WENO3-JS weights are computed as the labels. Each loss function consists of two components where the first component compares the difference between the weights from the neural network and WENO3-JS weights, while the second component matches the output weights of the neural network and the linear weights. The former of the loss function enforces the neural network to follow the WENO properties, implying that there is no need for the post-processing layer. Additionally the latter leads to better performance around discontinuities. As a neural network structure, we choose the shallow neural network (SNN) for computational efficiency with the Delta layer consisting of the normalized undivided differences. These constructed WENO3-SNN schemes show the outperformed results in one-dimensional examples and improved behavior in two-dimensional examples, compared with the simulations from WENO3-JS and WENO3-Z.
We review some recent development in the theory of spatial extremes related to Pareto Processes and modeling of threshold exceedances. We provide theoretical background, methodology for modeling, simulation and inference as well as an illustration to wave height modelling. This preprint is an author version of a chapter to appear in a collaborative book.
We investigate the set of invariant idempotent probabilities for countable idempotent iterated function systems (IFS) defined in compact metric spaces. We demonstrate that, with constant weights, there exists a unique invariant idempotent probability. Utilizing Secelean's approach to countable IFSs, we introduce partially finite idempotent IFSs and prove that the sequence of invariant idempotent measures for these systems converges to the invariant measure of the original countable IFS. We then apply these results to approximate such measures with discrete systems, producing, in the one-dimensional case, data series whose Higuchi fractal dimension can be calculated. Finally, we provide numerical approximations for two-dimensional cases and discuss the application of generalized Higuchi dimensions in these scenarios.
In this work, we aim to develop energy-stable parametric finite element approximations for a sharp-interface model with strong surface energy anisotropy, which is derived from the first variation of an energy functional composed of film/vapor interfacial energy, substrate energy, and regularized Willmore energy. By introducing two geometric relations, we innovatively establish an equivalent regularized sharp-interface model and further construct an energy-stable parametric finite element algorithm for this equivalent model. We provide a detailed proof of the energy stability of the numerical scheme, addressing a gap in the relevant theory. Additionally, we develop another structure-preserving parametric finite element scheme that can preserve both area conservation and energy stability. Finally, we present several numerical simulations to show accuracy and efficiency as well as some structure-preserving properties of the proposed numerical methods. More importantly, extensive numerical simulations reveal that our schemes provide better mesh quality and are more suitable for long-term computations.
In the field of autonomous driving research, the use of immersive virtual reality (VR) techniques is widespread to enable a variety of studies under safe and controlled conditions. However, this methodology is only valid and consistent if the conduct of participants in the simulated setting mirrors their actions in an actual environment. In this paper, we present a first and innovative approach to evaluating what we term the behavioural gap, a concept that captures the disparity in a participant's conduct when engaging in a VR experiment compared to an equivalent real-world situation. To this end, we developed a digital twin of a pre-existed crosswalk and carried out a field experiment (N=18) to investigate pedestrian-autonomous vehicle interaction in both real and simulated driving conditions. In the experiment, the pedestrian attempts to cross the road in the presence of different driving styles and an external Human-Machine Interface (eHMI). By combining survey-based and behavioural analysis methodologies, we develop a quantitative approach to empirically assess the behavioural gap, as a mechanism to validate data obtained from real subjects interacting in a simulated VR-based environment. Results show that participants are more cautious and curious in VR, affecting their speed and decisions, and that VR interfaces significantly influence their actions.
In this work, we use the monolithic convex limiting (MCL) methodology to enforce relevant inequality constraints in implicit finite element discretizations of the compressible Euler equations. In this context, preservation of invariant domains follows from positivity preservation for intermediate states of the density and internal energy. To avoid spurious oscillations, we additionally impose local maximum principles on intermediate states of the density, velocity components, and specific total energy. For the backward Euler time stepping, we show the invariant domain preserving (IDP) property of the fully discrete MCL scheme by constructing a fixed-point iteration that is IDP and converges under a strong time step restriction. Our iterative solver for the nonlinear discrete problem employs a more efficient fixed-point iteration. The matrix of the associated linear system is a robust low-order Jacobian approximation that exploits the homogeneity property of the flux function. The limited antidiffusive terms are treated explicitly. We use positivity preservation as a stopping criterion for nonlinear iterations. The first iteration yields the solution of a linearized semi-implicit problem. This solution possesses the discrete conservation property but is generally not IDP. Further iterations are performed if any non-IDP states are detected. The existence of an IDP limit is guaranteed by our analysis. To facilitate convergence to steady-state solutions, we perform adaptive explicit underrelaxation at the end of each time step. The calculation of appropriate relaxation factors is based on an approximate minimization of nodal entropy residuals. The performance of proposed algorithms and alternative solution strategies is illustrated by the convergence history for standard two-dimensional test problems.
We study weighted basic parallel processes (WBPP), a nonlinear recursive generalisation of weighted finite automata inspired from process algebra and Petri net theory. Our main result is an algorithm of 2-EXPSPACE complexity for the WBPP equivalence problem. While (unweighted) BPP language equivalence is undecidable, we can use this algorithm to decide multiplicity equivalence of BPP and language equivalence of unambiguous BPP, with the same complexity. These are long-standing open problems for the related model of weighted context-free grammars. Our second contribution is a connection between WBPP, power series solutions of systems of polynomial differential equations, and combinatorial enumeration. To this end we consider constructible differentially finite power series (CDF), a class of multivariate differentially algebraic series introduced by Bergeron and Reutenauer in order to provide a combinatorial interpretation to differential equations. CDF series generalise rational, algebraic, and a large class of D-finite (holonomic) series, for which decidability of equivalence was an open problem. We show that CDF series correspond to commutative WBPP series. As a consequence of our result on WBPP and commutativity, we show that equivalence of CDF power series can be decided with 2-EXPTIME complexity. The complexity analysis is based on effective bounds from algebraic geometry, namely on the length of chains of polynomial ideals constructed by repeated application of finitely many, not necessarily commuting derivations of a multivariate polynomial ring. This is obtained by generalising a result of Novikov and Yakovenko in the case of a single derivation, which is noteworthy since generic bounds on ideal chains are non-primitive recursive in general. On the way, we develop the theory of \WBPP~series and \CDF~power series, exposing several of their appealing properties.