Data assimilation algorithms integrate prior information from numerical model simulations with observed data. Ensemble-based filters, regarded as state-of-the-art, are widely employed for large-scale estimation tasks in disciplines such as geoscience and meteorology. Despite their inability to produce the true posterior distribution for nonlinear systems, their robustness and capacity for state tracking are noteworthy. In contrast, Particle filters yield the correct distribution in the ensemble limit but require substantially larger ensemble sizes than ensemble-based filters to maintain stability in higher-dimensional spaces. It is essential to transcend traditional Gaussian assumptions to achieve realistic quantification of uncertainties. One approach involves the hybridisation of filters, facilitated by tempering, to harness the complementary strengths of different filters. A new adaptive tempering method is proposed to tune the underlying schedule, aiming to systematically surpass the performance previously achieved. Although promising numerical results for certain filter combinations in toy examples exist in the literature, the tuning of hyperparameters presents a considerable challenge. A deeper understanding of these interactions is crucial for practical applications.
We consider optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by partial differential equations (PDEs) under model uncertainty. Specifically, we consider inverse problems in which, in addition to the inversion parameters, the governing PDEs include secondary uncertain parameters. We focus on problems with infinite-dimensional inversion and secondary parameters and present a scalable computational framework for optimal design of such problems. The proposed approach enables Bayesian inversion and OED under uncertainty within a unified framework. We build on the Bayesian approximation error (BAE) approach, to incorporate modeling uncertainties in the Bayesian inverse problem, and methods for A-optimal design of infinite-dimensional Bayesian nonlinear inverse problems. Specifically, a Gaussian approximation to the posterior at the maximum a posteriori probability point is used to define an uncertainty aware OED objective that is tractable to evaluate and optimize. In particular, the OED objective can be computed at a cost, in the number of PDE solves, that does not grow with the dimension of the discretized inversion and secondary parameters. The OED problem is formulated as a binary bilevel PDE constrained optimization problem and a greedy algorithm, which provides a pragmatic approach, is used to find optimal designs. We demonstrate the effectiveness of the proposed approach for a model inverse problem governed by an elliptic PDE on a three-dimensional domain. Our computational results also highlight the pitfalls of ignoring modeling uncertainties in the OED and/or inference stages.
Generalized linear models (GLMs) arguably represent the standard approach for statistical regression beyond the Gaussian likelihood scenario. When Bayesian formulations are employed, the general absence of a tractable posterior distribution has motivated the development of deterministic approximations, which are generally more scalable than sampling techniques. Among them, expectation propagation (EP) showed extreme accuracy, usually higher than many variational Bayes solutions. However, the higher computational cost of EP posed concerns about its practical feasibility, especially in high-dimensional settings. We address these concerns by deriving a novel efficient formulation of EP for GLMs, whose cost scales linearly in the number of covariates p. This reduces the state-of-the-art O(p^2 n) per-iteration computational cost of the EP routine for GLMs to O(p n min{p,n}), with n being the sample size. We also show that, for binary models and log-linear GLMs approximate predictive means can be obtained at no additional cost. To preserve efficient moment matching for count data, we propose employing a combination of log-normal Laplace transform approximations, avoiding numerical integration. These novel results open the possibility of employing EP in settings that were believed to be practically impossible. Improvements over state-of-the-art approaches are illustrated both for simulated and real data. The efficient EP implementation is available at //github.com/niccoloanceschi/EPglm.
Supervised machine learning methods for geological mapping via remote sensing face limitations due to the scarcity of accurately labelled training data that can be addressed by unsupervised learning, such as dimensionality reduction and clustering. Dimensionality reduction methods have the potential to play a crucial role in improving the accuracy of geological maps. Although conventional dimensionality reduction methods may struggle with nonlinear data, unsupervised deep learning models such as autoencoders can model non-linear relationships. Stacked autoencoders feature multiple interconnected layers to capture hierarchical data representations useful for remote sensing data. This study presents an unsupervised machine learning-based framework for processing remote sensing data using stacked autoencoders for dimensionality reduction and k-means clustering for mapping geological units. We use Landsat 8, ASTER, and Sentinel-2 datasets to evaluate the framework for geological mapping of the Mutawintji region in Western New South Wales, Australia. We also compare stacked autoencoders with principal component analysis and canonical autoencoders. Our results reveal that the framework produces accurate and interpretable geological maps, efficiently discriminating rock units. We find that the accuracy of stacked autoencoders ranges from 86.6 % to 90 %, depending on the remote sensing data type, which is superior to their counterparts. We also find that the generated maps align with prior geological knowledge of the study area while providing novel insights into geological structures.
Compared to widely used likelihood-based approaches, the minimum contrast (MC) method offers a computationally efficient method for estimation and inference of spatial point processes. These relative gains in computing time become more pronounced when analyzing complicated multivariate point process models. Despite this, there has been little exploration of the MC method for multivariate spatial point processes. Therefore, this article introduces a new MC method for parametric multivariate spatial point processes. A contrast function is computed based on the trace of the power of the difference between the conjectured $K$-function matrix and its nonparametric unbiased edge-corrected estimator. Under standard assumptions, we derive the asymptotic normality of our MC estimator. The performance of the proposed method is demonstrated through simulation studies of bivariate log-Gaussian Cox processes and five-variate product-shot-noise Cox processes.
This paper introduces a new numerical scheme for a system that includes evolution equations describing a perfect plasticity model with a time-dependent yield surface. We demonstrate that the solution to the proposed scheme is stable under suitable norms. Moreover, the stability leads to the existence of an exact solution, and we also prove that the solution to the proposed scheme converges strongly to the exact solution under suitable norms.
We present a space-time continuous-Galerkin finite element method for solving incompressible Navier-Stokes equations. To ensure stability of the discrete variational problem, we apply ideas from the variational multi-scale method. The finite element problem is posed on the ``full" space-time domain, considering time as another dimension. We provide a rigorous analysis of the stability and convergence of the stabilized formulation. And finally, we apply this method on two benchmark problems in computational fluid dynamics, namely, lid-driven cavity flow and flow past a circular cylinder. We validate the current method with existing results from literature and show that very large space-time blocks can be solved using our approach.
What types of differences among causal structures with latent variables are impossible to distinguish by statistical data obtained by probing each visible variable? If the probing scheme is simply passive observation, then it is well-known that many different causal structures can realize the same joint probability distributions. Even for the simplest case of two visible variables, for instance, one cannot distinguish between one variable being a causal parent of the other and the two variables sharing a latent common cause. However, it is possible to distinguish between these two causal structures if we have recourse to more powerful probing schemes, such as the possibility of intervening on one of the variables and observing the other. Herein, we address the question of which causal structures remain indistinguishable even given the most informative types of probing schemes on the visible variables. We find that two causal structures remain indistinguishable if and only if they are both associated with the same mDAG structure (as defined by Evans (2016)). We also consider the question of when one causal structure dominates another in the sense that it can realize all of the joint probability distributions that can be realized by the other using a given probing scheme. (Equivalence of causal structures is the special case of mutual dominance.) Finally, we investigate to what extent one can weaken the probing schemes implemented on the visible variables and still have the same discrimination power as a maximally informative probing scheme.
We deal with a model selection problem for structural equation modeling (SEM) with latent variables for diffusion processes. Based on the asymptotic expansion of the marginal quasi-log likelihood, we propose two types of quasi-Bayesian information criteria of the SEM. It is shown that the information criteria have model selection consistency. Furthermore, we examine the finite-sample performance of the proposed information criteria by numerical experiments.
We consider functional linear regression models where functional outcomes are associated with scalar predictors by coefficient functions with shape constraints, such as monotonicity and convexity, that apply to sub-domains of interest. To validate the partial shape constraints, we propose testing a composite hypothesis of linear functional constraints on regression coefficients. Our approach employs kernel- and spline-based methods within a unified inferential framework, evaluating the statistical significance of the hypothesis by measuring an $L^2$-distance between constrained and unconstrained model fits. In the theoretical study of large-sample analysis under mild conditions, we show that both methods achieve the standard rate of convergence observed in the nonparametric estimation literature. Through numerical experiments of finite-sample analysis, we demonstrate that the type I error rate keeps the significance level as specified across various scenarios and that the power increases with sample size, confirming the consistency of the test procedure under both estimation methods. Our theoretical and numerical results provide researchers the flexibility to choose a method based on computational preference. The practicality of partial shape-constrained inference is illustrated by two data applications: one involving clinical trials of NeuroBloc in type A-resistant cervical dystonia and the other with the National Institute of Mental Health Schizophrenia Study.
This contribution introduces a model order reduction approach for an advection-reaction problem with a parametrized reaction function. The underlying discretization uses an ultraweak formulation with an $L^2$-like trial space and an 'optimal' test space as introduced by Demkowicz et al. This ensures the stability of the discretization and in addition allows for a symmetric reformulation of the problem in terms of a dual solution which can also be interpreted as the normal equations of an adjoint least-squares problem. Classic model order reduction techniques can then be applied to the space of dual solutions which also immediately gives a reduced primal space. We show that the necessary computations do not require the reconstruction of any primal solutions and can instead be performed entirely on the space of dual solutions. We prove exponential convergence of the Kolmogorov $N$-width and show that a greedy algorithm produces quasi-optimal approximation spaces for both the primal and the dual solution space. Numerical experiments based on the benchmark problem of a catalytic filter confirm the applicability of the proposed method.