This paper presents a regularized recursive identification algorithm with simultaneous on-line estimation of both the model parameters and the algorithms hyperparameters. A new kernel is proposed to facilitate the algorithm development. The performance of this novel scheme is compared with that of the recursive least squares algorithm in simulation.
We introduce an algorithm that simplifies the construction of efficient estimators, making them accessible to a broader audience. 'Dimple' takes as input computer code representing a parameter of interest and outputs an efficient estimator. Unlike standard approaches, it does not require users to derive a functional derivative known as the efficient influence function. Dimple avoids this task by applying automatic differentiation to the statistical functional of interest. Doing so requires expressing this functional as a composition of primitives satisfying a novel differentiability condition. Dimple also uses this composition to determine the nuisances it must estimate. In software, primitives can be implemented independently of one another and reused across different estimation problems. We provide a proof-of-concept Python implementation and showcase through examples how it allows users to go from parameter specification to efficient estimation with just a few lines of code.
Noninformative priors constructed for estimation purposes are usually not appropriate for model selection and testing. The methodology of integral priors was developed to get prior distributions for Bayesian model selection when comparing two models, modifying initial improper reference priors. We propose a generalization of this methodology to more than two models. Our approach adds an artificial copy of each model under comparison by compactifying the parametric space and creating an ergodic Markov chain across all models that returns the integral priors as marginals of the stationary distribution. Besides the garantee of their existance and the lack of paradoxes attached to estimation reference priors, an additional advantage of this methodology is that the simulation of this Markov chain is straightforward as it only requires simulations of imaginary training samples for all models and from the corresponding posterior distributions. This renders its implementation automatic and generic, both in the nested case and in the nonnested case.
We study the problem of parametric estimation for continuously observed stochastic differential equation driven by fractional Brownian motion. Under some assumptions on drift and diffusion coefficients, we construct maximum likelihood estimator and establish its the asymptotic normality and moment convergence of the drift parameter when a small dispersion coefficient vanishes.
We present a method for computing nearly singular integrals that occur when single or double layer surface integrals, for harmonic potentials or Stokes flow, are evaluated at points nearby. Such values could be needed in solving an integral equation when one surface is close to another or to obtain values at grid points. We replace the singular kernel with a regularized version having a length parameter $\delta$ in order to control discretization error. Analysis near the singularity leads to an expression for the error due to regularization which has terms with unknown coefficients multiplying known quantities. By computing the integral with three choices of $\delta$ we can solve for an extrapolated value that has regularization error reduced to $O(\delta^5)$, uniformly for target points on or near the surface. In examples with $\delta/h$ constant and moderate resolution we observe total error about $O(h^5)$ close to the surface. For convergence as $h \to 0$ we can choose $\delta$ proportional to $h^q$ with $q < 1$ to ensure the discretization error is dominated by the regularization error. With $q = 4/5$ we find errors about $O(h^4)$. For harmonic potentials we extend the approach to a version with $O(\delta^7)$ regularization; it typically has smaller errors but the order of accuracy is less predictable.
We propose a test for the identification of causal effects in mediation and dynamic treatment models that is based on two sets of observed variables, namely covariates to be controlled for and suspected instruments, building on the test by Huber and Kueck (2022) for single treatment models. We consider models with a sequential assignment of a treatment and a mediator to assess the direct treatment effect (net of the mediator), the indirect treatment effect (via the mediator), or the joint effect of both treatment and mediator. We establish testable conditions for identifying such effects in observational data. These conditions jointly imply (1) the exogeneity of the treatment and the mediator conditional on covariates and (2) the validity of distinct instruments for the treatment and the mediator, meaning that the instruments do not directly affect the outcome (other than through the treatment or mediator) and are unconfounded given the covariates. Our framework extends to post-treatment sample selection or attrition problems when replacing the mediator by a selection indicator for observing the outcome, enabling joint testing of the selectivity of treatment and attrition. We propose a machine learning-based test to control for covariates in a data-driven manner and analyze its finite sample performance in a simulation study. Additionally, we apply our method to Slovak labor market data and find that our testable implications are not rejected for a sequence of training programs typically considered in dynamic treatment evaluations.
This paper introduces an unsupervised method to estimate the class separability of text datasets from a topological point of view. Using persistent homology, we demonstrate how tracking the evolution of embedding manifolds during training can inform about class separability. More specifically, we show how this technique can be applied to detect when the training process stops improving the separability of the embeddings. Our results, validated across binary and multi-class text classification tasks, show that the proposed method's estimates of class separability align with those obtained from supervised methods. This approach offers a novel perspective on monitoring and improving the fine-tuning of sentence transformers for classification tasks, particularly in scenarios where labeled data is scarce. We also discuss how tracking these quantities can provide additional insights into the properties of the trained classifier.
We propose a scalable variational Bayes method for statistical inference for a single or low-dimensional subset of the coordinates of a high-dimensional parameter in sparse linear regression. Our approach relies on assigning a mean-field approximation to the nuisance coordinates and carefully modelling the conditional distribution of the target given the nuisance. This requires only a preprocessing step and preserves the computational advantages of mean-field variational Bayes, while ensuring accurate and reliable inference for the target parameter, including for uncertainty quantification. We investigate the numerical performance of our algorithm, showing that it performs competitively with existing methods. We further establish accompanying theoretical guarantees for estimation and uncertainty quantification in the form of a Bernstein--von Mises theorem.
Making inference with spatial extremal dependence models can be computationally burdensome since they involve intractable and/or censored likelihoods. Building on recent advances in likelihood-free inference with neural Bayes estimators, that is, neural networks that approximate Bayes estimators, we develop highly efficient estimators for censored peaks-over-threshold models that {use data augmentation techniques} to encode censoring information in the neural network {input}. Our new method provides a paradigm shift that challenges traditional censored likelihood-based inference methods for spatial extremal dependence models. Our simulation studies highlight significant gains in both computational and statistical efficiency, relative to competing likelihood-based approaches, when applying our novel estimators to make inference with popular extremal dependence models, such as max-stable, $r$-Pareto, and random scale mixture process models. We also illustrate that it is possible to train a single neural Bayes estimator for a general censoring level, precluding the need to retrain the network when the censoring level is changed. We illustrate the efficacy of our estimators by making fast inference on hundreds-of-thousands of high-dimensional spatial extremal dependence models to assess extreme particulate matter 2.5 microns or less in diameter (${\rm PM}_{2.5}$) concentration over the whole of Saudi Arabia.
Mechanical issues of noncircular and asymmetrical tunnelling can be estimated using complex variable method with suitable conformal mapping. Exsiting solution schemes of conformal mapping for noncircular tunnel generally need iteration or optimization strategy, and are thereby mathematically complicated. This paper proposes a new bidirectional conformal mapping for deep and shallow tunnels of noncircular and asymmetrical shapes by incorporating Charge Simulation Method. The solution scheme of this new bidirectional conformal mapping only involves a pair of linear systems, and is therefore logically straight-forward, computationally efficient, and practically easy in coding. New numerical strategies are developed to deal with possible sharp corners of cavity by small arc simulation and densified collocation points. Several numerical examples are presented to illustrate the geometrical usage of the new bidirectional conformal mapping. Furthermore, the new bidirectional conformal mapping is embedded into two complex variable solutions of noncircular and asymmetrical shallow tunnelling in gravitational geomaterial with reasonable far-field displacement. The respective result comparisons with finite element solution and exsiting analytical solution show good agreements, indicating the feasible mechanical usage of the new bidirectional conformal mapping.
Suitable discretizations through tensor product formulas of popular multidimensional operators (diffusion or diffusion--advection, for instance) lead to matrices with $d$-dimensional Kronecker sum structure. For evolutionary Partial Differential Equations containing such operators and integrated in time with exponential integrators, it is then of paramount importance to efficiently approximate the actions of $\varphi$-functions of the arising matrices. In this work, we show how to produce directional split approximations of third order with respect to the time step size. They conveniently employ tensor-matrix products (the so-called $\mu$-mode product and related Tucker operator, realized in practice with high performance level 3 BLAS), and allow for the effective usage of exponential Runge--Kutta integrators up to order three. The technique can also be efficiently implemented on modern computer hardware such as Graphic Processing Units. The approach has been successfully tested against state-of-the-art techniques on two well-known physical models that lead to Turing patterns, namely the 2D Schnakenberg and the 3D FitzHugh--Nagumo systems, on different hardware and software architectures.