We study the parameter estimation method for linear regression models with possibly skewed stable distributed errors. Our estimation procedure consists of two stages: first, for the regression coefficients, the Cauchy quasi-maximum likelihood estimator (CQMLE) is considered after taking the differences to remove the skewness of noise, and we prove its asymptotic normality and tail-probability estimate; second, as for stable-distribution parameters, we consider the moment estimators based on the symmetrized and centered residuals and prove their $\sqrt{n}$-consistency. To derive the $\sqrt{n}$-consistency, we essentially used the tail-probability estimate of the CQMLE. The proposed estimation procedure has a very low computational load and is much less time-consuming compared with the maximum-likelihood estimator. Further, our estimator can be effectively used as an initial value of the numerical optimization of the log-likelihood.
We propose a new Bayesian strategy for adaptation to smoothness in nonparametric models based on heavy tailed series priors. We illustrate it in a variety of settings, showing in particular that the corresponding Bayesian posterior distributions achieve adaptive rates of contraction in the minimax sense (up to logarithmic factors) without the need to sample hyperparameters. Unlike many existing procedures, where a form of direct model (or estimator) selection is performed, the method can be seen as performing a soft selection through the prior tail. In Gaussian regression, such heavy tailed priors are shown to lead to (near-)optimal simultaneous adaptation both in the $L^2$- and $L^\infty$-sense. Results are also derived for linear inverse problems, for anisotropic Besov classes, and for certain losses in more general models through the use of tempered posterior distributions. We present numerical simulations corroborating the theory.
The aim of the present work is to design, analyze theoretically, and test numerically, a generalized Dryja-Smith-Widlund (GDSW) preconditioner for composite Discontinuous Galerkin discretizations of multicompartment parabolic reaction-diffusion equations, where the solution can exhibit natural discontinuities across the domain. We prove that the resulting preconditioned operator for the solution of the discrete system arising at each time step converges with a scalable and quasi-optimal upper bound for the condition number. The GDSW preconditioner is then applied to the EMI (Extracellular - Membrane - Intracellular) reaction-diffusion system, recently proposed to model microscopically the spatiotemporal evolution of cardiac bioelectrical potentials. Numerical tests validate the scalability and quasi-optimality of the EMI-GDSW preconditioner, and investigate its robustness with respect to the time step size as well as jumps in the diffusion coefficients.
In survival analysis, prediction models are needed as stand-alone tools and in applications of causal inference to estimate nuisance parameters. The super learner is a machine learning algorithm which combines a library of prediction models into a meta learner based on cross-validated loss. In right-censored data, the choice of the loss function and the estimation of the expected loss need careful consideration. We introduce the state learner, a new super learner for survival analysis, which simultaneously evaluates libraries of prediction models for the event of interest and the censoring distribution. The state learner can be applied to all types of survival models, works in the presence of competing risks, and does not require a single pre-specified estimator of the conditional censoring distribution. We establish an oracle inequality for the state learner and investigate its performance through numerical experiments. We illustrate the application of the state learner with prostate cancer data, as a stand-alone prediction tool, and, for causal inference, as a way to estimate the nuisance parameter models of a smooth statistical functional.
We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at //github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.
In a regression model with multiple response variables and multiple explanatory variables, if the difference of the mean vectors of the response variables for different values of explanatory variables is always in the direction of the first principal eigenvector of the covariance matrix of the response variables, then it is called a multivariate allometric regression model. This paper studies the estimation of the first principal eigenvector in the multivariate allometric regression model. A class of estimators that includes conventional estimators is proposed based on weighted sum-of-squares matrices of regression sum-of-squares matrix and residual sum-of-squares matrix. We establish an upper bound of the mean squared error of the estimators contained in this class, and the weight value minimizing the upper bound is derived. Sufficient conditions for the consistency of the estimators are discussed in weak identifiability regimes under which the difference of the largest and second largest eigenvalues of the covariance matrix decays asymptotically and in ``large $p$, large $n$" regimes, where $p$ is the number of response variables and $n$ is the sample size. Several numerical results are also presented.
This paper focuses on the numerical scheme for delay-type stochastic McKean-Vlasov equations (DSMVEs) driven by fractional Brownian motion with Hurst parameter $H\in (0,1/2)\cup (1/2,1)$. The existence and uniqueness of the solutions to such DSMVEs whose drift coefficients contain polynomial delay terms are proved by exploting the Banach fixed point theorem. Then the propagation of chaos between interacting particle system and non-interacting system in $\mathcal{L}^p$ sense is shown. We find that even if the delay term satisfies the polynomial growth condition, the unmodified classical Euler-Maruyama scheme still can approximate the corresponding interacting particle system without the particle corruption. The convergence rates are revealed for $H\in (0,1/2)\cup (1/2,1)$. Finally, as an example that closely fits the original equation, a stochastic opinion dynamics model with both extrinsic memory and intrinsic memory is simulated to illustrate the plausibility of the theoretical result.
Despite the success of adaptive time-stepping in ODE simulation, it has so far seen few applications for Stochastic Differential Equations (SDEs). To simulate SDEs adaptively, methods such as the Virtual Brownian Tree (VBT) have been developed, which can generate Brownian motion (BM) non-chronologically. However, in most applications, knowing only the values of Brownian motion is not enough to achieve a high order of convergence; for that, we must compute time-integrals of BM such as $\int_s^t W_r \, dr$. With the aim of using high order SDE solvers adaptively, we extend the VBT to generate these integrals of BM in addition to the Brownian increments. A JAX-based implementation of our construction is included in the popular Diffrax library (//github.com/patrick-kidger/diffrax). Since the entire Brownian path produced by VBT is uniquely determined by a single PRNG seed, previously generated samples need not be stored, which results in a constant memory footprint and enables experiment repeatability and strong error estimation. Based on binary search, the VBT's time complexity is logarithmic in the tolerance parameter $\varepsilon$. Unlike the original VBT algorithm, which was only precise at some dyadic times, we prove that our construction exactly matches the joint distribution of the Brownian motion and its time integrals at any query times, provided they are at least $\varepsilon$ apart. We present two applications of adaptive high order solvers enabled by our new VBT. Using adaptive solvers to simulate a high-volatility CIR model, we achieve more than twice the convergence order of constant stepping. We apply an adaptive third order underdamped or kinetic Langevin solver to an MCMC problem, where our approach outperforms the No U-Turn Sampler, while using only a tenth of its function evaluations.
Data visualization and dimension reduction for regression between a general metric space-valued response and Euclidean predictors is proposed. Current Fr\'ech\'et dimension reduction methods require that the response metric space be continuously embeddable into a Hilbert space, which imposes restriction on the type of metric and kernel choice. We relax this assumption by proposing a Euclidean embedding technique which avoids the use of kernels. Under this framework, classical dimension reduction methods such as ordinary least squares and sliced inverse regression are extended. An extensive simulation experiment demonstrates the superior performance of the proposed method on synthetic data compared to existing methods where applicable. The real data analysis of factors influencing the distribution of COVID-19 transmission in the U.S. and the association between BMI and structural brain connectivity of healthy individuals are also investigated.
Probabilistic graphical models are widely used to model complex systems under uncertainty. Traditionally, Gaussian directed graphical models are applied for analysis of large networks with continuous variables as they can provide conditional and marginal distributions in closed form simplifying the inferential task. The Gaussianity and linearity assumptions are often adequate, yet can lead to poor performance when dealing with some practical applications. In this paper, we model each variable in graph G as a polynomial regression of its parents to capture complex relationships between individual variables and with a utility function of polynomial form. We develop a message-passing algorithm to propagate information throughout the network solely using moments which enables the expected utility scores to be calculated exactly. Our propagation method scales up well and enables to perform inference in terms of a finite number of expectations. We illustrate how the proposed methodology works with examples and in an application to decision problems in energy planning and for real-time clinical decision support.
We consider a robust estimation of linear regression coefficients. In this note, we focus on the case where the covariates are sampled from an $L$-subGaussian distribution with unknown covariance, the noises are sampled from a distribution with a bounded absolute moment and both covariates and noises may be contaminated by an adversary. We derive an estimation error bound, which depends on the stable rank and the condition number of the covariance matrix of covariates with a polynomial computational complexity of estimation.