We study efficiency of non-parametric estimation of diffusions (stochastic differential equations driven by Brownian motion) from long stationary trajectories. First, we introduce estimators based on conditional expectation which is motivated by the definition of drift and diffusion coefficients. These estimators involve time- and space-discretization parameters for computing expected values from discretely-sampled stationary data. Next, we analyze consistency and mean squared error of these estimators depending on computational parameters. We derive relationships between the number of observational points, time- and space-discretization parameters in order to achieve the optimal speed of convergence and minimize computational complexity. We illustrate our approach with numerical simulations.
Linear Mixed Effects (LME) models have been widely applied in clustered data analysis in many areas including marketing research, clinical trials, and biomedical studies. Inference can be conducted using maximum likelihood approach if assuming Normal distributions on the random effects. However, in many applications of economy, business and medicine, it is often essential to impose constraints on the regression parameters after taking their real-world interpretations into account. Therefore, in this paper we extend the classical (unconstrained) LME models to allow for sign constraints on its overall coefficients. We propose to assume a symmetric doubly truncated Normal (SDTN) distribution on the random effects instead of the unconstrained Normal distribution which is often found in classical literature. With the aforementioned change, difficulty has dramatically increased as the exact distribution of the dependent variable becomes analytically intractable. We then develop likelihood-based approaches to estimate the unknown model parameters utilizing the approximation of its exact distribution. Simulation studies have shown that the proposed constrained model not only improves real-world interpretations of results, but also achieves satisfactory performance on model fits as compared to the existing model.
This paper studies the optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed which is able to iteratively find near-optimal policies of the optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.
We extend the framework of a posteriori error estimation by preconditioning in [Li, Y., Zikatanov, L.: Computers \& Mathematics with Applications. \textbf{91}, 192-201 (2021)] and derive new a posteriori error estimates for H(curl)-elliptic two-phase interface problems. The proposed error estimator provides two-sided bounds for the discretization error and is robust with respect to coefficient variation under mild assumptions. For H(curl) problems with constant coefficients, the performance of this estimator is numerically compared with the one analyzed in [Sch\"oberl, J.: Math.~Comp. \textbf{77}(262), 633-649 (2008)].
We consider the nonparametric estimation of an S-shaped regression function. The least squares estimator provides a very natural, tuning-free approach, but results in a non-convex optimisation problem, since the inflection point is unknown. We show that the estimator may nevertheless be regarded as a projection onto a finite union of convex cones, which allows us to propose a mixed primal-dual bases algorithm for its efficient, sequential computation. After developing a projection framework that demonstrates the consistency and robustness to misspecification of the estimator, our main theoretical results provide sharp oracle inequalities that yield worst-case and adaptive risk bounds for the estimation of the regression function, as well as a rate of convergence for the estimation of the inflection point. These results reveal not only that the estimator achieves the minimax optimal rate of convergence for both the estimation of the regression function and its inflection point (up to a logarithmic factor in the latter case), but also that it is able to achieve an almost-parametric rate when the true regression function is piecewise affine with not too many affine pieces. Simulations and a real data application to air pollution modelling also confirm the desirable finite-sample properties of the estimator, and our algorithm is implemented in the R package Sshaped.
More often than not, we encounter problems with varying parameters as opposed to those that are static. In this paper, we treat the estimation of parameters which vary with space. We use Metropolis-Hastings algorithm as a selection criteria for the maximum filter likelihood. Comparisons are made with the use of joint estimation of both the spatially varying parameters and the state. We illustrate the procedures employed in this paper by means of two hyperbolic SPDEs: the advection and the wave equation. The Metropolis-Hastings procedure registers better estimates.
In this work we propose a new algorithm for solving high-dimensional backward stochastic differential equations (BSDEs). Based on the general theta-discretization for the time-integrands, we show how to efficiently use eXtreme Gradient Boosting (XGBoost) regression to approximate the resulting conditional expectations in a quite high dimension. Numerical results illustrate the efficiency and accuracy of our proposed algorithms for solving very high-dimensional (up to $10000$ dimensions) nonlinear BSDEs.
We develop a Bayesian nonparametric autoregressive model applied to flexibly estimate general transition densities exhibiting nonlinear lag dependence. Our approach is related to Bayesian density regression using Dirichlet process mixtures, with the Markovian likelihood defined through the conditional distribution obtained from the mixture. This results in a Bayesian nonparametric extension of a mixtures-of-experts model formulation. We address computational challenges to posterior sampling that arise from the Markovian structure in the likelihood. The base model is illustrated with synthetic data from a classical model for population dynamics, as well as a series of waiting times between eruptions of Old Faithful Geyser. We study inferences available through the base model before extending the methodology to include automatic relevance detection among a pre-specified set of lags. Inference for global and local lag selection is explored with additional simulation studies, and the methods are illustrated through analysis of an annual time series of pink salmon abundance in a stream in Alaska. We further explore and compare transition density estimation performance for alternative configurations of the proposed model.
We derive consistent and asymptotically normal estimators for the drift and volatility parameters of the stochastic heat equation driven by an additive space-only white noise when the solution is sampled discretely in the physical domain. We consider both the full space and the bounded domain. We establish the exact spatial regularity of the solution, which in turn, using power-variation arguments, allows building the desired estimators. We show that naive approximations of the derivatives appearing in the power-variation based estimators may create nontrivial biases, which we compute explicitly. The proofs are rooted in Malliavin-Stein's method.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.
Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.