Estimating nested expectations is an important task in computational mathematics and statistics. In this paper we propose a new Monte Carlo method using post-stratification to estimate nested expectations efficiently without taking samples of the inner random variable from the conditional distribution given the outer random variable. This property provides the advantage over many existing methods that it enables us to estimate nested expectations only with a dataset on the pair of the inner and outer variables drawn from the joint distribution. We show an upper bound on the mean squared error of the proposed method under some assumptions. Numerical experiments are conducted to compare our proposed method with several existing methods (nested Monte Carlo method, multilevel Monte Carlo method, and regression-based method), and we see that our proposed method is superior to the compared methods in terms of efficiency and applicability.
This paper contains new identification results for undirected weighted stochastic blockmodels. They are sharper than the ones available to date and the arguments underlying them are constructive. A nonparametric estimation framework that is computationally attractive is presented and the associated distribution theory is derived. Numerical experiments are reported on.
We propose new approximate alternating projection methods, based on randomized sketching, for the low-rank nonnegative matrix approximation problem: find a low-rank approximation of a nonnegative matrix that is nonnegative, but whose factors can be arbitrary. We calculate the computational complexities of the proposed methods and evaluate their performance in numerical experiments. The comparison with the known deterministic alternating projection methods shows that the randomized approaches are faster and exhibit similar convergence properties.
We study the computational complexity of zigzag sampling algorithm for strongly log-concave distributions. The zigzag process has the advantage of not requiring time discretization for implementation, and that each proposed bouncing event requires only one evaluation of partial derivative of the potential, while its convergence rate is dimension independent. Using these properties, we prove that the zigzag sampling algorithm achieves $\varepsilon$ error in chi-square divergence with a computational cost equivalent to $O\bigl(\kappa^2 d^\frac{1}{2}(\log\frac{1}{\varepsilon})^{\frac{3}{2}}\bigr)$ gradient evaluations in the regime $\kappa \ll \frac{d}{\log d}$ under a warm start assumption, where $\kappa$ is the condition number and $d$ is the dimension.
In this paper we propose a new optimization model for maximum likelihood estimation of causal and invertible ARMA models. Through a set of numerical experiments we show how our proposed model outperforms, both in terms of quality of the fitted model as well as in the computational time, the classical estimation procedure based on Jones reparametrization. We also propose a regularization term in the model and we show how this addition improves the out of sample quality of the fitted model. This improvement is achieved thanks to an increased penalty on models close to the non causality or non invertibility boundary.
This paper considers identification and estimation of the causal effect of the time Z until a subject is treated on a survival outcome T. The treatment is not randomly assigned, T is randomly right censored by a random variable C and the time to treatment Z is right censored by min(T,C) The endogeneity issue is treated using an instrumental variable explaining Z and independent of the error term of the model. We study identification in a fully nonparametric framework. We show that our specification generates an integral equation, of which the regression function of interest is a solution. We provide identification conditions that rely on this identification equation. For estimation purposes, we assume that the regression function follows a parametric model. We propose an estimation procedure and give conditions under which the estimator is asymptotically normal. The estimators exhibit good finite sample properties in simulations. Our methodology is applied to find evidence supporting the efficacy of a therapy for burn-out.
Motivated by applications in reinforcement learning (RL), we study a nonlinear stochastic approximation (SA) algorithm under Markovian noise, and establish its finite-sample convergence bounds under various stepsizes. Specifically, we show that when using constant stepsize (i.e., $\alpha_k\equiv \alpha$), the algorithm achieves exponential fast convergence to a neighborhood (with radius $O(\alpha\log(1/\alpha))$) around the desired limit point. When using diminishing stepsizes with appropriate decay rate, the algorithm converges with rate $O(\log(k)/k)$. Our proof is based on Lyapunov drift arguments, and to handle the Markovian noise, we exploit the fast mixing of the underlying Markov chain. To demonstrate the generality of our theoretical results on Markovian SA, we use it to derive the finite-sample bounds of the popular $Q$-learning with linear function approximation algorithm, under a condition on the behavior policy. Importantly, we do not need to make the assumption that the samples are i.i.d., and do not require an artificial projection step in the algorithm to maintain the boundedness of the iterates. Numerical simulations corroborate our theoretical results.
We present a hybrid sampling-surrogate approach for reducing the computational expense of uncertainty quantification in nonlinear dynamical systems. Our motivation is to enable rapid uncertainty quantification in complex mechanical systems such as automotive propulsion systems. Our approach is to build upon ideas from multifidelity uncertainty quantification to leverage the benefits of both sampling and surrogate modeling, while mitigating their downsides. In particular, the surrogate model is selected to exploit problem structure, such as smoothness, and offers a highly correlated information source to the original nonlinear dynamical system. We utilize an intrusive generalized Polynomial Chaos surrogate because it avoids any statistical errors in its construction and provides analytic estimates of output statistics. We then leverage a Monte Carlo-based Control Variate technique to correct the bias caused by the surrogate approximation error. The primary theoretical contribution of this work is the analysis and solution of an estimator design strategy that optimally balances the computational effort needed to adapt a surrogate compared with sampling the original expensive nonlinear system. While previous works have similarly combined surrogates and sampling, to our best knowledge this work is the first to provide rigorous analysis of estimator design. We deploy our approach on multiple examples stemming from the simulation of mechanical automotive propulsion system models. We show that the estimator is able to achieve orders of magnitude reduction in mean squared error of statistics estimation in some cases under comparable costs of purely sampling or purely surrogate approaches.
In this paper we propose a flexible nested error regression small area model with high dimensional parameter that incorporates heterogeneity in regression coefficients and variance components. We develop a new robust small area specific estimating equations method that allows appropriate pooling of a large number of areas in estimating small area specific model parameters. We propose a parametric bootstrap and jackknife method to estimate not only the mean squared errors but also other commonly used uncertainty measures such as standard errors and coefficients of variation. We conduct both modelbased and design-based simulation experiments and real-life data analysis to evaluate the proposed methodology
Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.
Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.