In epidemics many interesting quantities, like the reproduction number, depend on the incubation period (time from infection to symptom onset) and/or the generation time (time until a new person is infected from another infected person). Therefore, estimation of the distribution of these two quantities is of distinct interest. However, this is a challenging problem since it is normally not possible to obtain precise observations of these two variables. Instead, in the beginning of a pandemic, it is possible to observe for infection pairs the time of symptom onset for both people as well as a window for infection of the first person (e.g. because of travel to a risk area). In this paper we suggest a simple semi-parametric sieve-estimation method based on Laguerre-Polynomials for estimation of these distributions. We provide detailed theory for consistency and illustrate the finite sample performance for small datasets via a simulation study.
Consider two independent exponential populations having different unknown location parameters and a common unknown scale parameter. Call the population associated with the larger location parameter as the "best" population and the population associated with the smaller location parameter as the "worst" population. For the goal of selecting the best (worst) population a natural selection rule, that has many optimum properties, is the one which selects the population corresponding to the larger (smaller) minimal sufficient statistic. In this article, we consider the problem of estimating the location parameter of the population selected using this natural selection rule. For estimating the location parameter of the selected best population, we derive the uniformly minimum variance unbiased estimator (UMVUE) and show that the analogue of the best affine equivariant estimators (BAEEs) of location parameters is a generalized Bayes estimator. We provide some admissibility and minimaxity results for estimators in the class of linear, affine and permutation equivariant estimators, under the criterion of scaled mean squared error. We also derive a sufficient condition for inadmissibility of an arbitrary affine and permutation equivariant estimator. We provide similar results for the problem of estimating the location parameter of the selected population when the selection goal is that of selecting the worst exponential population. Finally, we provide a simulation study to compare, numerically, the performances of some of the proposed estimators.
We focus on the problem of manifold estimation: given a set of observations sampled close to some unknown submanifold $M$, one wants to recover information about the geometry of $M$. Minimax estimators which have been proposed so far all depend crucially on the a priori knowledge of some parameters quantifying the underlying distribution generating the sample (such as bounds on its density), whereas those quantities will be unknown in practice. Our contribution to the matter is twofold: first, we introduce a one-parameter family of manifold estimators $(\hat{M}_t)_{t\geq 0}$ based on a localized version of convex hulls, and show that for some choice of $t$, the corresponding estimator is minimax on the class of models of $C^2$ manifolds introduced in [Genovese et al., Manifold estimation and singular deconvolution under Hausdorff loss]. Second, we propose a completely data-driven selection procedure for the parameter $t$, leading to a minimax adaptive manifold estimator on this class of models. This selection procedure actually allows us to recover the Hausdorff distance between the set of observations and $M$, and can therefore be used as a scale parameter in other settings, such as tangent space estimation.
We consider the Bayesian analysis of models in which the unknown distribution of the outcomes is specified up to a set of conditional moment restrictions. The nonparametric exponentially tilted empirical likelihood function is constructed to satisfy a sequence of unconditional moments based on an increasing (in sample size) vector of approximating functions (such as tensor splines based on the splines of each conditioning variable). For any given sample size, results are robust to the number of expanded moments. We derive Bernstein-von Mises theorems for the behavior of the posterior distribution under both correct and incorrect specification of the conditional moments, subject to growth rate conditions (slower under misspecification) on the number of approximating functions. A large-sample theory for comparing different conditional moment models is also developed. The central result is that the marginal likelihood criterion selects the model that is less misspecified. We also introduce sparsity-based model search for high-dimensional conditioning variables, and provide efficient MCMC computations for high-dimensional parameters. Along with clarifying examples, the framework is illustrated with real-data applications to risk-factor determination in finance, and causal inference under conditional ignorability.
In the first part of this work, we develop a novel scheme for solving nonparametric regression problems. That is the approximation of possibly low regular and noised functions from the knowledge of their approximate values given at some random points. Our proposed scheme is based on the use of the pseudo-inverse of a random projection matrix, combined with some specific properties of the Jacobi polynomials system, as well as some properties of positive definite random matrices. This scheme has the advantages to be stable, robust, accurate and fairly fast in terms of execution time. In particular, we provide an $L_2$ as well as an $L_2-$risk errors of our proposed nonparametric regression estimator. Moreover and unlike most of the existing nonparametric regression estimators, no extra regularization step is required by our proposed estimator. Although, this estimator is initially designed to work with random sampling set of uni-variate i.i.d. random variables following a Beta distribution, we show that it is still works for a wide range of sampling distribution laws. Moreover, we briefly describe how our estimator can be adapted in order to handle the multivariate case of random sampling sets. In the second part of this work, we extend the random pseudo-inverse scheme technique to build a stable and accurate estimator for solving linear functional regression (LFR) problems. A dyadic decomposition approach is used to construct this last stable estimator for the LFR problem. Alaso, we give an $L_2-$risk error of our proposed LFR estimator. Finally, the performance of the two proposed estimators are illustrated by various numerical simulations. In particular, a real dataset is used to illustrate the performance of our nonparametric regression estimator.
In this paper several related estimation problems are addressed from a Bayesian point of view and optimal estimators are obtained for each of them when some natural loss functions are considered. Namely, we are interested in estimating a regression curve. Simultaneously, the estimation problems of a conditional distribution function, or a conditional density, or even the conditional distribution itself, are considered. All these problems are posed in a sufficiently general framework to cover continuous and discrete, univariate and multivariate, parametric and non-parametric cases, without the need to use a specific prior distribution. The loss functions considered come naturally from the quadratic error loss function comonly used in estimating a real function of the unknown parameter. The cornerstone of the mentioned Bayes estimators is the posterior predictive distribution. Some examples are provided to illustrate these results.
This paper aims at providing a new semi-parametric estimator for LARCH($\infty$) processes, and therefore also for LARCH(p) or GLARCH(p, q) processes. This estimator is obtained from the minimization of a contrast leading to a least squares estimator of the absolute values of the process. The strong consistency and the asymptotic normality are showed, and the convergence happens with rate $\sqrt$ n as well in cases of short or long memory. Numerical experiments confirm the theoretical results, and show that this new estimator clearly outperforms the smoothed quasi-maximum likelihood estimators or the weighted least square estimators often used for such processes.
We consider a dynamical system with two sources of uncertainties: (1) parameterized input with a known probability distribution and (2) stochastic input-to-response (ItR) function with heteroscedastic randomness. Our purpose is to efficiently quantify the extreme response probability when the ItR function is expensive to evaluate. The problem setup arises often in physics and engineering problems, with randomness in ItR coming from either intrinsic uncertainties (say, as a solution to a stochastic equation) or additional (critical) uncertainties that are not incorporated in the input parameter space. To reduce the required sampling numbers, we develop a sequential Bayesian experimental design method leveraging the variational heteroscedastic Gaussian process regression (VHGPR) to account for the stochastic ItR, along with a new criterion to select the next-best samples sequentially. The validity of our new method is first tested in two synthetic problems with the stochastic ItR functions defined artificially. Finally, we demonstrate the application of our method to an engineering problem of estimating the extreme ship motion probability in ensemble of wave groups, where the uncertainty in ItR naturally originates from the uncertain initial condition of ship motion in each wave group.
The widely applicable information criterion (WAIC) has been used as a model selection criterion for Bayesian statistics in recent years. It is an asymptotically unbiased estimator of the Kullback-Leibler divergence between a Bayesian predictive distribution and the true distribution. Not only is the WAIC theoretically more sound than other information criteria, its usefulness in practice has also been reported. On the other hand, the WAIC is intended for settings in which the prior distribution does not have an asymptotic influence, and as we set the class of the prior distribution to be more complex, it never fails to select the most complex one. To alleviate these concerns, this paper proposed the prior intensified information criterion (PIIC). In addition, it customizes this criterion to incorporate sparse estimation and causal inference. Numerical experiments show that the PIIC clearly outperforms the WAIC in terms of prediction performance when the above concerns are manifested. A real data analysis confirms that the results of variable selection and Bayesian estimators of the WAIC and PIIC differ significantly.
We propose a method for the description and simulation of the nonlinear dynamics of slender structures modeled as Cosserat rods. It is based on interpreting the strains and the generalized velocities of the cross sections as basic variables and elements of the special Euclidean algebra. This perspective emerges naturally from the evolution equations for strands, that are one-dimensional submanifolds, of the special Euclidean group. The discretization of the corresponding equations for the three-dimensional motion of a Cosserat rod is performed, in space, by using a staggered grid. The time evolution is then approximated with a semi-implicit method. Within this approach we can easily include dissipative effects due to both the action of external forces and the presence of internal mechanical dissipation. The comparison with results obtained with different schemes shows the effectiveness of the proposed method, which is able to provide very good predictions of nonlinear dynamical effects and shows competitive computation times also as an energy-minimizing method to treat static problems.
This paper proposes a generalization of Gaussian mixture models, where the mixture weight is allowed to behave as an unknown function of time. This model is capable of successfully capturing the features of the data, as demonstrated by simulated and real datasets. It can be useful in studies such as clustering, change-point and process control. In order to estimate the mixture weight function, we propose two new Bayesian nonlinear dynamic approaches for polynomial models, that can be extended to other problems involving polynomial nonlinear dynamic models. One of the methods, called here component-wise Metropolis-Hastings, apply the Metropolis-Hastings algorithm to each local level component of the state equation. It is more general and can be used in any situation where the observation and state equations are nonlinearly connected. The other method tends to be faster, but is applied specifically to binary data (using the probit link function). The performance of these methods of estimation, in the context of the proposed dynamic Gaussian mixture model, is evaluated through simulated datasets. Also, an application to an array Comparative Genomic Hybridization (aCGH) dataset from glioblastoma cancer illustrates our proposal, highlighting the ability of the method to detect chromosome aberrations.