We consider the problem of estimating a dose-response curve, both globally and locally at a point. Continuous treatments arise often in practice, e.g. in the form of time spent on an operation, distance traveled to a location or dosage of a drug. Letting A denote a continuous treatment variable, the target of inference is the expected outcome if everyone in the population takes treatment level A=a. Under standard assumptions, the dose-response function takes the form of a partial mean. Building upon the recent literature on nonparametric regression with estimated outcomes, we study three different estimators. As a global method, we construct an empirical-risk-minimization-based estimator with an explicit characterization of second-order remainder terms. As a local method, we develop a two-stage, doubly-robust (DR) learner. Finally, we construct a mth-order estimator based on the theory of higher-order influence functions. Under certain conditions, this higher order estimator achieves the fastest rate of convergence that we are aware of for this problem. However, the other two approaches are easier to implement using off-the-shelf software, since they are formulated as two-stage regression tasks. For each estimator, we provide an upper bound on the mean-square error and investigate its finite-sample performance in a simulation. Finally, we describe a flexible, nonparametric method to perform sensitivity analysis to the no-unmeasured-confounding assumption when the treatment is continuous.
A trigonometrically approximated maximum likelihood estimation for $\alpha$-stable laws is proposed. The estimator solves the approximated likelihood equation, which is obtained by projecting a true score function on the space spanned by trigonometric functions. The projected score is expressed only by real and imaginary parts of the characteristic function and their derivatives, so that we can explicitly construct the targeting estimating equation. We study the asymptotic properties of the proposed estimator and show consistency and asymptotic normality. Furthermore, as the number of trigonometric functions increases, the estimator converges to the exact maximum likelihood estimator, in the sense that they have the same asymptotic law. Simulation studies show that our estimator outperforms other moment-type estimators, and its standard deviation almost achieves the Cram\'er--Rao lower bound. We apply our method to the estimation problem for $\alpha$-stable Ornstein--Uhlenbeck processes in a high-frequency setting. The obtained result demonstrates the theory of asymptotic mixed normality.
We study the problem of unbiased estimation of expectations with respect to (w.r.t.) $\pi$ a given, general probability measure on $(\mathbb{R}^d,\mathcal{B}(\mathbb{R}^d))$ that is absolutely continuous with respect to a standard Gaussian measure. We focus on simulation associated to a particular class of diffusion processes, sometimes termed the Schr\"odinger-F\"ollmer Sampler, which is a simulation technique that approximates the law of a particular diffusion bridge process $\{X_t\}_{t\in [0,1]}$ on $\mathbb{R}^d$, $d\in \mathbb{N}_0$. This latter process is constructed such that, starting at $X_0=0$, one has $X_1\sim \pi$. Typically, the drift of the diffusion is intractable and, even if it were not, exact sampling of the associated diffusion is not possible. As a result, \cite{sf_orig,jiao} consider a stochastic Euler-Maruyama scheme that allows the development of biased estimators for expectations w.r.t.~$\pi$. We show that for this methodology to achieve a mean square error of $\mathcal{O}(\epsilon^2)$, for arbitrary $\epsilon>0$, the associated cost is $\mathcal{O}(\epsilon^{-5})$. We then introduce an alternative approach that provides unbiased estimates of expectations w.r.t.~$\pi$, that is, it does not suffer from the time discretization bias or the bias related with the approximation of the drift function. We prove that to achieve a mean square error of $\mathcal{O}(\epsilon^2)$, the associated cost is, with high probability, $\mathcal{O}(\epsilon^{-2}|\log(\epsilon)|^{2+\delta})$, for any $\delta>0$. We implement our method on several examples including Bayesian inverse problems.
In this work, the problem of 4 degree-of-freedom (3D position and heading) robot-to-robot relative frame transformation estimation using onboard odometry and inter-robot distance measurements is studied. Firstly, we present a theoretical analysis of the problem, namely the derivation and interpretation of the Cramer-Rao Lower Bound (CRLB), the Fisher Information Matrix (FIM) and its determinant. Secondly, we propose optimization-based methods to solve the problem, including a quadratically constrained quadratic programming (QCQP) and the corresponding semidefinite programming (SDP) relaxation. Moreover, we address practical issues that are ignored in previous works, such as accounting for spatial-temporal offsets between the ultra-wideband (UWB) and odometry sensors, rejecting UWB outliers and checking for singular configurations before commencing operation. Lastly, extensive simulations and real-life experiments with aerial robots show that the proposed QCQP and SDP methods outperform state-of-the-art methods, especially in geometrically poor or large measurement noise conditions. In general, the QCQP method provides the best results at the expense of computational time, while the SDP method runs much faster and is sufficiently accurate in most cases.
Two aspects of neural networks that have been extensively studied in the recent literature are their function approximation properties and their training by gradient descent methods. The approximation problem seeks accurate approximations with a minimal number of weights. In most of the current literature these weights are fully or partially hand-crafted, showing the capabilities of neural networks but not necessarily their practical performance. In contrast, optimization theory for neural networks heavily relies on an abundance of weights in over-parametrized regimes. This paper balances these two demands and provides an approximation result for shallow networks in $1d$ with non-convex weight optimization by gradient descent. We consider finite width networks and infinite sample limits, which is the typical setup in approximation theory. Technically, this problem is not over-parametrized, however, some form of redundancy reappears as a loss in approximation rate compared to best possible rates.
This work considers Gaussian process interpolation with a periodized version of the Mat{\'e}rn covariance function (Stein, 1999, Section 6.7) with Fourier coefficients $\phi$($\alpha$^2 + j^2)^(--$\nu$--1/2). Convergence rates are studied for the joint maximum likelihood estimation of $\nu$ and $\phi$ when the data is sampled according to the model. The mean integrated squared error is also analyzed with fixed and estimated parameters, showing that maximum likelihood estimation yields asymptotically the same error as if the ground truth was known. Finally, the case where the observed function is a ''deterministic'' element of a continuous Sobolev space is also considered, suggesting that bounding assumptions on some parameters can lead to different estimates.
Interval-censored multi-state data arise in many studies of chronic diseases, where the health status of a subject can be characterized by a finite number of disease states and the transition between any two states is only known to occur over a broad time interval. We formulate the effects of potentially time-dependent covariates on multi-state processes through semiparametric proportional intensity models with random effects. We adopt nonparametric maximum likelihood estimation (NPMLE) under general interval censoring and develop a stable expectation-maximization (EM) algorithm. We show that the resulting parameter estimators are consistent and that the finite-dimensional components are asymptotically normal with a covariance matrix that attains the semiparametric efficiency bound and can be consistently estimated through profile likelihood. In addition, we demonstrate through extensive simulation studies that the proposed numerical and inferential procedures perform well in realistic settings. Finally, we provide an application to a major epidemiologic cohort study.
We study the algorithmic problem of optimally covering a tree with $k$ mobile robots. The tree is known to all robots, and our goal is to assign a walk to each robot in such a way that the union of these walks covers the whole tree. We assume that the edges have the same length, and that traveling along an edge takes a unit of time. Two objective functions are considered: the cover time and the cover length. The cover time is the maximum time a robot needs to finish its assigned walk and the cover length is the sum of the lengths of all the walks. We also consider a variant in which the robots must rendezvous periodically at the same vertex in at most a certain number of moves. We show that the problem is different for the two cost functions. For the cover time minimization problem, we prove that the problem is NP-hard when $k$ is part of the input, regardless of whether periodic rendezvous are required or not. For the cover length minimization problem, we show that it can be solved in polynomial time when periodic rendezvous are not required, and it is NP-hard otherwise.
The parameters of the log-logistic distribution are generally estimated based on classical methods such as maximum likelihood estimation, whereas these methods usually result in severe biased estimates when the data contain outliers. In this paper, we consider several alternative estimators, which not only have closed-form expressions, but also are quite robust to a certain level of data contamination. We investigate the robustness property of each estimator in terms of the breakdown point. The finite sample performance and effectiveness of these estimators are evaluated through Monte Carlo simulations and a real-data application. Numerical results demonstrate that the proposed estimators perform favorably in a manner that they are comparable with the maximum likelihood estimator for the data without contamination and that they provide superior performance in the presence of data contamination.
Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.
To estimate causal effects, analysts performing observational studies in health settings utilize several strategies to mitigate bias due to confounding by indication. There are two broad classes of approaches for these purposes: use of confounders and instrumental variables (IVs). Because such approaches are largely characterized by untestable assumptions, analysts must operate under an indefinite paradigm that these methods will work imperfectly. In this tutorial, we formalize a set of general principles and heuristics for estimating causal effects in the two approaches when the assumptions are potentially violated. This crucially requires reframing the process of observational studies as hypothesizing potential scenarios where the estimates from one approach are less inconsistent than the other. While most of our discussion of methodology centers around the linear setting, we touch upon complexities in non-linear settings and flexible procedures such as target minimum loss-based estimation (TMLE) and double machine learning (DML). To demonstrate the application of our principles, we investigate the use of donepezil off-label for mild cognitive impairment (MCI). We compare and contrast results from confounder and IV methods, traditional and flexible, within our analysis and to a similar observational study and clinical trial.