亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Given univariate random variables $Y_1, \ldots, Y_n$ with the $\text{Uniform}(\theta_0 - 1, \theta_0 + 1)$ distribution, the sample midrange $\frac{Y_{(n)}+Y_{(1)}}{2}$ is the MLE for $\theta_0$ and estimates $\theta_0$ with error of order $1/n$, which is much smaller compared with the $1/\sqrt{n}$ error rate of the usual sample mean estimator. However, the sample midrange performs poorly when the data has say the Gaussian $N(\theta_0, 1)$ distribution, with an error rate of $1/\sqrt{\log n}$. In this paper, we propose an estimator of the location $\theta_0$ with a rate of convergence that can, in many settings, adapt to the underlying distribution which we assume to be symmetric around $\theta_0$ but is otherwise unknown. When the underlying distribution is compactly supported, we show that our estimator attains a rate of convergence of $n^{-\frac{1}{\alpha}}$ up to polylog factors, where the rate parameter $\alpha$ can take on any value in $(0, 2]$ and depends on the moments of the underlying distribution. Our estimator is formed by the $\ell^\gamma$-center of the data, for a $\gamma\geq2$ chosen in a data-driven way -- by minimizing a criterion motivated by the asymptotic variance. Our approach can be directly applied to the regression setting where $\theta_0$ is a function of observed features and motivates the use of $\ell^\gamma$ loss function for $\gamma > 2$ in certain settings.

相關內容

Recent approaches to causal inference have focused on causal effects defined as contrasts between the distribution of counterfactual outcomes under hypothetical interventions on the nodes of a graphical model. In this article we develop theory for causal effects defined with respect to a different type of intervention, one which alters the information propagated through the edges of the graph. These information transfer interventions may be more useful than node interventions in settings in which causes are non-manipulable, for example when considering race or genetics as a causal agent. Furthermore, information transfer interventions allow us to define path-specific decompositions which are identified in the presence of treatment-induced mediator-outcome confounding, a practical problem whose general solution remains elusive. We prove that the proposed effects provide valid statistical tests of mechanisms, unlike popular methods based on randomized interventions on the mediator. We propose efficient non-parametric estimators for a covariance version of the proposed effects, using data-adaptive regression coupled with semi-parametric efficiency theory to address model misspecification bias while retaining $\sqrt{n}$-consistency and asymptotic normality. We illustrate the use of our methods in two examples using publicly available data.

Inference tasks in signal processing are often characterized by the availability of reliable statistical modeling with some missing instance-specific parameters. One conventional approach uses data to estimate these missing parameters and then infers based on the estimated model. Alternatively, data can also be leveraged to directly learn the inference mapping end-to-end. These approaches for combining partially-known statistical models and data in inference are related to the notions of generative and discriminative models used in the machine learning literature, typically considered in the context of classifiers. The goal of this lecture note is to introduce the concepts of generative and discriminative learning for inference with a partially-known statistical model. While machine learning systems often lack the interpretability of traditional signal processing methods, we focus on a simple setting where one can interpret and compare the approaches in a tractable manner that is accessible and relevant to signal processing readers. In particular, we exemplify the approaches for the task of Bayesian signal estimation in a jointly Gaussian setting with the mean-squared error (MSE) objective, i.e., a linear estimation setting.

Conditional effect estimation has great scientific and policy importance because interventions may impact subjects differently depending on their characteristics. Most research has focused on estimating the conditional average treatment effect (CATE). However, identification of the CATE requires all subjects have a non-zero probability of receiving treatment, or positivity, which may be unrealistic in practice. Instead, we propose conditional effects based on incremental propensity score interventions, which are stochastic interventions where the odds of treatment are multiplied by some factor. These effects do not require positivity for identification and can be better suited for modeling scenarios in which people cannot be forced into treatment. We develop a projection estimator and a flexible nonparametric estimator that can each estimate all the conditional effects we propose and derive model-agnostic error guarantees showing both estimators satisfy a form of double robustness. Further, we propose a summary of treatment effect heterogeneity and a test for any effect heterogeneity based on the variance of a conditional derivative effect and derive a nonparametric estimator that also satisfies a form of double robustness. Finally, we demonstrate our estimators by analyzing the effect of intensive care unit admission on mortality using a dataset from the (SPOT)light study.

Obtaining guarantees on the convergence of the minimizers of empirical risks to the ones of the true risk is a fundamental matter in statistical learning. Instead of deriving guarantees on the usual estimation error, the goal of this paper is to provide concentration inequalities on the distance between the sets of minimizers of the risks for a broad spectrum of estimation problems. In particular, the risks are defined on metric spaces through probability measures that are also supported on metric spaces. A particular attention will therefore be given to include unbounded spaces and non-convex cost functions that might also be unbounded. This work identifies a set of assumptions allowing to describe a regime that seem to govern the concentration in many estimation problems, where the empirical minimizers are stable. This stability can then be leveraged to prove parametric concentration rates in probability and in expectation. The assumptions are verified, and the bounds showcased, on a selection of estimation problems such as barycenters on metric space with positive or negative curvature, subspaces of covariance matrices, regression problems and entropic-Wasserstein barycenters.

This study examines the use of a recurrent neural network for estimating the parameters of a Hawkes model based on high-frequency financial data, and subsequently, for computing volatility. Neural networks have shown promising results in various fields, and interest in finance is also growing. Our approach demonstrates significantly faster computational performance compared to traditional maximum likelihood estimation methods while yielding comparable accuracy in both simulation and empirical studies. Furthermore, we demonstrate the application of this method for real-time volatility measurement, enabling the continuous estimation of financial volatility as new price data keeps coming from the market.

We consider a Prohorov metric-based nonparametric approach to estimating the probability distribution of a random parameter vector in discrete-time abstract parabolic systems. We establish the existence and consistency of a least squares estimator. We develop a finite-dimensional approximation and convergence theory, and obtain numerical results by applying the nonparametric estimation approach and the finite-dimensional approximation framework to a problem involving an alcohol biosensor, wherein we estimate the probability distribution of random parameters in a parabolic PDE. To show the convergence of the estimated distribution to the "true" distribution, we simulate data from the "true" distribution, apply our algorithm, and obtain the estimated cumulative distribution function. We then use the Markov Chain Monte Carlo Metropolis Algorithm to generate random samples from the estimated distribution, and perform a generalized (2-dimensional) two-sample Kolmogorov-Smirnov test with null hypothesis that our generated random samples from the estimated distribution and generated random samples from the "true" distribution are drawn from the same distribution. We then apply our algorithm to actual human subject data from the alcohol biosensor and observe the behavior of the normalized root-mean-square error (NRMSE) using leave-one-out cross-validation (LOOCV) under different model complexities.

Sampling from the posterior is a key technical problem in Bayesian statistics. Rigorous guarantees are difficult to obtain for Markov Chain Monte Carlo algorithms of common use. In this paper, we study an alternative class of algorithms based on diffusion processes. The diffusion is constructed in such a way that, at its final time, it approximates the target posterior distribution. The stochastic differential equation that defines this process is discretized (using a Euler scheme) to provide an efficient sampling algorithm. Our construction of the diffusion is based on the notion of observation process and the related idea of stochastic localization. Namely, the diffusion process describes a sample that is conditioned on increasing information. An overlapping family of processes was derived in the machine learning literature via time-reversal. We apply this method to posterior sampling in the high-dimensional symmetric spiked model. We observe a rank-one matrix ${\boldsymbol \theta}{\boldsymbol \theta}^{\sf T}$ corrupted by Gaussian noise, and want to sample ${\boldsymbol \theta}$ from the posterior. Our sampling algorithm makes use of an oracle that computes the posterior expectation of ${\boldsymbol \theta}$ given the data and the additional observation process. We provide an efficient implementation of this oracle using approximate message passing. We thus develop the first sampling algorithm for this problem with approximation guarantees.

We present self-supervised geometric perception (SGP), the first general framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels (e.g., camera poses, rigid transformations). Our first contribution is to formulate geometric perception as an optimization problem that jointly optimizes the feature descriptor and the geometric models given a large corpus of visual measurements (e.g., images, point clouds). Under this optimization formulation, we show that two important streams of research in vision, namely robust model fitting and deep feature learning, correspond to optimizing one block of the unknown variables while fixing the other block. This analysis naturally leads to our second contribution -- the SGP algorithm that performs alternating minimization to solve the joint optimization. SGP iteratively executes two meta-algorithms: a teacher that performs robust model fitting given learned features to generate geometric pseudo-labels, and a student that performs deep feature learning under noisy supervision of the pseudo-labels. As a third contribution, we apply SGP to two perception problems on large-scale real datasets, namely relative camera pose estimation on MegaDepth and point cloud registration on 3DMatch. We demonstrate that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司