亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e.g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies. We model these dependencies in the language of Markov Random Fields and, importantly, allow these dependencies to be substantial, i.e do not assume that the Markov Random Field capturing these dependencies is in high temperature. As our main contribution we provide algorithms and statistically efficient estimation rates for this model, giving several instantiations of our bounds in logistic regression, sparse logistic regression, and neural network settings with dependent data. Our estimation guarantees follow from novel results for estimating the parameters (i.e. external fields and interaction strengths) of Ising models from a {\em single} sample. {We evaluate our estimation approach on real networked data, showing that it outperforms standard regression approaches that ignore dependencies, across three text classification datasets: Cora, Citeseer and Pubmed.}

相關內容

We consider the task of estimating a conditional density using i.i.d. samples from a joint distribution, which is a fundamental problem with applications in both classification and uncertainty quantification for regression. For joint density estimation, minimax rates have been characterized for general density classes in terms of uniform (metric) entropy, a well-studied notion of statistical capacity. When applying these results to conditional density estimation, the use of uniform entropy -- which is infinite when the covariate space is unbounded and suffers from the curse of dimensionality -- can lead to suboptimal rates. Consequently, minimax rates for conditional density estimation cannot be characterized using these classical results. We resolve this problem for well-specified models, obtaining matching (within logarithmic factors) upper and lower bounds on the minimax Kullback--Leibler risk in terms of the empirical Hellinger entropy for the conditional density class. The use of empirical entropy allows us to appeal to concentration arguments based on local Rademacher complexity, which -- in contrast to uniform entropy -- leads to matching rates for large, potentially nonparametric classes and captures the correct dependence on the complexity of the covariate space. Our results require only that the conditional densities are bounded above, and do not require that they are bounded below or otherwise satisfy any tail conditions.

The estimation of information measures of continuous distributions based on samples is a fundamental problem in statistics and machine learning. In this paper, we analyze estimates of differential entropy in $K$-dimensional Euclidean space, computed from a finite number of samples, when the probability density function belongs to a predetermined convex family $\mathcal{P}$. First, estimating differential entropy to any accuracy is shown to be infeasible if the differential entropy of densities in $\mathcal{P}$ is unbounded, clearly showing the necessity of additional assumptions. Subsequently, we investigate sufficient conditions that enable confidence bounds for the estimation of differential entropy. In particular, we provide confidence bounds for simple histogram based estimation of differential entropy from a fixed number of samples, assuming that the probability density function is Lipschitz continuous with known Lipschitz constant and known, bounded support. Our focus is on differential entropy, but we provide examples that show that similar results hold for mutual information and relative entropy as well.

The probability distribution of precipitation amount strongly depends on geography, climate zone, and time scale considered. Closed-form parametric probability distributions are not sufficiently flexible to provide accurate and universal models for precipitation amount over different time scales. In this paper we derive non-parametric estimates of the cumulative distribution function (CDF) of precipitation amount for wet time intervals. The CDF estimates are obtained by integrating the kernel density estimator leading to semi-explicit CDF expressions for different kernel functions. We investigate kernel-based CDF estimation with an adaptive plug-in bandwidth (KCDE), using both synthetic data sets and reanalysis precipitation data from the island of Crete (Greece). We show that KCDE provides better estimates of the probability distribution than the standard empirical (staircase) estimate and kernel-based estimates that use the normal reference bandwidth. We also demonstrate that KCDE enables the simulation of non-parametric precipitation amount distributions by means of the inverse transform sampling method.

Human motion synthesis is an important problem with applications in graphics, gaming and simulation environments for robotics. Existing methods require accurate motion capture data for training, which is costly to obtain. Instead, we propose a framework for training generative models of physically plausible human motion directly from monocular RGB videos, which are much more widely available. At the core of our method is a novel optimization formulation that corrects imperfect image-based pose estimations by enforcing physics constraints and reasons about contacts in a differentiable way. This optimization yields corrected 3D poses and motions, as well as their corresponding contact forces. Results show that our physically-corrected motions significantly outperform prior work on pose estimation. We can then use these to train a generative model to synthesize future motion. We demonstrate both qualitatively and quantitatively significantly improved motion estimation, synthesis quality and physical plausibility achieved by our method on the large scale Human3.6m dataset \cite{h36m_pami} as compared to prior kinematic and physics-based methods. By enabling learning of motion synthesis from video, our method paves the way for large-scale, realistic and diverse motion synthesis.

Human pose estimation has achieved significant progress on images with high imaging resolution. However, low-resolution imagery data bring nontrivial challenges which are still under-studied. To fill this gap, we start with investigating existing methods and reveal that the most dominant heatmap-based methods would suffer more severe model performance degradation from low-resolution, and offset learning is an effective strategy. Established on this observation, in this work we propose a novel Confidence-Aware Learning (CAL) method which further addresses two fundamental limitations of existing offset learning methods: inconsistent training and testing, decoupled heatmap and offset learning. Specifically, CAL selectively weighs the learning of heatmap and offset with respect to ground-truth and most confident prediction, whilst capturing the statistical importance of model output in mini-batch learning manner. Extensive experiments conducted on the COCO benchmark show that our method outperforms significantly the state-of-the-art methods for low-resolution human pose estimation.

Measurement error is a pervasive issue which renders the results of an analysis unreliable. The measurement error literature contains numerous correction techniques, which can be broadly divided into those which aim to produce exactly consistent estimators, and those which are only approximately consistent. While consistency is a desirable property, it is typically attained only under specific model assumptions. Two techniques, regression calibration and simulation extrapolation, are used frequently in a wide variety of parametric and semiparametric settings. However, in many settings these methods are only approximately consistent. We generalize these corrections, relaxing assumptions placed on replicate measurements. Under regularity conditions, the estimators are shown to be asymptotically normal, with a sandwich estimator for the asymptotic variance. Through simulation, we demonstrate the improved performance of the modified estimators, over the standard techniques, when these assumptions are violated. We motivate these corrections using the Framingham Heart Study, and apply the generalized techniques to an analysis of these data.

While deep neural networks are highly performant and successful in a wide range of real-world problems, estimating their predictive uncertainty remains a challenging task. To address this challenge, we propose and implement a loss function for regression uncertainty estimation based on the Bayesian Validation Metric (BVM) framework while using ensemble learning. A series of experiments on in-distribution data show that the proposed method is competitive with existing state-of-the-art methods. In addition, experiments on out-of-distribution data show that the proposed method is robust to statistical change and exhibits superior predictive capability.

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.

A fundamental computation for statistical inference and accurate decision-making is to compute the marginal probabilities or most probable states of task-relevant variables. Probabilistic graphical models can efficiently represent the structure of such complex data, but performing these inferences is generally difficult. Message-passing algorithms, such as belief propagation, are a natural way to disseminate evidence amongst correlated variables while exploiting the graph structure, but these algorithms can struggle when the conditional dependency graphs contain loops. Here we use Graph Neural Networks (GNNs) to learn a message-passing algorithm that solves these inference tasks. We first show that the architecture of GNNs is well-matched to inference tasks. We then demonstrate the efficacy of this inference approach by training GNNs on a collection of graphical models and showing that they substantially outperform belief propagation on loopy graphs. Our message-passing algorithms generalize out of the training set to larger graphs and graphs with different structure.

北京阿比特科技有限公司