亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A confidence sequence is a sequence of confidence intervals that is uniformly valid over an unbounded time horizon. Our work develops confidence sequences whose widths go to zero, with nonasymptotic coverage guarantees under nonparametric conditions. We draw connections between the Cram\'er-Chernoff method for exponential concentration, the law of the iterated logarithm (LIL), and the sequential probability ratio test -- our confidence sequences are time-uniform extensions of the first; provide tight, nonasymptotic characterizations of the second; and generalize the third to nonparametric settings, including sub-Gaussian and Bernstein conditions, self-normalized processes, and matrix martingales. We illustrate the generality of our proof techniques by deriving an empirical-Bernstein bound growing at a LIL rate, as well as a novel upper LIL for the maximum eigenvalue of a sum of random matrices. Finally, we apply our methods to covariance matrix estimation and to estimation of sample average treatment effect under the Neyman-Rubin potential outcomes model.

相關內容

We study nonparametric Bayesian models for reversible multi-dimensional diffusions with periodic drift. For continuous observation paths, reversibility is exploited to prove a general posterior contraction rate theorem for the drift gradient vector field under approximation-theoretic conditions on the induced prior for the invariant measure. The general theorem is applied to Gaussian priors and $p$-exponential priors, which are shown to converge to the truth at the minimax optimal rate over Sobolev smoothness classes in any dimension

High-resolution simulations of particle-based kinetic plasma models typically require a high number of particles and thus often become computationally intractable. This is exacerbated in multi-query simulations, where the problem depends on a set of parameters. In this work, we derive reduced order models for the semi-discrete Hamiltonian system resulting from a geometric particle-in-cell approximation of the parametric Vlasov-Poisson equations. Since the problem's non-dissipative and highly nonlinear nature makes it reducible only locally in time, we adopt a nonlinear reduced basis approach where the reduced phase space evolves in time. This strategy allows a significant reduction in the number of simulated particles, but the evaluation of the nonlinear operators associated with the Vlasov-Poisson coupling remains computationally expensive. We propose a novel reduction of the nonlinear terms that combines adaptive parameter sampling and hyper-reduction techniques to address this. The proposed approach allows decoupling the operations having a cost dependent on the number of particles from those that depend on the instances of the required parameters. In particular, in each time step, the electric potential is approximated via dynamic mode decomposition (DMD) and the particle-to-grid map via a discrete empirical interpolation method (DEIM). These approximations are constructed from data obtained from a past temporal window at a few selected values of the parameters to guarantee a computationally efficient adaptation. The resulting DMD-DEIM reduced dynamical system retains the Hamiltonian structure of the full model, provides good approximations of the solution, and can be solved at a reduced computational cost.

This paper studies the classical problem of finding all $k$ nearest neighbors to points of a query set $Q$ in another reference set $R$ within any metric space. The well-known work by Beygelzimer, Kakade, and Langford in 2006 introduced cover trees and claimed to guarantee a near linear time complexity in the size $|R|$ of the reference set for $k=1$. Our previous work defined compressed cover trees and corrected the key arguments for $k\geq 1$ and previously unknown challenging data cases. In 2009 Ram, Lee, March, and Gray attempted to improve the time complexity by using pairs of cover trees on the query and reference sets. In 2015 Curtin with the above co-authors used extra parameters to finally prove a similar complexity for $k = 1$. Our work fills all previous gaps and substantially improves the neighbor search based on pairs of new compressed cover trees. The novel imbalance parameter of paired trees allowed us to prove a better time complexity for any number of neighbors $k\geq 1$.

The aim of this paper is to describe a novel non-parametric noise reduction technique from the point of view of Bayesian inference that may automatically improve the signal-to-noise ratio of one- and two-dimensional data, such as e.g. astronomical images and spectra. The algorithm iteratively evaluates possible smoothed versions of the data, the smooth models, obtaining an estimation of the underlying signal that is statistically compatible with the noisy measurements. Iterations stop based on the evidence and the $\chi^2$ statistic of the last smooth model, and we compute the expected value of the signal as a weighted average of the whole set of smooth models. In this paper, we explain the mathematical formalism and numerical implementation of the algorithm, and we evaluate its performance in terms of the peak signal to noise ratio, the structural similarity index, and the time payload, using a battery of real astronomical observations. Our Fully Adaptive Bayesian Algorithm for Data Analysis (FABADA) yields results that, without any parameter tuning, are comparable to standard image processing algorithms whose parameters have been optimized based on the true signal to be recovered, something that is impossible in a real application. State-of-the-art non-parametric methods, such as BM3D, offer slightly better performance at high signal-to-noise ratio, while our algorithm is significantly more accurate for extremely noisy data (higher than $20-40\%$ relative errors, a situation of particular interest in the field of astronomy). In this range, the standard deviation of the residuals obtained by our reconstruction may become more than an order of magnitude lower than that of the original measurements. The source code needed to reproduce all the results presented in this report, including the implementation of the method, is publicly available at //github.com/PabloMSanAla/fabada

We consider a class of statistical estimation problems in which we are given a random data matrix ${\boldsymbol X}\in {\mathbb R}^{n\times d}$ (and possibly some labels ${\boldsymbol y}\in{\mathbb R}^n$) and would like to estimate a coefficient vector ${\boldsymbol \theta}\in{\mathbb R}^d$ (or possibly a constant number of such vectors). Special cases include low-rank matrix estimation and regularized estimation in generalized linear models (e.g., sparse regression). First order methods proceed by iteratively multiplying current estimates by ${\boldsymbol X}$ or its transpose. Examples include gradient descent or its accelerated variants. Celentano, Montanari, Wu proved that for any constant number of iterations (matrix vector multiplications), the optimal first order algorithm is a specific approximate message passing algorithm (known as `Bayes AMP'). The error of this estimator can be characterized in the high-dimensional asymptotics $n,d\to\infty$, $n/d\to\delta$, and provides a lower bound to the estimation error of any first order algorithm. Here we present a simpler proof of the same result, and generalize it to broader classes of data distributions and of first order algorithms, including algorithms with non-separable nonlinearities. Most importantly, the new proof technique does not require to construct an equivalent tree-structured estimation problem, and is therefore susceptible of a broader range of applications.

Neyman-Scott process (NSP) are point process models that generate clusters of points in time or space. They are natural models for a wide range of phenomena, ranging from neural spike trains to document streams. The clustering property is achieved via a doubly stochastic formulation: first, a set of latent events is drawn from a Poisson process; then, each latent event generates a set of observed data points according to another Poisson process. This construction is similar to Bayesian nonparametric mixture models like the Dirichlet process mixture model (DPMM) in that the number of latent events (i.e. clusters) is a random variable, but the point process formulation makes the NSP especially well suited to modeling spatiotemporal data. While many specialized algorithms have been developed for DPMMs, comparatively fewer works have focused on inference in NSPs. Here, we present novel connections between NSPs and DPMMs, with the key link being a third class of Bayesian mixture models called mixture of finite mixture models (MFMMs). Leveraging this connection, we adapt the standard collapsed Gibbs sampling algorithm for DPMMs to enable scalable Bayesian inference on NSP models. We demonstrate the potential of Neyman-Scott processes on a variety of applications including sequence detection in neural spike trains and event detection in document streams.

We present self-supervised geometric perception (SGP), the first general framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels (e.g., camera poses, rigid transformations). Our first contribution is to formulate geometric perception as an optimization problem that jointly optimizes the feature descriptor and the geometric models given a large corpus of visual measurements (e.g., images, point clouds). Under this optimization formulation, we show that two important streams of research in vision, namely robust model fitting and deep feature learning, correspond to optimizing one block of the unknown variables while fixing the other block. This analysis naturally leads to our second contribution -- the SGP algorithm that performs alternating minimization to solve the joint optimization. SGP iteratively executes two meta-algorithms: a teacher that performs robust model fitting given learned features to generate geometric pseudo-labels, and a student that performs deep feature learning under noisy supervision of the pseudo-labels. As a third contribution, we apply SGP to two perception problems on large-scale real datasets, namely relative camera pose estimation on MegaDepth and point cloud registration on 3DMatch. We demonstrate that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.

We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates the number of topics K from the observed data. We derive new finite sample minimax lower bounds for the estimation of A, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any number of documents (n), individual document length (N_i), dictionary size (p) and number of topics (K), and both p and K are allowed to increase with n, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics K, while we provide the competing methods with the correct value in our simulations.

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models. We would like to solve this with computational and sample complexity lower than solving the overall original problem, where one learns parameters of all components. Our main contributions are the development of a simple but general model for the notion of side information, and a corresponding simple matrix-based algorithm for solving the search problem in this general setting. We then specialize this model and algorithm to four common scenarios: Gaussian mixture models, LDA topic models, subspace clustering, and mixed linear regression. For each one of these we show that if (and only if) the side information is informative, we obtain parameter estimates with greater accuracy, and also improved computation complexity than existing moment based mixture model algorithms (e.g. tensor methods). We also illustrate several natural ways one can obtain such side information, for specific problem instances. Our experiments on real data sets (NY Times, Yelp, BSDS500) further demonstrate the practicality of our algorithms showing significant improvement in runtime and accuracy.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

北京阿比特科技有限公司