亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider a nonparametric Bayesian approach to estimation and testing for a multivariate monotone density. Instead of following the conventional Bayesian route of putting a prior distribution complying with the monotonicity restriction, we put a prior on the step heights through binning and a Dirichlet distribution. An arbitrary piece-wise constant probability density is converted to a monotone one by a projection map, taking its $\mathbb{L}_1$-projection onto the space of monotone functions, which is subsequently normalized to integrate to one. We construct consistent Bayesian tests to test multivariate monotonicity of a probability density based on the $\mathbb{L}_1$-distance to the class of monotone functions. The test is shown to have a size going to zero and high power against alternatives sufficiently separated from the null hypothesis. To obtain a Bayesian credible interval for the value of the density function at an interior point with guaranteed asymptotic frequentist coverage, we consider a posterior quantile interval of an induced map transforming the function value to its value optimized over certain blocks. The limiting coverage is explicitly calculated and is seen to be higher than the credibility level used in the construction. By exploring the asymptotic relationship between the coverage and the credibility, we show that a desired asymptomatic coverage can be obtained exactly by starting with an appropriate credibility level.

相關內容

We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory. Our work studies the theoretical performance of spectral clustering based on a random $k$-nearest neighbor graph constructed from an extremal sample, i.e., the angular part of random vectors for which the radius exceeds a large threshold. In particular, we derive the asymptotic distribution of extremes arising from a linear factor model and prove that, under certain conditions, spectral clustering can consistently identify the clusters of extremes arising in this model. Leveraging this result we propose a simple consistent estimation strategy for learning the angular measure. Our theoretical findings are complemented with numerical experiments illustrating the finite sample performance of our methods.

Bayesian inference and kernel methods are well established in machine learning. The neural network Gaussian process in particular provides a concept to investigate neural networks in the limit of infinitely wide hidden layers by using kernel and inference methods. Here we build upon this limit and provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks. We systematically compute generalization properties of linear, non-linear, and deep non-linear networks for kernel matrices with heterogeneous entries. In contrast to currently employed spectral methods we derive the generalization properties from the statistical properties of the input, elucidating the interplay of input dimensionality, size of the training data set, and variability of the data. We show that data variability leads to a non-Gaussian action reminiscent of a ($\varphi^3+\varphi^4$)-theory. Using our formalism on a synthetic task and on MNIST we obtain a homogeneous kernel matrix approximation for the learning curve as well as corrections due to data variability which allow the estimation of the generalization properties and exact results for the bounds of the learning curves in the case of infinitely many training data points.

In the recent paper [5], a Bayesian approach for constructing confidence intervals in monotone regression problems is proposed, based on credible intervals. We view this method from a frequentist point of view, and show that it corresponds to a percentile bootstrap method of which we give two versions. It is shown that a (non-percentile) smoothed bootstrap method has better behavior and does not need correction for over- or undercoverage. The proofs use martingale methods.

Focusing on stochastic programming (SP) with covariate information, this paper proposes an empirical risk minimization (ERM) method embedded within a nonconvex piecewise affine decision rule (PADR), which aims to learn the direct mapping from features to optimal decisions. We establish the nonasymptotic consistency result of our PADR-based ERM model for unconstrained problems and asymptotic consistency result for constrained ones. To solve the nonconvex and nondifferentiable ERM problem, we develop an enhanced stochastic majorization-minimization algorithm and establish the asymptotic convergence to (composite strong) directional stationarity along with complexity analysis. We show that the proposed PADR-based ERM method applies to a broad class of nonconvex SP problems with theoretical consistency guarantees and computational tractability. Our numerical study demonstrates the superior performance of PADR-based ERM methods compared to state-of-the-art approaches under various settings, with significantly lower costs, less computation time, and robustness to feature dimensions and nonlinearity of the underlying dependency.

This note addresses the question of optimally estimating a linear functional of an object acquired through linear observations corrupted by random noise, where optimality pertains to a worst-case setting tied to a symmetric, convex, and closed model set containing the object. It complements the article "Statistical Estimation and Optimal Recovery" published in the Annals of Statistics in 1994. There, Donoho showed (among other things) that, for Gaussian noise, linear maps provide near-optimal estimation schemes relatively to a performance measure relevant in Statistical Estimation. Here, we advocate for a different performance measure arguably more relevant in Optimal Recovery. We show that, relatively to this new measure, linear maps still provide near-optimal estimation schemes even if the noise is merely log-concave. Our arguments, which make a connection to the deterministic noise situation and bypass properties specific to the Gaussian case, offer an alternative to parts of Donoho's proof.

The stochastic partial differential equation (SPDE) approach is widely used for modeling large spatial datasets. It is based on representing a Gaussian random field $u$ on $\mathbb{R}^d$ as the solution of an elliptic SPDE $L^\beta u = \mathcal{W}$ where $L$ is a second-order differential operator, $2\beta$ (belongs to natural number starting from 1) is a positive parameter that controls the smoothness of $u$ and $\mathcal{W}$ is Gaussian white noise. A few approaches have been suggested in the literature to extend the approach to allow for any smoothness parameter satisfying $\beta>d/4$. Even though those approaches work well for simulating SPDEs with general smoothness, they are less suitable for Bayesian inference since they do not provide approximations which are Gaussian Markov random fields (GMRFs) as in the original SPDE approach. We address this issue by proposing a new method based on approximating the covariance operator $L^{-2\beta}$ of the Gaussian field $u$ by a finite element method combined with a rational approximation of the fractional power. This results in a numerically stable GMRF approximation which can be combined with the integrated nested Laplace approximation (INLA) method for fast Bayesian inference. A rigorous convergence analysis of the method is performed and the accuracy of the method is investigated with simulated data. Finally, we illustrate the approach and corresponding implementation in the R package rSPDE via an application to precipitation data which is analyzed by combining the rSPDE package with the R-INLA software for full Bayesian inference.

Identifying how dependence relationships vary across different conditions plays a significant role in many scientific investigations. For example, it is important for the comparison of biological systems to see if relationships between genomic features differ between cases and controls. In this paper, we seek to evaluate whether the relationships between two sets of variables is different across two conditions. Specifically, we assess: do two sets of high-dimensional variables have similar dependence relationships across two conditions?. We propose a new kernel-based test to capture the differential dependence. Specifically, the new test determines whether two measures that detect dependence relationships are similar or not under two conditions. We introduce the asymptotic permutation null distribution of the test statistic and it is shown to work well under finite samples such that the test is computationally efficient, making it easily applicable to analyze large data sets. We demonstrate through numerical studies that our proposed test has high power for detecting differential linear and non-linear relationships. The proposed method is implemented in an R package kerDAA.

We investigate the fixed-budget best-arm identification (BAI) problem for linear bandits in a potentially non-stationary environment. Given a finite arm set $\mathcal{X}\subset\mathbb{R}^d$, a fixed budget $T$, and an unpredictable sequence of parameters $\left\lbrace\theta_t\right\rbrace_{t=1}^{T}$, an algorithm will aim to correctly identify the best arm $x^* := \arg\max_{x\in\mathcal{X}}x^\top\sum_{t=1}^{T}\theta_t$ with probability as high as possible. Prior work has addressed the stationary setting where $\theta_t = \theta_1$ for all $t$ and demonstrated that the error probability decreases as $\exp(-T /\rho^*)$ for a problem-dependent constant $\rho^*$. But in many real-world $A/B/n$ multivariate testing scenarios that motivate our work, the environment is non-stationary and an algorithm expecting a stationary setting can easily fail. For robust identification, it is well-known that if arms are chosen randomly and non-adaptively from a G-optimal design over $\mathcal{X}$ at each time then the error probability decreases as $\exp(-T\Delta^2_{(1)}/d)$, where $\Delta_{(1)} = \min_{x \neq x^*} (x^* - x)^\top \frac{1}{T}\sum_{t=1}^T \theta_t$. As there exist environments where $\Delta_{(1)}^2/ d \ll 1/ \rho^*$, we are motivated to propose a novel algorithm $\mathsf{P1}$-$\mathsf{RAGE}$ that aims to obtain the best of both worlds: robustness to non-stationarity and fast rates of identification in benign settings. We characterize the error probability of $\mathsf{P1}$-$\mathsf{RAGE}$ and demonstrate empirically that the algorithm indeed never performs worse than G-optimal design but compares favorably to the best algorithms in the stationary setting.

Two-component mixture models have proved to be a powerful tool for modeling heterogeneity in several cluster analysis contexts. However, most methods based on these models assume a constant behavior for the mixture weights, which can be restrictive and unsuitable for some applications. In this paper, we relax this assumption and allow the mixture weights to vary according to the index (e.g., time) to make the model more adaptive to a broader range of data sets. We propose an efficient MCMC algorithm to jointly estimate both component parameters and dynamic weights from their posterior samples. We evaluate the method's performance by running Monte Carlo simulation studies under different scenarios for the dynamic weights. In addition, we apply the algorithm to a time series that records the level reached by a river in southern Brazil. The Taquari River is a water body whose frequent flood inundations have caused various damage to riverside communities. Implementing a dynamic mixture model allows us to properly describe the flood regimes for the areas most affected by these phenomena.

Supervised classification algorithms are used to solve a growing number of real-life problems around the globe. Their performance is strictly connected with the quality of labels used in training. Unfortunately, acquiring good-quality annotations for many tasks is infeasible or too expensive to be done in practice. To tackle this challenge, active learning algorithms are commonly employed to select only the most relevant data for labeling. However, this is possible only when the quality and quantity of labels acquired from experts are sufficient. Unfortunately, in many applications, a trade-off between annotating individual samples by multiple annotators to increase label quality vs. annotating new samples to increase the total number of labeled instances is necessary. In this paper, we address the issue of faulty data annotations in the context of active learning. In particular, we propose two novel annotation unification algorithms that utilize unlabeled parts of the sample space. The proposed methods require little to no intersection between samples annotated by different experts. Our experiments on four public datasets indicate the robustness and superiority of the proposed methods in both, the estimation of the annotator's reliability, and the assignment of actual labels, against the state-of-the-art algorithms and the simple majority voting.

北京阿比特科技有限公司