亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We show that under minimal assumptions on a random vector $X\in\mathbb{R}^d$, and with high probability, given $m$ independent copies of $X$, the coordinate distribution of each vector $(\langle X_i,\theta \rangle)_{i=1}^m$ is dictated by the distribution of the true marginal $\langle X,\theta \rangle$. Formally, we show that with high probability, \[\sup_{\theta \in S^{d-1}} \left( \frac{1}{m}\sum_{i=1}^m \left|\langle X_i,\theta \rangle^\sharp - \lambda^\theta_i \right|^2 \right)^{1/2} \leq c \left( \frac{d}{m} \right)^{1/4},\] where $\lambda^{\theta}_i = m\int_{(\frac{i-1}{m}, \frac{i}{m}]} F_{ \langle X,\theta \rangle }^{-1}(u)^2 \,du$ and $a^\sharp$ denotes the monotone non-decreasing rearrangement of $a$. The proof follows from the optimal estimate on the worst Wasserstein distance between a marginal of $X$ and its empirical counterpart, $\frac{1}{m} \sum_{i=1}^m \delta_{\langle X_i, \theta \rangle}$. We then use the accurate information on the structures of the vectors $(\langle X_i,\theta \rangle)_{i=1}^m$ to construct the first non-gaussian ensemble that yields the optimal estimate in the Dvoretzky-Milman Theorem: the ensemble exhibits almost Euclidean sections in arbitrary normed spaces of the same dimension as the gaussian embedding -- despite being very far from gaussian (in fact, it happens to be heavy-tailed).

相關內容

The study of Markov processes and broadcasting on trees has deep connections to a variety of areas including statistical physics, graphical models, phylogenetic reconstruction, Markov Chain Monte Carlo, and community detection in random graphs. Notably, the celebrated Belief Propagation (BP) algorithm achieves Bayes-optimal performance for the reconstruction problem of predicting the value of the Markov process at the root of the tree from its values at the leaves. Recently, the analysis of low-degree polynomials has emerged as a valuable tool for predicting computational-to-statistical gaps. In this work, we investigate the performance of low-degree polynomials for the reconstruction problem on trees. Perhaps surprisingly, we show that there are simple tree models with $N$ leaves and bounded arity where (1) nontrivial reconstruction of the root value is possible with a simple polynomial time algorithm and with robustness to noise, but not with any polynomial of degree $N^{c}$ for $c > 0$ a constant depending only on the arity, and (2) when the tree is unknown and given multiple samples with correlated root assignments, nontrivial reconstruction of the root value is possible with a simple Statistical Query algorithm but not with any polynomial of degree $N^c$. These results clarify some of the limitations of low-degree polynomials vs. polynomial time algorithms for Bayesian estimation problems. They also complement recent work of Moitra, Mossel, and Sandon who studied the circuit complexity of Belief Propagation. As a consequence of our main result, we show that for some $c' > 0$ depending only on the arity, $\exp(N^{c'})$ many samples are needed for RBF kernel regression to obtain nontrivial correlation with the true regression function (BP). We pose related open questions about low-degree polynomials and the Kesten-Stigum threshold.

We study the following fundamental hypothesis testing problem, which we term Gaussian mean testing. Given i.i.d. samples from a distribution $p$ on $\mathbb{R}^d$, the task is to distinguish, with high probability, between the following cases: (i) $p$ is the standard Gaussian distribution, $\mathcal{N}(0,I_d)$, and (ii) $p$ is a Gaussian $\mathcal{N}(\mu,\Sigma)$ for some unknown covariance $\Sigma$ and mean $\mu \in \mathbb{R}^d$ satisfying $\|\mu\|_2 \geq \epsilon$. Recent work gave an algorithm for this testing problem with the optimal sample complexity of $\Theta(\sqrt{d}/\epsilon^2)$. Both the previous algorithm and its analysis are quite complicated. Here we give an extremely simple algorithm for Gaussian mean testing with a one-page analysis. Our algorithm is sample optimal and runs in sample linear time.

We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between $PK$ and $QK$ output distributions of an $\epsilon$-LDP mechanism $K$ in terms of a divergence between the corresponding input distributions $P$ and $Q$, respectively. Our first main technical result presents a sharp upper bound on the $\chi^2$-divergence $\chi^2(PK\|QK)$ in terms of $\chi^2(P\|Q)$ and $\epsilon$. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on $\chi^2(PK\|QK)$ in terms of total variation distance $TV(P, Q)$ and $\epsilon$. We then utilize these bounds to establish locally private versions of the Cram\'er-Rao bound, Le Cam's, Assouad's, and the mutual information methods, which are powerful tools for bounding minimax estimation risks. These results are shown to lead to better privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.

Program sensitivity measures how robust a program is to small changes in its input, and is a fundamental notion in domains ranging from differential privacy to cyber-physical systems. A natural way to formalize program sensitivity is in terms of metrics on the input and output spaces, requiring that an $r$-sensitive function map inputs that are at distance $d$ to outputs that are at distance at most $r \cdot d$. Program sensitivity is thus an analogue of Lipschitz continuity for programs. Reed and Pierce introduced Fuzz, a functional language with a linear type system that can express program sensitivity. They show soundness operationally, in the form of a metric preservation property. Inspired by their work, we study program sensitivity and metric preservation from a denotational point of view. In particular, we introduce metric CPOs, a novel semantic structure for reasoning about computation on metric spaces, by endowing CPOs with a compatible notion of distance. This structure is useful for reasoning about metric properties of programs, and specifically about program sensitivity. We demonstrate metric CPOs by giving a model for the deterministic fragment of Fuzz.

The independent set polynomial is important in many areas. For every integer $\Delta\geq 2$, the Shearer threshold is the value $\lambda^*(\Delta)=(\Delta-1)^{\Delta-1}/\Delta^{\Delta}$ . It is known that for $\lambda < - \lambda^*(\Delta)$, there are graphs~$G$ with maximum degree~$\Delta$ whose independent set polynomial, evaluated at~$\lambda$, is at most~$0$. Also, there are no such graphs for any $\lambda > -\lambda^*(\Delta)$. This paper is motivated by the computational problem of approximating the independent set polynomial when $\lambda < - \lambda^*(\Delta)$. The key issue in complexity bounds for this problem is "implementation". Informally, an implementation of a real number $\lambda'$ is a graph whose hard-core partition function, evaluated at~$\lambda$, simulates a vertex-weight of~$\lambda'$ in the sense that $\lambda'$ is the ratio between the contribution to the partition function from independent sets containing a certain vertex and the contribution from independent sets that do not contain that vertex. Implementations are the cornerstone of intractability results for the problem of approximately evaluating the independent set polynomial. Our main result is that, for any $\lambda < - \lambda^*(\Delta)$, it is possible to implement a set of values that is dense over the reals. The result is tight in the sense that it is not possible to implement a set of values that is dense over the reals for any $\lambda> \lambda^*(\Delta)$. Our result has already been used in a paper with \bezakova{} (STOC 2018) to show that it is \#P-hard to approximate the evaluation of the independent set polynomial on graphs of degree at most~$\Delta$ at any value $\lambda<-\lambda^*(\Delta)$. In the appendix, we give an additional incomparable inapproximability result (strengthening the inapproximability bound to an exponential factor, but weakening the hardness to NP-hardness).

Wasserstein dictionary learning is an unsupervised approach to learning a collection of probability distributions that generate observed distributions as Wasserstein barycentric combinations. Existing methods for Wasserstein dictionary learning optimize an objective that seeks a dictionary with sufficient representation capacity via barycentric interpolation to approximate the observed training data, but without imposing additional structural properties on the coefficients associated to the dictionary. This leads to dictionaries that densely represent the observed data, which makes interpretation of the coefficients challenging and may also lead to poor empirical performance when using the learned coefficients in downstream tasks. In contrast and motivated by sparse dictionary learning in Euclidean spaces, we propose a geometrically sparse regularizer for Wasserstein space that promotes representations of a data point using only nearby dictionary elements. We show this approach leads to sparse representations in Wasserstein space and addresses the problem of non-uniqueness of barycentric representation. Moreover, when data is generated as Wasserstein barycenters of fixed distributions, this regularizer facilitates the recovery of the generating distributions in cases that are ill-posed for unregularized Wasserstein dictionary learning. Through experimentation on synthetic and real data, we show that our geometrically regularized approach yields sparser and more interpretable dictionaries in Wasserstein space, which perform better in downstream applications.

Motivated by a recently established result saying that within the class of bivariate Archimedean copulas standard pointwise convergence implies weak convergence of almost all conditional distributions this contribution studies the class $\mathcal{C}_{ar}^d$ of all $d$-dimensional Archimedean copulas with $d \geq 3$ and proves the afore-mentioned implication with respect to conditioning on the first $d-1$ coordinates. Several proper\-ties equivalent to pointwise convergence in $\mathcal{C}_{ar}^d$ are established and - as by-product of working with conditional distributions (Markov kernels) - alternative simple proofs for the well-known formulas for the level set masses $\mu_C(L_t)$ and the Kendall distribution function $F_K^d$ as well as a novel geometrical interpretation of the latter are provided. Viewing normalized generators $\psi$ of $d$-dimensional Archimedean copulas from the perspective of their so-called Williamson measures $\gamma$ on $(0,\infty)$ is then shown to allow not only to derive surprisingly simple expressions for $\mu_C(L_t)$ and $F_K^d$ in terms of $\gamma$ and to characterize pointwise convergence in $\mathcal{C}_{ar}^d$ by weak convergence of the Williamson measures but also to prove that regularity/singularity properties of $\gamma$ directly carry over to the corresponding copula $C_\gamma \in \mathcal{C}_{ar}^d$. These results are finally used to prove the fact that the family of all absolutely continuous and the family of all singular $d$-dimensional copulas is dense in $\mathcal{C}_{ar}^d$ and to underline that despite of their simple algebraic structure Archimedean copulas may exhibit surprisingly singular behavior in the sense of irregularity of their conditional distribution functions.

V. Levenshtein first proposed the sequence reconstruction problem in 2001. This problem studies the model where the same sequence from some set is transmitted over multiple channels, and the decoder receives the different outputs. Assume that the transmitted sequence is at distance $d$ from some code and there are at most $r$ errors in every channel. Then the sequence reconstruction problem is to find the minimum number of channels required to recover exactly the transmitted sequence that has to be greater than the maximum intersection between two metric balls of radius $r$, where the distance between their centers is at least $d$. In this paper, we study the sequence reconstruction problem of permutations under the Hamming distance. In this model, we define a Cayley graph and find the exact value of the largest intersection of two metric balls in this graph under the Hamming distance for $r=4$ with $d\geqslant 5$, and for $d=2r$.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

北京阿比特科技有限公司