日本三级网站在线播放-亚洲国产欧美一区二区午夜浪

A polynomial threshold function (PTF) $f:\mathbb{R}^n \rightarrow \mathbb{R}$ is a function of the form $f(x) = \mathsf{sign}(p(x))$ where $p$ is a polynomial of degree at most $d$. PTFs are a classical and well-studied complexity class with applications across complexity theory, learning theory, approximation theory, quantum complexity and more. We address the question of designing pseudorandom generators (PRG) for polynomial threshold functions (PTFs) in the gaussian space: design a PRG that takes a seed of few bits of randomness and outputs a $n$-dimensional vector whose distribution is indistinguishable from a standard multivariate gaussian by a degree $d$ PTF. Our main result is a PRG that takes a seed of $d^{O(1)}\log ( n / \varepsilon)\log(1/\varepsilon)/\varepsilon^2$ random bits with output that cannot be distinguished from $n$-dimensional gaussian distribution with advantage better than $\varepsilon$ by degree $d$ PTFs. The best previous generator due to O'Donnell, Servedio, and Tan (STOC'20) had a quasi-polynomial dependence (i.e., seedlength of $d^{O(\log d)}$) in the degree $d$. Along the way we prove a few nearly-tight structural properties of restrictions of PTFs that may be of independent interest.

相關內容

泛函

關注 0

簇 · 張成子空間 · Processing（編程語言） · 估計/估計量 · 相似度 ·

2022 年 2 月 1 日

Spectral Clustering, Spanning Forest, and Bayesian Forest Process

Leo L. Duan,Arkaprava Roy

Spectral clustering algorithms are very popular. Starting from a pairwise similarity matrix, spectral clustering gives a partition of data that approximately minimizes the total similarity scores across clusters. Since there is no need to model how data are distributed within each cluster, such a method enjoys algorithmic simplicity and robustness in clustering non-Gaussian data such as those near manifolds. Nevertheless, several important questions are unaddressed, such as how to estimate the similarity scores and cluster assignment probabilities, as important uncertainty estimates in clustering. In this article, we propose to solve these problems with a discovered generative modeling counterpart. Our clustering model is based on a spanning forest graph that consists of several disjoint spanning trees, with each tree corresponding to a cluster. Taking a Bayesian approach, we assign proper densities on the root and leaf nodes, and we prove that the posterior mode is almost the same as spectral clustering estimates. Further, we show that the associated generative process, named "forest process", is a continuous extension to the classic urn process, hence inheriting many nice properties such as having unbounded support for the number of clusters and being amenable to existing partition probability function; at the same time, we carefully characterize their differences. We demonstrate a novel application in joint clustering of multiple-subject functional magnetic resonance imaging scans of the human brain.

相互獨立的 · SGD · 模型評估 · 噪聲 · INFORMS ·

2022 年 2 月 1 日

Bringing Differential Private SGD to Practice: On the Independence of Gaussian Noise and the Number of Training Rounds

Marten van Dijk,Nhuong V. Nguyen,Toan N. Nguyen,Lam M. Nguyen,Phuong Ha Nguyen

from arxiv, arXiv admin note: text overlap with arXiv:2007.09208

In DP-SGD each round communicates a local SGD update which leaks some new information about the underlying local data set to the outside world. In order to provide privacy, Gaussian noise with standard deviation $\sigma$ is added to local SGD updates after performing a clipping operation. We show that for attaining $(\epsilon,\delta)$-differential privacy $\sigma$ can be chosen equal to $\sqrt{2(\epsilon +\ln(1/\delta))/\epsilon}$ for $\epsilon=\Omega(T/N^2)$, where $T$ is the total number of rounds and $N$ is equal to the size of the local data set. In many existing machine learning problems, $N$ is always large and $T=O(N)$. Hence, $\sigma$ becomes "independent" of any $T=O(N)$ choice with $\epsilon=\Omega(1/N)$. This means that our $\sigma$ only depends on $N$ rather than $T$. As shown in our paper, this differential privacy characterization allows one to {\it a-priori} select parameters of DP-SGD based on a fixed privacy budget (in terms of $\epsilon$ and $\delta$) in such a way to optimize the anticipated utility (test accuracy) the most. This ability of planning ahead together with $\sigma$'s independence of $T$ (which allows local gradient computations to be split among as many rounds as needed, even for large $T$ as usually happens in practice) leads to a {\it proactive DP-SGD algorithm} that allows a client to balance its privacy budget with the accuracy of the learned global model based on local test data. We notice that the current state-of-the art differential privacy accountant method based on $f$-DP has a closed form for computing the privacy loss for DP-SGD. However, due to its interpretation complexity, it cannot be used in a simple way to plan ahead. Instead, accountant methods are only used for keeping track of how privacy budget has been spent (after the fact).

FAST · 代價 · Better · CASES · 簇 ·

2022 年 1 月 31 日

Fast Distributed k-Means with a Small Number of Rounds

Tom Hess,Ron Visbord,Sivan Sabato

We propose a new algorithm for k-means clustering in a distributed setting, where the data is distributed across many machines, and a coordinator communicates with these machines to calculate the output clustering. Our algorithm guarantees a cost approximation factor and a number of communication rounds that depend only on the computational capacity of the coordinator. Moreover, the algorithm includes a built-in stopping mechanism, which allows it to use fewer communication rounds whenever possible. We show both theoretically and empirically that in many natural cases, indeed 1-4 rounds suffice. In comparison with the popular k-means|| algorithm, our approach allows exploiting a larger coordinator capacity to obtain a smaller number of rounds. Our experiments show that the k-means cost obtained by the proposed algorithm is usually better than the cost obtained by k-means||, even when the latter is allowed a larger number of rounds. Moreover, the machine running time in our approach is considerably smaller than that of k-means||. Code for running the algorithm and experiments is available at //github.com/selotape/distributed_k_means.

賭博機/老虎機 · ARM · 閾值 · Continuity · 無限 ·

2022 年 1 月 31 日

Rotting infinitely many-armed bandits

Jung-hun Kim,Milan Vojnovic,Se-Young Yun

We consider the infinitely many-armed bandit problem with rotting rewards, where the mean reward of an arm decreases at each pull of the arm according to an arbitrary trend with maximum rotting rate $\varrho=o(1)$. We show that this learning problem has an $\Omega(\max\{\varrho^{1/3}T,\sqrt{T}\})$ worst-case regret lower bound where $T$ is the horizon time. We show that a matching upper bound $\tilde{O}(\max\{\varrho^{1/3}T,\sqrt{T}\})$, up to a poly-logarithmic factor, can be achieved by an algorithm that uses a UCB index for each arm and a threshold value to decide whether to continue pulling an arm or remove the arm from further consideration, when the algorithm knows the value of the maximum rotting rate $\varrho$. We also show that an $\tilde{O}(\max\{\varrho^{1/3}T,T^{3/4}\})$ regret upper bound can be achieved by an algorithm that does not know the value of $\varrho$, by using an adaptive UCB index along with an adaptive threshold value.

生成模型 · PDE · MoDELS · 特化 · 欠定的 ·

2022 年 1 月 31 日

GenMod: A generative modeling approach for spectral representation of PDEs with random inputs

Jacqueline Wentz,Alireza Doostan

We propose a method for quantifying uncertainty in high-dimensional PDE systems with random parameters, where the number of solution evaluations is small. Parametric PDE solutions are often approximated using a spectral decomposition based on polynomial chaos expansions. For the class of systems we consider (i.e., high dimensional with limited solution evaluations) the coefficients are given by an underdetermined linear system in a regression formulation. This implies additional assumptions, such as sparsity of the coefficient vector, are needed to approximate the solution. Here, we present an approach where we assume the coefficients are close to the range of a generative model that maps from a low to a high dimensional space of coefficients. Our approach is inspired be recent work examining how generative models can be used for compressed sensing in systems with random Gaussian measurement matrices. Using results from PDE theory on coefficient decay rates, we construct an explicit generative model that predicts the polynomial chaos coefficient magnitudes. The algorithm we developed to find the coefficients, which we call GenMod, is composed of two main steps. First, we predict the coefficient signs using Orthogonal Matching Pursuit. Then, we assume the coefficients are within a sparse deviation from the range of a sign-adjusted generative model. This allows us to find the coefficients by solving a nonconvex optimization problem, over the input space of the generative model and the space of sparse vectors. We obtain theoretical recovery results for a Lipschitz continuous generative model and for a more specific generative model, based on coefficient decay rate bounds. We examine three high-dimensional problems and show that, for all three examples, the generative model approach outperforms sparsity promoting methods at small sample sizes.

估計/估計量 · 隨機變量 · 點估計 · SimPLe · 無偏 ·

2022 年 1 月 29 日

Limit theorems for quasi-arithmetic means of random variables with applications to point estimations for the Cauchy distribution

Yuichi Akaoka,Kazuki Okamura,Yoshiki Otobe

from arxiv, 24 pages, Example 1.2 revised and Lemma 2.4 added, to appear in Brazilian Journal of Probability and Statistics

We establish some limit theorems for quasi-arithmetic means of random variables. This class of means contains the arithmetic, geometric and harmonic means. Our feature is that the generators of quasi-arithmetic means are allowed to be complex-valued, which makes considerations for quasi-arithmetic means of random variables which could take negative values possible. Our motivation for the limit theorems is finding simple estimators of the parameters of the Cauchy distribution. By applying the limit theorems, we obtain some closed-form unbiased strongly-consistent estimators for the joint of the location and scale parameters of the Cauchy distribution, which are easy to compute and analyze.

INFORMS · state-of-the-art · 情景 · 線性的 · 示例 ·

2022 年 1 月 29 日

A scaleable projection-based branch-and-cut algorithm for the $p$-center problem

Elisabeth Gaar,Markus Sinnl

The $p$-center problem (pCP) is a fundamental problem in location science, where we are given customer demand points and possible facility locations, and we want to choose $p$ of these locations to open a facility such that the maximum distance of any customer demand point to its closest open facility is minimized. State-of-the-art solution approaches of pCP use its connection to the set cover problem to solve pCP in an iterative fashion by repeatedly solving set cover problems. The classical textbook integer programming (IP) formulation of pCP is usually dismissed due to its size and bad linear programming (LP)-relaxation bounds. We present a novel solution approach that works on a new IP formulation that can be obtained by a projection from the classical formulation. The formulation is solved by means of branch-and-cut, where cuts for demand points are iteratively generated. Moreover, the formulation can be strengthened with combinatorial information to obtain a much tighter LP-relaxation. In particular, we present a novel way to use lower bound information to obtain stronger cuts. We show that the LP-relaxation bound of our strengthened formulation has the same strength as the best known bound in literature, which is based on a semi-relaxation. Finally, we also present a computational study on instances from the literature with up to more than 700000 customers and locations. Our solution algorithm is competitive with highly sophisticated set-cover-based solution algorithms, which depend on various components and parameters.

話題 · binary · Better · 相互獨立的 · 值域 ·

2022 年 1 月 28 日

The Price of Majority Support

Robin Fritsch,Roger Wattenhofer

from arxiv, 9 pages, 2 figures

We consider the problem of finding a compromise between the opinions of a group of individuals on a number of mutually independent, binary topics. In this paper, we quantify the loss in representativeness that results from requiring the outcome to have majority support, in other words, the "price of majority support". Each individual is assumed to support an outcome if they agree with the outcome on at least as many topics as they disagree on. Our results can also be seen as quantifying Anscombes paradox which states that topic-wise majority outcome may not be supported by a majority. To measure the representativeness of an outcome, we consider two metrics. First, we look for an outcome that agrees with a majority on as many topics as possible. We prove that the maximum number such that there is guaranteed to exist an outcome that agrees with a majority on this number of topics and has majority support, equals $\ceil{(t+1)/2}$ where $t$ is the total number of topics. Second, we count the number of times a voter opinion on a topic matches the outcome on that topic. The goal is to find the outcome with majority support with the largest number of matches. We consider the ratio between this number and the number of matches of the overall best outcome which may not have majority support. We try to find the maximum ratio such that an outcome with majority support and this ratio of matches compared to the overall best is guaranteed to exist. For 3 topics, we show this ratio to be $5/6\approx 0.83$. In general, we prove an upper bound that comes arbitrarily close to $2\sqrt{6}-4\approx 0.90$ as $t$ tends to infinity. Furthermore, we numerically compute a better upper and a non-matching lower bound in the relevant range for $t$.

Processing（編程語言） · binary · 蒙特卡羅方法 · 閉式 · state-of-the-art ·

2022 年 1 月 27 日

Scalable computation of predictive probabilities in probit models with Gaussian process priors

Jian Cao,Daniele Durante,Marc G. Genton

from arxiv, 14 pages, 4 figures

Predictive models for binary data are fundamental in various fields, and the growing complexity of modern applications has motivated several flexible specifications for modeling the relationship between the observed predictors and the binary responses. A widely-implemented solution is to express the probability parameter via a probit mapping of a Gaussian process indexed by predictors. However, unlike for continuous settings, there is a lack of closed-form results for predictive distributions in binary models with Gaussian process priors. Markov chain Monte Carlo methods and approximation strategies provide common solutions to this problem, but state-of-the-art algorithms are either computationally intractable or inaccurate in moderate-to-high dimensions. In this article, we aim to cover this gap by deriving closed-form expressions for the predictive probabilities in probit Gaussian processes that rely either on cumulative distribution functions of multivariate Gaussians or on functionals of multivariate truncated normals. To evaluate these quantities we develop novel scalable solutions based on tile-low-rank Monte Carlo methods for computing multivariate Gaussian probabilities, and on mean-field variational approximations of multivariate truncated normals. Closed-form expressions for the marginal likelihood and for the posterior distribution of the Gaussian process are also discussed. As shown in simulated and real-world empirical studies, the proposed methods scale to dimensions where state-of-the-art solutions are impractical.

推斷 · 近似 · 近似推斷 · 自編碼器 · 變分自編碼 ·

2018 年 1 月 10 日

Inference Suboptimality in Variational Autoencoders

Chris Cremer,Xuechen Li,David Duvenaud

Amortized inference has led to efficient approximate inference for large datasets. The quality of posterior inference is largely determined by two factors: a) the ability of the variational distribution to model the true posterior and b) the capacity of the recognition network to generalize inference over all datapoints. We analyze approximate inference in variational autoencoders in terms of these factors. We find that suboptimal inference is often due to amortizing inference rather than the limited complexity of the approximating distribution. We show that this is due partly to the generator learning to accommodate the choice of approximation. Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of the approximation.