亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Clustering is an essential primitive in unsupervised machine learning. We bring forth the problem of sublinear-time differentially-private clustering as a natural and well-motivated direction of research. We combine the $k$-means and $k$-median sublinear-time results of Mishra et al. (SODA, 2001) and of Czumaj and Sohler (Rand. Struct. and Algorithms, 2007) with recent results on private clustering of Balcan et al. (ICML 2017), Gupta et al. (SODA, 2010) and Ghazi et al. (NeurIPS, 2020) to obtain sublinear-time private $k$-means and $k$-median algorithms via subsampling. We also investigate the privacy benefits of subsampling for group privacy.

相關內容

Although robust learning and local differential privacy are both widely studied fields of research, combining the two settings is just starting to be explored. We consider the problem of estimating a discrete distribution in total variation from $n$ contaminated data batches under a local differential privacy constraint. A fraction $1-\epsilon$ of the batches contain $k$ i.i.d. samples drawn from a discrete distribution $p$ over $d$ elements. To protect the users' privacy, each of the samples is privatized using an $\alpha$-locally differentially private mechanism. The remaining $\epsilon n $ batches are an adversarial contamination. The minimax rate of estimation under contamination alone, with no privacy, is known to be $\epsilon/\sqrt{k}+\sqrt{d/kn}$, up to a $\sqrt{\log(1/\epsilon)}$ factor. Under the privacy constraint alone, the minimax rate of estimation is $\sqrt{d^2/\alpha^2 kn}$. We show that combining the two constraints leads to a minimax estimation rate of $\epsilon\sqrt{d/\alpha^2 k}+\sqrt{d^2/\alpha^2 kn}$ up to a $\sqrt{\log(1/\epsilon)}$ factor, larger than the sum of the two separate rates. We provide a polynomial-time algorithm achieving this bound, as well as a matching information theoretic lower bound.

Stochastic partial differential equations (SPDEs) are the mathematical tool of choice for modelling spatiotemporal PDE-dynamics under the influence of randomness. Based on the notion of mild solution of an SPDE, we introduce a novel neural architecture to learn solution operators of PDEs with (possibly stochastic) forcing from partially observed data. The proposed Neural SPDE model provides an extension to two popular classes of physics-inspired architectures. On the one hand, it extends Neural CDEs and variants -- continuous-time analogues of RNNs -- in that it is capable of processing incoming sequential information arriving irregularly in time and observed at arbitrary spatial resolutions. On the other hand, it extends Neural Operators -- generalizations of neural networks to model mappings between spaces of functions -- in that it can parameterize solution operators of SPDEs depending simultaneously on the initial condition and a realization of the driving noise. By performing operations in the spectral domain, we show how a Neural SPDE can be evaluated in two ways, either by calling an ODE solver (emulating a spectral Galerkin scheme), or by solving a fixed point problem. Experiments on various semilinear SPDEs, including the stochastic Navier-Stokes equations, demonstrate how the Neural SPDE model is capable of learning complex spatiotemporal dynamics in a resolution-invariant way, with better accuracy and lighter training data requirements compared to alternative models, and up to 3 orders of magnitude faster than traditional solvers.

This paper considers the problem of inference in cluster randomized experiments when cluster sizes are non-ignorable. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by non-ignorable cluster sizes we mean that "large" clusters and "small" clusters may be heterogeneous, and, in particular, the effects of the treatment may vary across clusters of differing sizes. In order to permit this sort of flexibility, we consider a sampling framework in which cluster sizes themselves are random. In this way, our analysis departs from earlier analyses of cluster randomized experiments in which cluster sizes are treated as non-random. We distinguish between two different parameters of interest: the equally-weighted cluster-level average treatment effect, and the size-weighted cluster-level average treatment effect. For each parameter, we provide methods for inference in an asymptotic framework where the number of clusters tends to infinity and treatment is assigned using simple random sampling. We additionally permit the experimenter to sample only a subset of the units within each cluster rather than the entire cluster and demonstrate the implications of such sampling for some commonly used estimators. A small simulation study shows the practical relevance of our theoretical results.

We propose a multiple-splitting projection test (MPT) for one-sample mean vectors in high-dimensional settings. The idea of projection test is to project high-dimensional samples to a 1-dimensional space using an optimal projection direction such that traditional tests can be carried out with projected samples. However, estimation of the optimal projection direction has not been systematically studied in the literature. In this work, we bridge the gap by proposing a consistent estimation via regularized quadratic optimization. To retain type I error rate, we adopt a data-splitting strategy when constructing test statistics. To mitigate the power loss due to data-splitting, we further propose a test via multiple splits to enhance the testing power. We show that the $p$-values resulted from multiple splits are exchangeable. Unlike existing methods which tend to conservatively combine dependent $p$-values, we develop an exact level $\alpha$ test that explicitly utilizes the exchangeability structure to achieve better power. Numerical studies show that the proposed test well retains the type I error rate and is more powerful than state-of-the-art tests.

In the interdependent values (IDV) model introduced by Milgrom and Weber [1982], agents have private signals that capture their information about different social alternatives, and the valuation of every agent is a function of all agent signals. While interdependence has been mainly studied for auctions, it is extremely relevant for a large variety of social choice settings, including the canonical setting of public projects. The IDV model is very challenging relative to standard independent private values, and welfare guarantees have been achieved through two alternative conditions known as {\em single-crossing} and {\em submodularity over signals (SOS)}. In either case, the existing theory falls short of solving the public projects setting. Our contribution is twofold: (i) We give a workable characterization of truthfulness for IDV public projects for the largest class of valuations for which such a characterization exists, and term this class \emph{decomposable valuations}; (ii) We provide possibility and impossibility results for welfare approximation in public projects with SOS valuations. Our main impossibility result is that, in contrast to auctions, no universally truthful mechanism performs better for public projects with SOS valuations than choosing a project at random. Our main positive result applies to {\em excludable} public projects with SOS, for which we establish a constant factor approximation similar to auctions. Our results suggest that exclusion may be a key tool for achieving welfare guarantees in the IDV model.

Motivated by problems from neuroimaging in which existing approaches make use of "mass univariate" analysis which neglects spatial structure entirely, but the full joint modelling of all quantities of interest is computationally infeasible, a novel method for incorporating spatial dependence within a (potentially large) family of model-selection problems is presented. Spatial dependence is encoded via a Markov random field model for which a variant of the pseudo-marginal Markov chain Monte Carlo algorithm is developed and extended by a further augmentation of the underlying state space. This approach allows the exploitation of existing unbiased marginal likelihood estimators used in settings in which spatial independence is normally assumed thereby facilitating the incorporation of spatial dependence using non-spatial estimates with minimal additional development effort. The proposed algorithm can be realistically used for analysis of %smaller subsets of large image moderately sized data sets such as $2$D slices of whole $3$D dynamic PET brain images or other regions of interest. Principled approximations of the proposed method, together with simple extensions based on the augmented spaces, are investigated and shown to provide similar results to the full pseudo-marginal method. Such approximations and extensions allow the improved performance obtained by incorporating spatial dependence to be obtained at negligible additional cost. An application to measured PET image data shows notable improvements in revealing underlying spatial structure when compared to current methods that assume spatial independence.

Policy gradient (PG) estimation becomes a challenge when we are not allowed to sample with the target policy but only have access to a dataset generated by some unknown behavior policy. Conventional methods for off-policy PG estimation often suffer from either significant bias or exponentially large variance. In this paper, we propose the double Fitted PG estimation (FPG) algorithm. FPG can work with an arbitrary policy parameterization, assuming access to a Bellman-complete value function class. In the case of linear value function approximation, we provide a tight finite-sample upper bound on policy gradient estimation error, that is governed by the amount of distribution mismatch measured in feature space. We also establish the asymptotic normality of FPG estimation error with a precise covariance characterization, which is further shown to be statistically optimal with a matching Cramer-Rao lower bound. Empirically, we evaluate the performance of FPG on both policy gradient estimation and policy optimization, using either softmax tabular or ReLU policy networks. Under various metrics, our results show that FPG significantly outperforms existing off-policy PG estimation methods based on importance sampling and variance reduction techniques.

Federated learning with differential privacy, or private federated learning, provides a strategy to train machine learning models while respecting users' privacy. However, differential privacy can disproportionately degrade the performance of the models on under-represented groups, as these parts of the distribution are difficult to learn in the presence of noise. Existing approaches for enforcing fairness in machine learning models have considered the centralized setting, in which the algorithm has access to the users' data. This paper introduces an algorithm to enforce group fairness in private federated learning, where users' data does not leave their devices. First, the paper extends the modified method of differential multipliers to empirical risk minimization with fairness constraints, thus providing an algorithm to enforce fairness in the central setting. Then, this algorithm is extended to the private federated learning setting. The proposed algorithm, \texttt{FPFL}, is tested on a federated version of the Adult dataset and an "unfair" version of the FEMNIST dataset. The experiments on these datasets show how private federated learning accentuates unfairness in the trained models, and how FPFL is able to mitigate such unfairness.

We present a new sublinear time algorithm for approximating the spectral density (eigenvalue distribution) of an $n\times n$ normalized graph adjacency or Laplacian matrix. The algorithm recovers the spectrum up to $\epsilon$ accuracy in the Wasserstein-1 distance in $O(n\cdot \text{poly}(1/\epsilon))$ time given sample access to the graph. This result compliments recent work by David Cohen-Steiner, Weihao Kong, Christian Sohler, and Gregory Valiant (2018), which obtains a solution with runtime independent of $n$, but exponential in $1/\epsilon$. We conjecture that the trade-off between dimension dependence and accuracy is inherent. Our method is simple and works well experimentally. It is based on a Chebyshev polynomial moment matching method that employees randomized estimators for the matrix trace. We prove that, for any Hermitian $A$, this moment matching method returns an $\epsilon$ approximation to the spectral density using just $O({1}/{\epsilon})$ matrix-vector products with $A$. By leveraging stability properties of the Chebyshev polynomial three-term recurrence, we then prove that the method is amenable to the use of coarse approximate matrix-vector products. Our sublinear time algorithm follows from combining this result with a novel sampling algorithm for approximating matrix-vector products with a normalized graph adjacency matrix. Of independent interest, we show a similar result for the widely used \emph{kernel polynomial method} (KPM), proving that this practical algorithm nearly matches the theoretical guarantees of our moment matching method. Our analysis uses tools from Jackson's seminal work on approximation with positive polynomial kernels.

The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.

北京阿比特科技有限公司