亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Assessing goodness-of-fit is challenging because theoretically there is no uniformly powerful test, whereas in practice the question `what would be a preferable default test?' is important to applied statisticians. To take a look at this so-called omnibus testing problem, this paper considers the class of reweighted Anderson-Darling tests and makes two fold contributions. The first contribution is to provide a geometric understanding of the problem via establishing an explicit one-to-one correspondence between the weights and their focal directions of deviations of the distributions under alternative hypothesis from those under the null. It is argued that the weights that produce the test statistic with minimum variance can serve as a general-purpose test. In addition, this default or optimal weights-based test is found to be practically equivalent to the Zhang test, which has been commonly perceived powerful. The second contribution is to establish new large-sample results. It is shown that like Anderson-Darling, the minimum variance test statistic under the null has the same distribution as that of a weighted sum of an infinite number of independent squared normal random variables. These theoretical results are shown to be useful for large sample-based approximations. Finally, the paper concludes with a few remarks, including how the present approach can be extended to create new multinomial goodness-of-fit tests.

相關內容

We introduce a new statistical test based on the observed spacings of ordered data. The statistic is sensitive to detect non-uniformity in random samples, or short-lived features in event time series. Under some conditions, this new test can outperform existing ones, such as the well known Kolmogorov-Smirnov or Anderson-Darling tests, in particular when the number of samples is small and differences occur over a small quantile of the null hypothesis distribution. A detailed description of the test statistic is provided including a detailed discussion of the parameterization of its distribution via asymptotic bootstrapping as well as a novel per-quantile error estimation of the empirical distribution. Two example applications are provided, using the test to boost the sensitivity in generic "bump hunting", and employing the test to detect supernovae. The article is rounded off with an extended performance comparison to other, established goodness-of-fit tests.

Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, their application for clustering has some limitations. Miller and Harrison (2014) proved posterior inconsistency in the number of clusters when the true number of clusters is finite for Dirichlet process and Pitman--Yor process mixture models. In this work, we extend this result to additional Bayesian nonparametric priors such as Gibbs-type processes and finite-dimensional representations of them. The latter include the Dirichlet multinomial process and the recently proposed Pitman--Yor and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a post-processing algorithm introduced by Guha et al. (2021) for the Dirichlet process extends to more general models and provides a consistent method to estimate the number of components.

We introduce new goodness-of-fit tests and corresponding confidence bands for distribution functions. They are inspired by multi-scale methods of testing and based on refined laws of the iterated logarithm for the normalized uniform empirical process $\mathbb{U}_n (t)/\sqrt{t(1-t)}$ and its natural limiting process, the normalized Brownian bridge process $\mathbb{U}(t)/\sqrt{t(1-t)}$. The new tests and confidence bands refine the procedures of Berk and Jones (1979) and Owen (1995). Roughly speaking, the high power and accuracy of the latter methods in the tail regions of distributions are essentially preserved while gaining considerably in the central region. The goodness-of-fit tests perform well in signal detection problems involving sparsity, as in Ingster (1997), Donoho and Jin (2004) and Jager and Wellner (2007), but also under contiguous alternatives. Our analysis of the confidence bands sheds new light on the influence of the underlying $\phi$-divergences.

The problem of fair division of indivisible goods has been receiving much attention recently. The prominent metric of envy-freeness can always be satisfied in the divisible goods setting (see for example \cite{BT95}), but often cannot be satisfied in the indivisible goods setting. This has led to many relaxations thereof being introduced. We study the existence of {\em maximin share (MMS)} allocations, which is one such relaxation. Previous work has shown that MMS allocations are guaranteed to exist for all instances with $n$ players and $m$ goods if $m \leq n+4$. We extend this guarantee to the case of $m = n+5$ and show that the same guarantee fails for $m = n+6$.

We propose a sampling method based on an ensemble approximation of second order Langevin dynamics. The log target density is appended with a quadratic term in an auxiliary momentum variable and damped-driven Hamiltonian dynamics introduced; the resulting stochastic differential equation is invariant to the Gibbs measure, with marginal on the position coordinates given by the target. A preconditioner based on covariance under the law of the dynamics does not change this invariance property, and is introduced to accelerate convergence to the Gibbs measure. The resulting mean-field dynamics may be approximated by an ensemble method; this results in a gradient-free and affine-invariant stochastic dynamical system. Numerical results demonstrate its potential as the basis for a numerical sampler in Bayesian inverse problems.

The current best approximation algorithms for $k$-median rely on first obtaining a structured fractional solution known as a bi-point solution, and then rounding it to an integer solution. We improve this second step by unifying and refining previous approaches. We describe a hierarchy of increasingly-complex partitioning schemes for the facilities, along with corresponding sets of algorithms and factor-revealing non-linear programs. We prove that the third layer of this hierarchy is a $2.613$-approximation, improving upon the current best ratio of $2.675$, while no layer can be proved better than $2.588$ under the proposed analysis. On the negative side, we give a family of bi-point solutions which cannot be approximated better than the square root of the golden ratio, even if allowed to open $k+o(k)$ facilities. This gives a barrier to current approaches for obtaining an approximation better than $2 \sqrt{\phi} \approx 2.544$. Altogether we reduce the approximation gap of bi-point solutions by two thirds.

A central question in computational neuroscience is how structure determines function in neural networks. The emerging high-quality large-scale connectomic datasets raise the question of what general functional principles can be gleaned from structural information such as the distribution of excitatory/inhibitory synapse types and the distribution of synaptic weights. Motivated by this question, we developed a statistical mechanical theory of learning in neural networks that incorporates structural information as constraints. We derived an analytical solution for the memory capacity of the perceptron, a basic feedforward model of supervised learning, with constraint on the distribution of its weights. Our theory predicts that the reduction in capacity due to the constrained weight-distribution is related to the Wasserstein distance between the imposed distribution and that of the standard normal distribution. To test the theoretical predictions, we use optimal transport theory and information geometry to develop an SGD-based algorithm to find weights that simultaneously learn the input-output task and satisfy the distribution constraint. We show that training in our algorithm can be interpreted as geodesic flows in the Wasserstein space of probability distributions. We further developed a statistical mechanical theory for teacher-student perceptron rule learning and ask for the best way for the student to incorporate prior knowledge of the rule. Our theory shows that it is beneficial for the learner to adopt different prior weight distributions during learning, and shows that distribution-constrained learning outperforms unconstrained and sign-constrained learning. Our theory and algorithm provide novel strategies for incorporating prior knowledge about weights into learning, and reveal a powerful connection between structure and function in neural networks.

We study the problem of covering and learning sums $X = X_1 + \cdots + X_n$ of independent integer-valued random variables $X_i$ (SIIRVs) with unbounded, or even infinite, support. De et al. at FOCS 2018, showed that the maximum value of the collective support of $X_i$'s necessarily appears in the sample complexity of learning $X$. In this work, we address two questions: (i) Are there general families of SIIRVs with unbounded support that can be learned with sample complexity independent of both $n$ and the maximal element of the support? (ii) Are there general families of SIIRVs with unbounded support that admit proper sparse covers in total variation distance? As for question (i), we provide a set of simple conditions that allow the unbounded SIIRV to be learned with complexity $\text{poly}(1/\epsilon)$ bypassing the aforementioned lower bound. We further address question (ii) in the general setting where each variable $X_i$ has unimodal probability mass function and is a different member of some, possibly multi-parameter, exponential family $\mathcal{E}$ that satisfies some structural properties. These properties allow $\mathcal{E}$ to contain heavy tailed and non log-concave distributions. Moreover, we show that for every $\epsilon > 0$, and every $k$-parameter family $\mathcal{E}$ that satisfies some structural assumptions, there exists an algorithm with $\tilde{O}(k) \cdot \text{poly}(1/\epsilon)$ samples that learns a sum of $n$ arbitrary members of $\mathcal{E}$ within $\epsilon$ in TV distance. The output of the learning algorithm is also a sum of random variables whose distribution lies in the family $\mathcal{E}$. En route, we prove that any discrete unimodal exponential family with bounded constant-degree central moments can be approximated by the family corresponding to a bounded subset of the initial (unbounded) parameter space.

We present combinatorial and parallelizable algorithms for maximization of a submodular function, not necessarily monotone, with respect to a size constraint. We improve the best approximation factor achieved by an algorithm that has optimal adaptivity and nearly optimal query complexity to $0.193 - \varepsilon$. The conference version of this work mistakenly employed a subroutine that does not work for non-monotone, submodular functions. In this version, we propose a fixed and improved subroutine to add a set with high average marginal gain, \threseq, which returns a solution in $O( \log(n) )$ adaptive rounds with high probability. Moreover, we provide two approximation algorithms. The first has approximation ratio $1/6 - \varepsilon$, adaptivity $O( \log (n) )$, and query complexity $O( n \log (k) )$, while the second has approximation ratio $0.193 - \varepsilon$, adaptivity $O( \log^2 (n) )$, and query complexity $O(n \log (k))$. Our algorithms are empirically validated to use a low number of adaptive rounds and total queries while obtaining solutions with high objective value in comparison with state-of-the-art approximation algorithms, including continuous algorithms that use the multilinear extension.

Most recent studies have shown several vulnerabilities to attacks with the potential to jeopardize the integrity of the model, opening in a few recent years a new window of opportunity in terms of cyber-security. The main interest of this paper is directed towards data poisoning attacks involving label-flipping, this kind of attacks occur during the training phase, being the aim of the attacker to compromise the integrity of the targeted machine learning model by drastically reducing the overall accuracy of the model and/or achieving the missclassification of determined samples. This paper is conducted with intention of proposing two new kinds of data poisoning attacks based on label-flipping, the targeted of the attack is represented by a variety of machine learning classifiers dedicated for malware detection using mobile exfiltration data. With that, the proposed attacks are proven to be model-agnostic, having successfully corrupted a wide variety of machine learning models; Logistic Regression, Decision Tree, Random Forest and KNN are some examples. The first attack is performs label-flipping actions randomly while the second attacks performs label flipping only one of the 2 classes in particular. The effects of each attack are analyzed in further detail with special emphasis on the accuracy drop and the misclassification rate. Finally, this paper pursuits further research direction by suggesting the development of a defense technique that could promise a feasible detection and/or mitigation mechanisms; such technique should be capable of conferring a certain level of robustness to a target model against potential attackers.

北京阿比特科技有限公司