亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Distribution function is essential in statistical inference, and connected with samples to form a directed closed loop by the correspondence theorem in measure theory and the Glivenko-Cantelli and Donsker properties. This connection creates a paradigm for statistical inference. However, existing distribution functions are defined in Euclidean spaces and no longer convenient to use in rapidly evolving data objects of complex nature. It is imperative to develop the concept of distribution function in a more general space to meet emerging needs. Note that the linearity allows us to use hypercubes to define the distribution function in a Euclidean space, but without the linearity in a metric space, we must work with the metric to investigate the probability measure. We introduce a class of metric distribution functions through the metric between random objects and a fixed location in metric spaces. We overcome this challenging step by proving the correspondence theorem and the Glivenko-Cantelli theorem for metric distribution functions in metric spaces that lie the foundation for conducting rational statistical inference for metric space-valued data. Then, we develop homogeneity test and mutual independence test for non-Euclidean random objects, and present comprehensive empirical evidence to support the performance of our proposed methods.

相關內容

In this work, we introduce statistical testing under distributional shifts. We are interested in the hypothesis $P^* \in H_0$ for a target distribution $P^*$, but observe data from a different distribution $Q^*$. We assume that $P^*$ is related to $Q^*$ through a known shift $\tau$ and formally introduce hypothesis testing in this setting. We propose a general testing procedure that first resamples from the observed data to construct an auxiliary data set and then applies an existing test in the target domain. We prove that if the size of the resample is at most $o(\sqrt{n})$ and the resampling weights are well-behaved, this procedure inherits the pointwise asymptotic level and power from the target test. If the map $\tau$ is estimated from data, we can maintain the above guarantees under mild conditions if the estimation works sufficiently well. We further extend our results to uniform asymptotic level and a different resampling scheme. Testing under distributional shifts allows us to tackle a diverse set of problems. We argue that it may prove useful in reinforcement learning and covariate shift, we show how it reduces conditional to unconditional independence testing and we provide example applications in causal inference.

We develop a rigorous and general framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs), such as the $1$-Wasserstein distance. We prove under which assumptions these divergences, hereafter referred to as $(f,\Gamma)$-divergences, provide a notion of `distance' between probability measures and show that they can be expressed as a two-stage mass-redistribution/mass-transport process. The $(f,\Gamma)$-divergences inherit features from IPMs, such as the ability to compare distributions which are not absolutely continuous, as well as from $f$-divergences, namely the strict concavity of their variational representations and the ability to control heavy-tailed distributions for particular choices of $f$. When combined, these features establish a divergence with improved properties for estimation, statistical learning, and uncertainty quantification applications. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions. We also show improved performance and stability over gradient-penalized Wasserstein GAN in image generation.

A Lattice is a partially ordered set where both least upper bound and greatest lower bound of any pair of elements are unique and exist within the set. K\"{o}tter and Kschischang proved that codes in the linear lattice can be used for error and erasure-correction in random networks. Codes in the linear lattice have previously been shown to be special cases of codes in modular lattices. Two well known classifications of modular lattices are geometric and distributive lattices. We have identified the unique criterion which makes a geometric lattice distributive, thus characterizing all finite geometric distributive lattices. Our characterization helps to prove a conjecture regarding the maximum size of a distributive sublattice of a finite geometric lattice and identify the maximal case. The Whitney numbers of the class of geometric distributive lattices are also calculated. We present a few other applications of this unique characterization to derive certain results regarding linearity and complements in the linear lattice.

Property inference attacks reveal statistical properties about a training set but are difficult to distinguish from the primary purposes of statistical machine learning, which is to produce models that capture statistical properties about a distribution. Motivated by Yeom et al.'s membership inference framework, we propose a formal and generic definition of property inference attacks. The proposed notion describes attacks that can distinguish between possible training distributions, extending beyond previous property inference attacks that infer the ratio of a particular type of data in the training data set. In this paper, we show how our definition captures previous property inference attacks as well as a new attack that reveals the average degree of nodes of a training graph and report on experiments giving insight into the potential risks of property inference attacks.

Nonparametric latent structure models provide flexible inference on distinct, yet related, groups of observations. Each component of a vector of $d \ge 2$ random measures models the distribution of a group of exchangeable observations, while their dependence structure regulates the borrowing of information across different groups. Recent work has quantified the dependence between random measures in terms of Wasserstein distance from the maximally dependent scenario when $d=2$. By solving an intriguing max-min problem we are now able to define a Wasserstein index of dependence $I_\mathcal{W}$ with the following properties: (i) it simultaneously quantifies the dependence of $d \ge 2$ random measures; (ii) it takes values in [0,1]; (iii) it attains the extreme values $\{0,1\}$ under independence and complete dependence, respectively; (iv) since it is defined in terms of the underlying L\'evy measures, it is possible to evaluate it numerically in many Bayesian nonparametric models for partially exchangeable data.

The problem of skewness is common among clinical trials and survival data which has being the research focus derivation and proposition of different flexible distributions. Thus, a new distribution called Extended Rayleigh Lomax distribution is constructed from Rayleigh Lomax distribution to capture the excessiveness of some survival data. We derive the new distribution by using beta logit function proposed by Jones (2004). Some statistical properties of the distribution such as probability density function, cumulative density function, reliability rate, hazard rate, reverse hazard rate, moment generating functions, likelihood functions, skewness, kurtosis and coefficient of variation are obtained. We also performed the expected estimation of model parameters by maximum likelihood; goodness of fit and model selection criteria including Anderson Darling (AD), CramerVon Misses (CVM), Kolmogorov Smirnov (KS), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC) and Consistent Akaike Information Criterion (CAIC) are employed to select the better distribution from those models considered in the work. The results from the statistics criteria show that the proposed distribution performs better with better representation of the States in Nigeria COVID-19 death cases data than other competing models.

This work focuses on combining nonparametric topic models with Auto-Encoding Variational Bayes (AEVB). Specifically, we first propose iTM-VAE, where the topics are treated as trainable parameters and the document-specific topic proportions are obtained by a stick-breaking construction. The inference of iTM-VAE is modeled by neural networks such that it can be computed in a simple feed-forward manner. We also describe how to introduce a hyper-prior into iTM-VAE so as to model the uncertainty of the prior parameter. Actually, the hyper-prior technique is quite general and we show that it can be applied to other AEVB based models to alleviate the {\it collapse-to-prior} problem elegantly. Moreover, we also propose HiTM-VAE, where the document-specific topic distributions are generated in a hierarchical manner. HiTM-VAE is even more flexible and can generate topic distributions with better variability. Experimental results on 20News and Reuters RCV1-V2 datasets show that the proposed models outperform the state-of-the-art baselines significantly. The advantages of the hyper-prior technique and the hierarchical model construction are also confirmed by experiments.

Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.

Deep distance metric learning (DDML), which is proposed to learn image similarity metrics in an end-to-end manner based on the convolution neural network, has achieved encouraging results in many computer vision tasks.$L2$-normalization in the embedding space has been used to improve the performance of several DDML methods. However, the commonly used Euclidean distance is no longer an accurate metric for $L2$-normalized embedding space, i.e., a hyper-sphere. Another challenge of current DDML methods is that their loss functions are usually based on rigid data formats, such as the triplet tuple. Thus, an extra process is needed to prepare data in specific formats. In addition, their losses are obtained from a limited number of samples, which leads to a lack of the global view of the embedding space. In this paper, we replace the Euclidean distance with the cosine similarity to better utilize the $L2$-normalization, which is able to attenuate the curse of dimensionality. More specifically, a novel loss function based on the von Mises-Fisher distribution is proposed to learn a compact hyper-spherical embedding space. Moreover, a new efficient learning algorithm is developed to better capture the global structure of the embedding space. Experiments for both classification and retrieval tasks on several standard datasets show that our method achieves state-of-the-art performance with a simpler training procedure. Furthermore, we demonstrate that, even with a small number of convolutional layers, our model can still obtain significantly better classification performance than the widely used softmax loss.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

北京阿比特科技有限公司