亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This article deals with the hypothesis test for the extremely heavy-tailed distributions with infinite mean or variance by using a truncated sample mean. We obtain three necessary and sufficient conditions under which the asymptotic distribution of the truncated test statistics converges to normal, neither normal nor stable or converges to $-\infty$ or the combination of stable distributions, respectively. The numerical simulation illustrates an application of the theoretical results above in the hypothesis testing.

相關內容

Let $P$ be a linear differential operator over $\mathcal{D} \subset \mathbb{R}^d$ and $U = (U_x)_{x \in \mathcal{D}}$ a second order stochastic process. In the first part of this article, we prove a new necessary and sufficient condition for all the trajectories of $U$ to verify the partial differential equation (PDE) $T(U) = 0$. This condition is formulated in terms of the covariance kernel of $U$. When compared to previous similar results, the novelty lies in that the equality $T(U) = 0$ is understood in the \textit{sense of distributions}, which is a relevant framework for PDEs. This theorem provides precious insights during the second part of this article, devoted to performing "physically informed" machine learning for the homogeneous 3 dimensional free space wave equation. We perform Gaussian process regression (GPR) on pointwise observations of a solution of this PDE. To do so, we propagate Gaussian processes (GP) priors over its initial conditions through the wave equation. We obtain explicit formulas for the covariance kernel of the propagated GP, which can then be used for GPR. We then explore the particular cases of radial symmetry and point source. For the former, we derive convolution-free GPR formulas; for the latter, we show a direct link between GPR and the classical triangulation method for point source localization used in GPS systems. Additionally, this Bayesian framework provides a new answer for the ill-posed inverse problem of reconstructing initial conditions for the wave equation with a limited number of sensors, and simultaneously enables the inference of physical parameters from these data. Finally, we illustrate this physically informed GPR on a number of practical examples.

Simulator-based models are models for which the likelihood is intractable but simulation of synthetic data is possible. They are often used to describe complex real-world phenomena, and as such can often be misspecified in practice. Unfortunately, existing Bayesian approaches for simulators are known to perform poorly in those cases. In this paper, we propose a novel algorithm based on the posterior bootstrap and maximum mean discrepancy estimators. This leads to a highly-parallelisable Bayesian inference algorithm with strong robustness properties. This is demonstrated through an in-depth theoretical study which includes generalisation bounds and proofs of frequentist consistency and robustness of our posterior. The approach is then assessed on a range of examples including a g-and-k distribution and a toggle-switch model.

Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples for which model confidence exceeds that threshold. ATC outperforms previous methods across several model architectures, types of distribution shifts (e.g., due to synthetic corruptions, dataset reproduction, or novel subpopulations), and datasets (Wilds, ImageNet, Breeds, CIFAR, and MNIST). In our experiments, ATC estimates target performance $2$-$4\times$ more accurately than prior methods. We also explore the theoretical foundations of the problem, proving that, in general, identifying the accuracy is just as hard as identifying the optimal predictor and thus, the efficacy of any method rests upon (perhaps unstated) assumptions on the nature of the shift. Finally, analyzing our method on some toy distributions, we provide insights concerning when it works.

The generalized g-formula can be used to estimate the probability of survival under a sustained treatment strategy. When treatment strategies are deterministic, estimators derived from the so-called efficient influence function (EIF) for the g-formula will be doubly robust to model misspecification. In recent years, several practical applications have motivated estimation of the g-formula under non-deterministic treatment strategies where treatment assignment at each time point depends on the observed treatment process. In this case, EIF-based estimators may or may not be doubly robust. In this paper, we provide sufficient conditions to ensure existence of doubly robust estimators for intervention treatment distributions that depend on the observed treatment process for point treatment interventions, and give a class of intervention treatment distributions dependent on the observed treatment process that guarantee model doubly and multiply robust estimators in longitudinal settings. Motivated by an application to pre-exposure prophylaxis (PrEP) initiation studies, we propose a new treatment intervention dependent on the observed treatment process. We show there exist 1) estimators that are doubly and multiply robust to model misspecification, and 2) estimators that when used with machine learning algorithms can attain fast convergence rates for our proposed intervention. Theoretical results are confirmed via simulation studies.

Common practice when using recurrent neural networks (RNNs) is to apply a model to sequences longer than those seen in training. This "extrapolating" usage deviates from the traditional statistical learning setup where guarantees are provided under the assumption that train and test distributions are identical. Here we set out to understand when RNNs can extrapolate, focusing on a simple case where the data generating distribution is memoryless. We first show that even with infinite training data, there exist RNN models that interpolate perfectly (i.e., they fit the training data) yet extrapolate poorly to longer sequences. We then show that if gradient descent is used for training, learning will converge to perfect extrapolation under certain assumption on initialization. Our results complement recent studies on the implicit bias of gradient descent, showing that it plays a key role in extrapolation when learning temporal prediction models.

The community of scientists is characterized by their need to publish in peer-reviewed journals, in an attempt to avoid the "perish" side of the famous maxim. Accordingly, almost all researchers authored some scientific articles. Scholarly publications represent at least two benefits for the study of the scientific community as a social group. First, they attest of some form of relation between scientists (collaborations, mentoring, heritage,...), useful to determine and analyze social subgroups. Second, most of them are recorded in large data bases, easily accessible and including a lot of pertinent information, easing the quantitative and qualitative study of the scientific community. Understanding the underlying dynamics driving the creation of knowledge in general, and of scientific publication in particular, in addition to its interest from the social science point of view, can contribute to maintaining a high level of research, by identifying good and bad practices in science. In this manuscript, we aim at advancing this understanding by a statistical analysis of publications within peer-reviewed journals. Namely, we show that the distribution of the number of articles published by an author in a given journal is heavy-tailed, but has lighter tail than a power law. Interestingly, we demonstrate (both analytically and numerically) that such distributions are the result of an modified preferential attachment process.

This article introduces an informative goodness-of-fit (iGOF) approach to study multivariate distributions. When the null model is rejected, iGOF allows us to identify the underlying sources of mismodeling and naturally equips practitioners with additional insights on the nature of the deviations from the true distribution. The informative character of the procedure is achieved by exploiting smooth tests and random fields theory to facilitate the analysis of multivariate data. Simulation studies show that iGOF enjoys high power for different types of alternatives. The methods presented here directly address the problem of background mismodeling arising in physics and astronomy. It is in these areas that the motivation of this work is rooted.

Though remarkable progress has been achieved in various vision tasks, deep neural networks still suffer obvious performance degradation when tested in out-of-distribution scenarios. We argue that the feature statistics (mean and standard deviation), which carry the domain characteristics of the training data, can be properly manipulated to improve the generalization ability of deep learning models. Common methods often consider the feature statistics as deterministic values measured from the learned features and do not explicitly consider the uncertain statistics discrepancy caused by potential domain shifts during testing. In this paper, we improve the network generalization ability by modeling the uncertainty of domain shifts with synthesized feature statistics during training. Specifically, we hypothesize that the feature statistic, after considering the potential uncertainties, follows a multivariate Gaussian distribution. Hence, each feature statistic is no longer a deterministic value, but a probabilistic point with diverse distribution possibilities. With the uncertain feature statistics, the models can be trained to alleviate the domain perturbations and achieve better robustness against potential domain shifts. Our method can be readily integrated into networks without additional parameters. Extensive experiments demonstrate that our proposed method consistently improves the network generalization ability on multiple vision tasks, including image classification, semantic segmentation, and instance retrieval. The code will be released soon at //github.com/lixiaotong97/DSU.

The problem of learning functions over spaces of probabilities - or distribution regression - is gaining significant interest in the machine learning community. A key challenge behind this problem is to identify a suitable representation capturing all relevant properties of the underlying functional mapping. A principled approach to distribution regression is provided by kernel mean embeddings, which lifts kernel-induced similarity on the input domain at the probability level. This strategy effectively tackles the two-stage sampling nature of the problem, enabling one to derive estimators with strong statistical guarantees, such as universal consistency and excess risk bounds. However, kernel mean embeddings implicitly hinge on the maximum mean discrepancy (MMD), a metric on probabilities, which may fail to capture key geometrical relations between distributions. In contrast, optimal transport (OT) metrics, are potentially more appealing, as documented by the recent literature on the topic. In this work, we propose the first OT-based estimator for distribution regression. We build on the Sliced Wasserstein distance to obtain an OT-based representation. We study the theoretical properties of a kernel ridge regression estimator based on such representation, for which we prove universal consistency and excess risk bounds. Preliminary experiments complement our theoretical findings by showing the effectiveness of the proposed approach and compare it with MMD-based estimators.

Discrete random structures are important tools in Bayesian nonparametrics and the resulting models have proven effective in density estimation, clustering, topic modeling and prediction, among others. In this paper, we consider nested processes and study the dependence structures they induce. Dependence ranges between homogeneity, corresponding to full exchangeability, and maximum heterogeneity, corresponding to (unconditional) independence across samples. The popular nested Dirichlet process is shown to degenerate to the fully exchangeable case when there are ties across samples at the observed or latent level. To overcome this drawback, inherent to nesting general discrete random measures, we introduce a novel class of latent nested processes. These are obtained by adding common and group-specific completely random measures and, then, normalising to yield dependent random probability measures. We provide results on the partition distributions induced by latent nested processes, and develop an Markov Chain Monte Carlo sampler for Bayesian inferences. A test for distributional homogeneity across groups is obtained as a by product. The results and their inferential implications are showcased on synthetic and real data.

北京阿比特科技有限公司