亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We show posterior convergence for the community structure in the planted bi-section model, for several interesting priors. Examples include where the label on each vertex is iid Bernoulli distributed, with some parameter $r\in(0,1)$. The parameter $r$ may be fixed, or equipped with a beta distribution. We do not have constraints on the class sizes, which might be as small as zero, or include all vertices, and everything in between. This enables us to test between a uniform (Erd\"os-R\'enyi) random graph with no distinguishable community or the planted bi-section model. The exact bounds for posterior convergence enable us to convert credible sets into confidence sets. Symmetric testing with posterior odds is shown to be consistent.

相關內容

In this paper, we consider several efficient data structures for the problem of sampling from a dynamically changing discrete probability distribution, where some prior information is known on the distribution of the rates, in particular the maximum and minimum rate, and where the number of possible outcomes N is large. We consider three basic data structures, the Acceptance-Rejection method, the Complete Binary Tree and the Alias method. These can be used as building blocks in a multi-level data structure, where at each of the levels, one of the basic data structures can be used, with the top level selecting a group of events, and the bottom level selecting an element from a group. Depending on assumptions on the distribution of the rates of outcomes, different combinations of the basic structures can be used. We prove that for particular data structures the expected time of sampling and update is constant when the rate distribution follows certain conditions. We show that for any distribution, combining a tree structure with the Acceptance-Rejection method, we have an expected time of sampling and update of $O\left(\log\log{r_{max}}/{r_{min}}\right)$ is possible, where $r_{max}$ is the maximum rate and $r_{min}$ the minimum rate. We also discuss an implementation of a Two Levels Acceptance-Rejection data structure, that allows expected constant time for sampling, and amortized constant time for updates, assuming that $r_{max}$ and $r_{min}$ are known and the number of events is sufficiently large. We also present an experimental verification, highlighting the limits given by the constraints of a real-life setting.

We consider statistical inference in the density estimation model using a tree-based Bayesian approach, with Optional P\'olya trees as prior distribution. We derive near-optimal convergence rates for corresponding posterior distributions with respect to the supremum norm. For broad classes of H\"older-smooth densities, we show that the method automatically adapts to the unknown H\"older regularity parameter. We consider the question of uncertainty quantification by providing mathematical guarantees for credible sets from the obtained posterior distributions, leading to near-optimal uncertainty quantification for the density function, as well as related functionals such as the cumulative distribution function. The results are illustrated through a brief simulation study.

This paper presents a new and unified approach to the derivation and analysis of many existing, as well as new discontinuous Galerkin methods for linear elasticity problems. The analysis is based on a unified discrete formulation for the linear elasticity problem consisting of four discretization variables: strong symmetric stress tensor $\dsig$ and displacement $\du$ inside each element, and the modifications of these two variables $\hsig$ and $\hu$ on elementary boundaries of elements. Motivated by many relevant methods in the literature, this formulation can be used to derive most existing discontinuous, nonconforming and conforming Galerkin methods for linear elasticity problems and especially to develop a number of new discontinuous Galerkin methods. Many special cases of this four-field formulation are proved to be hybridizable and can be reduced to some known hybridizable discontinuous Galerkin, weak Galerkin and local discontinuous Galerkin methods by eliminating one or two of the four fields. As certain stabilization parameter tends to zero, this four-field formulation is proved to converge to some conforming and nonconforming mixed methods for linear elasticity problems. Two families of inf-sup conditions, one known as $H^1$-based and the other known as $H({\rm div})$-based, are proved to be uniformly valid with respect to different choices of discrete spaces and parameters. These inf-sup conditions guarantee the well-posedness of the new proposed methods and also offer a new and unified analysis for many existing methods in the literature as a by-product. Some numerical examples are provided to verify the theoretical analysis including the optimal convergence of the new proposed methods.

One- and multi-dimensional stochastic Maxwell equations with additive noise are considered in this paper. It is known that such system can be written in the multi-symplectic structure, and the stochastic energy increases linearly in time. High order discontinuous Galerkin methods are designed for the stochastic Maxwell equations with additive noise, and we show that the proposed methods satisfy the discrete form of the stochastic energy linear growth property and preserve the multi-symplectic structure on the discrete level. Optimal error estimate of the semi-discrete DG method is also analyzed. The fully discrete methods are obtained by coupling with symplectic temporal discretizations. One- and two-dimensional numerical results are provided to demonstrate the performance of the proposed methods, and optimal error estimates and linear growth of the discrete energy can be observed for all cases.

In this paper, we present a simple yet effective method (ABSGD) for addressing the data imbalance issue in deep learning. Our method is a simple modification to momentum SGD where we leverage an attentional mechanism to assign an individual importance weight to each gradient in the mini-batch. Unlike many existing heuristic-driven methods for tackling data imbalance, our method is grounded in {\it theoretically justified distributionally robust optimization (DRO)}, which is guaranteed to converge to a stationary point of an information-regularized DRO problem. The individual-level weight of a sampled data is systematically proportional to the exponential of a scaled loss value of the data, where the scaling factor is interpreted as the regularization parameter in the framework of information-regularized DRO. Compared with existing class-level weighting schemes, our method can capture the diversity between individual examples within each class. Compared with existing individual-level weighting methods using meta-learning that require three backward propagations for computing mini-batch stochastic gradients, our method is more efficient with only one backward propagation at each iteration as in standard deep learning methods. To balance between the learning of feature extraction layers and the learning of the classifier layer, we employ a two-stage method that uses SGD for pretraining followed by ABSGD for learning a robust classifier and finetuning lower layers. Our empirical studies on several benchmark datasets demonstrate the effectiveness of the proposed method.

We advocate for a practical Maximum Likelihood Estimation (MLE) approach towards designing loss functions for regression and forecasting, as an alternative to the typical approach of direct empirical risk minimization on a specific target metric. The MLE approach is better suited to capture inductive biases such as prior domain knowledge in datasets, and can output post-hoc estimators at inference time that can optimize different types of target metrics. We present theoretical results to demonstrate that our approach is competitive with any estimator for the target metric under some general conditions. In two example practical settings, Poisson and Pareto regression, we show that our competitive results can be used to prove that the MLE approach has better excess risk bounds than directly minimizing the target metric. We also demonstrate empirically that our method instantiated with a well-designed general purpose mixture likelihood family can obtain superior performance for a variety of tasks across time-series forecasting and regression datasets with different data distributions.

We argue that proven exponential upper bounds on runtimes, an established area in classic algorithms, are interesting also in heuristic search and we prove several such results. We show that any of the algorithms randomized local search, Metropolis algorithm, simulated annealing, and (1+1) evolutionary algorithm can optimize any pseudo-Boolean weakly monotonic function under a large set of noise assumptions in a runtime that is at most exponential in the problem dimension~$n$. This drastically extends a previous such result, limited to the (1+1) EA, the LeadingOnes function, and one-bit or bit-wise prior noise with noise probability at most $1/2$, and at the same time simplifies its proof. With the same general argument, among others, we also derive a sub-exponential upper bound for the runtime of the $(1,\lambda)$ evolutionary algorithm on the OneMax problem when the offspring population size $\lambda$ is logarithmic, but below the efficiency threshold. To show that our approach can also deal with non-trivial parent population sizes, we prove an exponential upper bound for the runtime of the mutation-based version of the simple genetic algorithm on the OneMax benchmark, matching a known exponential lower bound.

In this paper, we consider possibly misspecified stochastic differential equation models driven by L\'{e}vy processes. Regardless of whether the driving noise is Gaussian or not, Gaussian quasi-likelihood estimator can estimate unknown parameters in the drift and scale coefficients. However, in the misspecified case, the asymptotic distribution of the estimator varies by the correction of the misspecification bias, and consistent estimators for the asymptotic variance proposed in the correctly specified case may lose theoretical validity. As one of its solutions, we propose a bootstrap method for approximating the asymptotic distribution. We show that our bootstrap method theoretically works in both correctly specified case and misspecified case without assuming the precise distribution of the driving noise.

This paper develops a general methodology to conduct statistical inference for observations indexed by multiple sets of entities. We propose a novel multiway empirical likelihood statistic that converges to a chi-square distribution under the non-degenerate case, where corresponding Hoeffding type decomposition is dominated by linear terms. Our methodology is related to the notion of jackknife empirical likelihood but the leave-out pseudo values are constructed by leaving columns or rows. We further develop a modified version of our multiway empirical likelihood statistic, which converges to a chi-square distribution regardless of the degeneracy, and discover its desirable higher-order property compared to the t-ratio by the conventional Eicker-White type variance estimator. The proposed methodology is illustrated by several important statistical problems, such as bipartite network, two-stage sampling, generalized estimating equations, and three-way observations.

Developing classification algorithms that are fair with respect to sensitive attributes of the data has become an important problem due to the growing deployment of classification algorithms in various social contexts. Several recent works have focused on fairness with respect to a specific metric, modeled the corresponding fair classification problem as a constrained optimization problem, and developed tailored algorithms to solve them. Despite this, there still remain important metrics for which we do not have fair classifiers and many of the aforementioned algorithms do not come with theoretical guarantees; perhaps because the resulting optimization problem is non-convex. The main contribution of this paper is a new meta-algorithm for classification that takes as input a large class of fairness constraints, with respect to multiple non-disjoint sensitive attributes, and which comes with provable guarantees. This is achieved by first developing a meta-algorithm for a large family of classification problems with convex constraints, and then showing that classification problems with general types of fairness constraints can be reduced to those in this family. We present empirical results that show that our algorithm can achieve near-perfect fairness with respect to various fairness metrics, and that the loss in accuracy due to the imposed fairness constraints is often small. Overall, this work unifies several prior works on fair classification, presents a practical algorithm with theoretical guarantees, and can handle fairness metrics that were previously not possible.

北京阿比特科技有限公司