亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Mixture distributions with dynamic weights are an efficient way of modeling loss data characterized by heavy tails. However, maximum likelihood estimation of this family of models is difficult, mostly because of the need to evaluate numerically an intractable normalizing constant. In such a setup, simulation-based estimation methods are an appealing alternative. The approximate maximum likelihood estimation (AMLE) approach is employed. It is a general method that can be applied to mixtures with any component densities, as long as simulation is feasible. The focus is on the dynamic lognormal-generalized Pareto distribution, and the Cram\'er - von Mises distance is used to measure the discrepancy between observed and simulated samples. After deriving the theoretical properties of the estimators, a hybrid procedure is developed, where standard maximum likelihood is first employed to determine the bounds of the uniform priors required as input for AMLE. Simulation experiments and two real-data applications suggest that this approach yields a major improvement with respect to standard maximum likelihood estimation.

相關內容

The empirical likelihood is a powerful nonparametric tool, that emulates its parametric counterpart -- the parametric likelihood -- preserving many of its large-sample properties. This article tackles the problem of assessing the discriminatory power of three-class diagnostic tests from an empirical likelihood perspective. In particular, we concentrate on interval estimation in a three-class ROC analysis, where a variety of inferential tasks could be of interest. We present novel theoretical results and tailored techniques studied to efficiently solve some of such tasks. Extensive simulation experiments are provided in a supporting role, with our novel proposals compared to existing competitors, when possible. It emerges that our new proposals are extremely flexible, being able to compete with contestants and being the most suited to accommodating flexible distributions for target populations. We illustrate the application of the novel proposals with a real data example. The article ends with a discussion and a presentation of some directions for future research.

We propose EB-TC$\varepsilon$, a novel sampling rule for $\varepsilon$-best arm identification in stochastic bandits. It is the first instance of Top Two algorithm analyzed for approximate best arm identification. EB-TC$\varepsilon$ is an *anytime* sampling rule that can therefore be employed without modification for fixed confidence or fixed budget identification (without prior knowledge of the budget). We provide three types of theoretical guarantees for EB-TC$\varepsilon$. First, we prove bounds on its expected sample complexity in the fixed confidence setting, notably showing its asymptotic optimality in combination with an adaptive tuning of its exploration parameter. We complement these findings with upper bounds on its probability of error at any time and for any error parameter, which further yield upper bounds on its simple regret at any time. Finally, we show through numerical simulations that EB-TC$\varepsilon$ performs favorably compared to existing algorithms, in different settings.

Estimating causal effects from observational data is a central problem in many domains. A general approach is to balance covariates with weights such that the distribution of the data mimics randomization. We present generalized balancing weights, Neural Balancing Weights (NBW), to estimate the causal effects of an arbitrary mixture of discrete and continuous interventions. The weights were obtained through direct estimation of the density ratio between the source and balanced distributions by optimizing the variational representation of $f$-divergence. For this, we selected $\alpha$-divergence as it presents efficient optimization because it has an estimator whose sample complexity is independent of its ground truth value and unbiased mini-batch gradients; moreover, it is advantageous for the vanishing-gradient problem. In addition, we provide the following two methods for estimating the balancing weights: improving the generalization performance of the balancing weights and checking the balance of the distribution changed by the weights. Finally, we discuss the sample size requirements for the weights as a general problem of a curse of dimensionality when balancing multidimensional data. Our study provides a basic approach for estimating the balancing weights of multidimensional data using variational $f$-divergences.

We consider the problem of estimating the optimal transport map between two probability distributions, $P$ and $Q$ in $\mathbb R^d$, on the basis of i.i.d. samples. All existing statistical analyses of this problem require the assumption that the transport map is Lipschitz, a strong requirement that, in particular, excludes any examples where the transport map is discontinuous. As a first step towards developing estimation procedures for discontinuous maps, we consider the important special case where the data distribution $Q$ is a discrete measure supported on a finite number of points in $\mathbb R^d$. We study a computationally efficient estimator initially proposed by Pooladian and Niles-Weed (2021), based on entropic optimal transport, and show in the semi-discrete setting that it converges at the minimax-optimal rate $n^{-1/2}$, independent of dimension. Other standard map estimation techniques both lack finite-sample guarantees in this setting and provably suffer from the curse of dimensionality. We confirm these results in numerical experiments, and provide experiments for other settings, not covered by our theory, which indicate that the entropic estimator is a promising methodology for other discontinuous transport map estimation problems.

Configuration spaces (C-spaces) are an essential component of many robot path-planning algorithms, yet calculating them is a time-consuming task, especially in spaces involving a large number of degrees of freedom (DoF). Here we explore a two-step data-driven approach to C-space approximation: (1) sample (i.e., explicitly calculate) a few configurations; (2) train a machine learning (ML) model on these configurations to predict the collision status of other points in the C-space. We studied multiple factors that impact this approximation process, including model representation, number of DoF (up to 42), collision density, sample size, training set distribution, and desired confidence of predictions. We conclude that XGBoost offers a significant time improvement over other methods, while maintaining low error rates, even in C-Spaces with over 14 DoF.

We consider a social choice setting in which agents and alternatives are represented by points in a metric space, and the cost of an agent for an alternative is the distance between the corresponding points in the space. The goal is to choose a single alternative to (approximately) minimize the social cost (cost of all agents) or the maximum cost of any agent, when only limited information about the preferences of the agents is given. Previous work has shown that the best possible distortion one can hope to achieve is $3$ when access to the ordinal preferences of the agents is given, even when the distances between alternatives in the metric space are known. We improve upon this bound of $3$ by designing deterministic mechanisms that exploit a bit of cardinal information. We show that it is possible to achieve distortion $1+\sqrt{2}$ by using the ordinal preferences of the agents, the distances between alternatives, and a threshold approval set per agent that contains all alternatives for whom her cost is within an appropriately chosen factor of her cost for her most-preferred alternative. We show that this bound is the best possible for any deterministic mechanism in general metric spaces, and also provide improved bounds for the fundamental case of a line metric.

Simulation-based inference (SBI) enables amortized Bayesian inference for simulators with implicit likelihoods. But when we are primarily interested in the quality of predictive simulations, or when the model cannot exactly reproduce the observed data (i.e., is misspecified), targeting the Bayesian posterior may be overly restrictive. Generalized Bayesian Inference (GBI) aims to robustify inference for (misspecified) simulator models, replacing the likelihood-function with a cost function that evaluates the goodness of parameters relative to data. However, GBI methods generally require running multiple simulations to estimate the cost function at each parameter value during inference, making the approach computationally infeasible for even moderately complex simulators. Here, we propose amortized cost estimation (ACE) for GBI to address this challenge: We train a neural network to approximate the cost function, which we define as the expected distance between simulations produced by a parameter and observed data. The trained network can then be used with MCMC to infer GBI posteriors for any observation without running additional simulations. We show that, on several benchmark tasks, ACE accurately predicts cost and provides predictive simulations that are closer to synthetic observations than other SBI methods, especially for misspecified simulators. Finally, we apply ACE to infer parameters of the Hodgkin-Huxley model given real intracellular recordings from the Allen Cell Types Database. ACE identifies better data-matching parameters while being an order of magnitude more simulation-efficient than a standard SBI method. In summary, ACE combines the strengths of SBI methods and GBI to perform robust and simulation-amortized inference for scientific simulators.

The examination of uncertainty in the predictions of machine learning (ML) models is receiving increasing attention. One uncertainty modeling technique used for this purpose is Monte-Carlo (MC)-Dropout, where repeated predictions are generated for a single input. Therefore, clustering is required to describe the resulting uncertainty, but only through efficient clustering is it possible to describe the uncertainty from the model attached to each object. This article uses Bayesian Gaussian Mixture (BGM) to solve this problem. In addition, we investigate different values for the dropout rate and other techniques, such as focal loss and calibration, which we integrate into the Mask-RCNN model to obtain the most accurate uncertainty approximation of each instance and showcase it graphically.

The Bayesian Synthetic Likelihood (BSL) method is a widely-used tool for likelihood-free Bayesian inference. This method assumes that some summary statistics are normally distributed, which can be incorrect in many applications. We propose a transformation, called the Wasserstein Gaussianization transformation, that uses a Wasserstein gradient flow to approximately transform the distribution of the summary statistics into a Gaussian distribution. BSL also implicitly requires compatibility between simulated summary statistics under the working model and the observed summary statistics. A robust BSL variant which achieves this has been developed in the recent literature. We combine the Wasserstein Gaussianization transformation with robust BSL, and an efficient Variational Bayes procedure for posterior approximation, to develop a highly efficient and reliable approximate Bayesian inference method for likelihood-free problems.

The mechanisms by which certain training interventions, such as increasing learning rates and applying batch normalization, improve the generalization of deep networks remains a mystery. Prior works have speculated that "flatter" solutions generalize better than "sharper" solutions to unseen data, motivating several metrics for measuring flatness (particularly $\lambda_{max}$, the largest eigenvalue of the Hessian of the loss); and algorithms, such as Sharpness-Aware Minimization (SAM) [1], that directly optimize for flatness. Other works question the link between $\lambda_{max}$ and generalization. In this paper, we present findings that call $\lambda_{max}$'s influence on generalization further into question. We show that: (1) while larger learning rates reduce $\lambda_{max}$ for all batch sizes, generalization benefits sometimes vanish at larger batch sizes; (2) by scaling batch size and learning rate simultaneously, we can change $\lambda_{max}$ without affecting generalization; (3) while SAM produces smaller $\lambda_{max}$ for all batch sizes, generalization benefits (also) vanish with larger batch sizes; (4) for dropout, excessively high dropout probabilities can degrade generalization, even as they promote smaller $\lambda_{max}$; and (5) while batch-normalization does not consistently produce smaller $\lambda_{max}$, it nevertheless confers generalization benefits. While our experiments affirm the generalization benefits of large learning rates and SAM for minibatch SGD, the GD-SGD discrepancy demonstrates limits to $\lambda_{max}$'s ability to explain generalization in neural networks.

北京阿比特科技有限公司