亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In Bayesian inverse problems, one aims at characterizing the posterior distribution of a set of unknowns, given indirect measurements. For non-linear/non-Gaussian problems, analytic solutions are seldom available: Sequential Monte Carlo samplers offer a powerful tool for approximating complex posteriors, by constructing an auxiliary sequence of densities that smoothly reaches the posterior. Often the posterior depends on a scalar hyper-parameter. In this work, we show that properly designed Sequential Monte Carlo (SMC) samplers naturally provide an approximation of the marginal likelihood associated with this hyper-parameter for free, i.e. at a negligible additional computational cost. The proposed method proceeds by constructing the auxiliary sequence of distributions in such a way that each of them can be interpreted as a posterior distribution corresponding to a different value of the hyper-parameter. This can be exploited to perform selection of the hyper-parameter in Empirical Bayes approaches, as well as averaging across values of the hyper-parameter according to some hyper-prior distribution in Fully Bayesian approaches. For FB approaches, the proposed method has the further benefit of allowing prior sensitivity analysis at a negligible computational cost. In addition, the proposed method exploits particles at all the (relevant) iterations, thus alleviating one of the known limitations of SMC samplers, i.e. the fact that all samples at intermediate iterations are typically discarded. We show numerical results for two distinct cases where the hyper-parameter affects only the likelihood: a toy example, where an SMC sampler is used to approximate the full posterior distribution; and a brain imaging example, where a Rao-Blackwellized SMC sampler is used to approximate the posterior distribution of a subset of parameters in a conditionally linear Gaussian model.

相關內容

SMC:IEEE International Conference on Systems,Man, and Cybernetics Explanation:IEEE系統、人與控制論國際會議。 Publisher:IEEE。 SIT:

Calculating the expected information gain in optimal Bayesian experimental design typically relies on nested Monte Carlo sampling. When the model also contains nuisance parameters, which are parameters that contribute to the overall uncertainty of the system but are of no interest in the Bayesian design framework, this introduces a second inner loop. We propose and derive a small-noise approximation for this additional inner loop. The computational cost of our method can be further reduced by applying a Laplace approximation to the remaining inner loop. Thus, we present two methods, the small-noise Double-loop Monte Carlo and small-noise Monte Carlo Laplace methods. Moreover, we demonstrate that the total complexity of these two approaches remains comparable to the case without nuisance uncertainty. To assess the efficiency of these methods, we present three examples, and the last example includes the partial differential equation for the electrical impedance tomography experiment for composite laminate materials.

Calibration weighting has been widely used to correct selection biases in non-probability sampling, missing data, and causal inference. The main idea is to calibrate the biased sample to the benchmark by adjusting the subject weights. However, hard calibration can produce enormous weights when an exact calibration is enforced on a large set of extraneous covariates. This article proposes a soft calibration scheme, in which the outcome and the selection indicator follow mixed-effects models. The scheme imposes an exact calibration on the fixed effects and an approximate calibration on the random effects. On the one hand, our soft calibration has an intrinsic connection with best linear unbiased prediction, which results in a more efficient estimation compared to hard calibration. On the other hand, soft calibration weighting estimation can be envisioned as penalized propensity score weight estimation, with the penalty term motivated by the mixed-effects structure. The asymptotic distribution and a valid variance estimator are derived for soft calibration. We demonstrate the superiority of the proposed estimator over other competitors in simulation studies and a real-data application.

The emerging availability of trained machine learning models has put forward the novel concept of Machine Learning Model Market in which one can harness the collective intelligence of multiple well-trained models to improve the performance of the resultant model through one-shot federated learning and ensemble learning in a data-free manner. However, picking the models available in the market for ensemble learning is time-consuming, as using all the models is not always the best approach. It is thus crucial to have an effective ensemble selection strategy that can find a good subset of the base models for the ensemble. Conventional ensemble selection techniques are not applicable, as we do not have access to the local datasets of the parties in the federated learning setting. In this paper, we present a novel Data-Free Diversity-Based method called DeDES to address the ensemble selection problem for models generated by one-shot federated learning in practical applications such as model markets. Experiments showed that our method can achieve both better performance and higher efficiency over 5 datasets and 4 different model structures under the different data-partition strategies.

Solving inverse problems is central to a variety of important applications, such as biomedical image reconstruction and non-destructive testing. These problems are characterized by the sensitivity of direct solution methods with respect to data perturbations. To stabilize the reconstruction process, regularization methods have to be employed. Well-known regularization methods are based on frame expansions, such as the wavelet-vaguelette (WVD) decomposition, which are well adapted to the underlying signal class and the forward model and furthermore allow efficient implementation. However, it is well known that the lack of translational invariance of wavelets and related systems leads to specific artifacts in the reconstruction. To overcome this problem, in this paper we introduce and analyze the translation invariant diagonal frame decomposition (TI-DFD) of linear operators as a novel concept generalizing the SVD. We characterize ill-posedness via the TI-DFD and prove that a TI-DFD combined with a regularizing filter leads to a convergent regularization method with optimal convergence rates. As illustrative example, we construct a wavelet-based TI-DFD for one-dimensional integration, where we also investigate our approach numerically. The results indicate that filtered TI-DFDs eliminate the typical wavelet artifacts when using standard wavelets and provide a fast, accurate, and stable solution scheme for inverse problems.

Selective classification (or classification with a reject option) pairs a classifier with a selection function to determine whether or not a prediction should be accepted. This framework trades off coverage (probability of accepting a prediction) with predictive performance, typically measured by distributive loss functions. In many application scenarios, such as credit scoring, performance is instead measured by ranking metrics, such as the Area Under the ROC Curve (AUC). We propose a model-agnostic approach to associate a selection function to a given probabilistic binary classifier. The approach is specifically targeted at optimizing the AUC. We provide both theoretical justifications and a novel algorithm, called AUCROSS, to achieve such a goal. Experiments show that our method succeeds in trading-off coverage for AUC, improving over existing selective classification methods targeted at optimizing accuracy.

We consider the Ensemble Kalman Inversion which has been recently introduced as an efficient, gradient-free optimisation method to estimate unknown parameters in an inverse setting. In the case of large data sets, the Ensemble Kalman Inversion becomes computationally infeasible as the data misfit needs to be evaluated for each particle in each iteration. Here, randomised algorithms like stochastic gradient descent have been demonstrated to successfully overcome this issue by using only a random subset of the data in each iteration, so-called subsampling techniques. Based on a recent analysis of a continuous-time representation of stochastic gradient methods, we propose, analyse, and apply subsampling-techniques within Ensemble Kalman Inversion. Indeed, we propose two different subsampling techniques: either every particle observes the same data subset (single subsampling) or every particle observes a different data subset (batch subsampling).

Solving high-dimensional Bayesian inverse problems (BIPs) with the variational inference (VI) method is promising but still challenging. The main difficulties arise from two aspects. First, VI methods approximate the posterior distribution using a simple and analytic variational distribution, which makes it difficult to estimate complex spatially-varying parameters in practice. Second, VI methods typically rely on gradient-based optimization, which can be computationally expensive or intractable when applied to BIPs involving partial differential equations (PDEs). To address these challenges, we propose a novel approximation method for estimating the high-dimensional posterior distribution. This approach leverages a deep generative model to learn a prior model capable of generating spatially-varying parameters. This enables posterior approximation over the latent variable instead of the complex parameters, thus improving estimation accuracy. Moreover, to accelerate gradient computation, we employ a differentiable physics-constrained surrogate model to replace the adjoint method. The proposed method can be fully implemented in an automatic differentiation manner. Numerical examples demonstrate two types of log-permeability estimation for flow in heterogeneous media. The results show the validity, accuracy, and high efficiency of the proposed method.

We propose derivative-informed neural operators (DINOs), a general family of neural networks to approximate operators as infinite-dimensional mappings from input function spaces to output function spaces or quantities of interest. After discretizations both inputs and outputs are high-dimensional. We aim to approximate not only the operators with improved accuracy but also their derivatives (Jacobians) with respect to the input function-valued parameter to empower derivative-based algorithms in many applications, e.g., Bayesian inverse problems, optimization under parameter uncertainty, and optimal experimental design. The major difficulties include the computational cost of generating derivative training data and the high dimensionality of the problem leading to large training cost. To address these challenges, we exploit the intrinsic low-dimensionality of the derivatives and develop algorithms for compressing derivative information and efficiently imposing it in neural operator training yielding derivative-informed neural operators. We demonstrate that these advances can significantly reduce the costs of both data generation and training for large classes of problems (e.g., nonlinear steady state parametric PDE maps), making the costs marginal or comparable to the costs without using derivatives, and in particular independent of the discretization dimension of the input and output functions. Moreover, we show that the proposed DINO achieves significantly higher accuracy than neural operators trained without derivative information, for both function approximation and derivative approximation (e.g., Gauss-Newton Hessian), especially when the training data are limited.

The kernel two-sample test based on the maximum mean discrepancy (MMD) is one of the most popular methods for detecting differences between two distributions over general metric spaces. In this paper we propose a method to boost the power of the kernel test by combining MMD estimates over multiple kernels using their Mahalanobis distance. We derive the asymptotic null distribution of the proposed test statistic and use a multiplier bootstrap approach to efficiently compute the rejection region. The resulting test is universally consistent and, since it is obtained by aggregating over a collection of kernels/bandwidths, is more powerful in detecting a wide range of alternatives in finite samples. We also derive the distribution of the test statistic for both fixed and local contiguous alternatives. The latter, in particular, implies that the proposed test is statistically efficient, that is, it has non-trivial asymptotic (Pitman) efficiency. Extensive numerical experiments are performed on both synthetic and real-world datasets to illustrate the efficacy of the proposed method over single kernel tests. Our asymptotic results rely on deriving the joint distribution of MMD estimates using the framework of multiple stochastic integrals, which is more broadly useful, specifically, in understanding the efficiency properties of recently proposed adaptive MMD tests based on kernel aggregation.

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

北京阿比特科技有限公司