亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Uncertainty quantification of machine learning and deep learning methods plays an important role in enhancing trust to the obtained result. In recent years, a numerous number of uncertainty quantification methods have been introduced. Monte Carlo dropout (MC-Dropout) is one of the most well-known techniques to quantify uncertainty in deep learning methods. In this study, we propose two new loss functions by combining cross entropy with Expected Calibration Error (ECE) and Predictive Entropy (PE). The obtained results clearly show that the new proposed loss functions lead to having a calibrated MC-Dropout method. Our results confirmed the great impact of the new hybrid loss functions for minimising the overlap between the distributions of uncertainty estimates for correct and incorrect predictions without sacrificing the model's overall performance.

相關內容

損失函數,在AI中亦稱呼距離函數,度量函數。此處的距離代表的是抽象性的,代表真實數據與預測數據之間的誤差。損失函數(loss function)是用來估量你模型的預測值f(x)與真實值Y的不一致程度,它是一個非負實值函數,通常使用L(Y, f(x))來表示,損失函數越小,模型的魯棒性就越好。損失函數是經驗風險函數的核心部分,也是結構風險函數重要組成部分。

Deep neural networks have shown great success in prediction quality while reliable and robust uncertainty estimation remains a challenge. Predictive uncertainty supplements model predictions and enables improved functionality of downstream tasks including embedded and mobile applications, such as virtual reality, augmented reality, sensor fusion, and perception. These applications often require a compromise in complexity to obtain uncertainty estimates due to very limited memory and compute resources. We tackle this problem by building upon Monte Carlo Dropout (MCDO) models using the Axolotl framework; specifically, we diversify sampled subnetworks, leverage dropout patterns, and use a branching technique to improve predictive performance while maintaining fast computations. We conduct experiments on (1) a multi-class classification task using the CIFAR10 dataset, and (2) a more complex human body segmentation task. Our results show the effectiveness of our approach by reaching close to Deep Ensemble prediction quality and uncertainty estimation, while still achieving faster inference on resource-limited mobile platforms.

In this paper, we explore model-based approach to training robust and interpretable binarized regression models for multiclass classification tasks using Mixed-Integer Programming (MIP). Our MIP model balances the optimization of prediction margin and model size by using a weighted objective that: minimizes the total margin of incorrectly classified training instances, maximizes the total margin of correctly classified training instances, and maximizes the overall model regularization. We conduct two sets of experiments to test the classification accuracy of our MIP model over standard and corrupted versions of multiple classification datasets, respectively. In the first set of experiments, we show that our MIP model outperforms an equivalent Pseudo-Boolean Optimization (PBO) model and achieves competitive results to Logistic Regression (LR) and Gradient Descent (GD) in terms of classification accuracy over the standard datasets. In the second set of experiments, we show that our MIP model outperforms the other models (i.e., GD and LR) in terms of classification accuracy over majority of the corrupted datasets. Finally, we visually demonstrate the interpretability of our MIP model in terms of its learned parameters over the MNIST dataset. Overall, we show the effectiveness of training robust and interpretable binarized regression models using MIP.

Deep Learning (DL) holds great promise in reshaping the healthcare systems given its precision, efficiency, and objectivity. However, the brittleness of DL models to noisy and out-of-distribution inputs is ailing their deployment in the clinic. Most systems produce point estimates without further information about model uncertainty or confidence. This paper introduces a new Bayesian deep learning framework for uncertainty quantification in segmentation neural networks, specifically encoder-decoder architectures. The proposed framework uses the first-order Taylor series approximation to propagate and learn the first two moments (mean and covariance) of the distribution of the model parameters given the training data by maximizing the evidence lower bound. The output consists of two maps: the segmented image and the uncertainty map of the segmentation. The uncertainty in the segmentation decisions is captured by the covariance matrix of the predictive distribution. We evaluate the proposed framework on medical image segmentation data from Magnetic Resonances Imaging and Computed Tomography scans. Our experiments on multiple benchmark datasets demonstrate that the proposed framework is more robust to noise and adversarial attacks as compared to state-of-the-art segmentation models. Moreover, the uncertainty map of the proposed framework associates low confidence (or equivalently high uncertainty) to patches in the test input images that are corrupted with noise, artifacts or adversarial attacks. Thus, the model can self-assess its segmentation decisions when it makes an erroneous prediction or misses part of the segmentation structures, e.g., tumor, by presenting higher values in the uncertainty map.

We study downlink channel estimation in a multi-cell Massive multiple-input multiple-output (MIMO) system operating in time-division duplex. The users must know their effective channel gains to decode their received downlink data. Previous works have used the mean value as the estimate, motivated by channel hardening. However, this is associated with a performance loss in non-isotropic scattering environments. We propose two novel estimation methods that can be applied without downlink pilots. The first method is model-based and asymptotic arguments are utilized to identify a connection between the effective channel gain and the average received power during a coherence interval. The second method is data-driven and trains a neural network to identify a mapping between the available information and the effective channel gain. Both methods can be utilized for any channel distribution and precoding. For the model-aided method, we derive all expressions in closed form for the case when maximum ratio or zero-forcing precoding is used. We compare the proposed methods with the state-of-the-art using the normalized mean-squared error and spectral efficiency (SE). The results suggest that the two proposed methods provide better SE than the state-of-the-art when there is a low level of channel hardening, while the performance difference is relatively small with the uncorrelated channel model.

Deep Reinforcement Learning (Deep RL) has been receiving increasingly more attention thanks to its encouraging performance on a variety of control tasks. Yet, conventional regularization techniques in training neural networks (e.g., $L_2$ regularization, dropout) have been largely ignored in RL methods, possibly because agents are typically trained and evaluated in the same environment, and because the deep RL community focuses more on high-level algorithm designs. In this work, we present the first comprehensive study of regularization techniques with multiple policy optimization algorithms on continuous control tasks. Interestingly, we find conventional regularization techniques on the policy networks can often bring large improvement, especially on harder tasks. Our findings are shown to be robust against training hyperparameter variations. We also compare these techniques with the more widely used entropy regularization. In addition, we study regularizing different components and find that only regularizing the policy network is typically the best. We further analyze why regularization may help generalization in RL from four perspectives - sample complexity, reward distribution, weight norm, and noise robustness. We hope our study provides guidance for future practices in regularizing policy optimization algorithms. Our code is available at //github.com/xuanlinli17/iclr2021_rlreg .

Due to their increasing spread, confidence in neural network predictions became more and more important. However, basic neural networks do not deliver certainty estimates or suffer from over or under confidence. Many researchers have been working on understanding and quantifying uncertainty in a neural network's prediction. As a result, different types and sources of uncertainty have been identified and a variety of approaches to measure and quantify uncertainty in neural networks have been proposed. This work gives a comprehensive overview of uncertainty estimation in neural networks, reviews recent advances in the field, highlights current challenges, and identifies potential research opportunities. It is intended to give anyone interested in uncertainty estimation in neural networks a broad overview and introduction, without presupposing prior knowledge in this field. A comprehensive introduction to the most crucial sources of uncertainty is given and their separation into reducible model uncertainty and not reducible data uncertainty is presented. The modeling of these uncertainties based on deterministic neural networks, Bayesian neural networks, ensemble of neural networks, and test-time data augmentation approaches is introduced and different branches of these fields as well as the latest developments are discussed. For a practical application, we discuss different measures of uncertainty, approaches for the calibration of neural networks and give an overview of existing baselines and implementations. Different examples from the wide spectrum of challenges in different fields give an idea of the needs and challenges regarding uncertainties in practical applications. Additionally, the practical limitations of current methods for mission- and safety-critical real world applications are discussed and an outlook on the next steps towards a broader usage of such methods is given.

Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.

Despite the state-of-the-art performance for medical image segmentation, deep convolutional neural networks (CNNs) have rarely provided uncertainty estimations regarding their segmentation outputs, e.g., model (epistemic) and image-based (aleatoric) uncertainties. In this work, we analyze these different types of uncertainties for CNN-based 2D and 3D medical image segmentation tasks. We additionally propose a test-time augmentation-based aleatoric uncertainty to analyze the effect of different transformations of the input image on the segmentation output. Test-time augmentation has been previously used to improve segmentation accuracy, yet not been formulated in a consistent mathematical framework. Hence, we also propose a theoretical formulation of test-time augmentation, where a distribution of the prediction is estimated by Monte Carlo simulation with prior distributions of parameters in an image acquisition model that involves image transformations and noise. We compare and combine our proposed aleatoric uncertainty with model uncertainty. Experiments with segmentation of fetal brains and brain tumors from 2D and 3D Magnetic Resonance Images (MRI) showed that 1) the test-time augmentation-based aleatoric uncertainty provides a better uncertainty estimation than calculating the test-time dropout-based model uncertainty alone and helps to reduce overconfident incorrect predictions, and 2) our test-time augmentation outperforms a single-prediction baseline and dropout-based multiple predictions.

Data augmentation has been widely used for training deep learning systems for medical image segmentation and plays an important role in obtaining robust and transformation-invariant predictions. However, it has seldom been used at test time for segmentation and not been formulated in a consistent mathematical framework. In this paper, we first propose a theoretical formulation of test-time augmentation for deep learning in image recognition, where the prediction is obtained through estimating its expectation by Monte Carlo simulation with prior distributions of parameters in an image acquisition model that involves image transformations and noise. We then propose a novel uncertainty estimation method based on the formulated test-time augmentation. Experiments with segmentation of fetal brains and brain tumors from 2D and 3D Magnetic Resonance Images (MRI) showed that 1) our test-time augmentation outperforms a single-prediction baseline and dropout-based multiple predictions, and 2) it provides a better uncertainty estimation than calculating the model-based uncertainty alone and helps to reduce overconfident incorrect predictions.

In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.

北京阿比特科技有限公司