清纯唯美另类亚洲欧美综合-AV乱码AV免费AⅤ成人

In predictive modeling with simulation or machine learning, it is critical to assess the quality of estimated values through output analysis accurately. In recent decades output analysis has become enriched with methods that quantify the impact of input data uncertainty in the model outputs to increase robustness. However, most developments apply when the input data can be parametrically parameterized. We propose a unified output analysis framework for simulation and machine learning outputs through the lens of Monte Carlo sampling. This framework provides nonparametric quantification of the variance and bias induced in the outputs with higher-order accuracy. Our new bias-corrected estimation from the model outputs leverages the extension of fast iterative bootstrap sampling and higher-order influence functions. For the scalability of the proposed estimation methods, we devise budget-optimal rules and leverage control variates for variance reduction. Our numerical results demonstrate a clear advantage in building better and more robust confidence intervals for both simulation and machine learning frameworks.

相關內容

估計/估計量

關注 3

參數空間 · Subspace · 簇 · 講稿 · 自頂向下 ·

2022 年 11 月 8 日

A local approach to parameter space reduction for regression and classification tasks

Francesco Romor,Marco Tezzele,Gianluigi Rozza

Parameter space reduction has been proved to be a crucial tool to speed-up the execution of many numerical tasks such as optimization, inverse problems, sensitivity analysis, and surrogate models' design, especially when in presence of high-dimensional parametrized systems. In this work we propose a new method called local active subspaces (LAS), which explores the synergies of active subspaces with supervised clustering techniques in order to carry out a more efficient dimension reduction in the parameter space. The clustering is performed without losing the input-output relations by introducing a distance metric induced by the global active subspace. We present two possible clustering algorithms: K-medoids and a hierarchical top-down approach, which is able to impose a variety of subdivision criteria specifically tailored for parameter space reduction tasks. This method is particularly useful for the community working on surrogate modelling. Frequently, the parameter space presents subdomains where the objective function of interest varies less on average along different directions. So, it could be approximated more accurately if restricted to those subdomains and studied separately. We tested the new method over several numerical experiments of increasing complexity, we show how to deal with vectorial outputs, and how to classify the different regions with respect to the local active subspace dimension. Employing this classification technique as a preprocessing step in the parameter space, or output space in case of vectorial outputs, brings remarkable results for the purpose of surrogate modelling.

t檢驗 · 統計量 · 確切的 · Performer · 置信度 ·

2022 年 11 月 8 日

Te Test: A New Non-asymptotic T-test for Behrens-Fisher Problems

Chang Wang,Jinzhu Jia

from arxiv, 27 pages

The Behrens-Fisher Problem is a classical statistical problem. It is to test the equality of the means of two normal populations using two independent samples, when the equality of the population variances is unknown. Linnik (1968) has shown that this problem has no exact fixed-level tests based on the complete sufficient statistics. However, exact conventional solutions based on other statistics and approximate solutions based the complete sufficient statistics do exist. Existing methods are mainly asymptotic tests, and usually don't perform well when the variances or sample sizes differ a lot. In this paper, we propose a new method to find an exact t-test (Te) to solve this classical Behrens-Fisher Problem. Confidence intervals for the difference between two means are provided. We also use detailed analysis to show that Te test reaches the maximum of degree of freedom and to give a weak version of proof that Te test has the shortest confidence interval length expectation. Some simulations are performed to show the advantages of our new proposed method compared to available conventional methods like Welch's test, paired t-test and so on. We will also compare it to unconventional method, like two-stage test.

異常點 · FAST · 統計量 · 估計/估計量 · Performer ·

2022 年 11 月 7 日

Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms and Theoretical Studies

Yiyuan She,Zhifeng Wang,Jiahui Shen

Outliers widely occur in big-data applications and may severely affect statistical estimation and inference. In this paper, a framework of outlier-resistant estimation is introduced to robustify an arbitrarily given loss function. It has a close connection to the method of trimming and includes explicit outlyingness parameters for all samples, which in turn facilitates computation, theory, and parameter tuning. To tackle the issues of nonconvexity and nonsmoothness, we develop scalable algorithms with implementation ease and guaranteed fast convergence. In particular, a new technique is proposed to alleviate the requirement on the starting point such that on regular datasets, the number of data resamplings can be substantially reduced. Based on combined statistical and computational treatments, we are able to perform nonasymptotic analysis beyond M-estimation. The obtained resistant estimators, though not necessarily globally or even locally optimal, enjoy minimax rate optimality in both low dimensions and high dimensions. Experiments in regression, classification, and neural networks show excellent performance of the proposed methodology at the occurrence of gross outliers.

推斷 · 蒙特卡羅 · TOOLS · 蓋樂世（Galaxy） · 邊緣化 ·

2022 年 11 月 7 日

Monte Carlo Techniques for Addressing Large Errors and Missing Data in Simulation-based Inference

Bingjie Wang,Joel Leja,Ashley Villar,Joshua S. Speagle

from arxiv, 8 pages, 2 figures, accepted to the Machine Learning and the Physical Sciences workshop at NeurIPS 2022

Upcoming astronomical surveys will observe billions of galaxies across cosmic time, providing a unique opportunity to map the many pathways of galaxy assembly to an incredibly high resolution. However, the huge amount of data also poses an immediate computational challenge: current tools for inferring parameters from the light of galaxies take $\gtrsim 10$ hours per fit. This is prohibitively expensive. Simulation-based Inference (SBI) is a promising solution. However, it requires simulated data with identical characteristics to the observed data, whereas real astronomical surveys are often highly heterogeneous, with missing observations and variable uncertainties determined by sky and telescope conditions. Here we present a Monte Carlo technique for treating out-of-distribution measurement errors and missing data using standard SBI tools. We show that out-of-distribution measurement errors can be approximated by using standard SBI evaluations, and that missing data can be marginalized over using SBI evaluations over nearby data realizations in the training set. While these techniques slow the inference process from $\sim 1$ sec to $\sim 1.5$ min per object, this is still significantly faster than standard approaches while also dramatically expanding the applicability of SBI. This expanded regime has broad implications for future applications to astronomical surveys.

交叉熵 · Learning · contrastive · 目標函數 · 損失 ·

2022 年 11 月 7 日

Contrastive Classification and Representation Learning with Probabilistic Interpretation

Rahaf Aljundi,Yash Patel,Milan Sulc,Daniel Olmeda,Nikolay Chumerin

Cross entropy loss has served as the main objective function for classification-based tasks. Widely deployed for learning neural network classifiers, it shows both effectiveness and a probabilistic interpretation. Recently, after the success of self supervised contrastive representation learning methods, supervised contrastive methods have been proposed to learn representations and have shown superior and more robust performance, compared to solely training with cross entropy loss. However, cross entropy loss is still needed to train the final classification layer. In this work, we investigate the possibility of learning both the representation and the classifier using one objective function that combines the robustness of contrastive learning and the probabilistic interpretation of cross entropy loss. First, we revisit a previously proposed contrastive-based objective function that approximates cross entropy loss and present a simple extension to learn the classifier jointly. Second, we propose a new version of the supervised contrastive training that learns jointly the parameters of the classifier and the backbone of the network. We empirically show that our proposed objective functions show a significant improvement over the standard cross entropy loss with more training stability and robustness in various challenging settings.

估計/估計量 · Performer · 稀疏 · 線性的 · Extensibility ·

2022 年 11 月 7 日

Sparse Horseshoe Estimation via Expectation-Maximisation

Shu Yu Tew,Daniel F. Schmidt,Enes Makalic

The horseshoe prior is known to possess many desirable properties for Bayesian estimation of sparse parameter vectors, yet its density function lacks an analytic form. As such, it is challenging to find a closed-form solution for the posterior mode. Conventional horseshoe estimators use the posterior mean to estimate the parameters, but these estimates are not sparse. We propose a novel expectation-maximisation (EM) procedure for computing the MAP estimates of the parameters in the case of the standard linear model. A particular strength of our approach is that the M-step depends only on the form of the prior and it is independent of the form of the likelihood. We introduce several simple modifications of this EM procedure that allow for straightforward extension to generalised linear models. In experiments performed on simulated and real data, our approach performs comparable, or superior to, state-of-the-art sparse estimation methods in terms of statistical performance and computational cost.

Continuity · Weight · MoDELS · Learning · Performer ·

2022 年 11 月 6 日

Momentum-based Weight Interpolation of Strong Zero-Shot Models for Continual Learning

Zafir Stojanovski,Karsten Roth,Zeynep Akata

from arxiv, First Workshop on Interpolation Regularizers and Beyond, NeurIPS 2022 (Spotlight) and Workshop on Distribution Shifts, NeurIPS 2022

Large pre-trained, zero-shot capable models have shown considerable success both for standard transfer and adaptation tasks, with particular robustness towards distribution shifts. In addition, subsequent fine-tuning can considerably improve performance on a selected downstream task. However, through naive fine-tuning, these zero-shot models lose their generalizability and robustness towards distribution shifts. This is a particular problem for tasks such as Continual Learning (CL), where continuous adaptation has to be performed as new task distributions are introduced sequentially. In this work, we showcase that where fine-tuning falls short to adapt such zero-shot capable models, simple momentum-based weight interpolation can provide consistent improvements for CL tasks in both memory-free and memory-based settings. In particular, we find improvements of over $+4\%$ on standard CL benchmarks, while reducing the error to the upper limit of jointly training on all tasks at once in parts by more than half, allowing the continual learner to inch closer to the joint training limits.

回合 · MoDELS · 估計/估計量 · Markov · 馬爾可夫鏈蒙特卡羅 ·

2022 年 11 月 6 日

A Dynamic Spatiotemporal Stochastic Volatility Model with an Application to Environmental Risks

Philipp Otto,Osman Do?an,Süleyman Ta?p?nar

This article introduces a dynamic spatiotemporal stochastic volatility (SV) model with explicit terms for the spatial, temporal, and spatiotemporal spillover effects. Moreover, the model includes time-invariant site-specific constant log-volatility terms. Thus, this formulation allows to distinguish between spatial and temporal interactions, while each location may have a different volatility level. We study the statistical properties of an outcome variable under this process and show that it introduces spatial dependence in the outcome variable. Further, we present a Bayesian estimation procedure based on the Markov Chain Monte Carlo (MCMC) approach using a suitable data transformation. After providing simulation evidence on the proposed Bayesian estimator's performance, we apply the model in a highly relevant field, namely environmental risk modeling. Even though there are only a few empirical studies on environmental risks, previous literature undoubtedly demonstrated the importance of climate variation studies. For example, for local air quality in Northern Italy in 2021, we show pronounced spatial and temporal spillovers and larger uncertainties/risks during the winter season compared to the summer season.

Markov · 控制器 · Processing（編程語言） · Learning · Continuity ·

2022 年 11 月 6 日

On learning history based policies for controlling Markov decision processes

Gandharv Patil,Aditya Mahajan,Doina Precup

Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods,suchas recurrent neural nets or history-based state abstraction, perform better than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP. However, there has been little formal analysis of such history-based algorithms, as most existing frameworks focus exclusively on memory-less features. In this paper, we introduce a theoretical framework for studying the behaviour of RL algorithms that learn to control an MDP using history-based feature abstraction mappings. Furthermore, we use this framework to design a practical RL algorithm and we numerically evaluate its effectiveness on a set of continuous control tasks.

相互獨立的 · MoDELS · 誤差度量 · motivation · 模型平均 ·

2022 年 11 月 4 日

Prediction & Model Evaluation for Space-Time Data

Gregory L. Watson,Colleen E. Reid,Michael Jerrett,Donatello Telesca

Evaluation metrics for prediction error, model selection and model averaging on space-time data are understudied and poorly understood. The absence of independent replication makes prediction ambiguous as a concept and renders evaluation procedures developed for independent data inappropriate for most space-time prediction problems. Motivated by air pollution data collected during California wildfires in 2008, this manuscript attempts a formalization of the true prediction error associated with spatial interpolation. We investigate a variety of cross-validation (CV) procedures employing both simulations and case studies to provide insight into the nature of the estimand targeted by alternative data partition strategies. Consistent with recent best practice, we find that location-based cross-validation is appropriate for estimating spatial interpolation error as in our analysis of the California wildfire data. Interestingly, commonly held notions of bias-variance trade-off of CV fold size do not trivially apply to dependent data, and we recommend leave-one-location-out (LOLO) CV as the preferred prediction error metric for spatial interpolation.