人人干人人摸人人操_亚洲天堂AV一区二区在线观看_热99精品香蕉视频_色综合中文字幕一区二区三区_黄页免费观看免费_日韩三级AV在线_欧美日本在线观看

Machine Learning · Learning · 特征空間 · 樣例 · ML ·

2023 年 9 月 12 日

Consistency and adaptivity are complementary targets for the validation of variance-based uncertainty quantification metrics in machine learning regression tasks

Pascal Pernot

from arxiv, arXiv admin note: text overlap with arXiv:2303.07170

Reliable uncertainty quantification (UQ) in machine learning (ML) regression tasks is becoming the focus of many studies in materials and chemical science. It is now well understood that average calibration is insufficient, and most studies implement additional methods testing the conditional calibration with respect to uncertainty, i.e. consistency. Consistency is assessed mostly by so-called reliability diagrams. There exists however another way beyond average calibration, which is conditional calibration with respect to input features, i.e. adaptivity. In practice, adaptivity is the main concern of the final users of a ML-UQ method, seeking for the reliability of predictions and uncertainties for any point in features space. This article aims to show that consistency and adaptivity are complementary validation targets, and that a good consistency does not imply a good adaptivity. Adapted validation methods are proposed and illustrated on a representative example.

相關內容

Machine Learning

關注 2241

機(ji)器學(xue)習(xi)（Machine Learning）是一個研(yan)(yan)(yan)究(jiu)(jiu)計算學(xue)習(xi)方(fang)(fang)(fang)法(fa)(fa)的(de)(de)(de)(de)國際論(lun)(lun)壇。該雜志(zhi)發(fa)表文章，報告廣(guang)泛的(de)(de)(de)(de)學(xue)習(xi)方(fang)(fang)(fang)法(fa)(fa)應(ying)(ying)用于各種學(xue)習(xi)問(wen)題(ti)的(de)(de)(de)(de)實(shi)質性結果(guo)。該雜志(zhi)的(de)(de)(de)(de)特色論(lun)(lun)文描述研(yan)(yan)(yan)究(jiu)(jiu)的(de)(de)(de)(de)問(wen)題(ti)和(he)方(fang)(fang)(fang)法(fa)(fa)，應(ying)(ying)用研(yan)(yan)(yan)究(jiu)(jiu)和(he)研(yan)(yan)(yan)究(jiu)(jiu)方(fang)(fang)(fang)法(fa)(fa)的(de)(de)(de)(de)問(wen)題(ti)。有關學(xue)習(xi)問(wen)題(ti)或方(fang)(fang)(fang)法(fa)(fa)的(de)(de)(de)(de)論(lun)(lun)文通過實(shi)證研(yan)(yan)(yan)究(jiu)(jiu)、理論(lun)(lun)分(fen)析或與心理現(xian)象(xiang)的(de)(de)(de)(de)比較提(ti)供了(le)堅實(shi)的(de)(de)(de)(de)支(zhi)持。應(ying)(ying)用論(lun)(lun)文展示了(le)如(ru)何應(ying)(ying)用學(xue)習(xi)方(fang)(fang)(fang)法(fa)(fa)來解決重要的(de)(de)(de)(de)應(ying)(ying)用問(wen)題(ti)。研(yan)(yan)(yan)究(jiu)(jiu)方(fang)(fang)(fang)法(fa)(fa)論(lun)(lun)文改進了(le)機(ji)器學(xue)習(xi)的(de)(de)(de)(de)研(yan)(yan)(yan)究(jiu)(jiu)方(fang)(fang)(fang)法(fa)(fa)。所有的(de)(de)(de)(de)論(lun)(lun)文都以(yi)其(qi)他研(yan)(yan)(yan)究(jiu)(jiu)人員可以(yi)驗(yan)證或復制(zhi)的(de)(de)(de)(de)方(fang)(fang)(fang)式描述了(le)支(zhi)持證據。論(lun)(lun)文還(huan)詳細說明了(le)學(xue)習(xi)的(de)(de)(de)(de)組成部分(fen)，并討論(lun)(lun)了(le)關于知識表示和(he)性能(neng)任務(wu)的(de)(de)(de)(de)假設。官網(wang)地址(zhi)：

可辨認的 · 估計/估計量 · 樣本 · 推斷 · 相互獨立的 ·

2023 年 10 月 26 日

Transporting treatment effects from difference-in-differences studies

Audrey Renson,Ellicott C. Matthay,Kara E. Rudolph

Difference-in-differences (DID) is a popular approach to identify the causal effects of treatments and policies in the presence of unmeasured confounding. DID identifies the sample average treatment effect in the treated (SATT). However, a goal of such research is often to inform decision-making in target populations outside the treated sample. Transportability methods have been developed to extend inferences from study samples to external target populations; these methods have primarily been developed and applied in settings where identification is based on conditional independence between the treatment and potential outcomes, such as in a randomized trial. This paper develops identification and estimators for effects in a target population, based on DID conducted in a study sample that differs from the target population. We present a range of assumptions under which one may identify causal effects in the target population and employ causal diagrams to illustrate these assumptions. In most realistic settings, results depend critically on the assumption that any unmeasured confounders are not effect measure modifiers on the scale of the effect of interest. We develop several estimators of transported effects, including a doubly robust estimator based on the efficient influence function. Simulation results support theoretical properties of the proposed estimators. We discuss the potential application of our approach to a study of the effects of a US federal smoke-free housing policy, where the original study was conducted in New York City alone and the goal is extend inferences to other US cities.

Automator · Analysis · 統計量 · Integration · BASIC ·

2023 年 10 月 26 日

AutoCT: Automated CT registration, segmentation, and quantification

Zhe Bai,Abdelilah Essiari,Talita Perciano,Kristofer E. Bouchard

The processing and analysis of computed tomography (CT) imaging is important for both basic scientific development and clinical applications. In AutoCT, we provide a comprehensive pipeline that integrates an end-to-end automatic preprocessing, registration, segmentation, and quantitative analysis of 3D CT scans. The engineered pipeline enables atlas-based CT segmentation and quantification leveraging diffeomorphic transformations through efficient forward and inverse mappings. The extracted localized features from the deformation field allow for downstream statistical learning that may facilitate medical diagnostics. On a lightweight and portable software platform, AutoCT provides a new toolkit for the CT imaging community to underpin the deployment of artificial intelligence-driven applications.

泰勒級數 · Continuity · 講稿 · 離散化 · Analysis ·

2023 年 10 月 26 日

A spectral element solution of the Poisson equation with shifted boundary polynomial corrections: influence of the surrogate to true boundary mapping and an asymptotically preserving Robin formulation

Jens Visbech,Allan Peter Engsig-Karup,Mario Ricchiuto

from arxiv, 1 table, 18 figures

We present a new high-order accurate spectral element solution to the two-dimensional scalar Poisson equation subject to a general Robin boundary condition. The solution is based on a simplified version of the shifted boundary method employing a continuous arbitrary order $hp$-Galerkin spectral element method as the numerical discretization procedure. The simplification relies on a polynomial correction to avoid explicitly evaluating high-order partial derivatives from the Taylor series expansion, which traditionally have been used within the shifted boundary method. In this setting, we apply an extrapolation and novel interpolation approach to project the basis functions from the true domain onto the approximate surrogate domain. The resulting solution provides a method that naturally incorporates curved geometrical features of the domain, overcomes complex and cumbersome mesh generation, and avoids problems with small-cut-cells. Dirichlet, Neumann, and general Robin boundary conditions are enforced weakly through: i) a generalized Nitsche's method and ii) a generalized Aubin's method. For this, a consistent asymptotic preserving formulation of the embedded Robin formulations is presented. We present several numerical experiments and analysis of the algorithmic properties of the different weak formulations. With this, we include convergence studies under polynomial, $p$, increase of the basis functions, mesh, $h$, refinement, and matrix conditioning to highlight the spectral and algebraic convergence features, respectively. This is done to assess the influence of errors across variational formulations, polynomial order, mesh size, and mappings between the true and surrogate boundaries.

正則化項 · Learning · 非凸 · 學習率 · 泛函 ·

2023 年 10 月 26 日

Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

Yuqing Wang,Zhenghao Xu,Tuo Zhao,Molei Tao

Large learning rates, when applied to gradient descent for nonconvex optimization, yield various implicit biases including the edge of stability (Cohen et al., 2021), balancing (Wang et al., 2022), and catapult (Lewkowycz et al., 2020). These phenomena cannot be well explained by classical optimization theory. Though significant theoretical progress has been made in understanding these implicit biases, it remains unclear for which objective functions would they occur. This paper provides an initial step in answering this question, namely that these implicit biases are in fact various tips of the same iceberg. They occur when the objective function of optimization has some good regularity, which, in combination with a provable preference of large learning rate gradient descent for moving toward flatter regions, results in these nontrivial dynamical phenomena. To establish this result, we develop a new global convergence theory under large learning rates, for a family of nonconvex functions without globally Lipschitz continuous gradient, which was typically assumed in existing convergence analysis. A byproduct is the first non-asymptotic convergence rate bound for large-learning-rate gradient descent optimization of nonconvex functions. We also validate our theory with experiments on neural networks, where different losses, activation functions, and batch normalization all can significantly affect regularity and lead to very different training dynamics.

Learning · 估計/估計量 · MoDELS · 優化器 · 回合 ·

2023 年 10 月 25 日

Model predictive control-based value estimation for efficient reinforcement learning

Qizhen Wu,Kexin Liu,Lei Chen

Reinforcement learning suffers from limitations in real practices primarily due to the numbers of required interactions with virtual environments. It results in a challenging problem that we are implausible to obtain an optimal strategy only with a few attempts for many learning method. Hereby, we design an improved reinforcement learning method based on model predictive control that models the environment through a data-driven approach. Based on learned environmental model, it performs multi-step prediction to estimate the value function and optimize the policy. The method demonstrates higher learning efficiency, faster convergent speed of strategies tending to the optimal value, and fewer sample capacity space required by experience replay buffers. Experimental results, both in classic databases and in a dynamic obstacle avoidance scenario for unmanned aerial vehicle, validate the proposed approaches.

Learning · 情景 · 強化學習 · MoDELS · 樣本復雜度 ·

2023 年 10 月 25 日

Symphony of experts: orchestration with adversarial insights in reinforcement learning

Matthieu Jonckheere,Chiara Mignacco,Gilles Stoltz

Structured reinforcement learning leverages policies with advantageous properties to reach better performance, particularly in scenarios where exploration poses challenges. We explore this field through the concept of orchestration, where a (small) set of expert policies guides decision-making; the modeling thereof constitutes our first contribution. We then establish value-functions regret bounds for orchestration in the tabular setting by transferring regret-bound results from adversarial settings. We generalize and extend the analysis of natural policy gradient in Agarwal et al. [2021, Section 5.3] to arbitrary adversarial aggregation strategies. We also extend it to the case of estimated advantage functions, providing insights into sample complexity both in expectation and high probability. A key point of our approach lies in its arguably more transparent proofs compared to existing methods. Finally, we present simulations for a stochastic matching toy model.

MoDELS · 可辨認的 · 相互獨立的 · state-of-the-art · 逼真度 ·

2023 年 10 月 24 日

Context-aware feature attribution through argumentation

Jinfeng Zhong,Elsa Negre

Feature attribution is a fundamental task in both machine learning and data analysis, which involves determining the contribution of individual features or variables to a model's output. This process helps identify the most important features for predicting an outcome. The history of feature attribution methods can be traced back to General Additive Models (GAMs), which extend linear regression models by incorporating non-linear relationships between dependent and independent variables. In recent years, gradient-based methods and surrogate models have been applied to unravel complex Artificial Intelligence (AI) systems, but these methods have limitations. GAMs tend to achieve lower accuracy, gradient-based methods can be difficult to interpret, and surrogate models often suffer from stability and fidelity issues. Furthermore, most existing methods do not consider users' contexts, which can significantly influence their preferences. To address these limitations and advance the current state-of-the-art, we define a novel feature attribution framework called Context-Aware Feature Attribution Through Argumentation (CA-FATA). Our framework harnesses the power of argumentation by treating each feature as an argument that can either support, attack or neutralize a prediction. Additionally, CA-FATA formulates feature attribution as an argumentation procedure, and each computation has explicit semantics, which makes it inherently interpretable. CA-FATA also easily integrates side information, such as users' contexts, resulting in more accurate predictions.

Bagging · 預測器/決策函數 · 子采樣 · Learning · 優化器 ·

2023 年 10 月 24 日

Bagging in overparameterized learning: Risk characterization and risk monotonization

Pratik Patil,Jin-Hong Du,Arun Kumar Kuchibhotla

from arxiv, 102 pages, 34 figures; this version add minor clarifications at few places

Bagging is a commonly used ensemble technique in statistics and machine learning to improve the performance of prediction procedures. In this paper, we study the prediction risk of variants of bagged predictors under the proportional asymptotics regime, in which the ratio of the number of features to the number of observations converges to a constant. Specifically, we propose a general strategy to analyze the prediction risk under squared error loss of bagged predictors using classical results on simple random sampling. Specializing the strategy, we derive the exact asymptotic risk of the bagged ridge and ridgeless predictors with an arbitrary number of bags under a well-specified linear model with arbitrary feature covariance matrices and signal vectors. Furthermore, we prescribe a generic cross-validation procedure to select the optimal subsample size for bagging and discuss its utility to eliminate the non-monotonic behavior of the limiting risk in the sample size (i.e., double or multiple descents). In demonstrating the proposed procedure for bagged ridge and ridgeless predictors, we thoroughly investigate the oracle properties of the optimal subsample size and provide an in-depth comparison between different bagging variants.

損失函數（機器學習） · 泛函 · 損失 · Taxonomy · Machine Learning ·

2023 年 1 月 13 日

A survey and taxonomy of loss functions in machine learning

Lorenzo Ciampiconi,Adam Elwood,Marco Leonardi,Ashraf Mohamed,Alessandro Rozza

Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.

知識 (knowledge) · INFORMS · 語言表示 · MoDELS · Extensibility ·

2022 年 7 月 28 日

MLRIP: Pre-training a military language representation model with informative factual knowledge and professional knowledge base

Hui Li,Xuekang Yang,Xin Zhao,Lin Yu,Jiping Zheng,Wei Sun

from arxiv, 11 pages, 6 figures

Incorporating prior knowledge into pre-trained language models has proven to be effective for knowledge-driven NLP tasks, such as entity typing and relation extraction. Current pre-training procedures usually inject external knowledge into models by using knowledge masking, knowledge fusion and knowledge replacement. However, factual information contained in the input sentences have not been fully mined, and the external knowledge for injecting have not been strictly checked. As a result, the context information cannot be fully exploited and extra noise will be introduced or the amount of knowledge injected is limited. To address these issues, we propose MLRIP, which modifies the knowledge masking strategies proposed by ERNIE-Baidu, and introduce a two-stage entity replacement strategy. Extensive experiments with comprehensive analyses illustrate the superiority of MLRIP over BERT-based models in military knowledge-driven NLP tasks.