亚洲成AV人片乱码色午夜刚交,久久婷婷人人喊人人泡人人爽

Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use a nonsymmetric algorithm that treats recent observations as more relevant. This paper generalizes conformal prediction to deal with both aspects: we employ weighted quantiles to introduce robustness against distribution drift, and design a new randomization technique to allow for algorithms that do not treat data points symmetrically. Our new methods are provably robust, with substantially less loss of coverage when exchangeability is violated due to distribution drift or other challenging features of real data, while also achieving the same coverage guarantees as existing conformal prediction methods if the data points are in fact exchangeable. We demonstrate the practical utility of these new tools with simulations and real-data experiments on electricity and election forecasting.

相關內容

可交換的

關注 0

模型評估 · 穩健性 · 容差 · 預測器/決策函數 · 估計/估計量 ·

2022 年 10 月 27 日

Robust Monocular Localization of Drones by Adapting Domain Maps to Depth Prediction Inaccuracies

Priyesh Shukla,Sureshkumar S.,Alex C. Stutts,Sathya Ravi,Theja Tulabandhula,Amit R. Trivedi

We present a novel monocular localization framework by jointly training deep learning-based depth prediction and Bayesian filtering-based pose reasoning. The proposed cross-modal framework significantly outperforms deep learning-only predictions with respect to model scalability and tolerance to environmental variations. Specifically, we show little-to-no degradation of pose accuracy even with extremely poor depth estimates from a lightweight depth predictor. Our framework also maintains high pose accuracy in extreme lighting variations compared to standard deep learning, even without explicit domain adaptation. By openly representing the map and intermediate feature maps (such as depth estimates), our framework also allows for faster updates and reusing intermediate predictions for other tasks, such as obstacle avoidance, resulting in much higher resource efficiency.

Conformer · 容差 · 預測器/決策函數 · 情景 · 邊緣化 ·

2022 年 10 月 26 日

Distribution-Free Finite-Sample Guarantees and Split Conformal Prediction

Roel Hulsman

from arxiv, 38 pages, 3 figures

Modern black-box predictive models are often accompanied by weak performance guarantees that only hold asymptotically in the size of the dataset or require strong parametric assumptions. In response to this, split conformal prediction represents a promising avenue to obtain finite-sample guarantees under minimal distribution-free assumptions. Although prediction set validity most often concerns marginal coverage, we explore the related but different guarantee of tolerance regions, reformulating known results in the language of nested prediction sets and extending on the duality between marginal coverage and tolerance regions. Furthermore, we highlight the connection between split conformal prediction and classical tolerance predictors developed in the 1940s, as well as recent developments in distribution-free risk control. One result that transfers from classical tolerance predictors is that the coverage of a prediction set based on order statistics, conditional on the calibration set, is a random variable stochastically dominating the Beta distribution. We demonstrate the empirical effectiveness of our findings on synthetic and real datasets using a popular split conformal prediction procedure called conformalized quantile regression (CQR).

情景 · MoDELS · 泛函 · 模型評估 · 稀疏 ·

2022 年 10 月 25 日

Exploring the Whole Rashomon Set of Sparse Decision Trees

Rui Xin,Chudi Zhong,Zhi Chen,Takuya Takagi,Margo Seltzer,Cynthia Rudin

from arxiv, NeurIPS 2022 (Oral)

In any given machine learning problem, there may be many models that could explain the data almost equally well. However, most learning algorithms return only one of these models, leaving practitioners with no practical way to explore alternative models that might have desirable properties beyond what could be expressed within a loss function. The Rashomon set is the set of these all almost-optimal models. Rashomon sets can be extremely complicated, particularly for highly nonlinear function classes that allow complex interaction terms, such as decision trees. We provide the first technique for completely enumerating the Rashomon set for sparse decision trees; in fact, our work provides the first complete enumeration of any Rashomon set for a non-trivial problem with a highly nonlinear discrete function class. This allows the user an unprecedented level of control over model choice among all models that are approximately equally good. We represent the Rashomon set in a specialized data structure that supports efficient querying and sampling. We show three applications of the Rashomon set: 1) it can be used to study variable importance for the set of almost-optimal trees (as opposed to a single tree), 2) the Rashomon set for accuracy enables enumeration of the Rashomon sets for balanced accuracy and F1-score, and 3) the Rashomon set for a full dataset can be used to produce Rashomon sets constructed with only subsets of the data set. Thus, we are able to examine Rashomon sets across problems with a new lens, enabling users to choose models rather than be at the mercy of an algorithm that produces only a single model.

MoDELS · 操作 · 層 · 情景 · 元學習 ·

2022 年 10 月 25 日

Expanding the Deployment Envelope of Behavior Prediction via Adaptive Meta-Learning

Boris Ivanovic,James Harrison,Marco Pavone

from arxiv, 12 pages, 13 figures, 2 tables. Fixed links and references to the appendix

Learning-based behavior prediction methods are increasingly being deployed in real-world autonomous systems, e.g., in fleets of self-driving vehicles, which are beginning to commercially operate in major cities across the world. Despite their advancements, however, the vast majority of prediction systems are specialized to a set of well-explored geographic regions or operational design domains, complicating deployment to additional cities, countries, or continents. Towards this end, we present a novel method for efficiently adapting behavior prediction models to new environments. Our approach leverages recent advances in meta-learning, specifically Bayesian regression, to augment existing behavior prediction models with an adaptive layer that enables efficient domain transfer via offline fine-tuning, online adaptation, or both. Experiments across multiple real-world datasets demonstrate that our method can efficiently adapt to a variety of unseen environments.

語言模型化 · 置信度 · 可約的 · MoDELS · Continuity ·

2022 年 10 月 25 日

Confident Adaptive Language Modeling

Tal Schuster,Adam Fisch,Jai Gupta,Mostafa Dehghani,Dara Bahri,Vinh Q. Tran,Yi Tay,Donald Metzler

from arxiv, NeurIPS 2022 (selected as Oral)

Recent advances in Transformer-based large language models (LLMs) have led to significant performance improvements across many tasks. These gains come with a drastic increase in the models' size, potentially leading to slow and costly use at inference time. In practice, however, the series of generations made by LLMs is composed of varying levels of difficulty. While certain predictions truly benefit from the models' full capacity, other continuations are more trivial and can be solved with reduced compute. In this work, we introduce Confident Adaptive Language Modeling (CALM), a framework for dynamically allocating different amounts of compute per input and generation timestep. Early exit decoding involves several challenges that we address here, such as: (1) what confidence measure to use; (2) connecting sequence-level constraints to local per-token exit decisions; and (3) attending back to missing hidden representations due to early exits in previous tokens. Through theoretical analysis and empirical experiments on three diverse text generation tasks, we demonstrate the efficacy of our framework in reducing compute -- potential speedup of up to $\times 3$ -- while provably maintaining high performance.

Conformer · 優化器 · 協變量偏移 · 覆蓋 · MoDELS ·

2022 年 10 月 25 日

Bayesian Optimization with Conformal Coverage Guarantees

Samuel Stanton,Wesley Maddox,Andrew Gordon Wilson

from arxiv, For code, see //www.github.com/samuelstanton/conformal-bayesopt.git

Bayesian optimization is a coherent, ubiquitous approach to decision-making under uncertainty, with applications including multi-arm bandits, active learning, and black-box optimization. Bayesian optimization selects decisions (i.e. objective function queries) with maximal expected utility with respect to the posterior distribution of a Bayesian model, which quantifies reducible, epistemic uncertainty about query outcomes. In practice, subjectively implausible outcomes can occur regularly for two reasons: 1) model misspecification and 2) covariate shift. Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models and a simple mechanism to correct for covariate shift. We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity, and investigate its behavior on a suite of black-box optimization tasks and tabular ranking tasks. In many cases we find that query coverage can be significantly improved without harming sample-efficiency.

去噪 · state-of-the-art · 數據集 · Learning · 去噪自編碼器 ·

2022 年 10 月 25 日

Pre-training via Denoising for Molecular Property Prediction

Sheheryar Zaidi,Michael Schaarschmidt,James Martens,Hyunjik Kim,Yee Whye Teh,Alvaro Sanchez-Gonzalez,Peter Battaglia,Razvan Pascanu,Jonathan Godwin

Many important problems involving molecular property prediction from 3D structures have limited data, posing a generalization challenge for neural networks. In this paper, we describe a pre-training technique based on denoising that achieves a new state-of-the-art in molecular property prediction by utilizing large datasets of 3D molecular structures at equilibrium to learn meaningful representations for downstream tasks. Relying on the well-known link between denoising autoencoders and score-matching, we show that the denoising objective corresponds to learning a molecular force field -- arising from approximating the Boltzmann distribution with a mixture of Gaussians -- directly from equilibrium structures. Our experiments demonstrate that using this pre-training objective significantly improves performance on multiple benchmarks, achieving a new state-of-the-art on the majority of targets in the widely used QM9 dataset. Our analysis then provides practical insights into the effects of different factors -- dataset sizes, model size and architecture, and the choice of upstream and downstream datasets -- on pre-training.

Conformer · 推斷 · 情景 · 在線 · 可交換的 ·

2022 年 10 月 24 日

Conformal Inference for Online Prediction with Arbitrary Distribution Shifts

Isaac Gibbs,Emmanuel Candès

from arxiv, 29 pages, 10 figures

Conformal inference is a powerful tool for quantifying the uncertainty around predictions made by black-box models (e.g. neural nets, random forests). Formally, this methodology guarantees that if the training and test data are exchangeable (e.g. i.i.d.) then we can construct a prediction set $C$ for the target $Y$ such that $P(Y \in C) = 1-\alpha$ for any target level $\alpha$. In this article, we extend this methodology to an online prediction setting where the distribution generating the data is allowed to vary over time. To account for the non-exchangeability, we develop a protective layer that lies on top of conformal inference and gradually re-calibrates its predictions to adapt to the observed changes in the environment. Our methods are highly flexible and can be used in combination with any predictive algorithm that produces estimates of the target or its conditional distribution and without any assumptions on the size or type of the distribution shift. We test our techniques on two real-world datasets aimed at predicting stock market volatility and COVID-19 case counts and find that they are robust and adaptive to real-world distribution shifts.

圖 · Networking · Processing（編程語言） · 圖卷積 · 圖卷積神經網絡/圖卷積網絡 ·

2021 年 12 月 27 日

Powerful Graph Convolutioal Networks with Adaptive Propagation Mechanism for Homophily and Heterophily

Tao Wang,Rui Wang,Di Jin,Dongxiao He,Yuxiao Huang

Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.

contrastive · 圖 · 對比學習 · INFORMS · 學成 ·

2021 年 9 月 24 日

GeomGCL: Geometric Graph Contrastive Learning for Molecular Property Prediction

Shuangli Li,Jingbo Zhou,Tong Xu,Dejing Dou,Hui Xiong

Recently many efforts have been devoted to applying graph neural networks (GNNs) to molecular property prediction which is a fundamental task for computational drug and material discovery. One of major obstacles to hinder the successful prediction of molecule property by GNNs is the scarcity of labeled data. Though graph contrastive learning (GCL) methods have achieved extraordinary performance with insufficient labeled data, most focused on designing data augmentation schemes for general graphs. However, the fundamental property of a molecule could be altered with the augmentation method (like random perturbation) on molecular graphs. Whereas, the critical geometric information of molecules remains rarely explored under the current GNN and GCL architectures. To this end, we propose a novel graph contrastive learning method utilizing the geometry of the molecule across 2D and 3D views, which is named GeomGCL. Specifically, we first devise a dual-view geometric message passing network (GeomMPNN) to adaptively leverage the rich information of both 2D and 3D graphs of a molecule. The incorporation of geometric properties at different levels can greatly facilitate the molecular representation learning. Then a novel geometric graph contrastive scheme is designed to make both geometric views collaboratively supervise each other to improve the generalization ability of GeomMPNN. We evaluate GeomGCL on various downstream property prediction tasks via a finetune process. Experimental results on seven real-life molecular datasets demonstrate the effectiveness of our proposed GeomGCL against state-of-the-art baselines.