亚洲精品无码国产爽快A片百度,欧美日韩精品视频一区二区在线播,精品欧美一区二区精品久久久94,国产高清不卡码一区二区三区,国产一成人欧美一区二区三区

We propose local prediction pools as a method for combining the predictive distributions of a set of experts conditional on a set of variables believed to be related to the predictive accuracy of the experts. This is done in a two step process where we first estimate the conditional predictive accuracy of each expert given a vector of covariates$\unicode{x2014}$or pooling variables$\unicode{x2014}$and then combine the predictive distributions of the experts conditional on this local predictive accuracy. To estimate the local predictive accuracy of each expert, we introduce the simple, fast, and interpretable caliper method. Expert pooling weights from the local prediction pool approaches the equal weight solution whenever there is little data on local predictive performance, making the pools robust and adaptive. We also propose a local version of the widely used optimal prediction pools. Local prediction pools are shown to outperform the widely used optimal linear pools in a macroeconomic forecasting evaluation, and in predicting daily bike usage for a bike rental company.

相關內容

模型評估

關注 1730

機器學習系統設計系統評估標準

INTERACT · 數學 · 樣例 · Principle · 講稿 ·

2023 年 10 月 13 日

Univalent Double Categories

Niels van der Weide,Nima Rasekh,Benedikt Ahrens,Paige Randall North

Category theory is a branch of mathematics that provides a formal framework for understanding the relationship between mathematical structures. To this end, a category not only incorporates the data of the desired objects, but also "morphisms", which capture how different objects interact with each other. Category theory has found many applications in mathematics and in computer science, for example in functional programming. Double categories are a natural generalization of categories which incorporate the data of two separate classes of morphisms, allowing a more nuanced representation of relationships and interactions between objects. Similar to category theory, double categories have been successfully applied to various situations in mathematics and computer science, in which objects naturally exhibit two types of morphisms. Examples include categories themselves, but also lenses, petri nets, and spans. While categories have already been formalized in a variety of proof assistants, double categories have received far less attention. In this paper we remedy this situation by presenting a formalization of double categories via the proof assistant Coq, relying on the Coq UniMath library. As part of this work we present two equivalent formalizations of the definition of a double category, an unfolded explicit definition and a second definition which exhibits excellent formal properties via 2-sided displayed categories. As an application of the formal approach we establish a notion of univalent double category along with a univalence principle: equivalences of univalent double categories coincide with their identities

TOOLS · 數據集 · 情景 · 多樣性 · 示例 ·

2023 年 10 月 13 日

Structured Prediction Problem Archive

Paul Swoboda,Ahmed Abbas,Florian Bernard,Andrea Hornakova,Paul Roetzer,Bogdan Savchynskyy

from arxiv, Added new shape matching instances based of learned descriptors

Structured prediction problems are one of the fundamental tools in machine learning. In order to facilitate algorithm development for their numerical solution, we collect in one place a large number of datasets in easy to read formats for a diverse set of problem classes. We provide archival links to datasets, description of the considered problems and problem formats, and a short summary of problem characteristics including size, number of instances etc. For reference we also give a non-exhaustive selection of algorithms proposed in the literature for their solution. We hope that this central repository will make benchmarking and comparison to established works easier. We welcome submission of interesting new datasets and algorithms for inclusion in our archive.

模型評估 · binary · Performer · MoDELS · Fashion MNIST (數據集) ·

2023 年 10 月 12 日

Efficient Hyperdimensional Computing

Zhanglu Yan,Shida Wang,Kaiwen Tang,Weng-Fai Wong

Hyperdimensional computing (HDC) is a method to perform classification that uses binary vectors with high dimensions and the majority rule. This approach has the potential to be energy-efficient and hence deemed suitable for resource-limited platforms due to its simplicity and massive parallelism. However, in order to achieve high accuracy, HDC sometimes uses hypervectors with tens of thousands of dimensions. This potentially negates its efficiency advantage. In this paper, we examine the necessity of such high dimensions and conduct a detailed theoretical analysis of the relationship between hypervector dimensions and accuracy. Our results demonstrate that as the dimension of the hypervectors increases, the worst-case/average-case HDC prediction accuracy with the majority rule decreases. Building on this insight, we develop HDC models that use binary hypervectors with dimensions orders of magnitude lower than those of state-of-the-art HDC models while maintaining equivalent or even improved accuracy and efficiency. For instance, on the MNIST dataset, we achieve 91.12% HDC accuracy in image classification with a dimension of only 64. Our methods perform operations that are only 0.35% of other HDC models with dimensions of 10,000. Furthermore, we evaluate our methods on ISOLET, UCI-HAR, and Fashion-MNIST datasets and investigate the limits of HDC computing.

近似 · 多峰值 · 樣本 · 確切的 · 退火重要采樣 ·

2023 年 10 月 12 日

Multimodal Sampling via Approximate Symmetries

Lexing Ying

Sampling from multimodal distributions is a challenging task in scientific computing. When a distribution has an exact symmetry between the modes, direct jumps among them can accelerate the samplings significantly. However, the distributions from most applications do not have exact symmetries. This paper considers the distributions with approximate symmetries. We first construct an exactly symmetric reference distribution from the target one by averaging over the group orbit associated with the approximate symmetry. Next, we can apply the multilevel Monte Carlo methods by constructing a continuation path between the reference and target distributions. We discuss how to implement these steps with annealed importance sampling and tempered transitions. Compared with traditional multilevel methods, the proposed approach can be more effective since the reference and target distributions are much closer. Numerical results of the Ising models are presented to illustrate the efficiency of the proposed method.

多峰值 · 樣本 · 確切的 · 近似 · 退火重要采樣 ·

2023 年 10 月 11 日

Multimodal Sampling via Appproximate Symmetries

Lexing Ying

可約的 · 主動學習 · Learning · 超參數 · 數據集 ·

2023 年 10 月 11 日

Active Learning for Multilingual Semantic Parser

Zhuang Li,Gholamreza Haffari

from arxiv, EACL 2023 (findings)

Current multilingual semantic parsing (MSP) datasets are almost all collected by translating the utterances in the existing datasets from the resource-rich language to the target language. However, manual translation is costly. To reduce the translation effort, this paper proposes the first active learning procedure for MSP (AL-MSP). AL-MSP selects only a subset from the existing datasets to be translated. We also propose a novel selection method that prioritizes the examples diversifying the logical form structures with more lexical choices, and a novel hyperparameter tuning method that needs no extra annotation cost. Our experiments show that AL-MSP significantly reduces translation costs with ideal selection methods. Our selection method with proper hyperparameters yields better parsing performance than the other baselines on two multilingual datasets.

Learning · Processing（編程語言） · MoDELS · 分解的 · 表示學習 ·

2022 年 11 月 21 日

Disentangled Representation Learning

Xin Wang,Hong Chen,Si'ao Tang,Zihao Wu,Wenwu Zhu

from arxiv, 22 pages,9 figures

Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controlability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining etc. In this article, we comprehensively review DRL from various aspects including motivations, definitions, methodologies, evaluations, applications and model designs. We discuss works on DRL based on two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition. We further categorize the methodologies for DRL into four groups, i.e., Traditional Statistical Approaches, Variational Auto-encoder Based Approaches, Generative Adversarial Networks Based Approaches, Hierarchical Approaches and Other Approaches. We also analyze principles to design different DRL models that may benefit different tasks in practical applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this work may provide insights for promoting the DRL research in the community.

INFORMS · Performer · 隨機變量 · 優化器 · 泛化理論 ·

2020 年 12 月 22 日

Disentangled Information Bottleneck

Ziqi Pan,Li Niu,Jianfu Zhang,Liqing Zhang

from arxiv, Revised mathematical proof

The information bottleneck (IB) method is a technique for extracting information that is relevant for predicting the target random variable from the source random variable, which is typically implemented by optimizing the IB Lagrangian that balances the compression and prediction terms. However, the IB Lagrangian is hard to optimize, and multiple trials for tuning values of Lagrangian multiplier are required. Moreover, we show that the prediction performance strictly decreases as the compression gets stronger during optimizing the IB Lagrangian. In this paper, we implement the IB method from the perspective of supervised disentangling. Specifically, we introduce Disentangled Information Bottleneck (DisenIB) that is consistent on compressing source maximally without target prediction performance loss (maximum compression). Theoretical and experimental results demonstrate that our method is consistent on maximum compression, and performs well in terms of generalization, robustness to adversarial attack, out-of-distribution detection, and supervised disentangling.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

BLEU · MoDELS · 注意力機制 · Transformer · Networking ·

2017 年 12 月 6 日

Attention Is All You Need

Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin

from arxiv, 15 pages, 5 figures

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.