欧美综合一本热第九页,久久国产乱子伦精品噜噜,国产精品亚洲午夜一区二区

Model-based deep learning solutions to inverse problems have attracted increasing attention in recent years as they bridge state-of-the-art numerical performance with interpretability. In addition, the incorporated prior domain knowledge can make the training more efficient as the smaller number of parameters allows the training step to be executed with smaller datasets. Algorithm unrolling schemes stand out among these model-based learning techniques. Despite their rapid advancement and their close connection to traditional high-dimensional statistical methods, they lack certainty estimates and a theory for uncertainty quantification is still elusive. This work provides a step towards closing this gap proposing a rigorous way to obtain confidence intervals for the LISTA estimator.

相關內容

Learning

關注 12

自動問答 · Performer · 有偏 · 優化器 · 視覺問答 ·

2023 年 10 月 31 日

Adaptive loose optimization for robust question answering

Jie Ma,Pinghui Wang,Zewei Wang,Dechen Kong,Min Hu,Ting Han,Jun Liu

from arxiv, 13 pages,8 figures

Question answering methods are well-known for leveraging data bias, such as the language prior in visual question answering and the position bias in machine reading comprehension (extractive question answering). Current debiasing methods often come at the cost of significant in-distribution performance to achieve favorable out-of-distribution generalizability, while non-debiasing methods sacrifice a considerable amount of out-of-distribution performance in order to obtain high in-distribution performance. Therefore, it is challenging for them to deal with the complicated changing real-world situations. In this paper, we propose a simple yet effective novel loss function with adaptive loose optimization, which seeks to make the best of both worlds for question answering. Our main technical contribution is to reduce the loss adaptively according to the ratio between the previous and current optimization state on mini-batch training data. This loose optimization can be used to prevent non-debiasing methods from overlearning data bias while enabling debiasing methods to maintain slight bias learning. Experiments on the visual question answering datasets, including VQA v2, VQA-CP v1, VQA-CP v2, GQA-OOD, and the extractive question answering dataset SQuAD demonstrate that our approach enables QA methods to obtain state-of-the-art in- and out-of-distribution performance in most cases. The source code has been released publicly in \url{//github.com/reml-group/ALO}.

可約的 · MoDELS · Learning · 可辨認的 · 分解的 ·

2023 年 10 月 30 日

Strategic Data Sharing between Competitors

Nikita Tsoy,Nikola Konstantinov

from arxiv, Accepted to NeurIPS 2023

Collaborative learning techniques have significantly advanced in recent years, enabling private model training across multiple organizations. Despite this opportunity, firms face a dilemma when considering data sharing with competitors -- while collaboration can improve a company's machine learning model, it may also benefit competitors and hence reduce profits. In this work, we introduce a general framework for analyzing this data-sharing trade-off. The framework consists of three components, representing the firms' production decisions, the effect of additional data on model quality, and the data-sharing negotiation process, respectively. We then study an instantiation of the framework, based on a conventional market model from economic theory, to identify key factors that affect collaboration incentives. Our findings indicate a profound impact of market conditions on the data-sharing incentives. In particular, we find that reduced competition, in terms of the similarities between the firms' products, and harder learning tasks foster collaboration.

閾值 · 估計/估計量 · MoDELS · 線性的 · 泛函 ·

2023 年 10 月 28 日

Threshold detection under a semiparametric regression model

Graciela Boente,Florencia Leonardi,Daniela Rodriguez,Mariela Sued

Linear regression models have been extensively considered in the literature. However, in some practical applications they may not be appropriate all over the range of the covariate. In this paper, a more flexible model is introduced by considering a regression model $Y=r(X)+\varepsilon$ where the regression function $r(\cdot)$ is assumed to be linear for large values in the domain of the predictor variable $X$. More precisely, we assume that $r(x)=\alpha_0+\beta_0 x$ for $x> u_0$, where the value $u_0$ is identified as the smallest value satisfying such a property. A penalized procedure is introduced to estimate the threshold $u_0$. The considered proposal focusses on a semiparametric approach since no parametric model is assumed for the regression function for values smaller than $u_0$. Consistency properties of both the threshold estimator and the estimators of $(\alpha_0,\beta_0)$ are derived, under mild assumptions. Through a numerical study, the small sample properties of the proposed procedure and the importance of introducing a penalization are investigated. The analysis of a real data set allows us to demonstrate the usefulness of the penalized estimators.

線性的 · 易處理的 · motivation · 情景 · Performer ·

2023 年 10 月 27 日

Selection and Ordering Policies for Hiring Pipelines via Linear Programming

Boris Epstein,Will Ma

Motivated by hiring pipelines, we study three selection and ordering problems in which applicants for a finite set of positions must be interviewed or sent offers. There is a finite time budget for interviewing/sending offers, and every interview/offer is followed by a stochastic realization of discovering the applicant's quality or acceptance decision, leading to computationally challenging problems. In the first problem, we study sequential interviewing and show that a computationally tractable, non-adaptive policy that must make offers immediately after interviewing is near-optimal, assuming offers are always accepted. We further show how to use this policy as a subroutine for obtaining a PTAS. In the second problem, we assume that applicants have already been interviewed but only accept offers with some probability; we develop a computationally tractable policy that makes offers for the different positions in parallel, which can be used even if positions are heterogeneous, and is near-optimal relative to a policy that can make the same total number of offers one by one. In the third problem, we introduce a parsimonious model of overbooking where all offers must be sent simultaneously and a linear penalty is incurred for each acceptance beyond the number of positions; we provide nearly tight bounds on the performance of practically motivated value-ordered policies. All in all, our paper takes a unified approach to three different hiring problems, based on linear programming. Our results in the first two problems generalize and improve the existing guarantees due to Purohit et al. (2019) that were between 1/8 and 1/2 to new guarantees that are at least 1-1/e. We also numerically compare three different settings of making offers to candidates (sequentially, in parallel, or simultaneously), providing insight into when a firm should favor each one.

預測器/決策函數 · SSL · 正則化項 · 方差 · Learning ·

2023 年 10 月 27 日

Implicit variance regularization in non-contrastive SSL

Manu Srinath Halvagal,Axel Laborieux,Friedemann Zenke

from arxiv, Accepted at NeurIPS 2023

Non-contrastive SSL methods like BYOL and SimSiam rely on asymmetric predictor networks to avoid representational collapse without negative samples. Yet, how predictor networks facilitate stable learning is not fully understood. While previous theoretical analyses assumed Euclidean losses, most practical implementations rely on cosine similarity. To gain further theoretical insight into non-contrastive SSL, we analytically study learning dynamics in conjunction with Euclidean and cosine similarity in the eigenspace of closed-form linear predictor networks. We show that both avoid collapse through implicit variance regularization albeit through different dynamical mechanisms. Moreover, we find that the eigenvalues act as effective learning rate multipliers and propose a family of isotropic loss functions (IsoLoss) that equalize convergence rates across eigenmodes. Empirically, IsoLoss speeds up the initial learning dynamics and increases robustness, thereby allowing us to dispense with the EMA target network typically used with non-contrastive methods. Our analysis sheds light on the variance regularization mechanisms of non-contrastive SSL and lays the theoretical grounds for crafting novel loss functions that shape the learning dynamics of the predictor's spectrum.

INFORMS · 話題 · 自然語言處理 · 多媒體 · 進化計算 ·

2021 年 9 月 11 日

A Survey on Multi-modal Summarization

Anubhav Jangra,Adam Jatowt,Sriparna Saha,Mohammad Hasanuzzaman

The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this paper, we present a comprehensive survey of the existing research in the area of MMS.

Vision · 變換 · 學成 · 可約的 · 縮放 ·

2021 年 4 月 8 日

SiT: Self-supervised vIsion Transformer

Sara Atito,Muhammad Awais,Josef Kittler

Self-supervised learning methods are gaining increasing traction in computer vision due to their recent success in reducing the gap with supervised learning. In natural language processing (NLP) self-supervised learning and transformers are already the methods of choice. The recent literature suggests that the transformers are becoming increasingly popular also in computer vision. So far, the vision transformers have been shown to work well when pretrained either using a large scale supervised data or with some kind of co-supervision, e.g. in terms of teacher network. These supervised pretrained vision transformers achieve very good results in downstream tasks with minimal changes. In this work we investigate the merits of self-supervised learning for pretraining image/vision transformers and then using them for downstream classification tasks. We propose Self-supervised vIsion Transformers (SiT) and discuss several self-supervised training mechanisms to obtain a pretext model. The architectural flexibility of SiT allows us to use it as an autoencoder and work with multiple self-supervised tasks seamlessly. We show that a pretrained SiT can be finetuned for a downstream classification task on small scale datasets, consisting of a few thousand images rather than several millions. The proposed approach is evaluated on standard datasets using common protocols. The results demonstrate the strength of the transformers and their suitability for self-supervised learning. We outperformed existing self-supervised learning methods by large margin. We also observed that SiT is good for few shot learning and also showed that it is learning useful representation by simply training a linear classifier on top of the learned features from SiT. Pretraining, finetuning, and evaluation codes will be available under: //github.com/Sara-Ahmed/SiT.

Machine Learning · 學成 · Judea Pearl · 可理解性 · AI ·

2019 年 11 月 24 日

Causality for Machine Learning

Bernhard Sch?lkopf

Graphical causal inference as pioneered by Judea Pearl arose from research on artificial intelligence (AI), and for a long time had little connection to the field of machine learning. This article discusses where links have been and should be established, introducing key concepts along the way. It argues that the hard open problems of machine learning and AI are intrinsically related to causality, and explains how the field is beginning to understand them.

entity · 圖 · 知識圖譜 · MoDELS · 相似度 ·

2019 年 9 月 11 日

Domain Representation for Knowledge Graph Embedding

Cunxiang Wang,Feiliang Ren,Zhichao Lin,Chenxv Zhao,Tian Xie,Yue Zhang

from arxiv, Acceptted by NLPCC2019

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

目標檢測 · 模型評估 · 學成 · 注意力機制 · Networking ·

2019 年 4 月 15 日

Reverse Attention for Salient Object Detection

Shuhan Chen,Xiuli Tan,Ben Wang,Xuelong Hu

from arxiv, ECCV 2018

Benefit from the quick development of deep learning techniques, salient object detection has achieved remarkable progresses recently. However, there still exists following two major challenges that hinder its application in embedded devices, low resolution output and heavy model weight. To this end, this paper presents an accurate yet compact deep network for efficient salient object detection. More specifically, given a coarse saliency prediction in the deepest layer, we first employ residual learning to learn side-output residual features for saliency refinement, which can be achieved with very limited convolutional parameters while keep accuracy. Secondly, we further propose reverse attention to guide such side-output residual learning in a top-down manner. By erasing the current predicted salient regions from side-output features, the network can eventually explore the missing object parts and details which results in high resolution and accuracy. Experiments on six benchmark datasets demonstrate that the proposed approach compares favorably against state-of-the-art methods, and with advantages in terms of simplicity, efficiency (45 FPS) and model size (81 MB).