亚州AV无码专区在线电影_亚洲国产A精品一区不卡_色欲AV人妻精品一区二区软件_无码毛片一区二区视频不卡_在线曰批视频大全免费_欧美亚洲日韩精品在线_欧美日韩一区二区三区影视

We propose a general framework for machine learning based optimization under uncertainty. Our approach replaces the complex forward model by a surrogate, which is learned simultaneously in a one-shot sense when solving the optimal control problem. Our approach relies on a reformulation of the problem as a penalized empirical risk minimization problem for which we provide a consistency analysis in terms of large data and increasing penalty parameter. To solve the resulting problem, we suggest a stochastic gradient method with adaptive control of the penalty parameter and prove convergence under suitable assumptions on the surrogate model. Numerical experiments illustrate the results for linear and nonlinear surrogate models.

相關內容

優化器

關注 4

生成模型 · MoDELS · Learning · 強化學習 · 相互獨立的 ·

2024 年 2 月 12 日

Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model

Mark Rowland,Li Kevin Wenliang,Rémi Munos,Clare Lyle,Yunhao Tang,Will Dabney

We propose a new algorithm for model-based distributional reinforcement learning (RL), and prove that it is minimax-optimal for approximating return distributions with a generative model (up to logarithmic factors), resolving an open question of Zhang et al. (2023). Our analysis provides new theoretical results on categorical approaches to distributional RL, and also introduces a new distributional Bellman equation, the stochastic categorical CDF Bellman equation, which we expect to be of independent interest. We also provide an experimental study comparing several model-based distributional RL algorithms, with several takeaways for practitioners.

模型選擇 · 驗證集 · 情景 · MoDELS · 異常檢測 ·

2024 年 2 月 9 日

Model Selection of Zero-shot Anomaly Detectors in the Absence of Labeled Validation Data

Clement Fung,Chen Qiu,Aodong Li,Maja Rudolph

from arxiv, 14 pages

Anomaly detection requires detecting abnormal samples in large unlabeled datasets. While progress in deep learning and the advent of foundation models has produced powerful zero-shot anomaly detection methods, their deployment in practice is often hindered by the lack of labeled data -- without it, their detection performance cannot be evaluated reliably. In this work, we propose SWSA (Selection With Synthetic Anomalies): a general-purpose framework to select image-based anomaly detectors with a generated synthetic validation set. Our proposed anomaly generation method assumes access to only a small support set of normal images and requires no training or fine-tuning. Once generated, our synthetic validation set is used to create detection tasks that compose a validation framework for model selection. In an empirical study, we find that SWSA often selects models that match selections made with a ground-truth validation set, resulting in higher AUROCs than baseline methods. We also find that SWSA selects prompts for CLIP-based anomaly detection that outperform baseline prompt selection strategies on all datasets, including the challenging MVTec-AD and VisA datasets.

有向 · 控制器 · Learning · 強化學習 · 深度強化學習 ·

2024 年 2 月 8 日

Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning

Mohak Bhardwaj,Thomas Lampe,Michael Neunert,Francesco Romano,Abbas Abdolmaleki,Arunkumar Byravan,Markus Wulfmeier,Martin Riedmiller,Jonas Buchli

Recent advances in real-world applications of reinforcement learning (RL) have relied on the ability to accurately simulate systems at scale. However, domains such as fluid dynamical systems exhibit complex dynamic phenomena that are hard to simulate at high integration rates, limiting the direct application of modern deep RL algorithms to often expensive or safety critical hardware. In this work, we introduce "Box o Flows", a novel benchtop experimental control system for systematically evaluating RL algorithms in dynamic real-world scenarios. We describe the key components of the Box o Flows, and through a series of experiments demonstrate how state-of-the-art model-free RL algorithms can synthesize a variety of complex behaviors via simple reward specifications. Furthermore, we explore the role of offline RL in data-efficient hypothesis testing by reusing past experiences. We believe that the insights gained from this preliminary study and the availability of systems like the Box o Flows support the way forward for developing systematic RL algorithms that can be generally applied to complex, dynamical systems. Supplementary material and videos of experiments are available at //sites.google.com/view/box-o-flows/home.

TEAM · Performer · 情景 · 多樣性 · MoDELS ·

2024 年 2 月 8 日

Offline Risk-sensitive RL with Partial Observability to Enhance Performance in Human-Robot Teaming

Giorgio Angelotti,Caroline P. C. Chanel,Adam H. M. Pinto,Christophe Lounis,Corentin Chauffaut,Nicolas Drougard

from arxiv, Accepted as a full paper at AAMAS 2024

The integration of physiological computing into mixed-initiative human-robot interaction systems offers valuable advantages in autonomous task allocation by incorporating real-time features as human state observations into the decision-making system. This approach may alleviate the cognitive load on human operators by intelligently allocating mission tasks between agents. Nevertheless, accommodating a diverse pool of human participants with varying physiological and behavioral measurements presents a substantial challenge. To address this, resorting to a probabilistic framework becomes necessary, given the inherent uncertainty and partial observability on the human's state. Recent research suggests to learn a Partially Observable Markov Decision Process (POMDP) model from a data set of previously collected experiences that can be solved using Offline Reinforcement Learning (ORL) methods. In the present work, we not only highlight the potential of partially observable representations and physiological measurements to improve human operator state estimation and performance, but also enhance the overall mission effectiveness of a human-robot team. Importantly, as the fixed data set may not contain enough information to fully represent complex stochastic processes, we propose a method to incorporate model uncertainty, thus enabling risk-sensitive sequential decision-making. Experiments were conducted with a group of twenty-six human participants within a simulated robot teleoperation environment, yielding empirical evidence of the method's efficacy. The obtained adaptive task allocation policy led to statistically significant higher scores than the one that was used to collect the data set, allowing for generalization across diverse participants also taking into account risk-sensitive metrics.

Learning · 情景 · 簇 · Better · Processing（編程語言） ·

2023 年 1 月 19 日

A Survey of Meta-Reinforcement Learning

Jacob Beck,Risto Vuorio,Evan Zheran Liu,Zheng Xiong,Luisa Zintgraf,Chelsea Finn,Shimon Whiteson

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

知識 (knowledge) · 語言模型化 · MoDELS · NLU · Learning ·

2022 年 11 月 17 日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Linmei Hu,Zeyi Liu,Ziwang Zhao,Lei Hou,Liqiang Nie,Juanzi Li

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge-Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.

INFORMS · Taxonomy · Machine Learning · Integration · 學成 ·

2021 年 5 月 28 日

Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems

Laura von Rueden,Sebastian Mayer,Katharina Beckh,Bogdan Georgiev,Sven Giesselbach,Raoul Heese,Birgit Kirsch,Julius Pfrommer,Annika Pick,Rajkumar Ramamurthy,Michal Walczak,Jochen Garcke,Christian Bauckhage,Jannis Schuecker

from arxiv, Accepted at IEEE Transactions on Knowledge and Data Engineering: //ieeexplore.ieee.org/document/9429985

Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. We provide a definition and propose a concept for informed machine learning which illustrates its building blocks and distinguishes it from conventional machine learning. We introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Based on this taxonomy, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.

Machine Learning · 學成 · 可辨認的 · 統計量 · 話題 ·

2020 年 4 月 3 日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Eyke Hüllermeier,Willem Waegeman

from arxiv, 52 pages

The notion of uncertainty is of major importance in machine learning and constitutes a key element of machine learning methodology. In line with the statistical tradition, uncertainty has long been perceived as almost synonymous with standard probability and probabilistic predictions. Yet, due to the steadily increasing relevance of machine learning for practical applications and related issues such as safety requirements, new problems and challenges have recently been identified by machine learning scholars, and these problems may call for new methodological developments. In particular, this includes the importance of distinguishing between (at least) two different types of uncertainty, often refereed to as aleatoric and epistemic. In this paper, we provide an introduction to the topic of uncertainty in machine learning as well as an overview of hitherto attempts at handling uncertainty in general and formalizing this distinction in particular.

prototype · 小樣本學習 · 圖 · 學成 · 少試學習 ·

2019 年 9 月 11 日

Learning to Propagate for Graph Meta-Learning

Lu Liu,Tianyi Zhou,Guodong Long,Jing Jiang,Chengqi Zhang

from arxiv, Accepted to NeurIPS 2019

Meta-learning extracts the common knowledge acquired from learning different tasks and uses it for unseen tasks. It demonstrates a clear advantage on tasks that have insufficient training data, e.g., few-shot learning. In most meta-learning methods, tasks are implicitly related via the shared model or optimizer. In this paper, we show that a meta-learner that explicitly relates tasks on a graph describing the relations of their output dimensions (e.g., classes) can significantly improve the performance of few-shot learning. This type of graph is usually free or cheap to obtain but has rarely been explored in previous works. We study the prototype based few-shot classification, in which a prototype is generated for each class, such that the nearest neighbor search between the prototypes produces an accurate classification. We introduce "Gated Propagation Network (GPN)", which learns to propagate messages between prototypes of different classes on the graph, so that learning the prototype of each class benefits from the data of other related classes. In GPN, an attention mechanism is used for the aggregation of messages from neighboring classes, and a gate is deployed to choose between the aggregated messages and the message from the class itself. GPN is trained on a sequence of tasks from many-shot to few-shot generated by subgraph sampling. During training, it is able to reuse and update previously achieved prototypes from the memory in a life-long learning cycle. In experiments, we change the training-test discrepancy and test task generation settings for thorough evaluations. GPN outperforms recent meta-learning methods on two benchmark datasets in all studied cases.

Performer · Better · Networking · Neural Networks · 圖像分割 ·

2018 年 5 月 10 日

Recurrent Residual Convolutional Neural Network based on U-Net (R2U-Net) for Medical Image Segmentation

Md Zahangir Alom,Mahmudul Hasan,Chris Yakopcic,Tarek M. Taha,Vijayan K. Asari

from arxiv, 12 pages, 21 figures, 3 Tables

Deep learning (DL) based semantic segmentation methods have been providing state-of-the-art performance in the last few years. More specifically, these techniques have been successfully applied to medical image classification, segmentation, and detection tasks. One deep learning technique, U-Net, has become one of the most popular for these applications. In this paper, we propose a Recurrent Convolutional Neural Network (RCNN) based on U-Net as well as a Recurrent Residual Convolutional Neural Network (RRCNN) based on U-Net models, which are named RU-Net and R2U-Net respectively. The proposed models utilize the power of U-Net, Residual Network, as well as RCNN. There are several advantages of these proposed architectures for segmentation tasks. First, a residual unit helps when training deep architecture. Second, feature accumulation with recurrent residual convolutional layers ensures better feature representation for segmentation tasks. Third, it allows us to design better U-Net architecture with same number of network parameters with better performance for medical image segmentation. The proposed models are tested on three benchmark datasets such as blood vessel segmentation in retina images, skin cancer segmentation, and lung lesion segmentation. The experimental results show superior performance on segmentation tasks compared to equivalent models including U-Net and residual U-Net (ResU-Net).