青柠在线观看免费高清1_国内精品VA视频在线观看_国产亚洲成A人片在线观看_亚洲国产中文第一在线_日本AⅤ精品一区二区三区_久久久亚洲精品无码一二_欧美区一区二区三区视频在线

We consider the challenge of AI value alignment with multiple individuals that have different reward functions and optimal policies in an underlying Markov decision process. We formalize this problem as one of policy aggregation, where the goal is to identify a desirable collective policy. We argue that an approach informed by social choice theory is especially suitable. Our key insight is that social choice methods can be reinterpreted by identifying ordinal preferences with volumes of subsets of the state-action occupancy polytope. Building on this insight, we demonstrate that a variety of methods--including approval voting, Borda count, the proportional veto core, and quantile fairness--can be practically applied to policy aggregation.

相關內容

可辨認的

關注 4

多樣性 · MOC · 樣本 · Processing（編程語言） · 語言模型化 ·

2024 年 12 月 18 日

Generating Diverse Hypotheses for Inductive Reasoning

Kang-il Lee,Hyukhun Koh,Dongryeol Lee,Seunghyun Yoon,Minsung Kim,Kyomin Jung

from arxiv, 14 pages

Inductive reasoning - the process of inferring general rules from a small number of observations - is a fundamental aspect of human intelligence. Recent works suggest that large language models (LLMs) can engage in inductive reasoning by sampling multiple hypotheses about the rules and selecting the one that best explains the observations. However, due to the IID sampling, semantically redundant hypotheses are frequently generated, leading to significant wastage of compute. In this paper, we 1) demonstrate that increasing the temperature to enhance the diversity is limited due to text degeneration issue, and 2) propose a novel method to improve the diversity while maintaining text quality. We first analyze the effect of increasing the temperature parameter, which is regarded as the LLM's diversity control, on IID hypotheses. Our analysis shows that as temperature rises, diversity and accuracy of hypotheses increase up to a certain point, but this trend saturates due to text degeneration. To generate hypotheses that are more semantically diverse and of higher quality, we propose a novel approach inspired by human inductive reasoning, which we call Mixture of Concepts (MoC). When applied to several inductive reasoning benchmarks, MoC demonstrated significant performance improvements compared to standard IID sampling and other approaches.

標注 · MoDELS · Performer · Machine Learning · Continuity ·

2024 年 12 月 17 日

Sequential Harmful Shift Detection Without Labels

Salim I. Amoukou,Tom Bewley,Saumitra Mishra,Freddy Lecue,Daniele Magazzeni,Manuela Veloso

from arxiv, Accepted at the 38th Conference on Neural Information Processing Systems (NeurIPS 2024)

We introduce a novel approach for detecting distribution shifts that negatively impact the performance of machine learning models in continuous production environments, which requires no access to ground truth data labels. It builds upon the work of Podkopaev and Ramdas [2022], who address scenarios where labels are available for tracking model errors over time. Our solution extends this framework to work in the absence of labels, by employing a proxy for the true error. This proxy is derived using the predictions of a trained error estimator. Experiments show that our method has high power and false alarm control under various distribution shifts, including covariate and label shifts and natural shifts over geography and time.

泛函 · 情景 · 規范化的 · 序列化 · 損失 ·

2024 年 12 月 16 日

Quantifying Inefficiency

Yannai A. Gonczarowski,Ella Segev

We axiomatically define a cardinal social inefficiency function, which, given a set of alternatives and individuals' vNM preferences over the alternatives, assigns a unique number -- the social inefficiency -- to each alternative. These numbers -- and not only their order -- are uniquely defined by our axioms despite no exogenously given interpersonal comparison, outside option, or disagreement point. We interpret these numbers as per capita losses in endogenously normalized utility. We apply our social inefficiency function to a setting in which interpersonal comparison is notoriously hard to justify -- object allocation without money -- leveraging techniques from computer science to prove an approximate-efficiency result for the Random Serial Dictatorship mechanism.

優化器 · MoDELS · 泛函 · GPS · 冪法 ·

2024 年 12 月 16 日

Simulation Based Bayesian Optimization

Roi Naveiro,Becky Tang

Bayesian Optimization (BO) is a powerful method for optimizing black-box functions by combining prior knowledge with ongoing function evaluations. BO constructs a probabilistic surrogate model of the objective function given the covariates, which is in turn used to inform the selection of future evaluation points through an acquisition function. For smooth continuous search spaces, Gaussian Processes (GPs) are commonly used as the surrogate model as they offer analytical access to posterior predictive distributions, thus facilitating the computation and optimization of acquisition functions. However, in complex scenarios involving optimization over categorical or mixed covariate spaces, GPs may not be ideal. This paper introduces Simulation Based Bayesian Optimization (SBBO) as a novel approach to optimizing acquisition functions that only requires sampling-based access to posterior predictive distributions. SBBO allows the use of surrogate probabilistic models tailored for combinatorial spaces with discrete variables. Any Bayesian model in which posterior inference is carried out through Markov chain Monte Carlo can be selected as the surrogate model in SBBO. We demonstrate empirically the effectiveness of SBBO using various choices of surrogate models in applications involving combinatorial optimization. choices of surrogate models.

情景 · 相似度 · ForCES · 分離的 · Projection ·

2024 年 12 月 15 日

The Relational Quotient Completion

Francesco Dagnino,Fabio Pasquali

Taking a quotient roughly means changing the notion of equality on a given object, set or type. In a quantitative setting, equality naturally generalises to a distance, measuring how much elements are similar instead of just stating their equivalence. Hence, quotients can be understood quantitatively as a change of distance. In this paper, we show how, combining Lawvere's doctrines and the calculus of relations, one can unify quantitative and usual quotients in a common picture. More in detail, we introduce relational doctrines as a functorial description of (the core of) the calculus of relations. Then, we define quotients and a universal construction adding them to any relational doctrine, generalising the quotient completion of existential elementary doctrine and also recovering many quantitative examples. This construction deals with an intensional notion of quotient and breaks extensional equality of morphisms. Then, we describe another construction forcing extensionality, showing how it abstracts several notions of separation in metric and topological structures. Combining these two constructions, we get the extensional quotient completion, whose essential image is characterized through the notion of projective cover. As an application, we show that, under suitable conditions, relational doctrines of algebras arise as the extensional quotient completion of free algebras. Finally, we compare relational doctrines to other categorical structures where one can model the calculus of relations.

from arxiv, 12 pages, 3 figures

We investigate the hiring problem where a sequence of applicants is sequentially interviewed, and a decision on whether to hire an applicant is immediately made based on the applicant's score. For the maximal and average improvement strategies, the decision depends on the applicant's score and the scores of all employees, i.e., previous successful applicants. For local improvement strategies, an interviewing committee randomly chosen for each applicant makes the decision depending on the score of the applicant and the scores of the members of the committee. These idealized hiring strategies capture the challenges of decision-making under uncertainty. We probe the average score of the best employee, the probability of hiring all first $N$ applicants, the fraction of superior companies in which, throughout the evolution, every hired applicant has a score above expected, etc.

Learning · Processing（編程語言） · MoDELS · 分解的 · 表示學習 ·

2022 年 11 月 21 日

Disentangled Representation Learning

Xin Wang,Hong Chen,Si'ao Tang,Zihao Wu,Wenwu Zhu

from arxiv, 22 pages,9 figures

Disentangled Representation Learning (DRL) aims to learn a model capable of identifying and disentangling the underlying factors hidden in the observable data in representation form. The process of separating underlying factors of variation into variables with semantic meaning benefits in learning explainable representations of data, which imitates the meaningful understanding process of humans when observing an object or relation. As a general learning strategy, DRL has demonstrated its power in improving the model explainability, controlability, robustness, as well as generalization capacity in a wide range of scenarios such as computer vision, natural language processing, data mining etc. In this article, we comprehensively review DRL from various aspects including motivations, definitions, methodologies, evaluations, applications and model designs. We discuss works on DRL based on two well-recognized definitions, i.e., Intuitive Definition and Group Theory Definition. We further categorize the methodologies for DRL into four groups, i.e., Traditional Statistical Approaches, Variational Auto-encoder Based Approaches, Generative Adversarial Networks Based Approaches, Hierarchical Approaches and Other Approaches. We also analyze principles to design different DRL models that may benefit different tasks in practical applications. Finally, we point out challenges in DRL as well as potential research directions deserving future investigations. We believe this work may provide insights for promoting the DRL research in the community.

多峰值 · 模態 · INFORMS · MoDELS · 可約的 ·

2021 年 6 月 30 日

Attention Bottlenecks for Multimodal Fusion

Arsha Nagrani,Shan Yang,Anurag Arnab,Aren Jansen,Cordelia Schmid,Chen Sun

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.

置信度 · MoDELS · Extensibility · 圖 · entity ·

2019 年 2 月 26 日

Embedding Uncertain Knowledge Graphs

Xuelu Chen,Muhao Chen,Weijia Shi,Yizhou Sun,Carlo Zaniolo

Embedding models for deterministic Knowledge Graphs (KG) have been extensively studied, with the purpose of capturing latent semantic relations between entities and incorporating the structured knowledge into machine learning. However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge. The capturing of uncertain knowledge will benefit many knowledge-driven applications such as question answering and semantic search by providing more natural characterization of the knowledge. In this paper, we propose a novel uncertain KG embedding model UKGE, which aims to preserve both structural and uncertainty information of relation facts in the embedding space. Unlike previous models that characterize relation facts with binary classification techniques, UKGE learns embeddings according to the confidence scores of uncertain relation facts. To further enhance the precision of UKGE, we also introduce probabilistic soft logic to infer confidence scores for unseen relation facts during training. We propose and evaluate two variants of UKGE based on different learning objectives. Experiments are conducted on three real-world uncertain KGs via three tasks, i.e. confidence prediction, relation fact ranking, and relation fact classification. UKGE shows effectiveness in capturing uncertain knowledge by achieving promising results on these tasks, and consistently outperforms baselines on these tasks.

Performer · 深度強化學習 · 學成 · entity · 強化學習 ·

2018 年 6 月 28 日

Relational Deep Reinforcement Learning

Vinicius Zambaldi,David Raposo,Adam Santoro,Victor Bapst,Yujia Li,Igor Babuschkin,Karl Tuyls,David Reichert,Timothy Lillicrap,Edward Lockhart,Murray Shanahan,Victoria Langston,Razvan Pascanu,Matthew Botvinick,Oriol Vinyals,Peter Battaglia

We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.