欧美狂野视频一区国产精品_男女一边脱一边亲一边膜_国产亚洲欧美丝袜在线观看三区_美女视频网站黄还免费_18禁亚洲国产中文综合_禁止18点击进入在线看片尤物_国产丰满乱孑伦无码专区

Communities on GitHub often use issue labels as a way of triaging issues by assigning them priority ratings based on how urgently they should be addressed. The labels used are determined by the repository contributors and not standardised by GitHub. This makes it difficult for priority-related reasoning across repositories for both researchers and contributors. Previous work shows interest in how issues are labelled and what the consequences for those labels are. For instance, some previous work has used clustering models and natural language processing to categorise labels without a particular emphasis on priority. With this publication, we introduce a unique data set of 812 manually categorised labels pertaining to priority; normalised and ranked as low-, medium-, or high-priority. To provide an example of how this data set could be used, we have created a tool for GitHub contributors that will create a list of the highest priority issues from the repositories to which they contribute. We have released the data set and the tool for anyone to use on Zenodo because we hope that this will help the open source community address high-priority issues more effectively and inspire other uses.

相關內容

標(biao)注

關注 2

Facebook AI Research · 表示 · Learning · motivation · state-of-the-art ·

2024 年 6 月 24 日

Learning Interpretable Fair Representations

Tianhao Wang,Zana Bu?inca,Zilin Ma

Numerous approaches have been recently proposed for learning fair representations that mitigate unfair outcomes in prediction tasks. A key motivation for these methods is that the representations can be used by third parties with unknown objectives. However, because current fair representations are generally not interpretable, the third party cannot use these fair representations for exploration, or to obtain any additional insights, besides the pre-contracted prediction tasks. Thus, to increase data utility beyond prediction tasks, we argue that the representations need to be fair, yet interpretable. We propose a general framework for learning interpretable fair representations by introducing an interpretable "prior knowledge" during the representation learning process. We implement this idea and conduct experiments with ColorMNIST and Dsprite datasets. The results indicate that in addition to being interpretable, our representations attain slightly higher accuracy and fairer outcomes in a downstream classification task compared to state-of-the-art fair representations.

數據選擇 · Learning · 準則 · MoDELS · Bandits ·

2024 年 6 月 24 日

Towards Bayesian Data Selection

Julian Rodemann

from arxiv, 5th Workshop on Data-Centric Machine Learning Research (DMLR) at ICML 2024

A wide range of machine learning algorithms iteratively add data to the training sample. Examples include semi-supervised learning, active learning, multi-armed bandits, and Bayesian optimization. We embed this kind of data addition into decision theory by framing data selection as a decision problem. This paves the way for finding Bayes-optimal selections of data. For the illustrative case of self-training in semi-supervised learning, we derive the respective Bayes criterion. We further show that deploying this criterion mitigates the issue of confirmation bias by empirically assessing our method for generalized linear models, semi-parametric generalized additive models, and Bayesian neural networks on simulated and real-world data.

Agent · SimPLe · CASE · 近似 · 情景 ·

2024 年 6 月 24 日

Simple Delegated Choice

Ali Khodabakhsh,Emmanouil Pountourakis,Samuel Taggart

This paper studies delegation in a model of discrete choice. In the delegation problem, an uninformed principal must consult an informed agent to make a decision. Both the agent and principal have preferences over the decided-upon action which vary based on the state of the world, and which may not be aligned. The principal may commit to a mechanism, which maps reports of the agent to actions. When this mechanism is deterministic, it can take the form of a menu of actions, from which the agent simply chooses upon observing the state. In this case, the principal is said to have delegated the choice of action to the agent. We consider a setting where the decision being delegated is a choice of a utility-maximizing action from a set of several options. We assume the shared portion of the agent's and principal's utilities is drawn from a distribution known to the principal, and that utility misalignment takes the form of a known bias for or against each action. We provide tight approximation analyses for simple threshold policies under three increasingly general sets of assumptions. With independently-distributed utilities, we prove a $3$-approximation. When the agent has an outside option the principal cannot rule out, the constant approximation fails, but we prove a $\log \rho/\log\log \rho$-approximation, where $\rho$ is the ratio of the maximum value to the optimal utility. We also give a weaker but tight bound that holds for correlated values, and complement our upper bounds with hardness results. One special case of our model is utility-based assortment optimization, for which our results are new.

泛函 · 潛在 · 表示 · MoDELS · Learning ·

2024 年 6 月 21 日

Latent Functional Maps

Marco Fumero,Marco Pegoraro,Valentino Maiorca,Francesco Locatello,Emanuele Rodolà

Neural models learn data representations that lie on low-dimensional manifolds, yet modeling the relation between these representational spaces is an ongoing challenge. By integrating spectral geometry principles into neural modeling, we show that this problem can be better addressed in the functional domain, mitigating complexity, while enhancing interpretability and performances on downstream tasks. To this end, we introduce a multi-purpose framework to the representation learning community, which allows to: (i) compare different spaces in an interpretable way and measure their intrinsic similarity; (ii) find correspondences between them, both in unsupervised and weakly supervised settings, and (iii) to effectively transfer representations between distinct spaces. We validate our framework on various applications, ranging from stitching to retrieval tasks, demonstrating that latent functional maps can serve as a swiss-army knife for representation alignment.

特征選擇 · Facebook AI Research · 流 · Processing（編程語言） · 情景 ·

2024 年 6 月 20 日

Fair Streaming Feature Selection

Zhangling Duan,Tianci Li,Xingyu Wu,Zhaolong Ling,Jingye Yang,Zhaohong Jia

from arxiv, 30 pages, 10 figures

Streaming feature selection techniques have become essential in processing real-time data streams, as they facilitate the identification of the most relevant attributes from continuously updating information. Despite their performance, current algorithms to streaming feature selection frequently fall short in managing biases and avoiding discrimination that could be perpetuated by sensitive attributes, potentially leading to unfair outcomes in the resulting models. To address this issue, we propose FairSFS, a novel algorithm for Fair Streaming Feature Selection, to uphold fairness in the feature selection process without compromising the ability to handle data in an online manner. FairSFS adapts to incoming feature vectors by dynamically adjusting the feature set and discerns the correlations between classification attributes and sensitive attributes from this revised set, thereby forestalling the propagation of sensitive data. Empirical evaluations show that FairSFS not only maintains accuracy that is on par with leading streaming feature selection methods and existing fair feature techniques but also significantly improves fairness metrics.

MoDELS · Extensibility · Performer · binary · 環 ·

2024 年 6 月 20 日

In Tree Structure Should Sentence Be Generated

Yaguang Li,Xin Chen

Generative models reliant on sequential autoregression have been at the forefront of language generation for an extensive period, particularly following the introduction of widely acclaimed transformers. Despite its excellent performance, there are always some issues that we face today. For example, problems such as hallucinations and getting trapped in a logic loop may occur. To enhance the performance of existing systems, this paper introduces a new method for generating sequences in natural language, which involves generating the targeted sentence in a tree-traversing order. The paper includes an illustration of the theoretical basis and validity of the approach, as well as a comparison of its fundamentals with the diffusion model in graphic generation. Finally, a module called SenTree is introduced for generating an approximating binary tree. It is already available at //github.com/arklyg/sentree. Additionally, a joint training framework based on this approach is proposed, incorporating the intrinsics of generative adversarial networks.

數據選擇 · Learning · 準則 · MoDELS · Bandits ·

2024 年 6 月 18 日

Bayesian Data Selection

Julian Rodemann

from arxiv, 5th Workshop on Data-Centric Machine Learning Research (DMLR) at ICML 2024

MoDELS · Performer · ASSETS · 可辨認的 · Use Case ·

2024 年 6 月 17 日

Task Me Anything

Jieyu Zhang,Weikai Huang,Zixian Ma,Oscar Michel,Dong He,Tanmay Gupta,Wei-Chiu Ma,Ali Farhadi,Aniruddha Kembhavi,Ranjay Krishna

from arxiv, website: //www.task-me-anything.org

Benchmarks for large multimodal language models (MLMs) now serve to simultaneously assess the general capabilities of models instead of evaluating for a specific capability. As a result, when a developer wants to identify which models to use for their application, they are overwhelmed by the number of benchmarks and remain uncertain about which benchmark's results are most reflective of their specific use case. This paper introduces Task-Me-Anything, a benchmark generation engine which produces a benchmark tailored to a user's needs. Task-Me-Anything maintains an extendable taxonomy of visual assets and can programmatically generate a vast number of task instances. Additionally, it algorithmically addresses user queries regarding MLM performance efficiently within a computational budget. It contains 113K images, 10K videos, 2K 3D object assets, over 365 object categories, 655 attributes, and 335 relationships. It can generate 750M image/video question-answering pairs, which focus on evaluating MLM perceptual capabilities. Task-Me-Anything reveals critical insights: open-source MLMs excel in object and attribute recognition but lack spatial and temporal understanding; each model exhibits unique strengths and weaknesses; larger models generally perform better, though exceptions exist; and GPT4o demonstrates challenges in recognizing rotating/moving objects and distinguishing colors.

去噪 · Processing（編程語言） · MoDELS · 噪聲 · Guidance ·

2024 年 6 月 17 日

Denoising Diffusion Recommender Model

Jujia Zhao,Wenjie Wang,Yiyan Xu,Teng Sun,Fuli Feng,Tat-Seng Chua

from arxiv, Accepted by SIGIR 2024

Recommender systems often grapple with noisy implicit feedback. Most studies alleviate the noise issues from data cleaning perspective such as data resampling and reweighting, but they are constrained by heuristic assumptions. Another denoising avenue is from model perspective, which proactively injects noises into user-item interactions and enhances the intrinsic denoising ability of models. However, this kind of denoising process poses significant challenges to the recommender model's representation capacity to capture noise patterns. To address this issue, we propose Denoising Diffusion Recommender Model (DDRM), which leverages multi-step denoising process of diffusion models to robustify user and item embeddings from any recommender models. DDRM injects controlled Gaussian noises in the forward process and iteratively removes noises in the reverse denoising process, thereby improving embedding robustness against noisy feedback. To achieve this target, the key lies in offering appropriate guidance to steer the reverse denoising process and providing a proper starting point to start the forward-reverse process during inference. In particular, we propose a dedicated denoising module that encodes collaborative information as denoising guidance. Besides, in the inference stage, DDRM utilizes the average embeddings of users' historically liked items as the starting point rather than using pure noise since pure noise lacks personalization, which increases the difficulty of the denoising process. Extensive experiments on three datasets with three representative backend recommender models demonstrate the effectiveness of DDRM.

卷積神經網絡 · Neural Networks · 知識表示 · Networking · 卷積 ·

2018 年 2 月 14 日

Interpretable Convolutional Neural Networks

Quanshi Zhang,Ying Nian Wu,Song-Chun Zhu

from arxiv, In this version, we release the website of the code. Compared to the previous version, we have corrected all values of location instability in Table 3--6 by dividing the values by sqrt(2), i.e., a=a/sqrt(2). Such revisions do NOT decrease the significance of the superior performance of our method, because we make the same correction to location-instability values of all baselines

This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.