爱琴海论坛视频播放三免费_亚洲五月花在线观看_久久精品人人做人人看_不卡一区二区三区视频在线_亚洲综合片欧美一级片网_日韩欧美黄色片在线免费观看_亚洲国产精品一区二区三区视频

User interaction data is an important source of supervision in counterfactual learning to rank (CLTR). Such data suffers from presentation bias. Much work in unbiased learning to rank (ULTR) focuses on position bias, i.e., items at higher ranks are more likely to be examined and clicked. Inter-item dependencies also influence examination probabilities, with outlier items in a ranking as an important example. Outliers are defined as items that observably deviate from the rest and therefore stand out in the ranking. In this paper, we identify and introduce the bias brought about by outlier items: users tend to click more on outlier items and their close neighbors. To this end, we first conduct a controlled experiment to study the effect of outliers on user clicks. Next, to examine whether the findings from our controlled experiment generalize to naturalistic situations, we explore real-world click logs from an e-commerce platform. We show that, in both scenarios, users tend to click significantly more on outlier items than on non-outlier items in the same rankings. We show that this tendency holds for all positions, i.e., for any specific position, an item receives more interactions when presented as an outlier as opposed to a non-outlier item. We conclude from our analysis that the effect of outliers on clicks is a type of bias that should be addressed in ULTR. We therefore propose an outlier-aware click model that accounts for both outlier and position bias, called outlier-aware position-based model ( OPBM). We estimate click propensities based on OPBM ; through extensive experiments performed on both real-world e-commerce data and semi-synthetic data, we verify the effectiveness of our outlier-aware click model. Our results show the superiority of OPBM against baselines in terms of ranking performance and true relevance estimation.

相關內容

異常點

關注 1

Performer · 估計/估計量 · 目標領域 · 可辨認的 · 基準 ·

2023 年 6 月 13 日

DAPPER: Label-Free Performance Estimation after Personalization for Heterogeneous Mobile Sensing

Taesik Gong,Yewon Kim,Adiba Orzikulova,Yunxin Liu,Sung Ju Hwang,Jinwoo Shin,Sung-Ju Lee

from arxiv, Accepted to Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), 2023

Many applications utilize sensors in mobile devices and machine learning to provide novel services. However, various factors such as different users, devices, and environments impact the performance of such applications, thus making the domain shift (i.e., distributional shift between the training domain and the target domain) a critical issue in mobile sensing. Despite attempts in domain adaptation to solve this challenging problem, their performance is unreliable due to the complex interplay among diverse factors. In principle, the performance uncertainty can be identified and redeemed by performance validation with ground-truth labels. However, it is infeasible for every user to collect high-quality, sufficient labeled data. To address the issue, we present DAPPER (Domain AdaPtation Performance EstimatoR) that estimates the adaptation performance in a target domain with only unlabeled target data. Our key idea is to approximate the model performance based on the mutual information between the model inputs and corresponding outputs. Our evaluation with four real-world sensing datasets compared against six baselines shows that on average, DAPPER outperforms the state-of-the-art baseline by 39.8% in estimation accuracy. Moreover, our on-device experiment shows that DAPPER achieves up to 396X less computation overhead compared with the baselines.

穩健性 · Prompt · 語言模型化 · Analysis · 可理解性 ·

2023 年 6 月 13 日

PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts

Kaijie Zhu,Jindong Wang,Jiaheng Zhou,Zichen Wang,Hao Chen,Yidong Wang,Linyi Yang,Wei Ye,Neil Zhenqiang Gong,Yue Zhang,Xing Xie

from arxiv, Technical report; 23 pages; code is at: //github.com/microsoft/promptbench

The increasing reliance on Large Language Models (LLMs) across academia and industry necessitates a comprehensive understanding of their robustness to prompts. In response to this vital need, we introduce PromptBench, a robustness benchmark designed to measure LLMs' resilience to adversarial prompts. This study uses a plethora of adversarial textual attacks targeting prompts across multiple levels: character, word, sentence, and semantic. These prompts are then employed in diverse tasks, such as sentiment analysis, natural language inference, reading comprehension, machine translation, and math problem-solving. Our study generates 4,032 adversarial prompts, meticulously evaluated over 8 tasks and 13 datasets, with 567,084 test samples in total. Our findings demonstrate that contemporary LLMs are vulnerable to adversarial prompts. Furthermore, we present comprehensive analysis to understand the mystery behind prompt robustness and its transferability. We then offer insightful robustness analysis and pragmatic recommendations for prompt composition, beneficial to both researchers and everyday users. We make our code, prompts, and methodologies to generate adversarial prompts publicly accessible, thereby enabling and encouraging collaborative exploration in this pivotal field: //github.com/microsoft/promptbench.

Weight · 異常點 · 語言模型化 · 可辨認的 · MoDELS ·

2023 年 6 月 13 日

OWQ: Lessons learned from activation outliers for weight quantization in large language models

Changhun Lee,Jungyu Jin,Taesu Kim,Hyungjun Kim,Eunhyeok Park

Large language models (LLMs) with hundreds of billions of parameters show impressive results across various language tasks using simple prompt tuning and few-shot examples, without the need for task-specific fine-tuning. However, their enormous size requires multiple server-grade GPUs even for inference, creating a significant cost barrier. To address this limitation, we introduce a novel post-training quantization method for weights with minimal quality degradation. While activation outliers are known to be problematic in activation quantization, our theoretical analysis suggests that we can identify factors contributing to weight quantization errors by considering activation outliers. We propose an innovative PTQ scheme called outlier-aware weight quantization (OWQ), which identifies vulnerable weights and allocates high-precision to them. Our extensive experiments demonstrate that the 3.01-bit models produced by OWQ exhibit comparable quality to the 4-bit models generated by OPTQ.

分解的 · 可辨認的 · 傳感器 · motivation · 相關系數 ·

2023 年 6 月 12 日

Factors Impacting the Quality of User Answers on Smartphones

Ivano Bison,Haonan Zhao

from arxiv, 5 pages, 1 table

So far, most research investigating the predictability of human behavior, such as mobility and social interactions, has focused mainly on the exploitation of sensor data. However, sensor data can be difficult to capture the subjective motivations behind the individuals' behavior. Understanding personal context (e.g., where one is and what they are doing) can greatly increase predictability. The main limitation is that human input is often missing or inaccurate. The goal of this paper is to identify factors that influence the quality of responses when users are asked about their current context. We find that two key factors influence the quality of responses: user reaction time and completion time. These factors correlate with various exogenous causes (e.g., situational context, time of day) and endogenous causes (e.g., procrastination attitude, mood). In turn, we study how these two factors impact the quality of responses.

INFORMS · 代價 · 圖 · MoDELS · Networking ·

2023 年 6 月 12 日

The Local Information Cost of Distributed Graph Problems

Peter Robinson

from arxiv, A preliminary version of this paper appeared in the proceedings of SODA 2021

We introduce the local information cost (LIC), which quantifies the amount of information that nodes in a network need to learn when solving a graph problem. We show that the local information cost presents a natural lower bound on the communication complexity of distributed algorithms. For the synchronous CONGEST $KT_1$ model, where each node has initial knowledge of its neighbors' IDs, we prove that $\Omega(\frac{\text{LIC}_\gamma(P)}{\log\tau \log n})$ bits are required for solving a graph problem $P$ with a $\tau$-round algorithm that errs with probability at most $\gamma$. Our result is the first lower bound that yields a general trade-off between communication and time for graph problems in the CONGEST $KT_1$ model. We demonstrate how to apply the local information cost by deriving a lower bound on the communication complexity of computing routing tables for all-pairs-shortest-paths (APSP) routing, as well as for computing a spanner with multiplicative stretch $2t-1$ that consists of at most $O(n^{1+\frac{1}{t} + \epsilon})$ edges, where $\epsilon = O( {1}/{t^2} )$. More concretely, we derive the following lower bounds in the CONGEST model under the $KT_1$ assumption: For constructing routing tables, we show that any $O(\text{poly}(n))$-time algorithm has a communication complexity of $\Omega( {n^2}/{\log^2 n} )$ bits. Our main result is for constructing graph spanners: We show that any $O(\text{poly}(n))$-time algorithm must send at least $\tilde\Omega(\tfrac{1}{t^2} n^{1+{1}/{2t}})$ bits. Previously, only a trivial lower bound of $\tilde \Omega(n)$ bits was known for these problems.

INFORMS · 近似 · CTR · SimPLe · 設計 ·

2023 年 6 月 11 日

Bayesian Calibrated Click-Through Auction

Junjie Chen,Minming Li,Haifeng Xu,Song Zuo

We study information design in click-through auctions, in which the bidders/advertisers bid for winning an opportunity to show their ads but only pay for realized clicks. The payment may or may not happen, and its probability is called the click-through rate(CTR). This auction format is widely used in the industry of online advertising. Bidders have private values, whereas the seller has private information about each bidder's CTRs. We are interested in the seller's problem of partially revealing CTR information to maximize revenue. Information design in click-through auctions turns out to be intriguingly different from almost all previous studies in this space since any revealed information about CTRs will never affect bidders' bidding behaviors -- they will always bid their true value for a click -- but only affect the auction's allocation and payment rule. This makes information design effectively a (constrained) mechanism design problem. We primarily focus on the two-bidder situation, which is already notoriously challenging as demonstrated in recent works, and adopt the algorithmic lens of developing approximate algorithms. Our first result is an FPTAS to compute an approximately optimal mechanism. The design of this algorithm leverages Bayesian bidder values which help to ``smooth'' the seller's revenue function and lead to better tractability. Our second result seeks to design ``simple'' and more practical signaling schemes. When bidders' CTR distribution is symmetric, we develop a simple prior-free signaling scheme, whose construction relies on a single parameter called optimal signal ratio. The constructed scheme provably obtains a good approximation as long as the maximum and minimum of bidders' value density functions do not differ much.

MoDELS · 泛化理論 · 得分 · Learning · Prompt ·

2023 年 6 月 9 日

How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?

Yifei Ming,Yixuan Li

Recent large vision-language models such as CLIP have shown remarkable out-of-distribution (OOD) detection and generalization performance. However, their zero-shot in-distribution (ID) accuracy is often limited for downstream datasets. Recent CLIP-based fine-tuning methods such as prompt learning have demonstrated significant improvements in ID classification and OOD generalization where OOD labels are available. Nonetheless, it remains unclear whether the model is reliable to semantic shifts without OOD labels. In this paper, we aim to bridge the gap and present a comprehensive study to understand how fine-tuning impact OOD detection for few-shot downstream tasks. By framing OOD detection as multi-modal concept matching, we establish a connection between fine-tuning methods and various OOD scores. Our results suggest that a proper choice of OOD scores is essential for CLIP-based fine-tuning. In particular, the maximum concept matching (MCM) score provides a promising solution consistently. We also show that prompt learning demonstrates the state-of-the-art OOD detection performance over the zero-shot counterpart.

控制器 · 操作 · 變換 · 縮放 · 容差 ·

2023 年 6 月 9 日

Operational Concurrency Control in the Face of Arbitrary Scale and Latency

James Smith

from arxiv, 21 pages, 12 figures

We present for the first time a complete solution to the problem of proving the correctness of a concurrency control algorithm for collaborative text editors against the standard consistency model. The success of our approach stems from the use of comprehensive stringwise operational transformations, which appear to have escaped a formal treatment until now. Because these transformations sometimes lead to an increase in the number of operations as they are transformed, we cannot use inductive methods and adopt the novel idea of decreasing diagrams instead. We also base our algorithm on a client-server model rather than a peer-to-peer one, which leads to the correct application of operational transformations to both newly generated and pending operations. And lastly we solve the problem of latency, so that our algorithm works perfectly in practice. The result of these innovations is the first ever formally correct concurrency control algorithm for collaborative text editors together with a fast, fault tolerant and highly scalable implementation.

Learning · 聯邦學習 · MoDELS · 情景 · Taxonomy ·

2022 年 10 月 10 日

A Survey on Heterogeneous Federated Learning

Dashan Gao,Xin Yao,Qiang Yang

from arxiv, 46 pages, 10 figures, 10 tables

Federated learning (FL) has been proposed to protect data privacy and virtually assemble the isolated data silos by cooperatively training models among organizations without breaching privacy and security. However, FL faces heterogeneity from various aspects, including data space, statistical, and system heterogeneity. For example, collaborative organizations without conflict of interest often come from different areas and have heterogeneous data from different feature spaces. Participants may also want to train heterogeneous personalized local models due to non-IID and imbalanced data distribution and various resource-constrained devices. Therefore, heterogeneous FL is proposed to address the problem of heterogeneity in FL. In this survey, we comprehensively investigate the domain of heterogeneous FL in terms of data space, statistical, system, and model heterogeneity. We first give an overview of FL, including its definition and categorization. Then, We propose a precise taxonomy of heterogeneous FL settings for each type of heterogeneity according to the problem setting and learning objective. We also investigate the transfer learning methodologies to tackle the heterogeneity in FL. We further present the applications of heterogeneous FL. Finally, we highlight the challenges and opportunities and envision promising future research directions toward new framework design and trustworthy approaches.

Ripple · Networking · 圖 · 知識圖譜 · Extensibility ·

2018 年 3 月 9 日

Ripple Network: Propagating User Preferences on the Knowledge Graph for Recommender Systems

Hongwei Wang,Fuzheng Zhang,Jialin Wang,Miao Zhao,Wenjie Li,Xing Xie,Minyi Guo

To address the sparsity and cold start problem of collaborative filtering, researchers usually make use of side information, such as social networks or item attributes, to improve recommendation performance. This paper considers the knowledge graph as the source of side information. To address the limitations of existing embedding-based and path-based methods for knowledge-graph-aware recommendation, we propose Ripple Network, an end-to-end framework that naturally incorporates the knowledge graph into recommender systems. Similar to actual ripples propagating on the surface of water, Ripple Network stimulates the propagation of user preferences over the set of knowledge entities by automatically and iteratively extending a user's potential interests along links in the knowledge graph. The multiple "ripples" activated by a user's historically clicked items are thus superposed to form the preference distribution of the user with respect to a candidate item, which could be used for predicting the final clicking probability. Through extensive experiments on real-world datasets, we demonstrate that Ripple Network achieves substantial gains in a variety of scenarios, including movie, book and news recommendation, over several state-of-the-art baselines.