蜜桃少妇AV久久久久久久,男人的天堂精品视频网站,在线观看亚洲国产一区二区

Federated Learning (FL) has emerged as a leading paradigm for decentralized, privacy preserving machine learning training. However, recent research on gradient inversion attacks (GIAs) have shown that gradient updates in FL can leak information on private training samples. While existing surveys on GIAs have focused on the honest-but-curious server threat model, there is a dearth of research categorizing attacks under the realistic and far more privacy-infringing cases of malicious servers and clients. In this paper, we present a survey and novel taxonomy of GIAs that emphasize FL threat models, particularly that of malicious servers and clients. We first formally define GIAs and contrast conventional attacks with the malicious attacker. We then summarize existing honest-but-curious attack strategies, corresponding defenses, and evaluation metrics. Critically, we dive into attacks with malicious servers and clients to highlight how they break existing FL defenses, focusing specifically on reconstruction methods, target model architectures, target data, and evaluation metrics. Lastly, we discuss open problems and future research directions.

知識薈萃

精品入門和進階教程、論文和代碼整理等

查看相關VIP內容、論文、資訊等

Performer · state-of-the-art · 縮放 · 端到端 · 約束 ·

2024 年 6 月 26 日

LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism

Diandian Gu,Peng Sun,Qinghao Hu,Ting Huang,Xun Chen,Yingtong Xiong,Guoteng Wang,Qiaoling Chen,Shangchun Zhao,Jiarui Fang,Yonggang Wen,Tianwei Zhang,Xin Jin,Xuanzhe Liu

Efficiently training LLMs with long sequences is important yet challenged by the massive computation and memory requirements. Sequence parallelism has been proposed to tackle these problems, but existing methods suffer from scalability or efficiency issues. We propose LoongTrain, a novel system to efficiently train LLMs with long sequences at scale. The core of LoongTrain is the 2D-Attention mechanism, which combines both head-parallel and context-parallel techniques to break the scalability constraints while maintaining efficiency. We introduce Double-Ring-Attention and analyze the performance of device placement strategies to further speed up training. We implement LoongTrain with the hybrid ZeRO and Selective Checkpoint++ techniques. Experiment results show that LoongTrain outperforms state-of-the-art baselines, i.e., DeepSpeed-Ulysses and Megatron Context Parallelism, in both end-to-end training speed and scalability, and improves Model FLOPs Utilization (MFU) by up to 2.88x.

MoDELS · Continuity · Analysis · Taxonomy · 有向 ·

2024 年 6 月 26 日

A Survey of Privacy-Preserving Model Explanations: Privacy Risks, Attacks, and Countermeasures

Thanh Tam Nguyen,Thanh Trung Huynh,Zhao Ren,Thanh Toan Nguyen,Phi Le Nguyen,Hongzhi Yin,Quoc Viet Hung Nguyen

from arxiv, Revision

As the adoption of explainable AI (XAI) continues to expand, the urgency to address its privacy implications intensifies. Despite a growing corpus of research in AI privacy and explainability, there is little attention on privacy-preserving model explanations. This article presents the first thorough survey about privacy attacks on model explanations and their countermeasures. Our contribution to this field comprises a thorough analysis of research papers with a connected taxonomy that facilitates the categorisation of privacy attacks and countermeasures based on the targeted explanations. This work also includes an initial investigation into the causes of privacy leaks. Finally, we discuss unresolved issues and prospective research directions uncovered in our analysis. This survey aims to be a valuable resource for the research community and offers clear insights for those new to this domain. To support ongoing research, we have established an online resource repository, which will be continuously updated with new and relevant findings. Interested readers are encouraged to access our repository at //github.com/tamlhp/awesome-privex.

圖 · MoDELS · Performer · Learning · 目標檢測 ·

2024 年 6 月 24 日

EGTR: Extracting Graph from Transformer for Scene Graph Generation

Jinbae Im,JeongYeon Nam,Nokyung Park,Hyungmin Lee,Seunghyun Park

from arxiv, CVPR 2024 (Best paper award candidate)

Scene Graph Generation (SGG) is a challenging task of detecting objects and predicting relationships between objects. After DETR was developed, one-stage SGG models based on a one-stage object detector have been actively studied. However, complex modeling is used to predict the relationship between objects, and the inherent relationship between object queries learned in the multi-head self-attention of the object detector has been neglected. We propose a lightweight one-stage SGG model that extracts the relation graph from the various relationships learned in the multi-head self-attention layers of the DETR decoder. By fully utilizing the self-attention by-products, the relation graph can be extracted effectively with a shallow relation extraction head. Considering the dependency of the relation extraction task on the object detection task, we propose a novel relation smoothing technique that adjusts the relation label adaptively according to the quality of the detected objects. By the relation smoothing, the model is trained according to the continuous curriculum that focuses on object detection task at the beginning of training and performs multi-task learning as the object detection performance gradually improves. Furthermore, we propose a connectivity prediction task that predicts whether a relation exists between object pairs as an auxiliary task of the relation extraction. We demonstrate the effectiveness and efficiency of our method for the Visual Genome and Open Image V6 datasets. Our code is publicly available at //github.com/naver-ai/egtr.

Learning · 模型評估 · 優化器 · GROUP · MoDELS ·

2024 年 6 月 21 日

Towards Dynamic Resource Allocation and Client Scheduling in Hierarchical Federated Learning: A Two-Phase Deep Reinforcement Learning Approach

Xiaojing Chen,Zhenyuan Li,Wei Ni,Xin Wang,Shunqing Zhang,Yanzan Sun,Shugong Xu,Qingqi Pei

Federated learning (FL) is a viable technique to train a shared machine learning model without sharing data. Hierarchical FL (HFL) system has yet to be studied regrading its multiple levels of energy, computation, communication, and client scheduling, especially when it comes to clients relying on energy harvesting to power their operations. This paper presents a new two-phase deep deterministic policy gradient (DDPG) framework, referred to as ``TP-DDPG'', to balance online the learning delay and model accuracy of an FL process in an energy harvesting-powered HFL system. The key idea is that we divide optimization decisions into two groups, and employ DDPG to learn one group in the first phase, while interpreting the other group as part of the environment to provide rewards for training the DDPG in the second phase. Specifically, the DDPG learns the selection of participating clients, and their CPU configurations and the transmission powers. A new straggler-aware client association and bandwidth allocation (SCABA) algorithm efficiently optimizes the other decisions and evaluates the reward for the DDPG. Experiments demonstrate that with substantially reduced number of learnable parameters, the TP-DDPG can quickly converge to effective polices that can shorten the training time of HFL by 39.4% compared to its benchmarks, when the required test accuracy of HFL is 0.9.

2024 年 6 月 21 日

Ink and Algorithm: Exploring Temporal Dynamics in Human-AI Collaborative Writing

Kaixun Yang,Yixin Cheng,Linxuan Zhao,Mladen Rakovi?,Zachari Swiecki,Dragan Ga?evi?,Guanliang Chen

The advent of Generative Artificial Intelligence (GAI) has revolutionized the field of writing, marking a shift towards human-AI collaborative writing in education. However, the dynamics of human-AI interaction in the collaborative writing process are not well understood, and thus it remains largely unknown how human learning can be effectively supported with such cutting-edge GAI technologies. In this study, we aim to bridge this gap by investigating how humans employ GAI in collaborative writing and examining the interplay between the patterns of GAI usage and human writing behaviors. Considering the potential varying degrees to which people rely on GAI usage, we proposed to use Dynamic Time Warping time-series clustering for the identification and analysis of common temporal patterns in AI usage during the human-AI collaborative writing processes. Additionally, we incorporated Epistemic Network Analysis to reveal the correlation between GAI usage and human writing behaviors that reflect cognitive processes (i.e., knowledge telling, knowledge transformation, and cognitive presence), aiming to offer insights for developing better approaches and tools to support human to learn effectively via such human-AI collaborative writing activities. Our findings reveal four major distinct temporal patterns in AI utilization and highlight significant correlations between these patterns and human writing behaviors. These findings have significant implications for effectively supporting human learning with GAI in educational writing tasks.

隨機森林 · MoDELS · state-of-the-art · 模型評估 · 周期的 ·

2024 年 6 月 20 日

QC-Forest: a Classical-Quantum Algorithm to Provably Speedup Retraining of Random Forest

Romina Yalovetzky,Niraj Kumar,Changhao Li,Marco Pistoia

Random Forest (RF) is a popular tree-ensemble method for supervised learning, prized for its ease of use and flexibility. Online RF models require to account for new training data to maintain model accuracy. This is particularly important in applications where data is periodically and sequentially generated over time in data streams, such as auto-driving systems, and credit card payments. In this setting, performing periodic model retraining with the old and new data accumulated is beneficial as it fully captures possible drifts in the data distribution over time. However, this is unpractical with state-of-the-art classical algorithms for RF as they scale linearly with the accumulated number of samples. We propose QC-Forest, a classical-quantum algorithm designed to time-efficiently retrain RF models in the streaming setting for multi-class classification and regression, achieving a runtime poly-logarithmic in the total number of accumulated samples. QC-Forest leverages Des-q, a quantum algorithm for single tree construction and retraining proposed by Kumar et al. by expanding to multi-class classification, as the original proposal was limited to binary classes, and introducing an exact classical method to replace an underlying quantum subroutine incurring a finite error, while maintaining the same poly-logarithmic dependence. Finally, we showcase that QC-Forest achieves competitive accuracy in comparison to state-of-the-art RF methods on widely used benchmark datasets with up to 80,000 samples, while significantly speeding up the model retrain.

控制器 · Performer · HER · MoDELS · 可理解性 ·

2024 年 6 月 18 日

Vernacular? I Barely Know Her: Challenges with Style Control and Stereotyping

Ankit Aich,Tingting Liu,Salvatore Giorgi,Kelsey Isman,Lyle Ungar,Brenda Curtis

Large Language Models (LLMs) are increasingly being used in educational and learning applications. Research has demonstrated that controlling for style, to fit the needs of the learner, fosters increased understanding, promotes inclusion, and helps with knowledge distillation. To understand the capabilities and limitations of contemporary LLMs in style control, we evaluated five state-of-the-art models: GPT-3.5, GPT-4, GPT-4o, Llama-3, and Mistral-instruct- 7B across two style control tasks. We observed significant inconsistencies in the first task, with model performances averaging between 5th and 8th grade reading levels for tasks intended for first-graders, and standard deviations up to 27.6. For our second task, we observed a statistically significant improvement in performance from 0.02 to 0.26. However, we find that even without stereotypes in reference texts, LLMs often generated culturally insensitive content during their tasks. We provide a thorough analysis and discussion of the results.

圖 · 結點 · Networking · GNN · Learning ·

2024 年 6 月 18 日

The Heterophilic Snowflake Hypothesis: Training and Empowering GNNs for Heterophilic Graphs

Kun Wang,Guibin Zhang,Xinnan Zhang,Junfeng Fang,Xun Wu,Guohao Li,Shirui Pan,Wei Huang,Yuxuan Liang

from arxiv, KDD 2024

Graph Neural Networks (GNNs) have become pivotal tools for a range of graph-based learning tasks. Notably, most current GNN architectures operate under the assumption of homophily, whether explicitly or implicitly. While this underlying assumption is frequently adopted, it is not universally applicable, which can result in potential shortcomings in learning effectiveness. In this paper, \textbf{for the first time}, we transfer the prevailing concept of ``one node one receptive field" to the heterophilic graph. By constructing a proxy label predictor, we enable each node to possess a latent prediction distribution, which assists connected nodes in determining whether they should aggregate their associated neighbors. Ultimately, every node can have its own unique aggregation hop and pattern, much like each snowflake is unique and possesses its own characteristics. Based on observations, we innovatively introduce the Heterophily Snowflake Hypothesis and provide an effective solution to guide and facilitate research on heterophilic graphs and beyond. We conduct comprehensive experiments including (1) main results on 10 graphs with varying heterophily ratios across 10 backbones; (2) scalability on various deep GNN backbones (SGC, JKNet, etc.) across various large number of layers (2,4,6,8,16,32 layers); (3) comparison with conventional snowflake hypothesis; (4) efficiency comparison with existing graph pruning algorithms. Our observations show that our framework acts as a versatile operator for diverse tasks. It can be integrated into various GNN frameworks, boosting performance in-depth and offering an explainable approach to choosing the optimal network depth. The source code is available at \url{//github.com/bingreeky/HeteroSnoH}.

MoDELS · 數據集 · 語言模型化 · 情景 · 大語言模型 ·

2024 年 6 月 17 日

RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content

Joao Monteiro,Pierre-Andre Noel,Etienne Marcotte,Sai Rajeswar,Valentina Zantedeschi,David Vazquez,Nicolas Chapados,Christopher Pal,Perouz Taslakian

Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includes encyclopedic documents that harbor a vast amount of general knowledge (e.g., Wikipedia) but also potentially overlap with benchmark datasets used for evaluating LLMs. Consequently, evaluating models on test splits that might have leaked into the training set is prone to misleading conclusions. To foster sound evaluation of language models, we introduce a new test dataset named RepLiQA, suited for question-answering and topic retrieval tasks. RepLiQA is a collection of five splits of test sets, four of which have not been released to the internet or exposed to LLM APIs prior to this publication. Each sample in RepLiQA comprises (1) a reference document crafted by a human annotator and depicting an imaginary scenario (e.g., a news article) absent from the internet; (2) a question about the document's topic; (3) a ground-truth answer derived directly from the information in the document; and (4) the paragraph extracted from the reference document containing the answer. As such, accurate answers can only be generated if a model can find relevant content within the provided document. We run a large-scale benchmark comprising several state-of-the-art LLMs to uncover differences in performance across models of various types and sizes in a context-conditional language modeling setting. Released splits of RepLiQA can be found here: //huggingface.co/datasets/ServiceNow/repliqa.

INFORMS · Performer · 語言模型化 · MoDELS · Processing（編程語言） ·

2024 年 6 月 17 日

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Zhiwei Cao,Qian Cao,Yu Lu,Ningxin Peng,Luyang Huang,Shanbo Cheng,Jinsong Su

from arxiv, Accepted to ACL 2024

The growing popularity of Large Language Models has sparked interest in context compression for Large Language Models (LLMs). However, the performance of previous methods degrades dramatically as compression ratios increase, sometimes even falling to the closed-book level. This decline can be attributed to the loss of key information during the compression process. Our preliminary study supports this hypothesis, emphasizing the significance of retaining key information to maintain model performance under high compression ratios. As a result, we introduce Query-Guided Compressor (QGC), which leverages queries to guide the context compression process, effectively preserving key information within the compressed context. Additionally, we employ a dynamic compression strategy. We validate the effectiveness of our proposed QGC on the Question Answering task, including NaturalQuestions, TriviaQA, and HotpotQA datasets. Experimental results show that QGC can consistently perform well even at high compression ratios, which also offers significant benefits in terms of inference cost and throughput.