在线点播亚洲日韩国产欧美_麻豆尤物国产AV一区二区_国产精品一区二区在线观看99_国产日韩欧美在线观看播放_国内自拍一区三页_高潮国产喷水白_男人插入女人啊啊啊啊啊视频在线观看

Recent advancements in federated learning (FL) have produced models that retain user privacy by training across multiple decentralized devices or systems holding local data samples. However, these strategies often neglect the inherent challenges of statistical heterogeneity and vulnerability to adversarial attacks, which can degrade model robustness and fairness. Personalized FL strategies offer some respite by adjusting models to fit individual client profiles, yet they tend to neglect server-side aggregation vulnerabilities. To address these issues, we propose Reinforcement Federated Learning (RFL), a novel framework that leverages deep reinforcement learning to adaptively optimize client contribution during aggregation, thereby enhancing both model robustness against malicious clients and fairness across participants under non-identically distributed settings. To achieve this goal, we propose a meticulous approach involving a Deep Deterministic Policy Gradient-based algorithm for continuous control of aggregation weights, an innovative client selection method based on model parameter distances, and a reward mechanism guided by validation set performance. Empirically, extensive experiments demonstrate that, in terms of robustness, RFL outperforms the state-of-the-art methods, while maintaining comparable levels of fairness, offering a promising solution to build resilient and fair federated systems.

相關內容

穩健性(xing)

關注 3

Microsoft Surface · MoDELS · Metal · CASES · 數據可用性 ·

2024 年 3 月 20 日

Stochastic Geometry Models for Texture Synthesis of Machined Metallic Surfaces: Sandblasting and Milling

Natascha Jeziorski,Claudia Redenbach

Training defect detection algorithms for visual surface inspection systems requires a large and representative set of training data. Often there is not enough real data available which additionally cannot cover the variety of possible defects. Synthetic data generated by a synthetic visual surface inspection environment can overcome this problem. Therefore, a digital twin of the object is needed, whose micro-scale surface topography is modeled by texture synthesis models. We develop stochastic texture models for sandblasted and milled surfaces based on topography measurements of such surfaces. As the surface patterns differ significantly, we use separate modeling approaches for the two cases. Sandblasted surfaces are modeled by a combination of data-based texture synthesis methods that rely entirely on the measurements. In contrast, the model for milled surfaces is procedural and includes all process-related parameters known from the machine settings.

MoDELS · 模型評估 · 數據集 · Continuity · 查準率/準確率 ·

2024 年 3 月 18 日

The Power of Few: Accelerating and Enhancing Data Reweighting with Coreset Selection

Mohammad Jafari,Yimeng Zhang,Yihua Zhang,Sijia Liu

from arxiv, Accepted to ICASSP 2024

As machine learning tasks continue to evolve, the trend has been to gather larger datasets and train increasingly larger models. While this has led to advancements in accuracy, it has also escalated computational costs to unsustainable levels. Addressing this, our work aims to strike a delicate balance between computational efficiency and model accuracy, a persisting challenge in the field. We introduce a novel method that employs core subset selection for reweighting, effectively optimizing both computational time and model performance. By focusing on a strategically selected coreset, our approach offers a robust representation, as it efficiently minimizes the influence of outliers. The re-calibrated weights are then mapped back to and propagated across the entire dataset. Our experimental results substantiate the effectiveness of this approach, underscoring its potential as a scalable and precise solution for model training.

泛函 · INFORMS · 通用動力公司 · Networking · Networks ·

2024 年 3 月 18 日

The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents

Yatin Dandi,Emanuele Troiani,Luca Arnaboldi,Luca Pesce,Lenka Zdeborová,Florent Krzakala

We investigate the training dynamics of two-layer neural networks when learning multi-index target functions. We focus on multi-pass gradient descent (GD) that reuses the batches multiple times and show that it significantly changes the conclusion about which functions are learnable compared to single-pass gradient descent. In particular, multi-pass GD with finite stepsize is found to overcome the limitations of gradient flow and single-pass GD given by the information exponent (Ben Arous et al., 2021) and leap exponent (Abbe et al., 2023) of the target function. We show that upon re-using batches, the network achieves in just two time steps an overlap with the target subspace even for functions not satisfying the staircase property (Abbe et al., 2021). We characterize the (broad) class of functions efficiently learned in finite time. The proof of our results is based on the analysis of the Dynamical Mean-Field Theory (DMFT). We further provide a closed-form description of the dynamical process of the low-dimensional projections of the weights, and numerical experiments illustrating the theory.

INTERACT · Cognition · Analysis · 情景 · 可理解性 ·

2024 年 3 月 18 日

ChildCI Framework: Analysis of Motor and Cognitive Development in Children-Computer Interaction for Age Detection

Juan Carlos Ruiz-Garcia,Ruben Tolosana,Ruben Vera-Rodriguez,Julian Fierrez,Jaime Herreros-Rodriguez

from arxiv, 12 pages, 3 figures, 7 tables

This article presents a comprehensive analysis of the different tests proposed in the recent ChildCI framework, proving its potential for generating a better understanding of children's neuromotor and cognitive development along time, as well as their possible application in other research areas such as e-Health and e-Learning. In particular, we propose a set of over 100 global features related to motor and cognitive aspects of the children interaction with mobile devices, some of them collected and adapted from the literature. Furthermore, we analyse the robustness and discriminative power of the proposed feature set including experimental results for the task of children age group detection based on their motor and cognitive behaviours. Two different scenarios are considered in this study: i) single-test scenario, and ii) multiple-test scenario. Results over 93% accuracy are achieved using the publicly available ChildCIdb_v1 database (over 400 children from 18 months to 8 years old), proving the high correlation of children's age with the way they interact with mobile devices.

Continuity · 學成 · Vision · 計算機視覺 · 批量學習 ·

2021 年 9 月 23 日

Recent Advances of Continual Learning in Computer Vision: An Overview

Haoxuan Qu,Hossein Rahmani,Li Xu,Bryan Williams,Jun Liu

from arxiv, 21 pages, 5 figures

In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order. Similar to the human learning process with the ability of learning, fusing, and accumulating new knowledge coming at different time steps, continual learning is considered to have high practical significance. Hence, continual learning has been studied in various artificial intelligence tasks. In this paper, we present a comprehensive review of the recent progress of continual learning in computer vision. In particular, the works are grouped by their representative techniques, including regularization, knowledge distillation, memory, generative replay, parameter isolation, and a combination of the above techniques. For each category of these techniques, both its characteristics and applications in computer vision are presented. At the end of this overview, several subareas, where continuous knowledge accumulation is potentially helpful while continual learning has not been well studied, are discussed.

MoDELS · Machine Learning · 學成 · 線性的 · 線性模型 ·

2021 年 9 月 6 日

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Yehuda Dar,Vidya Muthukumar,Richard G. Baraniuk

The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated with detrimental overfitting, and yet a wide range of interpolating models -- from simple linear models to deep neural networks -- have recently been observed to generalize extremely well on fresh test data. Indeed, the recently discovered double descent phenomenon has revealed that highly overparameterized models often improve over the best underparameterized model in test performance. Understanding learning in this overparameterized regime requires new theory and foundational empirical studies, even for the simplest case of the linear model. The underpinnings of this understanding have been laid in very recent analyses of overparameterized linear regression and related statistical learning tasks, which resulted in precise analytic characterizations of double descent. This paper provides a succinct overview of this emerging theory of overparameterized ML (henceforth abbreviated as TOPML) that explains these recent findings through a statistical signal processing perspective. We emphasize the unique aspects that define the TOPML research area as a subfield of modern ML theory and outline interesting open questions that remain.

INFORMS · Taxonomy · Machine Learning · Integration · 學成 ·

2021 年 5 月 28 日

Informed Machine Learning -- A Taxonomy and Survey of Integrating Knowledge into Learning Systems

Laura von Rueden,Sebastian Mayer,Katharina Beckh,Bogdan Georgiev,Sven Giesselbach,Raoul Heese,Birgit Kirsch,Julius Pfrommer,Annika Pick,Rajkumar Ramamurthy,Michal Walczak,Jochen Garcke,Christian Bauckhage,Jannis Schuecker

from arxiv, Accepted at IEEE Transactions on Knowledge and Data Engineering: //ieeexplore.ieee.org/document/9429985

Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. We provide a definition and propose a concept for informed machine learning which illustrates its building blocks and distinguishes it from conventional machine learning. We introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Based on this taxonomy, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.

Performer · 聯邦學習 · Extensibility · 學成 · 分解的 ·

2021 年 2 月 21 日

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

Chengxu Yang,QiPeng Wang,Mengwei Xu,Zhenpeng Chen,Kaigui Bian,Yunxin Liu,Xuanzhe Liu

Federated learning (FL) is an emerging, privacy-preserving machine learning paradigm, drawing tremendous attention in both academia and industry. A unique characteristic of FL is heterogeneity, which resides in the various hardware specifications and dynamic states across the participating devices. Theoretically, heterogeneity can exert a huge influence on the FL training process, e.g., causing a device unavailable for training or unable to upload its model updates. Unfortunately, these impacts have never been systematically studied and quantified in existing FL literature. In this paper, we carry out the first empirical study to characterize the impacts of heterogeneity in FL. We collect large-scale data from 136k smartphones that can faithfully reflect heterogeneity in real-world settings. We also build a heterogeneity-aware FL platform that complies with the standard FL protocol but with heterogeneity in consideration. Based on the data and the platform, we conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under heterogeneity-aware and heterogeneity-unaware settings. Results show that heterogeneity causes non-trivial performance degradation in FL, including up to 9.2% accuracy drop, 2.32x lengthened training time, and undermined fairness. Furthermore, we analyze potential impact factors and find that device failure and participant bias are two potential factors for performance degradation. Our study provides insightful implications for FL practitioners. On the one hand, our findings suggest that FL algorithm designers consider necessary heterogeneity during the evaluation. On the other hand, our findings urge system providers to design specific mechanisms to mitigate the impacts of heterogeneity.

Continuity · Neural Networks · 學成 · Performer · Networks ·

2020 年 9 月 3 日

A Wholistic View of Continual Learning with Deep Neural Networks: Forgotten Lessons and the Bridge to Active and Open World Learning

Martin Mundt,Yong Won Hong,Iuliia Pliushch,Visvanathan Ramesh

from arxiv, 32 pages

Current deep learning research is dominated by benchmark evaluation. A method is regarded as favorable if it empirically performs well on the dedicated test set. This mentality is seamlessly reflected in the resurfacing area of continual learning, where consecutively arriving sets of benchmark data are investigated. The core challenge is framed as protecting previously acquired representations from being catastrophically forgotten due to the iterative parameter updates. However, comparison of individual methods is nevertheless treated in isolation from real world application and typically judged by monitoring accumulated test set performance. The closed world assumption remains predominant. It is assumed that during deployment a model is guaranteed to encounter data that stems from the same distribution as used for training. This poses a massive challenge as neural networks are well known to provide overconfident false predictions on unknown instances and break down in the face of corrupted data. In this work we argue that notable lessons from open set recognition, the identification of statistically deviating data outside of the observed dataset, and the adjacent field of active learning, where data is incrementally queried such that the expected performance gain is maximized, are frequently overlooked in the deep learning era. Based on these forgotten lessons, we propose a consolidated view to bridge continual learning, active learning and open set recognition in deep neural networks. Our results show that this not only benefits each individual paradigm, but highlights the natural synergies in a common framework. We empirically demonstrate improvements when alleviating catastrophic forgetting, querying data in active learning, selecting task orders, while exhibiting robust open world application where previously proposed methods fail.

MoDELS · Transformer模型 · 變換 · 推斷 · 模型評估 ·

2020 年 6 月 23 日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Zhuohan Li,Eric Wallace,Sheng Shen,Kevin Lin,Kurt Keutzer,Dan Klein,Joseph E. Gonzalez

from arxiv, ICML 2020

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.