黄片一级在线视频播放,亚洲国产中文精品在线观看香蕉,日韩AⅤ精品国内在线,亚洲热码中文字幕视频

All machine learning algorithms use a loss, cost, utility or reward function to encode the learning objective and oversee the learning process. This function that supervises learning is a frequently unrecognized hyperparameter that determines how incorrect outputs are penalized and can be tuned to improve performance. This paper shows that training speed and final accuracy of neural networks can significantly depend on the loss function used to train neural networks. In particular derivative values can be significantly different with different loss functions leading to significantly different performance after gradient descent based Backpropagation (BP) training. This paper explores the effect on performance of using new loss functions that are also convex but penalize errors differently compared to the popular Cross-entropy loss. Two new classification loss functions that significantly improve performance on a wide variety of benchmark tasks are proposed. A new loss function call smooth absolute error that outperforms the Squared error, Huber and Log-Cosh losses on datasets with significantly many outliers is proposed. This smooth absolute error loss function is infinitely differentiable and more closely approximates the absolute error loss compared to the Huber and Log-Cosh losses used for robust regression.

相關內容

泛函

關注 0

Learning · 推斷 · MoDELS · 圖 · 混合 ·

2024 年 12 月 17 日

Scrutinizing the Vulnerability of Decentralized Learning to Membership Inference Attacks

Ousmane Touat,Jezekael Brunon,Yacine Belal,Julien Nicolas,Mohamed Maouche,César Sabater,Sonia Ben Mokhtar

from arxiv, 12 pages, 8 figures

The primary promise of decentralized learning is to allow users to engage in the training of machine learning models in a collaborative manner while keeping their data on their premises and without relying on any central entity. However, this paradigm necessitates the exchange of model parameters or gradients between peers. Such exchanges can be exploited to infer sensitive information about training data, which is achieved through privacy attacks (e.g Membership Inference Attacks -- MIA). In order to devise effective defense mechanisms, it is important to understand the factors that increase/reduce the vulnerability of a given decentralized learning architecture to MIA. In this study, we extensively explore the vulnerability to MIA of various decentralized learning architectures by varying the graph structure (e.g number of neighbors), the graph dynamics, and the aggregation strategy, across diverse datasets and data distributions. Our key finding, which to the best of our knowledge we are the first to report, is that the vulnerability to MIA is heavily correlated to (i) the local model mixing strategy performed by each node upon reception of models from neighboring nodes and (ii) the global mixing properties of the communication graph. We illustrate these results experimentally using four datasets and by theoretically analyzing the mixing properties of various decentralized architectures. Our paper draws a set of lessons learned for devising decentralized learning systems that reduce by design the vulnerability to MIA.

MoDELS · 決策樹 · 分解的 · 示例 · Performer ·

2024 年 12 月 16 日

Evaluating the Efficacy of Vectocardiographic and ECG Parameters for Efficient Tertiary Cardiology Care Allocation Using Decision Tree Analysis

Lucas José da Costa,Vinicius Ruiz Uemoto,Mariana F. N. de Marchi,Renato de Aguiar Hortegal,Renata Valeri de Freitas

Use real word data to evaluate the performance of the electrocardiographic markers of GEH as features in a machine learning model with Standard ECG features and Risk Factors in Predicting Outcome of patients in a population referred to a tertiary cardiology hospital. Patients forwarded to specific evaluation in a cardiology specialized hospital performed an ECG and a risk factor anamnesis. A series of follow up attendances occurred in periods of 6 months, 12 months and 15 months to check for cardiovascular related events (mortality or new nonfatal cardiovascular events (Stroke, MI, PCI, CS), as identified during 1-year phone follow-ups. The first attendance ECG was measured by a specialist and processed in order to obtain the global electric heterogeneity (GEH) using the Kors Matriz. The ECG measurements, GEH parameters and risk factors were combined for training multiple instances of XGBoost decision trees models. Each instance were optmized for the AUCPR and the instance with higher AUC is chosen as representative to the model. The importance of each parameter for the winner tree model was compared to better understand the improvement from using GEH parameters. The GEH parameters turned out to have statistical significance for this population specially the QRST angle and the SVG. The combined model with the tree parameters class had the best performance. The findings suggest that using VCG features can facilitate more accurate identification of patients who require tertiary care, thereby optimizing resource allocation and improving patient outcomes. Moreover, the decision tree model's transparency and ability to pinpoint critical features make it a valuable tool for clinical decision-making and align well with existing clinical practices.

穩健性 · Agent · 閾值 · INTERACT · DC ·

2024 年 12 月 16 日

The Black Ninjas and the Sniper: On Robustness of Population Protocols

Benno Lossin,Philipp Czerner,Javier Esparza,Roland Guttenberg,Tobias Prehn

from arxiv, Final version to appear in Principles of Verification: Cycling the Probabilistic Landscape: Essays Dedicated to Joost-Pieter Katoen on the Occasion of His 60th Birthday, Part III (2025)

Population protocols are a model of distributed computation in which an arbitrary number of indistinguishable finite-state agents interact in pairs to decide some property of their initial configuration. We investigate the behaviour of population protocols under adversarial faults that cause agents to silently crash and no longer interact with other agents. As a starting point, we consider the property ``the number of agents exceeds a given threshold $t$'', represented by the predicate $x \geq t$, and show that the standard protocol for $x \geq t$ is very fragile: one single crash in a computation with $x:=2t-1$ agents can already cause the protocol to answer incorrectly that $x \geq t$ does not hold. However, a slightly less known protocol is robust: for any number $t' \geq t$ of agents, at least $t' - t+1$ crashes must occur for the protocol to answer that the property does not hold. We formally define robustness for arbitrary population protocols, and investigate the question whether every predicate computable by population protocols has a robust protocol. Angluin et al. proved in 2007 that population protocols decide exactly the Presburger predicates, which can be represented as Boolean combinations of threshold predicates of the form $\sum_{i=1}^n a_i \cdot x_i \geq t$ for $a_1,...,a_n, t \in \mathbb{Z}$ and modulo prdicates of the form $\sum_{i=1}^n a_i \cdot x_i \bmod m \geq t $ for $a_1, \ldots, a_n, m, t \in \mathbb{N}$. We design robust protocols for all threshold and modulo predicates. We also show that, unfortunately, the techniques in the literature that construct a protocol for a Boolean combination of predicates given protocols for the conjuncts do not preserve robustness. So the question remains open.

binary · 代碼 · 知識 (knowledge) · MoDELS · Processing（編程語言） ·

2024 年 12 月 15 日

A Progressive Transformer for Unifying Binary Code Embedding and Knowledge Transfer

Hanxiao Lu,Hongyu Cai,Yiming Liang,Antonio Bianchi,Z. Berkay Celik

Language model approaches have recently been integrated into binary analysis tasks, such as function similarity detection and function signature recovery. These models typically employ a two-stage training process: pre-training via Masked Language Modeling (MLM) on machine code and fine-tuning for specific tasks. While MLM helps to understand binary code structures, it ignores essential code characteristics, including control and data flow, which negatively affect model generalization. Recent work leverages domain-specific features (e.g., control flow graphs and dynamic execution traces) in transformer-based approaches to improve binary code semantic understanding. However, this approach involves complex feature engineering, a cumbersome and time-consuming process that can introduce predictive uncertainty when dealing with stripped or obfuscated code, leading to a performance drop. In this paper, we introduce ProTST, a novel transformer-based methodology for binary code embedding. ProTST employs a hierarchical training process based on a unique tree-like structure, where knowledge progressively flows from fundamental tasks at the root to more specialized tasks at the leaves. This progressive teacher-student paradigm allows the model to build upon previously learned knowledge, resulting in high-quality embeddings that can be effectively leveraged for diverse downstream binary analysis tasks. The effectiveness of ProTST is evaluated in seven binary analysis tasks, and the results show that ProTST yields an average validation score (F1, MRR, and Recall@1) improvement of 14.8% compared to traditional two-stage training and an average validation score of 10.7% compared to multimodal two-stage frameworks.

tuning · 控制器 · MoDELS · Continuity · Integration ·

2024 年 12 月 13 日

A Clinical Tuning Framework for Continuous Kinematic and Impedance Control of a Powered Knee-Ankle Prosthesis

Emma Reznick,T. Kevin Best,Robert Gregg

from arxiv, 9 pages, 8 figures, and Appendix

Objective: Configuring a prosthetic leg is an integral part of the fitting process, but the personalization of a multi-modal powered knee-ankle prosthesis is often too complex to realize in a clinical environment. This paper develops both the technical means to individualize a hybrid kinematic-impedance controller for variable-incline walking and sit-stand transitions, and an intuitive Clinical Tuning Interface (CTI) that allows prosthetists to directly modify the controller behavior. Methods: Utilizing an established method for predicting kinematic gait individuality alongside a new parallel approach for kinetic individuality, we applied tuned characteristics exclusively from level-ground walking to personalize continuous-phase/task models of joint kinematics and impedance. To take advantage of this method, we developed a CTI that translates common clinical tuning parameters into model adjustments. We then conducted a case study involving an above-knee amputee participant where a prosthetist iteratively tuned the prosthesis in a simulated clinical session involving walking and sit-stand transitions. Results: The prosthetist fully tuned the multi-activity prosthesis controller in under 20 min. Each iteration of tuning (i.e., observation, parameter adjustment, and model reprocessing) took 2 min on average for walking and 1 min on average for sit-stand. The tuned behavior changes were appropriately manifested in the commanded prosthesis torques, both at the tuned tasks and across untuned tasks (inclines). Conclusion: The CTI leveraged able-bodied trends to efficiently personalize a wide array of walking tasks and sit-stand transitions. A case-study validated the CTI tuning method and demonstrated the efficiency necessary for powered knee-ankle prostheses to become clinically viable.

穩健性 · Machine Learning · Learning · 量子機器學習 · 奇異的 ·

2024 年 12 月 13 日

Robust Dequantization of the Quantum Singular value Transformation and Quantum Machine Learning Algorithms

Fran?ois Le Gall

from arxiv, 56 pages; v2: minor changes (final journal version)

Several quantum algorithms for linear algebra problems, and in particular quantum machine learning problems, have been "dequantized" in the past few years. These dequantization results typically hold when classical algorithms can access the data via length-squared sampling. In this work we investigate how robust these dequantization results are. We introduce the notion of approximate length-squared sampling, where classical algorithms are only able to sample from a distribution close to the ideal distribution in total variation distance. While quantum algorithms are natively robust against small perturbations, current techniques in dequantization are not. Our main technical contribution is showing how many techniques from randomized linear algebra can be adapted to work under this weaker assumption as well. We then use these techniques to show that the recent low-rank dequantization framework by Chia, Gily\'en, Li, Lin, Tang and Wang (JACM 2022) and the dequantization framework for sparse matrices by Gharibian and Le Gall (STOC 2022), which are both based on the Quantum Singular Value Transformation, can be generalized to the case of approximate length-squared sampling access to the input. We also apply these results to obtain a robust dequantization of many quantum machine learning algorithms, including quantum algorithms for recommendation systems, supervised clustering and low-rank matrix inversion.

MoDELS · 講稿 · Learning · Sphering · 表示 ·

2023 年 11 月 2 日

A Review and Roadmap of Deep Causal Model from Different Causal Structures and Representations

Hang Chen,Keqing Du,Chenguang Li,Xinyu Yang

from arxiv, under review

The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area. Nonetheless, the broadening of original causal concepts and theories to such complex, non-statistical data has been met with serious challenges. In response, our study proposes redefinitions of causal data into three distinct categories from the standpoint of causal structure and representation: definite data, semi-definite data, and indefinite data. Definite data chiefly pertains to statistical data used in conventional causal scenarios, while semi-definite data refers to a spectrum of data formats germane to deep learning, including time-series, images, text, and others. Indefinite data is an emergent research sphere inferred from the progression of data forms by us. To comprehensively present these three data paradigms, we elaborate on their formal definitions, differences manifested in datasets, resolution pathways, and development of research. We summarize key tasks and achievements pertaining to definite and semi-definite data from myriad research undertakings, present a roadmap for indefinite data, beginning with its current research conundrums. Lastly, we classify and scrutinize the key datasets presently utilized within these three paradigms.

知識 (knowledge) · Machine Learning · MoDELS · 學成 · Conformer ·

2022 年 5 月 10 日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Julian W?rmann,Daniel Bogdoll,Etienne Bührle,Han Chen,Evaristus Fuh Chuo,Kostadin Cvejoski,Ludger van Elst,Tobias Glei?ner,Philip Gottschall,Stefan Griesche,Christian Hellert,Christian Hesels,Sebastian Houben,Tim Joseph,Niklas Keil,Johann Kelsch,Hendrik K?nigshof,Erwin Kraft,Leonie Kreuser,Kevin Krone,Tobias Latka,Denny Mattern,Stefan Matthes,Mohsin Munir,Moritz Nekolla,Adrian Paschke,Maximilian Alexander Pintz,Tianming Qiu,Faraz Qureishi,Syed Tahseen Raza Rizvi,J?rg Reichardt,Laura von Rueden,Stefan Rudolph,Alexander Sagel,Gerhard Schunk,Hao Shen,Hendrik Stapelbroek,Vera Stehr,Gurucharan Srinivas,Anh Tuan Tran,Abhishek Vivekanandan,Ya Wang,Florian Wasserrab,Tino Werner,Christian Wirth,Stefan Zwicklbauer

from arxiv, 93 pages

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

Machine Learning · 學成 · 可辨認的 · 統計量 · 話題 ·

2020 年 4 月 3 日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Eyke Hüllermeier,Willem Waegeman

from arxiv, 52 pages

The notion of uncertainty is of major importance in machine learning and constitutes a key element of machine learning methodology. In line with the statistical tradition, uncertainty has long been perceived as almost synonymous with standard probability and probabilistic predictions. Yet, due to the steadily increasing relevance of machine learning for practical applications and related issues such as safety requirements, new problems and challenges have recently been identified by machine learning scholars, and these problems may call for new methodological developments. In particular, this includes the importance of distinguishing between (at least) two different types of uncertainty, often refereed to as aleatoric and epistemic. In this paper, we provide an introduction to the topic of uncertainty in machine learning as well as an overview of hitherto attempts at handling uncertainty in general and formalizing this distinction in particular.

Machine Translation · NMT · Performer · state-of-the-art · 學成 ·

2018 年 6 月 1 日

A Survey of Domain Adaptation for Neural Machine Translation

Chenhui Chu,Rui Wang

from arxiv, COLING 2018, 16 pages, 9 figures

Neural machine translation (NMT) is a deep learning based approach for machine translation, which yields the state-of-the-art translation performance in scenarios where large-scale parallel corpora are available. Although the high-quality and domain-specific translation is crucial in the real world, domain-specific corpora are usually scarce or nonexistent, and thus vanilla NMT performs poorly in such scenarios. Domain adaptation that leverages both out-of-domain parallel corpora as well as monolingual corpora for in-domain translation, is very important for domain-specific translation. In this paper, we give a comprehensive survey of the state-of-the-art domain adaptation techniques for NMT.