我的美女教师在线观看免费-久久综合久久香蕉网欧美

from arxiv, Accepted at the Theory and Foundation of Continual Learning Workshop, International Conference on Machine Learning (ICML) 2021. Supplementary material included. V4 includes additional results of linear probing of intermediate representations that were added to the supplementary

In class-incremental learning, an agent with limited resources needs to learn a sequence of classification tasks, forming an ever growing classification problem, with the constraint of not being able to access data from previous tasks. The main difference with task-incremental learning, where a task-ID is available at inference time, is that the learner also needs to perform cross-task discrimination, i.e. distinguish between classes that have not been seen together. Approaches to tackle this problem are numerous and mostly make use of an external memory (buffer) of non-negligible size. In this paper, we ablate the learning of cross-task features and study its influence on the performance of basic replay strategies used for class-IL. We also define a new forgetting measure for class-incremental learning, and see that forgetting is not the principal cause of low performance. Our experimental results show that future algorithms for class-incremental learning should not only prevent forgetting, but also aim to improve the quality of the cross-task features, and the knowledge transfer between tasks. This is especially important when tasks contain limited amount of data.

相關內容

Learning

關注 12

估計/估計量 · binary · Analysis · 規范化的 · 無偏 ·

2024 年 7 月 9 日

Effect estimation in the presence of a misclassified binary mediator

Kimberly A. Hochstedler Webb,Martin T. Wells

from arxiv, 44 pages, 5 figures, 6 tables

Mediation analyses allow researchers to quantify the effect of an exposure variable on an outcome variable through a mediator variable. If a binary mediator variable is misclassified, the resulting analysis can be severely biased. Misclassification is especially difficult to deal with when it is differential and when there are no gold standard labels available. Previous work has addressed this problem using a sensitivity analysis framework or by assuming that misclassification rates are known. We leverage a variable related to the misclassification mechanism to recover unbiased parameter estimates without using gold standard labels. The proposed methods require the reasonable assumption that the sum of the sensitivity and specificity is greater than 1. Three correction methods are presented: (1) an ordinary least squares correction for Normal outcome models, (2) a multi-step predictive value weighting method, and (3) a seamless expectation-maximization algorithm. We apply our misclassification correction strategies to investigate the mediating role of gestational hypertension on the association between maternal age and pre-term birth.

情景 · Processing（編程語言） · MoDELS · 線性的 · Performer ·

2024 年 7 月 9 日

Compact formulations and valid inequalities for parallel machine scheduling with conflicts

Phablo F. S. Moura,Roel Leus,Hande Yaman

The problem of scheduling conflicting jobs on parallel machines consists in assigning a set of jobs to a set of machines so that no two conflicting jobs are allocated to the same machine, and the maximum processing time among all machines is minimized. We propose a new compact mixed integer linear formulation based on the representatives model for the vertex coloring problem, which overcomes a number of issues inherent in the natural assignment model. We present a polyhedral study of the associated polytope, and describe classes of valid inequalities inherited from the stable set polytope. We describe branch-and-cut algorithms for the problem, and report on computational experiments with benchmark instances. Our computational results on the hardest instances of the benchmark set show that the proposed algorithms are superior (either in running time or quality of the solutions) to the current state-of-the-art methods. We find that our new method performs better than the existing ones especially when the gap between the optimal value and the trivial lower bound (i.e., the sum of all processing times divided by the number of machines) increases.

流形 · 潛在 · Learning · 優化器 · AIM ·

2024 年 7 月 8 日

System stabilization with policy optimization on unstable latent manifolds

Steffen W. R. Werner,Benjamin Peherstorfer

from arxiv, 29 pages, 10 figures, 1 table

Stability is a basic requirement when studying the behavior of dynamical systems. However, stabilizing dynamical systems via reinforcement learning is challenging because only little data can be collected over short time horizons before instabilities are triggered and data become meaningless. This work introduces a reinforcement learning approach that is formulated over latent manifolds of unstable dynamics so that stabilizing policies can be trained from few data samples. The unstable manifolds are minimal in the sense that they contain the lowest dimensional dynamics that are necessary for learning policies that guarantee stabilization. This is in stark contrast to generic latent manifolds that aim to approximate all -- stable and unstable -- system dynamics and thus are higher dimensional and often require higher amounts of data. Experiments demonstrate that the proposed approach stabilizes even complex physical systems from few data samples for which other methods that operate either directly in the system state space or on generic latent manifolds fail.

XAI · Learning · Processing（編程語言） · 3D · 可辨認的 ·

2024 年 7 月 8 日

An explainable three dimension framework to uncover learning patterns: A unified look in variable sulci recognition

Michail Mamalakis,Heloise de Vareilles,Atheer AI-Manea,Samantha C. Mitchell,Ingrid Arartz,Lynn Egeland Morch-Johnsen,Jane Garrison,Jon Simons,Pietro Lio,John Suckling,Graham Murray

The significant features identified in a representative subset of the dataset during the learning process of an artificial intelligence model are referred to as a 'global' explanation. Three-dimensional (3D) global explanations are crucial in neuroimaging where a complex representational space demands more than basic two-dimensional interpretations. Curently, studies in the literature lack accurate, low-complexity, and 3D global explanations in neuroimaging and beyond. To fill this gap, we develop a novel explainable artificial intelligence (XAI) 3D-Framework that provides robust, faithful, and low-complexity global explanations. We evaluated our framework on various 3D deep learning networks trained, validated, and tested on a well-annotated cohort of 596 MRI images. The focus of detection was on the presence or absence of the paracingulate sulcus, a highly variable feature of brain topology associated with symptoms of psychosis. Our proposed 3D-Framework outperformed traditional XAI methods in terms of faithfulness for global explanations. As a result, these explanations uncovered new patterns that not only enhance the credibility and reliability of the training process but also reveal the broader developmental landscape of the human cortex. Our XAI 3D-Framework proposes for the first time, a way to utilize global explanations to discover the context in which detection of specific features are embedded, opening our understanding of normative brain development and atypical trajectories that can lead to the emergence of mental illness.

Machine Learning · 量子機器學習 · Learning · MoDELS · 量子計算 ·

2024 年 7 月 7 日

Shadows of quantum machine learning

Sofiene Jerbi,Casper Gyurik,Simon C. Marshall,Riccardo Molteni,Vedran Dunjko

from arxiv, 7 + 16 pages, 5 figures; changes in the main text, added content in the appendix

Quantum machine learning is often highlighted as one of the most promising practical applications for which quantum computers could provide a computational advantage. However, a major obstacle to the widespread use of quantum machine learning models in practice is that these models, even once trained, still require access to a quantum computer in order to be evaluated on new data. To solve this issue, we introduce a new class of quantum models where quantum resources are only required during training, while the deployment of the trained model is classical. Specifically, the training phase of our models ends with the generation of a 'shadow model' from which the classical deployment becomes possible. We prove that: i) this class of models is universal for classically-deployed quantum machine learning; ii) it does have restricted learning capacities compared to 'fully quantum' models, but nonetheless iii) it achieves a provable learning advantage over fully classical learners, contingent on widely-believed assumptions in complexity theory. These results provide compelling evidence that quantum machine learning can confer learning advantages across a substantially broader range of scenarios, where quantum computers are exclusively employed during the training phase. By enabling classical deployment, our approach facilitates the implementation of quantum machine learning models in various practical contexts.

流形 · 潛在 · 自編碼器 · Learning · PDE ·

2024 年 7 月 7 日

Ricci flow-guided autoencoders in learning time-dependent dynamics

Andrew Gracyk

from arxiv, Misc. edits and reformatting; new baseline; redid Burger's experiment with last-layer sigmoid activation; improved and new figures

We present a manifold-based autoencoder method for learning dynamics in time, notably partial differential equations (PDEs), in which the manifold latent space evolves according to Ricci flow. This can be accomplished by simulating Ricci flow in a physics-informed setting, and manifold quantities can be matched so that Ricci flow is empirically achieved. With our method, the manifold is discerned through the training procedure, while the latent evolution due to Ricci flow induces a more accommodating representation over static methods. We present our method on a range of experiments consisting of PDE data that encompasses desirable characteristics such as periodicity and randomness. By incorporating latent dynamics, we sustain a manifold latent representation for all values in the ambient PDE time interval. Furthermore, the dynamical manifold latent space facilitates qualities such as learning for out-of-distribution data, and robustness. We showcase our method by demonstrating these features.

模型評估 · MoDELS · 損失函數（機器學習） · Performer · 可辨認的 ·

2024 年 7 月 5 日

Research on target detection method of distracted driving behavior based on improved YOLOv8

Shiquan Shen,Zhizhong Wu,Pan Zhang

from arxiv, Major revision on content, no replacement available soon

With the development of deep learning technology, the detection and classification of distracted driving behaviour requires higher accuracy. Existing deep learning-based methods are computationally intensive and parameter redundant, limiting the efficiency and accuracy in practical applications. To solve this problem, this study proposes an improved YOLOv8 detection method based on the original YOLOv8 model by integrating the BoTNet module, GAM attention mechanism and EIoU loss function. By optimising the feature extraction and multi-scale feature fusion strategies, the training and inference processes are simplified, and the detection accuracy and efficiency are significantly improved. Experimental results show that the improved model performs well in both detection speed and accuracy, with an accuracy rate of 99.4%, and the model is smaller and easy to deploy, which is able to identify and classify distracted driving behaviours in real time, provide timely warnings, and enhance driving safety.

線性的 · 潛變量/隱變量 · 潛在 · 觀測變量 · 相互獨立的 ·

2024 年 7 月 5 日

Linear causal disentanglement via higher-order cumulants

Paula Leyes Carreno,Chiara Meroni,Anna Seigal

Linear causal disentanglement is a recent method in causal representation learning to describe a collection of observed variables via latent variables with causal dependencies between them. It can be viewed as a generalization of both independent component analysis and linear structural equation models. We study the identifiability of linear causal disentanglement, assuming access to data under multiple contexts, each given by an intervention on a latent variable. We show that one perfect intervention on each latent variable is sufficient and in the worst case necessary to recover parameters under perfect interventions, generalizing previous work to allow more latent than observed variables. We give a constructive proof that computes parameters via a coupled tensor decomposition. For soft interventions, we find the equivalence class of latent graphs and parameters that are consistent with observed data, via the study of a system of polynomial equations. Our results hold assuming the existence of non-zero higher-order cumulants, which implies non-Gaussianity of variables.

感知機 · 歸納偏好 · Networking · Neural Networks · 有偏 ·

2024 年 7 月 5 日

Exploiting the equivalence between quantum neural networks and perceptrons

Chris Mingard,Jessica Pointing,Charles London,Yoonsoo Nam,Ard A. Louis

Quantum machine learning models based on parametrized quantum circuits, also called quantum neural networks (QNNs), are considered to be among the most promising candidates for applications on near-term quantum devices. Here we explore the expressivity and inductive bias of QNNs by exploiting an exact mapping from QNNs with inputs $x$ to classical perceptrons acting on $x \otimes x$ (generalised to complex inputs). The simplicity of the perceptron architecture allows us to provide clear examples of the shortcomings of current QNN models, and the many barriers they face to becoming useful general-purpose learning algorithms. For example, a QNN with amplitude encoding cannot express the Boolean parity function for $n\geq 3$, which is but one of an exponential number of data structures that such a QNN is unable to express. Mapping a QNN to a classical perceptron simplifies training, allowing us to systematically study the inductive biases of other, more expressive embeddings on Boolean data. Several popular embeddings primarily produce an inductive bias towards functions with low class balance, reducing their generalisation performance compared to deep neural network architectures which exhibit much richer inductive biases. We explore two alternate strategies that move beyond standard QNNs. In the first, we use a QNN to help generate a classical DNN-inspired kernel. In the second we draw an analogy to the hierarchical structure of deep neural networks and construct a layered non-linear QNN that is provably fully expressive on Boolean data, while also exhibiting a richer inductive bias than simple QNNs. Finally, we discuss characteristics of the QNN literature that may obscure how hard it is to achieve quantum advantage over deep learning algorithms on classical data.

多峰值 · Learning · 圖 · 表示學習 · MoDELS ·

2022 年 9 月 7 日

Geometric multimodal representation learning

Yasha Ektefaie,George Dasoulas,Ayush Noori,Maha Farhat,Marinka Zitnik

from arxiv, 28 pages, 5 figures, 2 boxes

Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization. The blueprint serves as a way to group state-of-the-art architectures that treat multimodal data by choosing appropriately four different components. This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.