干逼视频无码免费网站,能免费看黄的网址

Reinforcement learning is commonly concerned with problems of maximizing accumulated rewards in Markov decision processes. Oftentimes, a certain goal state or a subset of the state space attain maximal reward. In such a case, the environment may be considered solved when the goal is reached. Whereas numerous techniques, learning or non-learning based, exist for solving environments, doing so optimally is the biggest challenge. Say, one may choose a reward rate which penalizes the action effort. Reinforcement learning is currently among the most actively developed frameworks for solving environments optimally by virtue of maximizing accumulated reward, in other words, returns. Yet, tuning agents is a notoriously hard task as reported in a series of works. Our aim here is to help the agent learn a near-optimal policy efficiently while ensuring a goal reaching property of some basis policy that merely solves the environment. We suggest an algorithm, which is fairly flexible, and can be used to augment practically any agent as long as it comprises of a critic. A formal proof of a goal reaching property is provided. Simulation experiments on six problems under five agents, including the benchmarked one, provided an empirical evidence that the learning can indeed be boosted while ensuring goal reaching property.

相關內容

Agent

關注 15

Continuity · Learning · 示例 · 小樣本學習 · 類別 ·

2024 年 7 月 9 日

Expanding continual few-shot learning benchmarks to include recognition of specific instances

Gideon Kowadlo,Abdelrahman Ahmed,Amir Mayan,David Rawlinson

from arxiv, Published in PLOS ONE //doi.org/10.1371/journal.pone.0305856

Continual learning and few-shot learning are important frontiers in progress toward broader Machine Learning (ML) capabilities. Recently, there has been intense interest in combining both. One of the first examples to do so was the Continual few-shot Learning (CFSL) framework of Antoniou et al. arXiv:2004.11967. In this study, we extend CFSL in two ways that capture a broader range of challenges, important for intelligent agent behaviour in real-world conditions. First, we increased the number of classes by an order of magnitude, making the results more comparable to standard continual learning experiments. Second, we introduced an 'instance test' which requires recognition of specific instances of classes -- a capability of animal cognition that is usually neglected in ML. For an initial exploration of ML model performance under these conditions, we selected representative baseline models from the original CFSL work and added a model variant with replay. As expected, learning more classes is more difficult than the original CFSL experiments, and interestingly, the way in which image instances and classes are presented affects classification performance. Surprisingly, accuracy in the baseline instance test is comparable to other classification tasks, but poor given significant occlusion and noise. The use of replay for consolidation substantially improves performance for both types of tasks, but particularly for the instance test.

INTERACT · Learning · SSL · Performer · 相同 ·

2024 年 7 月 9 日

Self-supervised visual learning from interactions with objects

Arthur Aubret,Céline Teulière,Jochen Triesch

Self-supervised learning (SSL) has revolutionized visual representation learning, but has not achieved the robustness of human vision. A reason for this could be that SSL does not leverage all the data available to humans during learning. When learning about an object, humans often purposefully turn or move around objects and research suggests that these interactions can substantially enhance their learning. Here we explore whether such object-related actions can boost SSL. For this, we extract the actions performed to change from one ego-centric view of an object to another in four video datasets. We then introduce a new loss function to learn visual and action embeddings by aligning the performed action with the representations of two images extracted from the same clip. This permits the performed actions to structure the latent visual representation. Our experiments show that our method consistently outperforms previous methods on downstream category recognition. In our analysis, we find that the observed improvement is associated with a better viewpoint-wise alignment of different objects from the same category. Overall, our work demonstrates that embodied interactions with objects can improve SSL of object categories.

Learning · Cognition · 表示容量 · 統計量 · 分離的 ·

2024 年 7 月 8 日

One system for learning and remembering episodes and rules

Joshua T. S. Hewson,Sabina J. Sloman,Marina Dubova

Humans can learn individual episodes and generalizable rules and also successfully retain both kinds of acquired knowledge over time. In the cognitive science literature, (1) learning individual episodes and rules and (2) learning and remembering are often both conceptualized as competing processes that necessitate separate, complementary learning systems. Inspired by recent research in statistical learning, we challenge these trade-offs, hypothesizing that they arise from capacity limitations rather than from the inherent incompatibility of the underlying cognitive processes. Using an associative learning task, we show that one system with excess representational capacity can learn and remember both episodes and rules.

XAI · Learning · Processing（編程語言） · 3D · 可辨認的 ·

2024 年 7 月 8 日

An explainable three dimension framework to uncover learning patterns: A unified look in variable sulci recognition

Michail Mamalakis,Heloise de Vareilles,Atheer AI-Manea,Samantha C. Mitchell,Ingrid Arartz,Lynn Egeland Morch-Johnsen,Jane Garrison,Jon Simons,Pietro Lio,John Suckling,Graham Murray

The significant features identified in a representative subset of the dataset during the learning process of an artificial intelligence model are referred to as a 'global' explanation. Three-dimensional (3D) global explanations are crucial in neuroimaging where a complex representational space demands more than basic two-dimensional interpretations. Curently, studies in the literature lack accurate, low-complexity, and 3D global explanations in neuroimaging and beyond. To fill this gap, we develop a novel explainable artificial intelligence (XAI) 3D-Framework that provides robust, faithful, and low-complexity global explanations. We evaluated our framework on various 3D deep learning networks trained, validated, and tested on a well-annotated cohort of 596 MRI images. The focus of detection was on the presence or absence of the paracingulate sulcus, a highly variable feature of brain topology associated with symptoms of psychosis. Our proposed 3D-Framework outperformed traditional XAI methods in terms of faithfulness for global explanations. As a result, these explanations uncovered new patterns that not only enhance the credibility and reliability of the training process but also reveal the broader developmental landscape of the human cortex. Our XAI 3D-Framework proposes for the first time, a way to utilize global explanations to discover the context in which detection of specific features are embedded, opening our understanding of normative brain development and atypical trajectories that can lead to the emergence of mental illness.

Learning · GROUP · 學習器 · 自適應學習 · Machine Learning ·

2024 年 7 月 8 日

Can machine learning solve the challenge of adaptive learning and the individualization of learning paths? A field experiment in an online learning platform

Marius K?ppel,Tim Klausmann,Isabell Zipperle,Daniel Schunk

The individualization of learning contents based on digital technologies promises large individual and social benefits. However, it remains an open question how this individualization can be implemented. To tackle this question we conduct a randomized controlled trial on a large digital self-learning platform. We develop an algorithm based on two convolutional neural networks that assigns tasks to $4,365$ learners according to their learning paths. Learners are randomized into three groups: two treatment groups -- a group-based adaptive treatment group and an individual adaptive treatment group -- and one control group. We analyze the difference between the three groups with respect to effort learners provide and their performance on the platform. Our null results shed light on the multiple challenges associated with the individualization of learning paths.

閾值 · Processing（編程語言） · 推斷 · MoDELS · 統計理論 ·

2024 年 7 月 8 日

Pareto processes for threshold exceedances in spatial extremes

Clement Dombry,Juliette Legrand,Thomas Opitz

We review some recent development in the theory of spatial extremes related to Pareto Processes and modeling of threshold exceedances. We provide theoretical background, methodology for modeling, simulation and inference as well as an illustration to wave height modelling. This preprint is an author version of a chapter to appear in a collaborative book.

線性的 · 潛變量/隱變量 · 潛在 · 觀測變量 · 相互獨立的 ·

2024 年 7 月 5 日

Linear causal disentanglement via higher-order cumulants

Paula Leyes Carreno,Chiara Meroni,Anna Seigal

Linear causal disentanglement is a recent method in causal representation learning to describe a collection of observed variables via latent variables with causal dependencies between them. It can be viewed as a generalization of both independent component analysis and linear structural equation models. We study the identifiability of linear causal disentanglement, assuming access to data under multiple contexts, each given by an intervention on a latent variable. We show that one perfect intervention on each latent variable is sufficient and in the worst case necessary to recover parameters under perfect interventions, generalizing previous work to allow more latent than observed variables. We give a constructive proof that computes parameters via a coupled tensor decomposition. For soft interventions, we find the equivalence class of latent graphs and parameters that are consistent with observed data, via the study of a system of polynomial equations. Our results hold assuming the existence of non-zero higher-order cumulants, which implies non-Gaussianity of variables.

xgboost · Machine Learning · 模型評估 · LightGBM · 閾值 ·

2024 年 7 月 5 日

Predicting the duration of traffic incidents for Sydney greater metropolitan area using machine learning methods

Artur Grigorev,Sajjad Shafiei,Hanna Grzybowska,Adriana-Simona Mihaita

This research presents a comprehensive approach to predicting the duration of traffic incidents and classifying them as short-term or long-term across the Sydney Metropolitan Area. Leveraging a dataset that encompasses detailed records of traffic incidents, road network characteristics, and socio-economic indicators, we train and evaluate a variety of advanced machine learning models including Gradient Boosted Decision Trees (GBDT), Random Forest, LightGBM, and XGBoost. The models are assessed using Root Mean Square Error (RMSE) for regression tasks and F1 score for classification tasks. Our experimental results demonstrate that XGBoost and LightGBM outperform conventional models with XGBoost achieving the lowest RMSE of 33.7 for predicting incident duration and highest classification F1 score of 0.62 for a 30-minute duration threshold. For classification, the 30-minute threshold balances performance with 70.84% short-term duration classification accuracy and 62.72% long-term duration classification accuracy. Feature importance analysis, employing both tree split counts and SHAP values, identifies the number of affected lanes, traffic volume, and types of primary and secondary vehicles as the most influential features. The proposed methodology not only achieves high predictive accuracy but also provides stakeholders with vital insights into factors contributing to incident durations. These insights enable more informed decision-making for traffic management and response strategies. The code is available by the link: //github.com/Future-Mobility-Lab/SydneyIncidents

過擬合 · SimPLe · Principle · 模型評估 · 統計量 ·

2021 年 3 月 16 日

Deep learning: a statistical viewpoint

Peter L. Bartlett,Andrea Montanari,Alexander Rakhlin

The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find interpolating solutions, that these methods implicitly impose regularization, and that overparametrization leads to benign overfitting. We survey recent theoretical progress that provides examples illustrating these principles in simpler settings. We first review classical uniform convergence results and why they fall short of explaining aspects of the behavior of deep learning methods. We give examples of implicit regularization in simple settings, where gradient methods lead to minimal norm functions that perfectly fit the training data. Then we review prediction methods that exhibit benign overfitting, focusing on regression problems with quadratic loss. For these methods, we can decompose the prediction rule into a simple component that is useful for prediction and a spiky component that is useful for overfitting but, in a favorable setting, does not harm prediction accuracy. We focus specifically on the linear regime for neural networks, where the network can be approximated by a linear model. In this regime, we demonstrate the success of gradient flow, and we consider benign overfitting with two-layer networks, giving an exact asymptotic analysis that precisely demonstrates the impact of overparametrization. We conclude by highlighting the key challenges that arise in extending these insights to realistic deep learning settings.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.