亚洲乱色熟女一区二区三区麻豆,成人影片免费完整电影

強化學習理論(RL)，重點是樣本復雜性分析。

Basics of MDPs and RL.
Sample complexity analyses of tabular RL.
Policy Gradient.
Off-policy evaluation.
State abstraction theory.
Sample complexity analyses of approximate dynamic programming.
PAC exploration theory (tabular).
PAC exploration theory (function approximation).
Partial observability and dynamical system modeling.

//nanjiang.cs.illinois.edu/cs598/

付費5元查看完整內容

知識薈萃

精品入門和進階教程、論文和代碼整理等

查看相關VIP內容、論文、資訊等

操作系統 ·

2020 年 10 月 27 日

[付費5元查看完整內容]霍普金斯《操作系統原理》2020課程，不可錯過！

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

本課程介紹構成現代計算機操作系統的基本概念和核心原理。這門課的目標是解釋那些可能在未來許多年仍然存在的概念和原則。本課程是操作系統和分布式系統研究的起點。具體地說，本課程介紹了進程、并發、同步、調度、多程序設計、內存管理和文件系統的概念。

//cs.jhu.edu/~huang/cs318/fall20/index.html

付費5元查看完整內容

機器學習 · 最近鄰 · 決策樹 · 線性回歸 ·

2020 年 10 月 11 日

[付費5元查看完整內容]多倫多大學Fall2020《機器學習導論》課程，不可錯過！

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

這是一門關于機器學習的入門課程。機器學習是一組技術，它允許機器從數據和經驗中學習，而不是要求人類手工指定所需的行為。在過去的20年里，機器學習技術在人工智能的學術領域和科技行業中都變得越來越重要。本課程提供了一些最常用的ML算法的廣泛介紹。

課程的前半部分側重于監督學習。我們從最近鄰、決策樹和集合開始。然后介紹了參數化模型，包括線性回歸、logistic回歸和softmax回歸以及神經網絡。然后我們轉向無監督學習，特別關注概率模型，以及主成分分析和k均值。最后，我們介紹了強化學習的基礎知識。

課程內容：

最近鄰導論
決策樹集成
線性回歸線性分類
Softmax回歸、SVM、Boosting
PCA、Kmeans、最大似然
概率圖模型
期望最大化
神經網絡
卷積神經網絡
強化學習
可微分隱私
算法公平性

//www.cs.toronto.edu/~huang/courses/csc2515_2020f/

推薦閱讀材料： Hastie, Tibshirani, and Friedman: “The Elements of Statistical Learning” Christopher Bishop: “Pattern Recognition and Machine Learning”, 2006. Kevin Murphy: “Machine Learning: a Probabilistic Perspective”, 2012. David Mackay: “Information Theory, Inference, and Learning Algorithms”, 2003. Shai Shalev-Shwartz & Shai Ben-David: “Understanding Machine Learning: From Theory to Algorithms”, 2014.

學習路線圖：

付費5元查看完整內容

生成式對抗網絡 ·

2020 年 5 月 3 日

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

【導讀】Pieter Abbeel 是加州大學伯克利分校的教授，伯克利機器人學習實驗室的主任，其新開課程CS294深度無監督學習包含兩個領域，分別是生成模型和自監督學習。這個15周的課程包含視頻PPT能資源，有助于讀者對深度學習無監督的理解。最新一期是生成式對抗網絡Generative Adversarial Networks的課程，共有257頁ppt，包括GAN, DC GAN, ImprovedGAN, WGAN, WGAN-GP, Progr.GAN, SN-GAN, SAGAN, BigGAN(-Deep), StyleGAN-v1,2, VIB-GAN, GANs as Energy Models，非常值得關注！

目錄內容：

隱式模型的動機和定義
原始GAN (Goodfellow et al, 2014)
評估: Parzen、Inception、Frechet
一些理論: 貝葉斯最優鑒別器; Jensen-Shannon散度; 模式崩潰; 避免飽和
GAN進展
DC GAN (Radford et al, 2016)
改進GANs訓練(Salimans et al, 2016)
WGAN, WGAN- gp, Progressive GAN, SN-GAN, SAGAN
BigGAN, BigGAN- deep, StyleGAN, StyleGAN-v2, VIB-GAN
創意條件GAN
GANs與申述
GANs作為能量模型
GANs與最優傳輸，隱式似然模型，矩匹配
對抗性損失的其他用途:轉移學習、公平
GANs和模仿學習

付費5元查看完整內容

概率圖模型 · 卡內基梅隆大學 (Carnegie Mellon University) · 邢波 · 機器學習 ·

2020 年 2 月 6 日

[付費5元查看完整內容]PGM2020開始了！CMU-邢波Eric P. Xing教授經典課程《概率圖模型》，不可錯過！

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

【導讀】卡內基梅隆大學（CMU），在2020年春季學習繼續開設了由Eric P. Xing教授執教的經典課程《Probabilistic Graphical Models》（概率圖模型）。這門課程從2005年開設至今，已經有十多個年頭了。它影響了一代又一代計算機學者，為學界培養了大量機器學習人才。直到如今，概率圖模型仍然是機器學習領域非常火熱的方向，感興趣的同學不要錯過。

課程簡介：

在人工智能、統計學、計算機系統、計算機視覺、自然語言處理和計算生物學等許多其他領域中，許多問題都可以看作是從局部信息中尋找一致的全局結論。概率圖模型框架為這一范圍廣泛的問題提供了統一的視角，支持對具有大量屬性和龐大數據集的問題進行有效的推理、決策和學習。無論是應用圖模型來解決復雜問題還是作為將圖模型作為核心研究課題，本課程都能為你打下堅實基礎。

邢波 Eric P. Xing 教授

Eric P.Xing是卡內基梅隆大學（Carnegie Mellon University）計算機科學教授，是2018年世界經濟論壇（World Economic Forum）技術先驅公司Petuum Inc.的創始人、首席執行官和首席科學家，該公司為廣泛和通用的工業人工智能應用構建標準化人工智能開發平臺和操作系統。美國新澤西州立大學分子生物學與生物化學博士；美國加州大學伯克利分校（UC，Berkeley）計算機科學博士。主要研究興趣集中在機器學習和統計學習方法論及理論的發展，和大規模計算系統和架構的開發，以解決在復雜系統中的高維、多峰和動態的潛在世界中的自動化學習、推理以及決策問題。目前或曾經擔任《美國統計協會期刊》(JASA)、《應用統計年鑒》(AOAS)、《IEEE模式分析與機器智能學報》(PAMI)和《PLoS計算生物學雜志》(the PLoS JournalofComputational Biology)的副主編，《機器學習雜志》(MLJ)和《機器學習研究雜志》(JMLR)的執行主編，還是美國國防部高級研究計劃署(DARPA)信息科學與技術顧問組成員，曾獲得美國國家科學基金會(NSF)事業獎、Alfred P. Sloan學者獎、美國空軍青年學者獎以及IBM開放協作研究學者獎等，以及多次論文獎。曾于2014年擔任國際機器學習大會（ICML）主席。

//www.cs.cmu.edu/~epxing/

課程信息：

課程網站：
教師: Eric P. Xing (epxing@cs)
時間: MW 12:00-1:20pm
地點: Wean 7500
辦公時間: Mon 1:30-2:30pm GHC 8101
Piazza:
Gradescope:
助教 (email, office hours):
- Xun Zheng (xzheng1@andrew, Fri 4-5pm GHC 8013)
- Ben Lengerich (blengeri@andrew, Thu 10-11am GHC 9005)
- Haohan Wang (haohanw@andrew, Fri 5-6pm)
- Yiwen Yuan (yiweny@andrew, Tue 1:50-2:50pm, outside GHC 8011)
- Xiang Si (xsi@andrew, Wed 2-3pm, GHC Citadel Commons)
- Junxian He (junxian1@andrew, Mon 4-5pm GHC 6603)

付費5元查看完整內容

強化學習 · 教程 ·

2019 年 10 月 11 日

[付費5元查看完整內容]強化學習最新教程，17頁pdf

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

The tutorial is written for those who would like an introduction to reinforcement learning (RL). The aim is to provide an intuitive presentation of the ideas rather than concentrate on the deeper mathematics underlying the topic. RL is generally used to solve the so-called Markov decision problem (MDP). In other words, the problem that you are attempting to solve with RL should be an MDP or its variant. The theory of RL relies on dynamic programming (DP) and artificial intelligence (AI). We will begin with a quick description of MDPs. We will discuss what we mean by “complex” and “large-scale” MDPs. Then we will explain why RL is needed to solve complex and large-scale MDPs. The semi-Markov decision problem (SMDP) will also be covered.

The tutorial is meant to serve as an introduction to these topics and is based mostly on the book: “Simulation-based optimization: Parametric Optimization techniques and reinforcement learning” [4]. The book discusses this topic in greater detail in the context of simulators. There are at least two other textbooks that I would recommend you to read: (i) Neuro-dynamic programming [2] (lots of details on convergence analysis) and (ii) Reinforcement Learning: An Introduction [11] (lots of details on underlying AI concepts). A more recent tutorial on this topic is [8]. This tutorial has 2 sections: ? Section 2 discusses MDPs and SMDPs. ? Section 3 discusses RL. By the end of this tutorial, you should be able to ? Identify problem structures that can be set up as MDPs / SMDPs. ? Use some RL algorithms.

付費5元查看完整內容

機器學習 · 深度學習可解釋性 ·

2019 年 10 月 9 日

[付費5元查看完整內容]【哈佛大學商學院課程Fall 2019】機器學習可解釋性

專知會員服務

專知，提供專業可信的知識分發服務，讓認知協作更快更好！

機器學習可解釋性，Interpretability and Explainability in Machine Learning

Overview As machine learning models are increasingly being employed to aid decision makers in high-stakes settings such as healthcare and criminal justice, it is important to ensure that the decision makers (end users) correctly understand and consequently trust the functionality of these models. This graduate level course aims to familiarize students with the recent advances in the emerging field of interpretable and explainable ML. In this course, we will review seminal position papers of the field, understand the notion of model interpretability and explainability, discuss in detail different classes of interpretable models (e.g., prototype based approaches, sparse linear models, rule based techniques, generalized additive models), post-hoc explanations (black-box explanations including counterfactual explanations and saliency maps), and explore the connections between interpretability and causality, debugging, and fairness. The course will also emphasize on various applications which can immensely benefit from model interpretability including criminal justice and healthcare.

付費5元查看完整內容

contrastive · 學成 · 無監督學習 · 表示學習 · MoDELS ·

2019 年 1 月 22 日

Representation Learning with Contrastive Predictive Coding

Aaron van den Oord,Yazhe Li,Oriol Vinyals

While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose a universal unsupervised learning approach to extract useful representations from high-dimensional data, which we call Contrastive Predictive Coding. The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models. We use a probabilistic contrastive loss which induces the latent space to capture information that is maximally useful to predict future samples. It also makes the model tractable by using negative sampling. While most prior work has focused on evaluating representations for a particular modality, we demonstrate that our approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

過擬合 · 泛化理論 · 深度強化學習 · 學成 · PARCO ·

2018 年 4 月 20 日

A Study on Overfitting in Deep Reinforcement Learning

Chiyuan Zhang,Oriol Vinyals,Remi Munos,Samy Bengio

Recent years have witnessed significant progresses in deep Reinforcement Learning (RL). Empowered with large scale neural networks, carefully designed architectures, novel training algorithms and massively parallel computing devices, researchers are able to attack many challenging RL problems. However, in machine learning, more training power comes with a potential risk of more overfitting. As deep RL techniques are being applied to critical problems such as healthcare and finance, it is important to understand the generalization behaviors of the trained agents. In this paper, we conduct a systematic study of standard RL agents and find that they could overfit in various ways. Moreover, overfitting could happen "robustly": commonly used techniques in RL that add stochasticity do not necessarily prevent or detect overfitting. In particular, the same agents and learning algorithms could have drastically different test performance, even when all of them achieve optimal rewards during training. The observations call for more principled and careful evaluation protocols in RL. We conclude with a general discussion on overfitting in RL and a study of the generalization behaviors from the perspective of inductive bias.