來自杜克大學Fan Li的簡明《因果推理》課程!
Chapter 1. Introduction 引言 Chapter 2. Randomized experiments 隨機實驗 Chapter 2.1: Fisher's and Neyman's mode of inference Chapter 2.2: Covariate adjustment in RCT Chapter 3. Observational studies with ignorable assignments: single-time treatments Chapter 3.1. Outcome regression Chapter 3.2. Covariate balance, matching, stratification Chapter 3.3. Propensity score Chapter 3.4. Propensity score weighting: inverse probability weighting and overlap weighting Chapter 3.5. Augmented weighting and double-robust estimators Chapter 3.6. Causal inference with multiple or continuous treatments Chapter 4. Heterogenous treatment effects and machine learning 異構治療效應與機器學習 Chapter 5. Sensitivity analysis 敏感性分析 Chapter 6. Instrumental variable and principal stratification Chapter 6.1. Instrumental variable (IV), noncompliance in RCT Chapter 6.2. Post-treatment confounding: Principal Stratification Chapter 7. Regression discontinuity design (RDD) Chapter 8. Panel data: Difference-in-differences (DID) and Synthetic control (SC) Chapter 9. Sequentially ignorable assignments: time-varying treatments Chapter 10. 貝葉斯推斷因果效應,Bayesian inference for causal effects
The regression discontinuity (RD) design is widely used for program evaluation with observational data. The RD design enables the identification of the local average treatment effect (LATE) at the treatment cutoff by exploiting known deterministic treatment assignment mechanisms. The primary focus of the existing literature has been the development of rigorous estimation methods for the LATE. In contrast, we consider policy learning under the RD design. We develop a robust optimization approach to finding an optimal treatment cutoff that improves upon the existing one. Under the RD design, policy learning requires extrapolation. We address this problem by partially identifying the conditional expectation function of counterfactual outcome under a smoothness assumption commonly used for the estimation of LATE. We then minimize the worst case regret relative to the status quo policy. The resulting new treatment cutoffs have a safety guarantee, enabling policy makers to limit the probability that they yield a worse outcome than the existing cutoff. Going beyond the standard single-cutoff case, we generalize the proposed methodology to the multi-cutoff RD design by developing a doubly robust estimator. We establish the asymptotic regret bounds for the learned policy using semi-parametric efficiency theory. Finally, we apply the proposed methodology to empirical and simulated data sets.
這是Mark Schmidt在UBC教機器學習的各種課程的課程材料的集合,包括100多個講座的材料,涵蓋了大量與機器學習相關的主題。
Part 1: Computer Science 340
Exploratory Data Analysis
Decision Trees (Notes on Big-O Notation)
Fundamentals of Learning (Notation Guide)
Probabilistic Classifiers (Probability Slides, Notes on Probability)
Non-Parametric Models
Ensemble Methods
More Clustering
Outlier Detection
Finding Similar Items
Nonlinear Regression
Gradient Descent
Robust Regression
Feature Selection
Regularization
More Regularization
Linear Classifiers
More Linear Classifiers
Feature Engineering
Convolutions
Kernel Methods
Stochastic Gradient
Boosting
MLE and MAP (Notes on Max and Argmax)
More PCA
Sparse Matrix Factorization
Recommender Systems
Nonlinear Dimensionality Reduction
More Deep Learning
Convolutional Neural Networks
More CNNs
Part 2: Data Science 573 and 575
Structure Learning
Sequence Mining
Tensor Basics
Semi-Supervised Learning
PageRank
Part 3: Computer Science 440
A. Binary Random Variables Binary Density Estimation
Bernoulli Distribution
MAP Estimation
Generative Classifiers
Discriminative Classifiers
Neural Networks
Double Descent Curves
Automatic Differentiation
Convolutional Neural Networks
Autoencoders
Fully-Convolutional Networks
B. Categorical Random Variables Monte Carlo Approximation
Conjugate Priors
Bayesian Learning
Empirical Bayes
Multi-Class Classification
What do we learn?
Recurrent Neural Networks
Long Short Term Memory
Attention and Transformers
C. Gaussian Random Variables Univariate Gaussian
Multivariate Gaussian (Motivation)
Multivairate Gaussian (Definition)
Learning Gaussians
Bayesian Linear Regression
End to End Learning
Exponential Family
D. Markov Models Markov Chains
Learning Markov Chains
Message Passing
Markov Chain Monte Carlo
Directed Acyclic Graphical Models
Learning Graphical Models
Log-Linear Models
E. Latent-Variable Models Mixture Models
EM and KDE (Notes on EM)
HMMs and RBMs (Forward-Backward for HMMs)
Topic Models and Variational Inference
VAEs and GANs
本課程首先介紹了機器學習、安全、隱私、對抗性機器學習和博弈論等主題。然后從研究的角度,討論各個課題和相關工作的新穎性和潛在的拓展性。通過一系列的閱讀和項目,學生將了解不同的機器學習算法,并分析它們的實現和安全漏洞,并培養開展相關主題的研究項目的能力。
//aisecure.github.io/TEACHING/2020_fall.html
Evasion Attacks Against Machine Learning Models (Against Classifiers) Evasion Attacks Against Machine Learning Models (Non-traditional Attacks) Evasion Attacks Against Machine Learning Models (Against Detectors/Generative odels/RL) Evasion Attacks Against Machine Learning Models (Blackbox Attacks) Detection Against Adversarial Attacks Defenses Against Adversarial Attacks (Empirical) Defenses Against Adversarial Attacks (Theoretic) Poisoning Attacks Against Machine Learning Models
貝葉斯決策理論提供了一個統一的、直觀的吸引人的方法,從觀察中得出推論,并做出理性的、知情的決定。貝葉斯學派把統計推理看作是信念動力學中的一個問題,即使用有關現象的證據來修正和更新有關它的知識。貝葉斯統計是一種科學合理的方法,以整合知情的專家判斷與經驗數據。貝葉斯統計推斷不能完全獨立于將根據推斷作出的決策的上下文來處理。近年來,貝葉斯方法在各種嚴重依賴數據的學科中變得越來越普遍。本課程向學生介紹貝葉斯理論和方法論,包括貝葉斯推理的現代計算方法。學生將學習貝葉斯方法和頻率論方法在統計推斷方面的共性和差異,如何從貝葉斯的角度來處理統計問題,以及如何將數據與專家判斷以合理的方式結合起來,得出有用的和與政策相關的結論。學生將學習必要的理論,以發展一個堅定的理解何時和如何應用貝葉斯和頻率論方法,并將學習實際程序,為現象發展統計模型,得出推論,并評估證據支持假設。本課程涵蓋貝葉斯推理理論的基礎知識,包括以概率表示信任程度,似然原理,使用貝葉斯規則修正基于證據的信念,共同統計模型的共軛先驗分布,近似后驗分布的馬爾可夫鏈蒙特卡羅方法,貝葉斯層次模型,以及其他關鍵主題。引入圖形模型來表示復雜的概率和決策問題,將它們指定為模塊化組件。作業利用現代計算技術,并著重于將方法應用于實際問題。
//seor.vse.gmu.edu/~klaskey/SYST664/SYST664.html
目錄內容: Unit 1: A Brief Tour of Bayesian Inference and Decision Theory Unit 2: Random Variables, Parametric Models, and Inference from Observation Unit 3: Bayesian Inference with Conjugate Pairs: Single Parameter Models Unit 4: Introduction to Monte Carlo Approximation Unit 5: The Normal Model Unit 6: Gibbs Sampling Unit 7: Hierarchical Bayesian Models Unit 8: Bayesian Regression Unit 9: Conclusion: Multinomial Distribution and Latent Groups
現代數據分析方法被期望處理大量的高維數據,這些數據被收集在不同的領域。這種數據的高維性帶來了許多挑戰,通常被稱為“維數災難”,這使得傳統的統計學習方法在分析時不切實際或無效。為了應對這些挑戰,人們投入了大量精力來開發幾何數據分析方法,這些方法對處理數據的固有幾何形狀進行建模和捕獲,而不是直接對它們的分布進行建模。在本課程中,我們將探討這些方法,并提供他們使用的模型和算法的分析研究。我們將從考慮監督學習開始,并從后驗和似然估計方法中區分基于幾何原則的分類器。接下來,我們將考慮聚類數據的無監督學習任務和基于密度估計的對比方法,這些方法依賴于度量空間或圖結構。最后,我們將考慮內在表示學習中更基本的任務,特別關注降維和流形學習,例如,使用擴散圖,tSNE和PHATE。如果時間允許,我們將包括與本課程相關的研究領域的客座演講,并討論圖形信號處理和幾何深度學習的最新發展。
目錄內容:
Topic 01 - Intoduction (incl. curse of dimensionality & overiew of data analysis tasks)
Topic 02 - Data Formalism ((incl. summary statistics, data types, preprocessing, and simple visualizations)
Topic 03 - Bayesian Classification (incl. decision boundaries, MLE, MAP, Bayes error rate, and Bayesian belief networks)
Topic 04 - Decision Trees (incl. random forests, random projections, and Johnson-Lindenstrauss lemma)
Topic 05 - Principal Component Analysis (incl. preprocessing & dimensionality reduction)
Topic 06 - Support Vector Machines (incl. the "kernel trick" & mercer kernels)
Topic 07 - Multidimensional Scaling (incl. spectral theorem & distance metrics)
Topic 08 - Density-based Clustering (incl. intro. to clustering & cluster eval. with RandIndex)
Topic 09 - Partitional Clustering (incl. lazy learners, kNN, voronoi partitions)
Topic 10 - Hierarchical Clustering (incl. large-scale & graph partitioning)
Topic 11 - Manifold Learning (incl. Isomap & LLE)
Topic 12 - Diffusion Maps
來自臺灣國立清華大學吳尚鴻副教授主講的《大規模機器學習》教程,內容包括深度學習概述與學習理論。
本課程介紹深度學習的概念和實踐。課程由三個部分組成。在第一部分中,我們快速介紹了經典機器學習,并回顧了一些需要理解深度學習的關鍵概念。在第二部分中,我們將討論深度學習與經典機器學習的不同之處,并解釋為什么它在處理復雜問題如圖像和自然語言處理時是有效的。我們將介紹各種CNN和RNN模型。在第三部分,我們介紹了深度強化學習及其應用。
本課程也提供了編程的實驗。在整個課程中,我們將使用Python 3作為主要的編程語言。一些流行的機器學習庫,如Scikit-learn和Tensorflow 2.0將被使用并詳細解釋。
本課程也提供了編程的實驗。在整個課程中,我們將使用Python 3作為主要的編程語言。一些流行的機器學習庫,如Scikit-learn和Tensorflow 2.0將被使用并詳細解釋。
目錄內容: