云南虫谷在线观看免费观看电视剧-日韩黄色视频免费

Here, we explain and illustrate a geometric perspective on causal inference in cohort studies that can help epidemiologists understand the role of standardization in causal inference as well as the distinctions between confounding, effect modification, and noncollapsibility. For simplicity, we focus on a binary exposure X, a binary outcome D, and a binary confounder C that is not causally affected by X. Rothman diagrams plot risk in the unexposed on the x-axis and risk in the exposed on the y-axis. The crude risks define one point in the unit square, and the stratum-specific risks define two other points in the unit square. These three points can be used to identify confounding and effect modification, and we show briefly how these concepts generalize to confounders with more than two levels. We propose a simplified but equivalent definition of collapsibility in terms of standardization, and we show that a measure of association is collapsible if and only if all of its contour lines are straight. We illustrate these ideas using data from a study conducted in Newcastle upon Tyne, United Kingdom, where the causal effect of smoking on 20-year mortality was confounded by age. We conclude that causal inference should be taught using geometry before using regression models.

相關內容

binary

關注 1

有偏 · MoDELS · 講稿 · 可約的 · 圖片分類 ·

2023 年 12 月 11 日

Classification for everyone : Building geography agnostic models for fairer recognition

Akshat Jindal,Shreya Singh,Soham Gadgil

from arxiv, Stanford CS 231n Course Project

In this paper, we analyze different methods to mitigate inherent geographical biases present in state of the art image classification models. We first quantitatively present this bias in two datasets - The Dollar Street Dataset and ImageNet, using images with location information. We then present different methods which can be employed to reduce this bias. Finally, we analyze the effectiveness of the different techniques on making these models more robust to geographical locations of the images.

TransAct · ASSETS · 優化器 · 在線 · Learning ·

2023 年 12 月 8 日

Onflow: an online portfolio allocation algorithm

Gabriel Turinici,Pierre Brugiere

We introduce Onflow, a reinforcement learning technique that enables online optimization of portfolio allocation policies based on gradient flows. We devise dynamic allocations of an investment portfolio to maximize its expected log return while taking into account transaction fees. The portfolio allocation is parameterized through a softmax function, and at each time step, the gradient flow method leads to an ordinary differential equation whose solutions correspond to the updated allocations. This algorithm belongs to the large class of stochastic optimization procedures; we measure its efficiency by comparing our results to the mathematical theoretical values in a log-normal framework and to standard benchmarks from the 'old NYSE' dataset. For log-normal assets, the strategy learned by Onflow, with transaction costs at zero, mimics Markowitz's optimal portfolio and thus the best possible asset allocation strategy. Numerical experiments from the 'old NYSE' dataset show that Onflow leads to dynamic asset allocation strategies whose performances are: a) comparable to benchmark strategies such as Cover's Universal Portfolio or Helmbold et al. "multiplicative updates" approach when transaction costs are zero, and b) better than previous procedures when transaction costs are high. Onflow can even remain efficient in regimes where other dynamical allocation techniques do not work anymore. Therefore, as far as tested, Onflow appears to be a promising dynamic portfolio management strategy based on observed prices only and without any assumption on the laws of distributions of the underlying assets' returns. In particular it could avoid model risk when building a trading strategy.

Markov · 散度 · 馬爾可夫鏈 · 信息散度 · INFORMS ·

2023 年 12 月 8 日

Information divergences of Markov chains and their applications

Youjia Wang,Michael C. H. Choi

from arxiv, 36 pages

In this paper, we first introduce and define several new information divergences in the space of transition matrices of finite Markov chains which measure the discrepancy between two Markov chains. These divergences offer natural generalizations of classical information-theoretic divergences, such as the $f$-divergences and the R\'enyi divergence between probability measures, to the context of finite Markov chains. We begin by detailing and deriving fundamental properties of these divergences and notably gives a Markov chain version of the Pinsker's inequality and Chernoff information. We then utilize these notions in a few applications. First, we investigate the binary hypothesis testing problem of Markov chains, where the newly defined R\'enyi divergence between Markov chains and its geometric interpretation play an important role in the analysis. Second, we propose and analyze information-theoretic (Ces\`aro) mixing times and ergodicity coefficients, along with spectral bounds of these notions in the reversible setting. Examples of the random walk on the hypercube, as well as the connections between the critical height of the low-temperature Metropolis-Hastings chain and these proposed ergodicity coefficients, are highlighted.

RNN · 循環神經網絡 · Networking · 遷移學習 · MoDELS ·

2023 年 12 月 7 日

Recurrent neural networks and transfer learning for elasto-plasticity in woven composites

Ehsan Ghane,Martin Fagerstr?m,Mohsen Mirkhalaf

from arxiv, There are 25 pages and 13 EPS images. The paper includes links to supporting materials

As a surrogate for computationally intensive meso-scale simulation of woven composites, this article presents Recurrent Neural Network (RNN) models. Leveraging the power of transfer learning, the initialization challenges and sparse data issues inherent in cyclic shear strain loads are addressed in the RNN models. A mean-field model generates a comprehensive data set representing elasto-plastic behavior. In simulations, arbitrary six-dimensional strain histories are used to predict stresses under random walking as the source task and cyclic loading conditions as the target task. Incorporating sub-scale properties enhances RNN versatility. In order to achieve accurate predictions, the model uses a grid search method to tune network architecture and hyper-parameter configurations. The results of this study demonstrate that transfer learning can be used to effectively adapt the RNN to varying strain conditions, which establishes its potential as a useful tool for modeling path-dependent responses in woven composites.

有偏 · Facebook AI Research · 可辨認的 · 估計/估計量 · 回合 ·

2023 年 12 月 6 日

Detecting algorithmic bias in medical AI-models

Jeffrey Smith,Andre Holder,Rishikesan Kamaleswaran,Yao Xie

from arxiv, 26 pages, 9 figures

With the growing prevalence of machine learning and artificial intelligence-based medical decision support systems, it is equally important to ensure that these systems provide patient outcomes in a fair and equitable fashion. This paper presents an innovative framework for detecting areas of algorithmic bias in medical-AI decision support systems. Our approach efficiently identifies potential biases in medical-AI models, specifically in the context of sepsis prediction, by employing the Classification and Regression Trees (CART) algorithm. We verify our methodology by conducting a series of synthetic data experiments, showcasing its ability to estimate areas of bias in controlled settings precisely. The effectiveness of the concept is further validated by experiments using electronic medical records from Grady Memorial Hospital in Atlanta, Georgia. These tests demonstrate the practical implementation of our strategy in a clinical environment, where it can function as a vital instrument for guaranteeing fairness and equity in AI-based medical decisions.

模型評估 · Principle · 查準率/準確率 · 穩健性 · 平滑 ·

2023 年 12 月 6 日

Incorporating the algorithm for the boundary condition from FVM into the framework of Eulerian SPH

Zhentong Wang,Oskar J. Haidn,Xiangyu Hu

from arxiv, 36 pages, 12 figures and 3 tables

Finite volume method (FVM) is a widely used mesh-based technique, renowned for its computational efficiency and accuracy but it bears significant drawbacks, particularly in mesh generation and handling complex boundary interfaces or conditions. On the other hand, smoothed particle hydrodynamics (SPH) method, a popular meshless alternative, inherently circumvents the mesh generation and yields smoother numerical outcomes but at the expense of computational efficiency. Therefore, numerous researchers have strategically amalgamated the strengths of both methods to investigate complex flow phenomena and this synergy has yielded precise and computationally efficient outcomes. However, algorithms involving the weak coupling of these two methods tend to be intricate, which has issues pertaining to versatility, implementation, and mutual adaptation to hardware and coding structures. Thus, achieving a robust and strong coupling of FVM and SPH in a unified framework is imperative. Due to differing boundary algorithms between these methods in Wang's work, the crucial step for establishing a strong coupling of both methods within a unified SPH framework lies in incorporating the FVM boundary algorithm into the Eulerian SPH method. In this paper, we propose a straightforward algorithm in the Eulerian SPH method, algorithmically equivalent to that in FVM, grounded in the principle of zero-order consistency. Moreover, several numerical examples, including fully and weakly compressible flows with various boundary conditions in the Eulerian SPH method, validate the stability and accuracy of the proposed algorithm.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

Ruoyu Sun

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.

圖形處理器 · 圖 · INTERACT · Performer · Neural Networks ·

2019 年 11 月 6 日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Ruochi Zhang,Yuesong Zou,Jian Ma

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.