久久久久久久精品少妇9999_精品亚洲高清一区二区三区电影_欧美日韩在线观看精品一区_亚洲АV电影天堂网无码_国产在线播放98噜噜噜_一区二区三区高清晰不卡免费视频_国产人成视频在线观看青草网

In this paper we address the challenges posed by non-proportional hazards and informative censoring, offering a path toward more meaningful causal inference conclusions. We start from the marginal structural Cox model, which has been widely used for analyzing observational studies with survival outcomes, and typically relies on the inverse probability weighting method. The latter hinges upon a propensity score model for the treatment assignment, and a censoring model which incorporates both the treatment and the covariates. In such settings, model misspecification can occur quite effortlessly, and the Cox regression model's non-collapsibility has historically posed challenges when striving to guard against model misspecification through augmentation. We introduce an augmented inverse probability weighted estimator which, enriched with doubly robust properties, paves the way for integrating machine learning and a plethora of nonparametric methods, effectively overcoming the challenges of non-collapsibility. The estimator extends naturally to estimating a time-average treatment effect when the proportional hazards assumption fails. We closely examine its theoretical and practical performance, showing that it satisfies both the assumption-lean and the well-specification criteria discussed in the recent literature. Finally, its application to a dataset reveals insights into the impact of mid-life alcohol consumption on mortality in later life.

相關內容

估(gu)計(ji)/估(gu)計(ji)量

關注 3

離散化 · 流形 · Networking · Neural Networks · 近似 ·

2024 年 1 月 4 日

Learning Discretized Neural Networks under Ricci Flow

Jun Chen,Hanwen Chen,Mengmeng Wang,Guang Dai,Ivor W. Tsang,Yong Liu

In this paper, we study Discretized Neural Networks (DNNs) composed of low-precision weights and activations, which suffer from either infinite or zero gradients due to the non-differentiable discrete function during training. Most training-based DNNs in such scenarios employ the standard Straight-Through Estimator (STE) to approximate the gradient w.r.t. discrete values. However, the use of STE introduces the problem of gradient mismatch, arising from perturbations in the approximated gradient. To address this problem, this paper reveals that this mismatch can be interpreted as a metric perturbation in a Riemannian manifold, viewed through the lens of duality theory. Building on information geometry, we construct the Linearly Nearly Euclidean (LNE) manifold for DNNs, providing a background for addressing perturbations. By introducing a partial differential equation on metrics, i.e., the Ricci flow, we establish the dynamical stability and convergence of the LNE metric with the $L^2$-norm perturbation. In contrast to previous perturbation theories with convergence rates in fractional powers, the metric perturbation under the Ricci flow exhibits exponential decay in the LNE manifold. Experimental results across various datasets demonstrate that our method achieves superior and more stable performance for DNNs compared to other representative training-based methods.

INFORMS · 最大后驗 · Performer · 樣本 · Performance ·

2024 年 1 月 4 日

The Theoretical Limit of Radar Target Detection

Dazhuan Xu,Nan Wang,Han Zhang,Xiaolong Kong

from arxiv, 19 pages, 9 figures

In this paper, we solve the optimal target detection problem employing the thoughts and methodologies of Shannon's information theory. Introducing a target state variable into a general radar system model, an equivalent detection channel is derived, and the a posteriori probability distribution is given accordingly. Detection information (DI) is proposed for measuring system performance, which holds for any specific detection method. Moreover, we provide an analytic expression for the false alarm probability concerning the a priori probability. In particular, for a sufficiently large observation interval, the false alarm probability equals the a priori probability of the existing state. A stochastic detection method, the sampling a posteriori probability, is also proposed. The target detection theorem is proved mathematically, which indicates that DI is an achievable theoretical limit of target detection. Specifically, when empirical DI is gained from the sampling a posteriori detection method approaches the DI, the probability of failed decisions tends to be zero. Conversely, there is no detector whose empirical DI is more than DI. Numerical simulations are performed to verify the correctness of the theorems. The results demonstrate that the maximum a posteriori and the Neyman-Pearson detection methods are upper bounded by the theoretical limit.

Learning · 可理解性 · Rust · Processing（編程語言） · INTERACT ·

2024 年 1 月 2 日

Profiling Programming Language Learning

Will Crichton,Shriram Krishnamurthi

from arxiv, To appear at OOPSLA'24

This paper documents a year-long experiment to "profile" the process of learning a programming language: gathering data to understand what makes a language hard to learn, and using that data to improve the learning process. We added interactive quizzes to The Rust Programming Language, the official textbook for learning Rust. Over 13 months, 62,526 readers answered questions 1,140,202 times. First, we analyze the trajectories of readers. We find that many readers drop-out of the book early when faced with difficult language concepts like Rust's ownership types. Second, we use classical test theory and item response theory to analyze the characteristics of quiz questions. We find that better questions are more conceptual in nature, such as asking why a program does not compile vs. whether a program compiles. Third, we performed 12 interventions into the book to help readers with difficult questions. We find that on average, interventions improved quiz scores on the targeted questions by +20%. Fourth, we show that our technique can likely generalize to languages with smaller user bases by simulating our statistical inferences on small N. These results demonstrate that quizzes are a simple and useful technique for understanding language learning at all scales.

可約的 · 確切的 · 優化器 · Automator · 設計 ·

2024 年 1 月 2 日

Quantum State Preparation Using an Exact CNOT Synthesis Formulation

Hanyu Wang,Bochen Tan,Jason Cong,Giovanni De Micheli

from arxiv, 6 pages, 7 figures

Minimizing the use of CNOT gates in quantum state preparation is a crucial step in quantum compilation, as they introduce coupling constraints and more noise than single-qubit gates. Reducing the number of CNOT gates can lead to more efficient and accurate quantum computations. However, the lack of compatibility to model superposition and entanglement challenges the scalability and optimality of CNOT optimization algorithms on classical computers. In this paper, we propose an effective state preparation algorithm using an exact CNOT synthesis formulation. Our method represents a milestone as the first design automation algorithm to surpass manual design, reducing the best CNOT numbers to prepare a Dicke state by 2x. For general states with up to 20 qubits, our method reduces the CNOT number by 9% and 32% for dense and sparse states, on average, compared to the latest algorithms.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

邊緣化 · 學成 · Networking · Performer · Better ·

2021 年 3 月 25 日

Recent Advances in Large Margin Learning

Yiwen Guo,Changshui Zhang

from arxiv, 8 pages, 3 figures

This paper serves as a survey of recent advances in large margin training and its theoretical foundations, mostly for (nonlinear) deep neural networks (DNNs) that are probably the most prominent machine learning models for large-scale data in the community over the past decade. We generalize the formulation of classification margins from classical research to latest DNNs, summarize theoretical connections between the margin, network generalization, and robustness, and introduce recent efforts in enlarging the margins for DNNs comprehensively. Since the viewpoint of different methods is discrepant, we categorize them into groups for ease of comparison and discussion in the paper. Hopefully, our discussions and overview inspire new research work in the community that aim to improve the performance of DNNs, and we also point to directions where the large margin principle can be verified to provide theoretical evidence why certain regularizations for DNNs function well in practice. We managed to shorten the paper such that the crucial spirit of large margin learning and related methods are better emphasized.

有向 · Extensibility · Processing（編程語言） · MoDELS · 編譯器 ·

2020 年 12 月 16 日

Communicative Message Passing for Inductive Relation Reasoning

Sijie Mai,Shuangjia Zheng,Yuedong Yang,Haifeng Hu

from arxiv, Accepted by AAAI-2021

Relation prediction for knowledge graphs aims at predicting missing relationships between entities. Despite the importance of inductive relation prediction, most previous works are limited to a transductive setting and cannot process previously unseen entities. The recent proposed subgraph-based relation reasoning models provided alternatives to predict links from the subgraph structure surrounding a candidate triplet inductively. However, we observe that these methods often neglect the directed nature of the extracted subgraph and weaken the role of relation information in the subgraph modeling. As a result, they fail to effectively handle the asymmetric/anti-symmetric triplets and produce insufficient embeddings for the target triplets. To this end, we introduce a \textbf{C}\textbf{o}mmunicative \textbf{M}essage \textbf{P}assing neural network for \textbf{I}nductive re\textbf{L}ation r\textbf{E}asoning, \textbf{CoMPILE}, that reasons over local directed subgraph structures and has a vigorous inductive bias to process entity-independent semantic relations. In contrast to existing models, CoMPILE strengthens the message interactions between edges and entitles through a communicative kernel and enables a sufficient flow of relation information. Moreover, we demonstrate that CoMPILE can naturally handle asymmetric/anti-symmetric relations without the need for explosively increasing the number of model parameters by extracting the directed enclosing subgraphs. Extensive experiments show substantial performance gains in comparison to state-of-the-art methods on commonly used benchmark datasets with variant inductive settings.

矩陣微積分 · 可理解性 · 學成 · Neural Networks · Networks ·

2018 年 7 月 2 日

The Matrix Calculus You Need For Deep Learning

Terence Parr,Jeremy Howard

from arxiv, PDF version of mobile/web friendly version //explained.ai/matrix-calculus/index.html

This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don't worry if you get stuck at some point along the way---just go back and reread the previous section, and try writing down and working through some examples. And if you're still stuck, we're happy to answer your questions in the Theory category at forums.fast.ai. Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here. See related articles at //explained.ai

注意力機制 · 機器閱讀理解 · Extensibility · state-of-the-art · MoDELS ·

2018 年 4 月 25 日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Minghao Hu,Yuxing Peng,Zhen Huang,Xipeng Qiu,Furu Wei,Ming Zhou

from arxiv, Published in 26th International Joint Conference on Artificial Intelligence (IJCAI), 2018

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.

卷積神經網絡 · Neural Networks · 知識表示 · Networking · 卷積 ·

2018 年 2 月 14 日

Interpretable Convolutional Neural Networks

Quanshi Zhang,Ying Nian Wu,Song-Chun Zhu

from arxiv, In this version, we release the website of the code. Compared to the previous version, we have corrected all values of location instability in Table 3--6 by dividing the values by sqrt(2), i.e., a=a/sqrt(2). Such revisions do NOT decrease the significance of the superior performance of our method, because we make the same correction to location-instability values of all baselines

This paper proposes a method to modify traditional convolutional neural networks (CNNs) into interpretable CNNs, in order to clarify knowledge representations in high conv-layers of CNNs. In an interpretable CNN, each filter in a high conv-layer represents a certain object part. We do not need any annotations of object parts or textures to supervise the learning process. Instead, the interpretable CNN automatically assigns each filter in a high conv-layer with an object part during the learning process. Our method can be applied to different types of CNNs with different structures. The clear knowledge representation in an interpretable CNN can help people understand the logics inside a CNN, i.e., based on which patterns the CNN makes the decision. Experiments showed that filters in an interpretable CNN were more semantically meaningful than those in traditional CNNs.