国产高清一区二区在线影院_一级一级A爱片免费视频_亚洲色黄A大片激情亚洲_国内外在线视频成人_91麻豆精品国产综合久久久_九九精品久久精品一区二区_国产人成在线视频

Experimental and observational studies often lack validity due to untestable assumptions. We propose a double machine learning approach to combine experimental and observational studies, allowing practitioners to test for assumption violations and estimate treatment effects consistently. Our framework tests for violations of external validity and ignorability under milder assumptions. When only one of these assumptions is violated, we provide semiparametrically efficient treatment effect estimators. However, our no-free-lunch theorem highlights the necessity of accurately identifying the violated assumption for consistent treatment effect estimation. Through comparative analyses, we show our framework's superiority over existing data fusion methods. The practical utility of our approach is further exemplified by three real-world case studies, underscoring its potential for widespread application in empirical research.

相關內容

Machine Learning

關注 2241

機器(qi)學(xue)(xue)習(xi)（Machine Learning）是(shi)一個(ge)研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)計算學(xue)(xue)習(xi)方(fang)法的(de)(de)國際(ji)論(lun)(lun)壇。該雜志發表(biao)文(wen)章，報告廣泛(fan)的(de)(de)學(xue)(xue)習(xi)方(fang)法應(ying)(ying)(ying)用于各種學(xue)(xue)習(xi)問(wen)題(ti)(ti)的(de)(de)實質性(xing)結果。該雜志的(de)(de)特色論(lun)(lun)文(wen)描(miao)述研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)的(de)(de)問(wen)題(ti)(ti)和(he)方(fang)法，應(ying)(ying)(ying)用研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)和(he)研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)方(fang)法的(de)(de)問(wen)題(ti)(ti)。有關學(xue)(xue)習(xi)問(wen)題(ti)(ti)或(huo)方(fang)法的(de)(de)論(lun)(lun)文(wen)通(tong)過實證研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)、理(li)論(lun)(lun)分(fen)析(xi)或(huo)與心理(li)現象的(de)(de)比(bi)較提供了(le)(le)堅實的(de)(de)支持。應(ying)(ying)(ying)用論(lun)(lun)文(wen)展示了(le)(le)如何應(ying)(ying)(ying)用學(xue)(xue)習(xi)方(fang)法來解決重(zhong)要的(de)(de)應(ying)(ying)(ying)用問(wen)題(ti)(ti)。研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)方(fang)法論(lun)(lun)文(wen)改進了(le)(le)機器(qi)學(xue)(xue)習(xi)的(de)(de)研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)方(fang)法。所(suo)有的(de)(de)論(lun)(lun)文(wen)都(dou)以(yi)其他研(yan)(yan)究(jiu)(jiu)(jiu)(jiu)人員(yuan)可以(yi)驗證或(huo)復(fu)制的(de)(de)方(fang)式描(miao)述了(le)(le)支持證據。論(lun)(lun)文(wen)還詳細(xi)說(shuo)明了(le)(le)學(xue)(xue)習(xi)的(de)(de)組成部(bu)分(fen)，并討論(lun)(lun)了(le)(le)關于知識(shi)表(biao)示和(he)性(xing)能任務(wu)的(de)(de)假(jia)設。官網地址：

估計/估計量 · 統計量 · 樣本 · 錯誤率 · 示例 ·

2024 年 5 月 20 日

On Efficient and Statistical Quality Estimation for Data Annotation

Jan-Christoph Klie,Rahul Nair,Juan Haladjian,Marc Kirchner

from arxiv, Accepted to ACL 2024

Annotated datasets are an essential ingredient to train, evaluate, compare and productionalize supervised machine learning models. It is therefore imperative that annotations are of high quality. For their creation, good quality management and thereby reliable quality estimates are needed. Then, if quality is insufficient during the annotation process, rectifying measures can be taken to improve it. Quality estimation is often performed by having experts manually label instances as correct or incorrect. But checking all annotated instances tends to be expensive. Therefore, in practice, usually only subsets are inspected; sizes are chosen mostly without justification or regard to statistical power and more often than not, are relatively small. Basing estimates on small sample sizes, however, can lead to imprecise values for the error rate. Using unnecessarily large sample sizes costs money that could be better spent, for instance on more annotations. Therefore, we first describe in detail how to use confidence intervals for finding the minimal sample size needed to estimate the annotation error rate. Then, we propose applying acceptance sampling as an alternative to error rate estimation We show that acceptance sampling can reduce the required sample sizes up to 50% while providing the same statistical guarantees.

統計量 · Networking · 近似 · 配分函數 · 模型評估 ·

2024 年 5 月 20 日

Statistical Mechanics Calculations Using Variational Autoregressive Networks and Quantum Annealing

Yuta Tamura,Masayuki Ohzeki

from arxiv, 5pages

In statistical mechanics, computing the partition function is generally difficult. An approximation method using a variational autoregressive network (VAN) has been proposed recently. This approach offers the advantage of directly calculating the generation probabilities while obtaining a significantly large number of samples. The present study introduces a novel approximation method that employs samples derived from quantum annealing machines in conjunction with VAN, which are empirically assumed to adhere to the Gibbs-Boltzmann distribution. When applied to the finite-size Sherrington-Kirkpatrick model, the proposed method demonstrates enhanced accuracy compared to the traditional VAN approach and other approximate methods, such as the widely utilized naive mean field.

線搜索 · 優化器 · 全局最小 · Performer · 無約束優化 ·

2024 年 5 月 17 日

Efficient Line Search Method Based on Regression and Uncertainty Quantification

S?ren Laue,Tomislav Prusina

from arxiv, To be featured in LION18 2024

Unconstrained optimization problems are typically solved using iterative methods, which often depend on line search techniques to determine optimal step lengths in each iteration. This paper introduces a novel line search approach. Traditional line search methods, aimed at determining optimal step lengths, often discard valuable data from the search process and focus on refining step length intervals. This paper proposes a more efficient method using Bayesian optimization, which utilizes all available data points, i.e., function values and gradients, to guide the search towards a potential global minimum. This new approach more effectively explores the search space, leading to better solution quality. It is also easy to implement and integrate into existing frameworks. Tested on the challenging CUTEst test set, it demonstrates superior performance compared to existing state-of-the-art methods, solving more problems to optimality with equivalent resource usage.

主動學習 · Networking · Neural Networks · Learning · 泛函 ·

2024 年 5 月 17 日

Active Learning with Fully Bayesian Neural Networks for Discontinuous and Nonstationary Data

Maxim Ziatdinov

from arxiv, Fixed PGM in Figure 2 and update caption

Active learning optimizes the exploration of large parameter spaces by strategically selecting which experiments or simulations to conduct, thus reducing resource consumption and potentially accelerating scientific discovery. A key component of this approach is a probabilistic surrogate model, typically a Gaussian Process (GP), which approximates an unknown functional relationship between control parameters and a target property. However, conventional GPs often struggle when applied to systems with discontinuities and non-stationarities, prompting the exploration of alternative models. This limitation becomes particularly relevant in physical science problems, which are often characterized by abrupt transitions between different system states and rapid changes in physical property behavior. Fully Bayesian Neural Networks (FBNNs) serve as a promising substitute, treating all neural network weights probabilistically and leveraging advanced Markov Chain Monte Carlo techniques for direct sampling from the posterior distribution. This approach enables FBNNs to provide reliable predictive distributions, crucial for making informed decisions under uncertainty in the active learning setting. Although traditionally considered too computationally expensive for 'big data' applications, many physical sciences problems involve small amounts of data in relatively low-dimensional parameter spaces. Here, we assess the suitability and performance of FBNNs with the No-U-Turn Sampler for active learning tasks in the 'small data' regime, highlighting their potential to enhance predictive accuracy and reliability on test functions relevant to problems in physical sciences.

MoDELS · 統計量 · Learning · 流形 · Machine Learning ·

2024 年 5 月 15 日

Kuramoto Oscillators and Swarms on Manifolds for Geometry Informed Machine Learning

Vladimir Jacimovic

We propose the idea of using Kuramoto models (including their higher-dimensional generalizations) for machine learning over non-Euclidean data sets. These models are systems of matrix ODE's describing collective motions (swarming dynamics) of abstract particles (generalized oscillators) on spheres, homogeneous spaces and Lie groups. Such models have been extensively studied from the beginning of XXI century both in statistical physics and control theory. They provide a suitable framework for encoding maps between various manifolds and are capable of learning over spherical and hyperbolic geometries. In addition, they can learn coupled actions of transformation groups (such as special orthogonal, unitary and Lorentz groups). Furthermore, we overview families of probability distributions that provide appropriate statistical models for probabilistic modeling and inference in Geometric Deep Learning. We argue in favor of using statistical models which arise in different Kuramoto models in the continuum limit of particles. The most convenient families of probability distributions are those which are invariant with respect to actions of certain symmetry groups.

異常檢測 · 無監督 · 解碼 · 自編碼器 · Performer ·

2024 年 5 月 15 日

A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection

Honghui Chen,Pingping Chen,Huan Mao,Mengxi Jiang

from arxiv, 12 pages, 4 figures

Anomaly detection and localization without any manual annotations and prior knowledge is a challenging task under the setting of unsupervised learning. The existing works achieve excellent performance in the anomaly detection, but with complex networks or cumbersome pipelines. To address this issue, this paper explores a simple but effective architecture in the anomaly detection. It consists of a well pre-trained encoder to extract hierarchical feature representations and a decoder to reconstruct these intermediate features from the encoder. In particular, it does not require any data augmentations and anomalous images for training. The anomalies can be detected when the decoder fails to reconstruct features well, and then errors of hierarchical feature reconstruction are aggregated into an anomaly map to achieve anomaly localization. The difference comparison between those features of encoder and decode lead to more accurate and robust localization results than the comparison in single feature or pixel-by-pixel comparison in the conventional works. Experiment results show that the proposed method outperforms the state-of-the-art methods on MNIST, Fashion-MNIST, CIFAR-10, and MVTec Anomaly Detection datasets on both anomaly detection and localization.

樣例 · 學習器 · 標注 · 主動學習 · Learning ·

2024 年 5 月 13 日

Active Learning with Simple Questions

Vasilis Kontonis,Mingchen Ma,Christos Tzamos

from arxiv, To appear at COLT 2024

We consider an active learning setting where a learner is presented with a pool S of n unlabeled examples belonging to a domain X and asks queries to find the underlying labeling that agrees with a target concept h^* \in H. In contrast to traditional active learning that queries a single example for its label, we study more general region queries that allow the learner to pick a subset of the domain T \subset X and a target label y and ask a labeler whether h^*(x) = y for every example in the set T \cap S. Such more powerful queries allow us to bypass the limitations of traditional active learning and use significantly fewer rounds of interactions to learn but can potentially lead to a significantly more complex query language. Our main contribution is quantifying the trade-off between the number of queries and the complexity of the query language used by the learner. We measure the complexity of the region queries via the VC dimension of the family of regions. We show that given any hypothesis class H with VC dimension d, one can design a region query family Q with VC dimension O(d) such that for every set of n examples S \subset X and every h^* \in H, a learner can submit O(d log n) queries from Q to a labeler and perfectly label S. We show a matching lower bound by designing a hypothesis class H with VC dimension d and a dataset S \subset X of size n such that any learning algorithm using any query class with VC dimension O(d) must make poly(n) queries to label S perfectly. Finally, we focus on well-studied hypothesis classes including unions of intervals, high-dimensional boxes, and d-dimensional halfspaces, and obtain stronger results. In particular, we design learning algorithms that (i) are computationally efficient and (ii) work even when the queries are not answered based on the learner's pool of examples S but on some unknown superset L of S

稀疏化 · 模型評估 · Learning · MoDELS · Notability ·

2024 年 5 月 13 日

Secure Aggregation Meets Sparsification in Decentralized Learning

Sayan Biswas,Anne-Marie Kermarrec,Rafael Pires,Rishi Sharma,Milos Vujasinovic

Decentralized learning (DL) faces increased vulnerability to privacy breaches due to sophisticated attacks on machine learning (ML) models. Secure aggregation is a computationally efficient cryptographic technique that enables multiple parties to compute an aggregate of their private data while keeping their individual inputs concealed from each other and from any central aggregator. To enhance communication efficiency in DL, sparsification techniques are used, selectively sharing only the most crucial parameters or gradients in a model, thereby maintaining efficiency without notably compromising accuracy. However, applying secure aggregation to sparsified models in DL is challenging due to the transmission of disjoint parameter sets by distinct nodes, which can prevent masks from canceling out effectively. This paper introduces CESAR, a novel secure aggregation protocol for DL designed to be compatible with existing sparsification mechanisms. CESAR provably defends against honest-but-curious adversaries and can be formally adapted to counteract collusion between them. We provide a foundational understanding of the interaction between the sparsification carried out by the nodes and the proportion of the parameters shared under CESAR in both colluding and non-colluding environments, offering analytical insight into the working and applicability of the protocol. Experiments on a network with 48 nodes in a 3-regular topology show that with random subsampling, CESAR is always within 0.5% accuracy of decentralized parallel stochastic gradient descent (D-PSGD), while adding only 11% of data overhead. Moreover, it surpasses the accuracy on TopK by up to 0.3% on independent and identically distributed (IID) data.

Continuity · Machine Learning · 泛函 · Learning · MoDELS ·

2024 年 5 月 12 日

Stochastic Langevin Differential Inclusions with Applications to Machine Learning

Fabio V. Difonzo,Vyacheslav Kungurtsev,Jakub Marecek

from arxiv, 26 pages, 11 figures

Stochastic differential equations of Langevin-diffusion form have received significant attention, thanks to their foundational role in both Bayesian sampling algorithms and optimization in machine learning. In the latter, they serve as a conceptual model of the stochastic gradient flow in training over-parameterized models. However, the literature typically assumes smoothness of the potential, whose gradient is the drift term. Nevertheless, there are many problems for which the potential function is not continuously differentiable, and hence the drift is not Lipschitz continuous everywhere. This is exemplified by robust losses and Rectified Linear Units in regression problems. In this paper, we show some foundational results regarding the flow and asymptotic properties of Langevin-type Stochastic Differential Inclusions under assumptions appropriate to the machine-learning settings. In particular, we show strong existence of the solution, as well as an asymptotic minimization of the canonical free-energy functional.

小樣本學習 · 目標檢測 · Networking · 數據集 · 情景 ·

2020 年 3 月 31 日

Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector

Qi Fan,Wei Zhuo,Chi-Keung Tang,Yu-Wing Tai

from arxiv, CVPR2020 Camera Ready. (Fix Figure 3 and Table 5. More implementation details in the supplementary material.)

Conventional methods for object detection typically require a substantial amount of training data and preparing such high-quality training data is very labor-intensive. In this paper, we propose a novel few-shot object detection network that aims at detecting objects of unseen categories with only a few annotated examples. Central to our method are our Attention-RPN, Multi-Relation Detector and Contrastive Training strategy, which exploit the similarity between the few shot support set and query set to detect novel objects while suppressing false detection in the background. To train our network, we contribute a new dataset that contains 1000 categories of various objects with high-quality annotations. To the best of our knowledge, this is one of the first datasets specifically designed for few-shot object detection. Once our few-shot network is trained, it can detect objects of unseen categories without further training or fine-tuning. Our method is general and has a wide range of potential applications. We produce a new state-of-the-art performance on different datasets in the few-shot setting. The dataset link is //github.com/fanq15/Few-Shot-Object-Detection-Dataset.