亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the problem of estimation of Individual Treatment Effects (ITE) in the context of multiple treatments and networked observational data. Leveraging the network information, we aim to utilize hidden confounders that may not be directly accessible in the observed data, thereby enhancing the practical applicability of the strong ignorability assumption. To achieve this, we first employ Graph Convolutional Networks (GCN) to learn a shared representation of the confounders. Then, our approach utilizes separate neural networks to infer potential outcomes for each treatment. We design a loss function as a weighted combination of two components: representation loss and Mean Squared Error (MSE) loss on the factual outcomes. To measure the representation loss, we extend existing metrics such as Wasserstein and Maximum Mean Discrepancy (MMD) from the binary treatment setting to the multiple treatments scenario. To validate the effectiveness of our proposed methodology, we conduct a series of experiments on the benchmark datasets such as BlogCatalog and Flickr. The experimental results consistently demonstrate the superior performance of our models when compared to baseline methods.

相關內容

Networking:IFIP International Conferences on Networking。 Explanation:國際網絡會議。 Publisher:IFIP。 SIT:

We study variants of the average treatment effect on the treated with population parameters replaced by their sample counterparts. For each estimand, we derive the limiting distribution with respect to a semiparametric efficient estimator of the population effect and provide guidance on variance estimation. Included in our analysis is the well-known sample average treatment effect on the treated, for which we obtain some unexpected results. Unlike for the ordinary sample average treatment effect, we find that the asymptotic variance for the sample average treatment effect on the treated is point-identified and consistently estimable, but it potentially exceeds that of the population estimand. To address this shortcoming, we propose a modification that yields a new estimand, the mixed average treatment effect on the treated, which is always estimated more precisely than both the population and sample effects. We also introduce a second new estimand that arises from an alternative interpretation of the treatment effect on the treated with which all individuals are weighted by the propensity score.

We propose a new loss function for supervised and physics-informed training of neural networks and operators that incorporates a posteriori error estimate. More specifically, during the training stage, the neural network learns additional physical fields that lead to rigorous error majorants after a computationally cheap postprocessing stage. Theoretical results are based upon the theory of functional a posteriori error estimates, which allows for the systematic construction of such loss functions for a diverse class of practically relevant partial differential equations. From the numerical side, we demonstrate on a series of elliptic problems that for a variety of architectures and approaches (physics-informed neural networks, physics-informed neural operators, neural operators, and classical architectures in the regression and physics-informed settings), we can reach better or comparable accuracy and in addition to that cheaply recover high-quality upper bounds on the error after training.

We address modelling and computational issues for multiple treatment effect inference under many potential confounders. A primary issue relates to preventing harmful effects from omitting relevant covariates (under-selection), while not running into over-selection issues that introduce substantial variance and a bias related to the non-random over-inclusion of covariates. We propose a novel empirical Bayes framework for Bayesian model averaging that learns from data the extent to which the inclusion of key covariates should be encouraged, specifically those highly associated to the treatments. A key challenge is computational. We develop fast algorithms, including an Expectation-Propagation variational approximation and simple stochastic gradient optimization algorithms, to learn the hyper-parameters from data. Our framework uses widely-used ingredients and largely existing software, and it is implemented within the R package mombf featured on CRAN. This work is motivated by and is illustrated in two applications. The first is the association between salary variation and discriminatory factors. The second, that has been debated in previous works, is the association between abortion policies and crime. Our approach provides insights that differ from previous analyses especially in situations with weaker treatment effects.

Recent architectural developments have enabled recurrent neural networks (RNNs) to reach and even surpass the performance of Transformers on certain sequence modeling tasks. These modern RNNs feature a prominent design pattern: linear recurrent layers interconnected by feedforward paths with multiplicative gating. Here, we show how RNNs equipped with these two design elements can exactly implement (linear) self-attention, the main building block of Transformers. By reverse-engineering a set of trained RNNs, we find that gradient descent in practice discovers our construction. In particular, we examine RNNs trained to solve simple in-context learning tasks on which Transformers are known to excel and find that gradient descent instills in our RNNs the same attention-based in-context learning algorithm used by Transformers. Our findings highlight the importance of multiplicative interactions in neural networks and suggest that certain RNNs might be unexpectedly implementing attention under the hood.

We propose a diffusion approximation method to the continuous-state Markov Decision Processes (MDPs) that can be utilized to address autonomous navigation and control in unstructured off-road environments. In contrast to most decision-theoretic planning frameworks that assume fully known state transition models, we design a method that eliminates such a strong assumption that is often extremely difficult to engineer in reality. We first take the second-order Taylor expansion of the value function. The Bellman optimality equation is then approximated by a partial differential equation, which only relies on the first and second moments of the transition model. By combining the kernel representation of the value function, we design an efficient policy iteration algorithm whose policy evaluation step can be represented as a linear system of equations characterized by a finite set of supporting states. We first validate the proposed method through extensive simulations in 2D obstacle avoidance and 2.5D terrain navigation problems. The results show that the proposed approach leads to a much superior performance over several baselines. We then develop a system that integrates our decision-making framework with onboard perception and conduct real-world experiments in both cluttered indoor and unstructured outdoor environments. The results from the physical systems further demonstrate the applicability of our method in challenging real-world environments.

Transfer Entropy (TE), the primary method for determining directed information flow within a network system, can exhibit bias - either in deficiency or excess - during both pairwise and conditioned calculations, owing to high-order dependencies among the dynamic processes under consideration and the remaining processes in the system used for conditioning. Here, we propose a novel approach. Instead of conditioning TE on all network processes except the driver and target, as in its fully conditioned version, or not conditioning at all, as in the pairwise approach, our method searches for both the multiplets of variables that maximize information flow and those that minimize it. This provides a decomposition of TE into unique, redundant, and synergistic atoms. Our approach enables the quantification of the relative importance of high-order effects compared to pure two-body effects in information transfer between two processes, while also highlighting the processes that contribute to building these high-order effects alongside the driver. We demonstrate the application of our approach in climatology by analyzing data from El Ni\~{n}o and the Southern Oscillation.

We study the coordination of actions and the allocation of profit in supply chains under decentralized control in which a single supplier supplies several retailers with goods for replenishment of stocks. The goal of the supplier and the retailers is to maximize their individual profits. Since the outcome under decentralized control is inefficient, cooperation among firms by means of coordination of actions may improve the individual profits. Cooperation is studied by means of cooperative game theory. Among others we show that the corresponding games are balanced and we propose a stable solution concept for these games.

The interest in network analysis of bibliographic data has grown substantially in recent years, yet comprehensive statistical models for examining the complete dynamics of scientific networks based on bibliographic data are generally lacking. Current empirical studies often focus on models restricting analysis either to paper citation networks (paper-by-paper) or author networks (author-by-author). However, such networks encompass not only direct connections between papers, but also indirect relationships between the references of papers connected by a citation link. In this paper, we extend recently developed relational hyperevent models (RHEM) for analyzing scientific networks. We introduce new covariates representing theoretically meaningful and empirically interesting sub-network configurations. The model accommodates testing hypotheses considering: (i) the polyadic nature of scientific publication events, and (ii) the interdependencies between authors and references of current and prior papers. We implement the model using purpose-built, publicly available open-source software, demonstrating its empirical value in an analysis of a large publicly available scientific network dataset. Assessing the relative strength of various effects reveals that both the hyperedge structure of publication events, as well as the interconnection between authors and references significantly improve our understanding and interpretation of collaborative scientific production.

Cluster randomization trials commonly employ multiple endpoints. When a single summary of treatment effects across endpoints is of primary interest, global hypothesis testing/effect estimation methods represent a common analysis strategy. However, specification of the joint distribution required by these methods is non-trivial, particularly when endpoint properties differ. We develop rank-based interval estimators for a global treatment effect referred to as the "global win probability," or the probability that a treatment individual responds better than a control individual on average. Using endpoint-specific ranks among the combined sample and within each arm, each individual-level observation is converted to a "win fraction" which quantifies the proportion of wins experienced over every observation in the comparison arm. An individual's multiple observations are then replaced by a single "global win fraction," constructed by averaging win fractions across endpoints. A linear mixed model is applied directly to the global win fractions to recover point, variance, and interval estimates of the global win probability adjusted for clustering. Simulation demonstrates our approach performs well concerning coverage and type I error, and methods are easily implemented using standard software. A case study using publicly available data is provided with corresponding R and SAS code.

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

北京阿比特科技有限公司