亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Heterogeneous treatment effects are driven by treatment effect modifiers, pre-treatment covariates that modify the effect of a treatment on an outcome. Current approaches for uncovering these variables are limited to low-dimensional data, data with weakly correlated covariates, or data generated according to parametric processes. We resolve these issues by developing a framework for defining model-agnostic treatment effect modifier variable importance parameters applicable to high-dimensional data with arbitrary correlation structure, deriving one-step, estimating equation and targeted maximum likelihood estimators of these parameters, and establishing these estimators' asymptotic properties. This framework is showcased by defining variable importance parameters for data-generating processes with continuous, binary, and time-to-event outcomes with binary treatments, and deriving accompanying multiply-robust and asymptotically linear estimators. Simulation experiments demonstrate that these estimators' asymptotic guarantees are approximately achieved in realistic sample sizes for observational and randomized studies alike. This framework is applied to gene expression data collected for a clinical trial assessing the effect of a monoclonal antibody therapy on disease-free survival in breast cancer patients. Genes predicted to have the greatest potential for treatment effect modification have previously been linked to breast cancer. An open-source R package implementing this methodology, unihtee, is made available on GitHub at //github.com/insightsengineering/unihtee.

相關內容

Nonparametric density estimation is an unsupervised learning problem. In this work we propose a two-step procedure that casts the density estimation problem in the first step into a supervised regression problem. The advantage is that we can afterwards apply supervised learning methods. Compared to the standard nonparametric regression setting, the proposed procedure creates, however, dependence among the training samples. To derive statistical risk bounds, one can therefore not rely on the well-developed theory for i.i.d. data. To overcome this, we prove an oracle inequality for this specific form of data dependence. As an application, it is shown that under a compositional structure assumption on the underlying density, the proposed two-step method achieves convergence rates that are faster than the standard nonparametric rates. A simulation study illustrates the finite sample performance.

Many combinatorial optimization problems can be formulated as the search for a subgraph that satisfies certain properties and minimizes the total weight. We assume here that the vertices correspond to points in a metric space and can take any position in given uncertainty sets. Then, the cost function to be minimized is the sum of the distances for the worst positions of the vertices in their uncertainty sets. We propose two types of polynomial-time approximation algorithms. The first one relies on solving a deterministic counterpart of the problem where the uncertain distances are replaced with maximum pairwise distances. We study in details the resulting approximation ratio, which depends on the structure of the feasible subgraphs and whether the metric space is Ptolemaic or not. The second algorithm is a fully-polynomial time approximation scheme for the special case of $s-t$ paths.

Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generative model prior $p(\mathbf{x})$ and a black-box constraint or likelihood function $r(\mathbf{x})$. We state and prove the asymptotic correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from this posterior, a problem that existing methods solve only approximately or in restricted cases. Relative trajectory balance arises from the generative flow network perspective on diffusion models, which allows the use of deep reinforcement learning techniques to improve mode coverage. Experiments illustrate the broad potential of unbiased inference of arbitrary posteriors under diffusion priors: in vision (classifier guidance), language (infilling under a discrete diffusion LLM), and multimodal data (text-to-image generation). Beyond generative modeling, we apply relative trajectory balance to the problem of continuous control with a score-based behavior prior, achieving state-of-the-art results on benchmarks in offline reinforcement learning.

Accurate prediction of antibody structure is a central task in the design and development of monoclonal antibodies, notably to understand both their developability and their binding properties. In this article, we introduce ABodyBuilder3, an improved and scalable antibody structure prediction model based on ImmuneBuilder. We achieve a new state-of-the-art accuracy in the modelling of CDR loops by leveraging language model embeddings, and show how predicted structures can be further improved through careful relaxation strategies. Finally, we incorporate a predicted Local Distance Difference Test into the model output to allow for a more accurate estimation of uncertainties.

Studies intended to estimate the effect of a treatment, like randomized trials, may not be sampled from the desired target population. To correct for this discrepancy, estimates can be transported to the target population. Methods for transporting between populations are often premised on a positivity assumption, such that all relevant covariate patterns in one population are also present in the other. However, eligibility criteria, particularly in the case of trials, can result in violations of positivity when transporting to external populations. To address nonpositivity, a synthesis of statistical and mathematical models can be considered. This approach integrates multiple data sources (e.g. trials, observational, pharmacokinetic studies) to estimate treatment effects, leveraging mathematical models to handle positivity violations. This approach was previously demonstrated for positivity violations by a single binary covariate. Here, we extend the synthesis approach for positivity violations with a continuous covariate. For estimation, two novel augmented inverse probability weighting estimators are proposed. Both estimators are contrasted with other common approaches for addressing nonpositivity. Empirical performance is compared via Monte Carlo simulation. Finally, the competing approaches are illustrated with an example in the context of two-drug versus one-drug antiretroviral therapy on CD4 T cell counts among women with HIV.

Diagnosing and managing a patient is a complex, sequential decision making process that requires physicians to obtain information -- such as which tests to perform -- and to act upon it. Recent advances in artificial intelligence (AI) and large language models (LLMs) promise to profoundly impact clinical care. However, current evaluation schemes overrely on static medical question-answering benchmarks, falling short on interactive decision-making that is required in real-life clinical work. Here, we present AgentClinic: a multimodal benchmark to evaluate LLMs in their ability to operate as agents in simulated clinical environments. In our benchmark, the doctor agent must uncover the patient's diagnosis through dialogue and active data collection. We present two open medical agent benchmarks: a multimodal image and dialogue environment, AgentClinic-NEJM, and a dialogue-only environment, AgentClinic-MedQA. We embed cognitive and implicit biases both in patient and doctor agents to emulate realistic interactions between biased agents. We find that introducing bias leads to large reductions in diagnostic accuracy of the doctor agents, as well as reduced compliance, confidence, and follow-up consultation willingness in patient agents. Evaluating a suite of state-of-the-art LLMs, we find that several models that excel in benchmarks like MedQA are performing poorly in AgentClinic-MedQA. We find that the LLM used in the patient agent is an important factor for performance in the AgentClinic benchmark. We show that both having limited interactions as well as too many interaction reduces diagnostic accuracy in doctor agents. The code and data for this work is publicly available at //AgentClinic.github.io.

Robotic exploration has long captivated researchers aiming to map complex environments efficiently. Techniques such as potential fields and frontier exploration have traditionally been employed in this pursuit, primarily focusing on solitary agents. Recent advancements have shifted towards optimizing exploration efficiency through multiagent systems. However, many existing approaches overlook critical real-world factors, such as broadcast range limitations, communication costs, and coverage overlap. This paper addresses these gaps by proposing a distributed maze exploration strategy (CU-LVP) that assumes constrained broadcast ranges and utilizes Voronoi diagrams for better area partitioning. By adapting traditional multiagent methods to distributed environments with limited broadcast ranges, this study evaluates their performance across diverse maze topologies, demonstrating the efficacy and practical applicability of the proposed method. The code and experimental results supporting this study are available in the following repository: //github.com/manouslinard/multiagent-exploration/.

Statistical significance of both the original and the replication study is a commonly used criterion to assess replication attempts, also known as the two-trials rule in drug development. However, replication studies are sometimes conducted although the original study is non-significant, in which case Type-I error rate control across both studies is no longer guaranteed. We propose an alternative method to assess replicability using the sum of p-values from the two studies. The approach provides a combined p-value and can be calibrated to control the overall Type-I error rate at the same level as the two-trials rule but allows for replication success even if the original study is non-significant. The unweighted version requires a less restrictive level of significance at replication if the original study is already convincing which facilitates sample size reductions of up to 10%. Downweighting the original study accounts for possible bias and requires a more stringent significance level and larger samples sizes at replication. Data from four large-scale replication projects are used to illustrate and compare the proposed method with the two-trials rule, meta-analysis and Fisher's combination method.

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.

北京阿比特科技有限公司