Estimating causal effects among different events is of great importance to critical fields such as drug development. Nevertheless, the data features associated with events may be distributed across various silos and remain private within respective parties, impeding direct information exchange between them. This, in turn, can result in biased estimations of local causal effects, which rely on the characteristics of only a subset of the covariates. To tackle this challenge, we introduce an innovative disentangle architecture designed to facilitate the seamless cross-silo transmission of model parameters, enriched with causal mechanisms, through a combination of shared and private branches. Besides, we introduce global constraints into the equation to effectively mitigate bias within the various missing domains, thereby elevating the accuracy of our causal effect estimation. Extensive experiments conducted on new semi-synthetic datasets show that our method outperforms state-of-the-art baselines.
In complex industrial and chemical process control rooms, effective decision-making is crucial for safety and effi- ciency. The experiments in this paper evaluate the impact and applications of an AI-based decision support system integrated into an improved human-machine interface, using dynamic influ- ence diagrams, a hidden Markov model, and deep reinforcement learning. The enhanced support system aims to reduce operator workload, improve situational awareness, and provide different intervention strategies to the operator adapted to the current state of both the system and human performance. Such a system can be particularly useful in cases of information overload when many alarms and inputs are presented all within the same time window, or for junior operators during training. A comprehensive cross-data analysis was conducted, involving 47 participants and a diverse range of data sources such as smartwatch metrics, eye- tracking data, process logs, and responses from questionnaires. The results indicate interesting insights regarding the effec- tiveness of the approach in aiding decision-making, decreasing perceived workload, and increasing situational awareness for the scenarios considered. Additionally, the results provide valuable insights to compare differences between styles of information gathering when using the system by individual participants. These findings are particularly relevant when predicting the overall performance of the individual participant and their capacity to successfully handle a plant upset and the alarms connected to it using process and human-machine interaction logs in real-time. These predictions enable the development of more effective intervention strategies.
Conversational dense retrieval has shown to be effective in conversational search. However, a major limitation of conversational dense retrieval is their lack of interpretability, hindering intuitive understanding of model behaviors for targeted improvements. This paper presents CONVINV, a simple yet effective approach to shed light on interpretable conversational dense retrieval models. CONVINV transforms opaque conversational session embeddings into explicitly interpretable text while faithfully maintaining their original retrieval performance as much as possible. Such transformation is achieved by training a recently proposed Vec2Text model based on the ad-hoc query encoder, leveraging the fact that the session and query embeddings share the same space in existing conversational dense retrieval. To further enhance interpretability, we propose to incorporate external interpretable query rewrites into the transformation process. Extensive evaluations on three conversational search benchmarks demonstrate that CONVINV can yield more interpretable text and faithfully preserve original retrieval performance than baselines. Our work connects opaque session embeddings with transparent query rewriting, paving the way toward trustworthy conversational search.
Identifying which variables do influence a response while controlling false positives pervades statistics and data science. In this paper, we consider a scenario in which we only have access to summary statistics, such as the values of marginal empirical correlations between each dependent variable of potential interest and the response. This situation may arise due to privacy concerns, e.g., to avoid the release of sensitive genetic information. We extend GhostKnockoffs (He et al. [2022]) and introduce variable selection methods based on penalized regression achieving false discovery rate (FDR) control. We report empirical results in extensive simulation studies, demonstrating enhanced performance over previous work. We also apply our methods to genome-wide association studies of Alzheimer's disease, and evidence a significant improvement in power.
Estimating the average treatment effect in social networks is challenging due to individuals influencing each other. One approach to address interference is ego cluster experiments, where each cluster consists of a central individual (ego) and its peers (alters). Clusters are randomized, and only the effects on egos are measured. In this work, we propose an improved framework for ego cluster experiments called ego group partition (EGP), which directly generates two groups and an ego sub-population instead of ego clusters. Under specific model assumptions, we propose two ego group partition algorithms. Compared to the original ego clustering algorithm, our algorithms produce more egos, yield smaller biases, and support parallel computation. The performance of our algorithms is validated through simulation and real-world case studies.
Vascular structure segmentation plays a crucial role in medical analysis and clinical applications. The practical adoption of fully supervised segmentation models is impeded by the intricacy and time-consuming nature of annotating vessels in the 3D space. This has spurred the exploration of weakly-supervised approaches that reduce reliance on expensive segmentation annotations. Despite this, existing weakly supervised methods employed in organ segmentation, which encompass points, bounding boxes, or graffiti, have exhibited suboptimal performance when handling sparse vascular structure. To alleviate this issue, we employ maximum intensity projection (MIP) to decrease the dimensionality of 3D volume to 2D image for efficient annotation, and the 2D labels are utilized to provide guidance and oversight for training 3D vessel segmentation model. Initially, we generate pseudo-labels for 3D blood vessels using the annotations of 2D projections. Subsequently, taking into account the acquisition method of the 2D labels, we introduce a weakly-supervised network that fuses 2D-3D deep features via MIP to further improve segmentation performance. Furthermore, we integrate confidence learning and uncertainty estimation to refine the generated pseudo-labels, followed by fine-tuning the segmentation network. Our method is validated on five datasets (including cerebral vessel, aorta and coronary artery), demonstrating highly competitive performance in segmenting vessels and the potential to significantly reduce the time and effort required for vessel annotation. Our code is available at: //github.com/gzq17/Weakly-Supervised-by-MIP.
Deep neural networks are susceptible to adversarial attacks, which can compromise their performance and accuracy. Adversarial Training (AT) has emerged as a popular approach for protecting neural networks against such attacks. However, a key challenge of AT is robust overfitting, where the network's robust performance on test data deteriorates with further training, thus hindering generalization. Motivated by the concept of active forgetting in the brain, we introduce a novel learning paradigm called "Forget to Mitigate Overfitting (FOMO)". FOMO alternates between the forgetting phase, which randomly forgets a subset of weights and regulates the model's information through weight reinitialization, and the relearning phase, which emphasizes learning generalizable features. Our experiments on benchmark datasets and adversarial attacks show that FOMO alleviates robust overfitting by significantly reducing the gap between the best and last robust test accuracy while improving the state-of-the-art robustness. Furthermore, FOMO provides a better trade-off between standard and robust accuracy, outperforming baseline adversarial methods. Finally, our framework is robust to AutoAttacks and increases generalization in many real-world scenarios.
We study the following model of disease spread in a social network. At first, all individuals are either infected or healthy. Next, in discrete rounds, the disease spreads in the network from infected to healthy individuals such that a healthy individual gets infected if and only if a sufficient number of its direct neighbors are already infected. We represent the social network as a graph. Inspired by the real-world restrictions in the current epidemic, especially by social and physical distancing requirements, we restrict ourselves to networks that can be represented as geometric intersection graphs. We show that finding a minimal vertex set of initially infected individuals to spread the disease in the whole network is computationally hard, already on unit disk graphs. Hence, to provide some algorithmic results, we focus ourselves on simpler geometric graph classes, such as interval graphs and grid graphs.
Goal recognition is a fundamental cognitive process that enables individuals to infer intentions based on available cues. Current goal recognition algorithms often take only observed actions as input, but here we use a Bayesian framework to explore the role of actions, timing, and goal solvability in goal recognition. We analyze human responses to goal-recognition problems in the Sokoban domain, and find that actions are assigned most importance, but that timing and solvability also influence goal recognition in some cases, especially when actions are uninformative. We leverage these findings to develop a goal recognition model that matches human inferences more closely than do existing algorithms. Our work provides new insight into human goal recognition and takes a step towards more human-like AI models.
Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and answer medical questions comparably to physicians has long been viewed as one such grand challenge. Large language models (LLMs) have catalyzed significant progress in medical question answering; Med-PaLM was the first model to exceed a "passing" score in US Medical Licensing Examination (USMLE) style questions with a score of 67.2% on the MedQA dataset. However, this and other prior work suggested significant room for improvement, especially when models' answers were compared to clinicians' answers. Here we present Med-PaLM 2, which bridges these gaps by leveraging a combination of base LLM improvements (PaLM 2), medical domain finetuning, and prompting strategies including a novel ensemble refinement approach. Med-PaLM 2 scored up to 86.5% on the MedQA dataset, improving upon Med-PaLM by over 19% and setting a new state-of-the-art. We also observed performance approaching or exceeding state-of-the-art across MedMCQA, PubMedQA, and MMLU clinical topics datasets. We performed detailed human evaluations on long-form questions along multiple axes relevant to clinical applications. In pairwise comparative ranking of 1066 consumer medical questions, physicians preferred Med-PaLM 2 answers to those produced by physicians on eight of nine axes pertaining to clinical utility (p < 0.001). We also observed significant improvements compared to Med-PaLM on every evaluation axis (p < 0.001) on newly introduced datasets of 240 long-form "adversarial" questions to probe LLM limitations. While further studies are necessary to validate the efficacy of these models in real-world settings, these results highlight rapid progress towards physician-level performance in medical question answering.
Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.