Background: Policy evaluation studies that assess how state-level policies affect health-related outcomes are foundational to health and social policy research. The relative ability of newer analytic methods to address confounding, a key source of bias in observational studies, has not been closely examined. Methods: We conducted a simulation study to examine how differing magnitudes of confounding affected the performance of four methods used for policy evaluations: (1) the two-way fixed effects (TWFE) difference-in-differences (DID) model; (2) a one-period lagged autoregressive (AR) model; (3) augmented synthetic control method (ASCM); and (4) the doubly robust DID approach with multiple time periods from Callaway-Sant'Anna (CSA). We simulated our data to have staggered policy adoption and multiple confounding scenarios (i.e., varying the magnitude and nature of confounding relationships). Results: Bias increased for each method: (1) as confounding magnitude increases; (2) when confounding is generated with respect to prior outcome trends (rather than levels), and (3) when confounding associations are nonlinear (rather than linear). The AR and ASCM have notably lower root mean squared error than the TWFE model and CSA approach for all scenarios; the exception is nonlinear confounding by prior trends, where CSA excels. Coverage rates are unreasonably high for ASCM (e.g., 100%), reflecting large model-based standard errors and wide confidence intervals in practice. Conclusions: Our simulation study indicated that no single method consistently outperforms the others. But a researcher's toolkit should include all methodological options. Our simulations and associated R package can help researchers choose the most appropriate approach for their data.
Background: Studies have shown the potential adverse health effects, ranging from headaches to cardiovascular disease, associated with long-term negative emotions and chronic stress. Since many indicators of stress are imperceptible to observers, the early detection and intervention of stress remains a pressing medical need. Physiological signals offer a non-invasive method of monitoring emotions and are easily collected by smartwatches. Existing research primarily focuses on developing generalized machine learning-based models for emotion classification. Objective: We aim to study the differences between personalized and generalized machine learning models for three-class emotion classification (neutral, stress, and amusement) using wearable biosignal data. Methods: We developed a convolutional encoder for the three-class emotion classification problem using data from WESAD, a multimodal dataset with physiological signals for 15 subjects. We compared the results between a subject-exclusive generalized, subject-inclusive generalized, and personalized model. Results: For the three-class classification problem, our personalized model achieved an average accuracy of 95.06% and F1-score of 91.71, our subject-inclusive generalized model achieved an average accuracy of 66.95% and F1-score of 42.50, and our subject-exclusive generalized model achieved an average accuracy of 67.65% and F1-score of 43.05. Conclusions: Our results emphasize the need for increased research in personalized emotion recognition models given that they outperform generalized models in certain contexts. We also demonstrate that personalized machine learning models for emotion classification are viable and can achieve high performance.
Calibration of neural networks is a topical problem that is becoming more and more important as neural networks increasingly underpin real-world applications. The problem is especially noticeable when using modern neural networks, for which there is a significant difference between the confidence of the model and the probability of correct prediction. Various strategies have been proposed to improve calibration, yet accurate calibration remains challenging. We propose a novel framework with two contributions: introducing a new differentiable surrogate for expected calibration error (DECE) that allows calibration quality to be directly optimised, and a meta-learning framework that uses DECE to optimise for validation set calibration with respect to model hyper-parameters. The results show that we achieve competitive performance with existing calibration approaches. Our framework opens up a new avenue and toolset for tackling calibration, which we believe will inspire further work on this important challenge.
To promote viral marketing, major social platforms (e.g., Facebook Marketplace and Pinduoduo) repeatedly select and invite different users (as seeds) in online social networks to share fresh information about a product or service with their friends. Thereby, we are motivated to optimize a multi-stage seeding process of viral marketing in social networks and adopt the recent notions of the peak and the average age of information (AoI) to measure the timeliness of promotion information received by network users. Our problem is different from the literature on information diffusion in social networks, which limits to one-time seeding and overlooks AoI dynamics or information replacement over time. As a critical step, we manage to develop closed-form expressions that characterize and trace AoI dynamics over any social network. For the peak AoI problem, we first prove the NP-hardness of our multi-stage seeding problem by a highly non-straightforward reduction from the dominating set problem, and then present a new polynomial-time algorithm that achieves good approximation guarantees (e.g., less than 2 for linear network topology). To minimize the average AoI, we also prove that our problem is NP-hard by properly reducing it from the set cover problem. Benefiting from our two-side bound analysis on the average AoI objective, we build up a new framework for approximation analysis and link our problem to a much simplified sum-distance minimization problem. This intriguing connection inspires us to develop another polynomial-time algorithm that achieves a good approximation guarantee. Additionally, our theoretical results are well corroborated by experiments on a real social network.
One-class classification (OCC) is a longstanding method for anomaly detection. With the powerful representation capability of the pre-trained backbone, OCC methods have witnessed significant performance improvements. Typically, most of these OCC methods employ transfer learning to enhance the discriminative nature of the pre-trained backbone's features, thus achieving remarkable efficacy. While most current approaches emphasize feature transfer strategies, we argue that the optimization objective space within OCC methods could also be an underlying critical factor influencing performance. In this work, we conducted a thorough investigation into the optimization objective of OCC. Through rigorous theoretical analysis and derivation, we unveil a key insights: any space with the suitable norm can serve as an equivalent substitute for the hypersphere center, without relying on the distribution assumption of training samples. Further, we provide guidelines for determining the feasible domain of norms for the OCC optimization objective. This novel insight sparks a simple and data-agnostic deep one-class classification method. Our method is straightforward, with a single 1x1 convolutional layer as a trainable projector and any space with suitable norm as the optimization objective. Extensive experiments validate the reliability and efficacy of our findings and the corresponding methodology, resulting in state-of-the-art performance in both one-class classification and industrial vision anomaly detection and segmentation tasks.
Deep learning models have achieved state-of-the-art results in estimating brain age, which is an important brain health biomarker, from magnetic resonance (MR) images. However, most of these models only provide a global age prediction, and rely on techniques, such as saliency maps to interpret their results. These saliency maps highlight regions in the input image that were significant for the model's predictions, but they are hard to be interpreted, and saliency map values are not directly comparable across different samples. In this work, we reframe the age prediction problem from MR images to an image-to-image regression problem where we estimate the brain age for each brain voxel in MR images. We compare voxel-wise age prediction models against global age prediction models and their corresponding saliency maps. The results indicate that voxel-wise age prediction models are more interpretable, since they provide spatial information about the brain aging process, and they benefit from being quantitative.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
Understanding causality helps to structure interventions to achieve specific goals and enables predictions under interventions. With the growing importance of learning causal relationships, causal discovery tasks have transitioned from using traditional methods to infer potential causal structures from observational data to the field of pattern recognition involved in deep learning. The rapid accumulation of massive data promotes the emergence of causal search methods with brilliant scalability. Existing summaries of causal discovery methods mainly focus on traditional methods based on constraints, scores and FCMs, there is a lack of perfect sorting and elaboration for deep learning-based methods, also lacking some considers and exploration of causal discovery methods from the perspective of variable paradigms. Therefore, we divide the possible causal discovery tasks into three types according to the variable paradigm and give the definitions of the three tasks respectively, define and instantiate the relevant datasets for each task and the final causal model constructed at the same time, then reviews the main existing causal discovery methods for different tasks. Finally, we propose some roadmaps from different perspectives for the current research gaps in the field of causal discovery and point out future research directions.
In contrast to batch learning where all training data is available at once, continual learning represents a family of methods that accumulate knowledge and learn continuously with data available in sequential order. Similar to the human learning process with the ability of learning, fusing, and accumulating new knowledge coming at different time steps, continual learning is considered to have high practical significance. Hence, continual learning has been studied in various artificial intelligence tasks. In this paper, we present a comprehensive review of the recent progress of continual learning in computer vision. In particular, the works are grouped by their representative techniques, including regularization, knowledge distillation, memory, generative replay, parameter isolation, and a combination of the above techniques. For each category of these techniques, both its characteristics and applications in computer vision are presented. At the end of this overview, several subareas, where continuous knowledge accumulation is potentially helpful while continual learning has not been well studied, are discussed.
Clinical Named Entity Recognition (CNER) aims to identify and classify clinical terms such as diseases, symptoms, treatments, exams, and body parts in electronic health records, which is a fundamental and crucial task for clinical and translational research. In recent years, deep neural networks have achieved significant success in named entity recognition and many other Natural Language Processing (NLP) tasks. Most of these algorithms are trained end to end, and can automatically learn features from large scale labeled datasets. However, these data-driven methods typically lack the capability of processing rare or unseen entities. Previous statistical methods and feature engineering practice have demonstrated that human knowledge can provide valuable information for handling rare and unseen cases. In this paper, we address the problem by incorporating dictionaries into deep neural networks for the Chinese CNER task. Two different architectures that extend the Bi-directional Long Short-Term Memory (Bi-LSTM) neural network and five different feature representation schemes are proposed to handle the task. Computational results on the CCKS-2017 Task 2 benchmark dataset show that the proposed method achieves the highly competitive performance compared with the state-of-the-art deep learning methods.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.