Hypertension remains a global health concern with a rising prevalence, necessitating effective monitoring and understanding of blood pressure (BP) dynamics. This study delves into the wealth of information derived from BP measurement, a crucial approach in informing our understanding of hypertensive trends. Numerous studies have reported on the relationship between BP variation and various factors. In this research, we leveraged an extensive dataset comprising 75 million records spanning two decades, offering a unique opportunity to explore and analyze BP variations across demographic features such as age, race, and gender. Our findings revealed that gender-based BP variation was not statistically significant, challenging conventional assumptions. Interestingly, systolic blood pressure (SBP) consistently increased with age, while diastolic blood pressure (DBP) displayed a distinctive peak in the forties age group. Moreover, our analysis uncovered intriguing similarities in the distribution of BP among some of the racial groups. This comprehensive investigation contributes to the ongoing discourse on hypertension and underscores the importance of considering diverse demographic factors in understanding BP variations. Our results provide valuable insights that may inform personalized healthcare approaches tailored to specific demographic profiles.
In high-stakes systems such as healthcare, it is critical to understand the causal reasons behind unusual events, such as sudden changes in patient's health. Unveiling the causal reasons helps with quick diagnoses and precise treatment planning. In this paper, we propose an automated method for uncovering "if-then" logic rules to explain observational events. We introduce temporal point processes to model the events of interest, and discover the set of latent rules to explain the occurrence of events. To achieve this, we employ an Expectation-Maximization (EM) algorithm. In the E-step, we calculate the likelihood of each event being explained by each discovered rule. In the M-step, we update both the rule set and model parameters to enhance the likelihood function's lower bound. Notably, we optimize the rule set in a differential manner. Our approach demonstrates accurate performance in both discovering rules and identifying root causes. We showcase its promising results using synthetic and real healthcare datasets.
The prevalence of social media and its escalating impact on mental health has highlighted the need for effective digital wellbeing strategies. Current digital wellbeing interventions have primarily focused on reducing screen time and social media use, often neglecting the potential benefits of these platforms. This paper introduces a new perspective centered around empowering positive social media experiences, instead of limiting users with restrictive rules. In line with this perspective, we lay out the key requirements that should be considered in future work, aiming to spark a dialogue in this emerging area. We further present our initial effort to address these requirements with PauseNow, an innovative digital wellbeing intervention designed to align users' digital behaviors with their intentions. PauseNow leverages digital nudging and intention-aware recommendations to gently guide users back to their original intentions when they "get lost" during their digital usage, promoting a more mindful use of social media.
Advancements in clinical treatment are increasingly constrained by the limitations of supervised learning techniques, which depend heavily on large volumes of annotated data. The annotation process is not only costly but also demands substantial time from clinical specialists. Addressing this issue, we introduce the S4MI (Self-Supervision and Semi-Supervision for Medical Imaging) pipeline, a novel approach that leverages the advancements in self-supervised and semi-supervised learning. These techniques engage in auxiliary tasks that do not require labeling, thus simplifying the scaling of machine supervision compared to fully-supervised methods. Our study benchmarks these techniques on three distinct medical imaging datasets to evaluate their effectiveness in classification and segmentation tasks. Notably, we observed that self-supervised learning significantly surpassed the performance of supervised methods in the classification of all evaluated datasets. Remarkably, the semi-supervised approach demonstrated superior outcomes in segmentation, outperforming fully-supervised methods while using 50% fewer labels across all datasets. In line with our commitment to contributing to the scientific community, we have made the S4MI code openly accessible, allowing for broader application and further development of these methods.
With the increasing importance of remote healthcare monitoring in the healthcare industry, it is essential to evaluate the usefulness and the ease of use the technology brings in remote healthcare. With this research, we want to understand the perspective of healthcare professionals, their competencies in using technology related to remote healthcare monitoring, and their trust and adoption of technology. In addition to these core factors, we introduce sustainability as a pivotal dimension in the Technology Acceptance Model, reflecting its importance in motivating and determining the use of remote healthcare technology. The results suggest that the participants have a positive view towards the use of remote monitoring devices for telemedicine, but have some concerns about security and privacy, and believe that network coverage needs to improve in remote areas. However, advances in technology and a focus on sustainable development can facilitate more effective and widespread adoption of remote monitoring devices in telemedicine.
Electronic health records include information on patients' status and medical history, which could cover the history of diseases and disorders that could be hereditary. One important use of family history information is in precision health, where the goal is to keep the population healthy with preventative measures. Natural Language Processing (NLP) and machine learning techniques can assist with identifying information that could assist health professionals in identifying health risks before a condition is developed in their later years, saving lives and reducing healthcare costs. We survey the literature on the techniques from the NLP field that have been developed to utilise digital health records to identify risks of familial diseases. We highlight that rule-based methods are heavily investigated and are still actively used for family history extraction. Still, more recent efforts have been put into building neural models based on large-scale pre-trained language models. In addition to the areas where NLP has successfully been utilised, we also identify the areas where more research is needed to unlock the value of patients' records regarding data collection, task formulation and downstream applications.
People increasingly rely on online sources for health information seeking due to their convenience and timeliness, traditionally using search engines like Google as the primary search agent. Recently, the emergence of generative Artificial Intelligence (AI) has made Large Language Model (LLM) powered conversational agents such as ChatGPT a viable alternative for health information search. However, while trust is crucial for adopting the online health advice, the factors influencing people's trust judgments in health information provided by LLM-powered conversational agents remain unclear. To address this, we conducted a mixed-methods, within-subjects lab study (N=21) to explore how interactions with different agents (ChatGPT vs. Google) across three health search tasks influence participants' trust judgments of the search results as well as the search agents themselves. Our key findings showed that: (a) participants' trust levels in ChatGPT were significantly higher than Google in the context of health information seeking; (b) there is a significant correlation between trust in health-related information and trust in the search agent, however only for Google; (c) the type of search tasks did not affect participants' perceived trust; and (d) participants' prior knowledge, the style of information presentation, and the interactive manner of using search agents were key determinants of trust in the health-related information. Our study taps into differences in trust perceptions when using traditional search engines compared to LLM-powered conversational agents. We highlight the potential role LLMs play in health-related information-seeking contexts, where they excel as stepping stones for further search. We contribute key factors and considerations for ensuring effective and reliable personal health information seeking in the age of generative AI.
With the rapid growth of artificial intelligence (AI) in healthcare, there has been a significant increase in the generation and storage of sensitive medical data. This abundance of data, in turn, has propelled the advancement of medical AI technologies. However, concerns about unauthorized data exploitation, such as training commercial AI models, often deter researchers from making their invaluable datasets publicly available. In response to the need to protect this hard-to-collect data while still encouraging medical institutions to share it, one promising solution is to introduce imperceptible noise into the data. This method aims to safeguard the data against unauthorized training by inducing degradation in model generalization. Although existing methods have shown commendable data protection capabilities in general domains, they tend to fall short when applied to biomedical data, mainly due to their failure to account for the sparse nature of medical images. To address this problem, we propose the Sparsity-Aware Local Masking (SALM) method, a novel approach that selectively perturbs significant pixel regions rather than the entire image as previous strategies have done. This simple-yet-effective approach significantly reduces the perturbation search space by concentrating on local regions, thereby improving both the efficiency and effectiveness of data protection for biomedical datasets characterized by sparse features. Besides, we have demonstrated that SALM maintains the essential characteristics of the data, ensuring its clinical utility remains uncompromised. Our extensive experiments across various datasets and model architectures demonstrate that SALM effectively prevents unauthorized training of deep-learning models and outperforms previous state-of-the-art data protection methods.
Process mining in healthcare presents a range of challenges when working with different types of data within the healthcare domain. There is high diversity considering the variety of data collected from healthcare processes: operational processes given by claims data, a collection of events during surgery, data related to pre-operative and post-operative care, and high-level data collections based on regular ambulant visits with no apparent events. In this case study, a data set from the last category is analyzed. We apply process-mining techniques on sparse patient heart failure data and investigate whether an information gain towards several research questions is achievable. Here, available data are transformed into an event log format, and process discovery and conformance checking are applied. Additionally, patients are split into different cohorts based on comorbidities, such as diabetes and chronic kidney disease, and multiple statistics are compared between the cohorts. Conclusively, we apply decision mining to determine whether a patient will have a cardiovascular outcome and whether a patient will die.
In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.
Small data challenges have emerged in many learning problems, since the success of deep neural networks often relies on the availability of a huge amount of labeled data that is expensive to collect. To address it, many efforts have been made on training complex models with small data in an unsupervised and semi-supervised fashion. In this paper, we will review the recent progresses on these two major categories of methods. A wide spectrum of small data models will be categorized in a big picture, where we will show how they interplay with each other to motivate explorations of new ideas. We will review the criteria of learning the transformation equivariant, disentangled, self-supervised and semi-supervised representations, which underpin the foundations of recent developments. Many instantiations of unsupervised and semi-supervised generative models have been developed on the basis of these criteria, greatly expanding the territory of existing autoencoders, generative adversarial nets (GANs) and other deep networks by exploring the distribution of unlabeled data for more powerful representations. While we focus on the unsupervised and semi-supervised methods, we will also provide a broader review of other emerging topics, from unsupervised and semi-supervised domain adaptation to the fundamental roles of transformation equivariance and invariance in training a wide spectrum of deep networks. It is impossible for us to write an exclusive encyclopedia to include all related works. Instead, we aim at exploring the main ideas, principles and methods in this area to reveal where we are heading on the journey towards addressing the small data challenges in this big data era.