Parkinson's disease (PD) is a prevalent neurodegenerative disorder known for its impact on motor neurons, causing symptoms like tremors, stiffness, and gait difficulties. This study explores the potential of vocal feature alterations in PD patients as a means of early disease prediction. This research aims to predict the onset of Parkinson's disease. Utilizing a variety of advanced machine-learning algorithms, including XGBoost, LightGBM, Bagging, AdaBoost, and Support Vector Machine, among others, the study evaluates the predictive performance of these models using metrics such as accuracy, area under the curve (AUC), sensitivity, and specificity. The findings of this comprehensive analysis highlight LightGBM as the most effective model, achieving an impressive accuracy rate of 96% alongside a matching AUC of 96%. LightGBM exhibited a remarkable sensitivity of 100% and specificity of 94.43%, surpassing other machine learning algorithms in accuracy and AUC scores. Given the complexities of Parkinson's disease and its challenges in early diagnosis, this study underscores the significance of leveraging vocal biomarkers coupled with advanced machine-learning techniques for precise and timely PD detection.
Intervertebral disc disease, a prevalent ailment, frequently leads to intermittent or persistent low back pain, and diagnosing and assessing of this disease rely on accurate measurement of vertebral bone and intervertebral disc geometries from lumbar MR images. Deep neural network (DNN) models may assist clinicians with more efficient image segmentation of individual instances (disks and vertebrae) of the lumbar spine in an automated way, which is termed as instance image segmentation. In this work, we proposed SymTC, an innovative lumbar spine MR image segmentation model that combines the strengths of Transformer and Convolutional Neural Network (CNN). Specifically, we designed a parallel dual-path architecture to merge CNN layers and Transformer layers, and we integrated a novel position embedding into the self-attention module of Transformer, enhancing the utilization of positional information for more accurate segmentation. To further improves model performance, we introduced a new data augmentation technique to create synthetic yet realistic MR image dataset, named SSMSpine, which is made publicly available. We evaluated our SymTC and the other 15 existing image segmentation models on our private in-house dataset and the public SSMSpine dataset, using two metrics, Dice Similarity Coefficient and 95% Hausdorff Distance. The results show that our SymTC has the best performance for segmenting vertebral bones and intervertebral discs in lumbar spine MR images. The SymTC code and SSMSpine dataset are available at //github.com/jiasongchen/SymTC.
We propose Compact and Swift Segmenting 3D Gaussians(CoSSegGaussians), a method for compact 3D-consistent scene segmentation at fast rendering speed with only RGB images input. Previous NeRF-based segmentation methods have relied on time-consuming neural scene optimization. While recent 3D Gaussian Splatting has notably improved speed, existing Gaussian-based segmentation methods struggle to produce compact masks, especially in zero-shot segmentation. This issue probably stems from their straightforward assignment of learnable parameters to each Gaussian, resulting in a lack of robustness against cross-view inconsistent 2D machine-generated labels. Our method aims to address this problem by employing Dual Feature Fusion Network as Gaussians' segmentation field. Specifically, we first optimize 3D Gaussians under RGB supervision. After Gaussian Locating, DINO features extracted from images are applied through explicit unprojection, which are further incorporated with spatial features from the efficient point cloud processing network. Feature aggregation is utilized to fuse them in a global-to-local strategy for compact segmentation features. Experimental results show that our model outperforms baselines on both semantic and panoptic zero-shot segmentation task, meanwhile consumes less than 10\% inference time compared to NeRF-based methods. Code and more results will be available at //David-Dou.github.io/CoSSegGaussians.
Wearable technologies enable continuous monitoring of various health metrics, such as physical activity, heart rate, sleep, and stress levels. A key challenge with wearable data is obtaining quality labels. Unlike modalities like video where the videos themselves can be effectively used to label objects or events, wearable data do not contain obvious cues about the physical manifestation of the users and usually require rich metadata. As a result, label noise can become an increasingly thorny issue when labeling such data. In this paper, we propose a novel solution to address noisy label learning, entitled Few-Shot Human-in-the-Loop Refinement (FHLR). Our method initially learns a seed model using weak labels. Next, it fine-tunes the seed model using a handful of expert corrections. Finally, it achieves better generalizability and robustness by merging the seed and fine-tuned models via weighted parameter averaging. We evaluate our approach on four challenging tasks and datasets, and compare it against eight competitive baselines designed to deal with noisy labels. We show that FHLR achieves significantly better performance when learning from noisy labels and achieves state-of-the-art by a large margin, with up to 19% accuracy improvement under symmetric and asymmetric noise. Notably, we find that FHLR is particularly robust to increased label noise, unlike prior works that suffer from severe performance degradation. Our work not only achieves better generalization in high-stakes health sensing benchmarks but also sheds light on how noise affects commonly-used models.
Major depressive disorder (MDD) presents challenges in diagnosis and treatment due to its complex and heterogeneous nature. Emerging evidence indicates that reward processing abnormalities may serve as a behavioral marker for MDD. To measure reward processing, patients perform computer-based behavioral tasks that involve making choices or responding to stimulants that are associated with different outcomes. Reinforcement learning (RL) models are fitted to extract parameters that measure various aspects of reward processing to characterize how patients make decisions in behavioral tasks. Recent findings suggest the inadequacy of characterizing reward learning solely based on a single RL model; instead, there may be a switching of decision-making processes between multiple strategies. An important scientific question is how the dynamics of learning strategies in decision-making affect the reward learning ability of individuals with MDD. Motivated by the probabilistic reward task (PRT) within the EMBARC study, we propose a novel RL-HMM framework for analyzing reward-based decision-making. Our model accommodates learning strategy switching between two distinct approaches under a hidden Markov model (HMM): subjects making decisions based on the RL model or opting for random choices. We account for continuous RL state space and allow time-varying transition probabilities in the HMM. We introduce a computationally efficient EM algorithm for parameter estimation and employ a nonparametric bootstrap for inference. We apply our approach to the EMBARC study to show that MDD patients are less engaged in RL compared to the healthy controls, and engagement is associated with brain activities in the negative affect circuitry during an emotional conflict task.
With the emergence and spread of infectious diseases with pandemic potential, such as COVID- 19, the urgency for vaccine development have led to unprecedented compressed and accelerated schedules that shortened the standard development timeline. In a relatively short time, the leading pharmaceutical companies1, received an Emergency Use Authorization (EUA) for vaccine\prime s en-mass deployment To monitor the potential side effect(s) of the vaccine during the (initial) vaccination campaign, we developed an optimal sequential test that allows for the early detection of potential side effect(s). This test employs a rule to stop the vaccination process once the observed number of side effect incidents exceeds a certain (pre-determined) threshold. The optimality of the proposed sequential test is justified when compared with the ({\alpha}, {\beta}) optimality of the non-randomized fixed-sample Uniformly Most Powerful (UMP) test. In the case of a single side effect, we study the properties of the sequential test and derive the exact expressions of the Average Sample Number (ASN) curve of the stopping time (and its variance) via the regularized incomplete beta function. Additionally, we derive the asymptotic distribution of the relative savings in ASN as compared to maximal sample size. Moreover, we construct the post-test parameter estimate and studied its sampling properties, including its asymptotic behavior under local-type alternatives. These limiting behavior results are the consistency and asymptotic normality of the post-test parameter estimator. We conclude the paper with a small simulation study illustrating the asymptotic performance of the point and interval estimation and provide a detailed example, based on COVID-19 side effect data (see Beatty et al. (2021)) of our suggested testing procedure.
To cope with the growing prevalence of colorectal cancer (CRC), screening programs for polyp detection and removal have proven their usefulness. Colonoscopy is considered the best-performing procedure for CRC screening. To ease the examination, deep learning based methods for automatic polyp detection have been developed for conventional white-light imaging (WLI). Compared with WLI, narrow-band imaging (NBI) can improve polyp classification during colonoscopy but requires special equipment. We propose a CycleGAN-based framework to convert images captured with regular WLI to synthetic NBI (SNBI) as a pre-processing method for improving object detection on WLI when NBI is unavailable. This paper first shows that better results for polyp detection can be achieved on NBI compared to a relatively similar dataset of WLI. Secondly, experimental results demonstrate that our proposed modality translation can achieve improved polyp detection on SNBI images generated from WLI compared to the original WLI. This is because our WLI-to-SNBI translation model can enhance the observation of polyp surface patterns in the generated SNBI images.
Globally, cardiovascular diseases (CVDs) are the leading cause of mortality, accounting for an estimated 17.9 million deaths annually. One critical clinical objective is the early detection of CVDs using electrocardiogram (ECG) data, an area that has received significant attention from the research community. Recent advancements based on machine learning and deep learning have achieved great progress in this domain. However, existing methodologies exhibit inherent limitations, including inappropriate model evaluations and instances of data leakage. In this study, we present a streamlined workflow paradigm for preprocessing ECG signals into consistent 10-second durations, eliminating the need for manual feature extraction/beat detection. We also propose a hybrid model of Long Short-Term Memory (LSTM) with Support Vector Machine (SVM) for fraud detection. This architecture consists of two LSTM layers and an SVM classifier, which achieves a SOTA results with an Average precision score of 0.9402 on the MIT-BIH arrhythmia dataset and 0.9563 on the MIT-BIH atrial fibrillation dataset. Based on the results, we believe our method can significantly benefit the early detection and management of CVDs.
Graph Neural Networks (GNNs) have shown considerable effectiveness in a variety of graph learning tasks, particularly those based on the message-passing approach in recent years. However, their performance is often constrained by a limited receptive field, a challenge that becomes more acute in the presence of sparse graphs. In light of the power series, which possesses infinite expansion capabilities, we propose a novel Graph Power Filter Neural Network (GPFN) that enhances node classification by employing a power series graph filter to augment the receptive field. Concretely, our GPFN designs a new way to build a graph filter with an infinite receptive field based on the convergence power series, which can be analyzed in the spectral and spatial domains. Besides, we theoretically prove that our GPFN is a general framework that can integrate any power series and capture long-range dependencies. Finally, experimental results on three datasets demonstrate the superiority of our GPFN over state-of-the-art baselines.
Diffusion-based Image Editing (DIE) is an emerging research hot-spot, which often applies a semantic mask to control the target area for diffusion-based editing. However, most existing solutions obtain these masks via manual operations or off-line processing, greatly reducing their efficiency. In this paper, we propose a novel and efficient image editing method for Text-to-Image (T2I) diffusion models, termed Instant Diffusion Editing(InstDiffEdit). In particular, InstDiffEdit aims to employ the cross-modal attention ability of existing diffusion models to achieve instant mask guidance during the diffusion steps. To reduce the noise of attention maps and realize the full automatics, we equip InstDiffEdit with a training-free refinement scheme to adaptively aggregate the attention distributions for the automatic yet accurate mask generation. Meanwhile, to supplement the existing evaluations of DIE, we propose a new benchmark called Editing-Mask to examine the mask accuracy and local editing ability of existing methods. To validate InstDiffEdit, we also conduct extensive experiments on ImageNet and Imagen, and compare it with a bunch of the SOTA methods. The experimental results show that InstDiffEdit not only outperforms the SOTA methods in both image quality and editing results, but also has a much faster inference speed, i.e., +5 to +6 times.
Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.