The rapid development in Internet of Medical Things (IoMT) boosts the opportunity for real-time health monitoring using various data types such as electroencephalography (EEG) and electrocardiography (ECG). Security issues have significantly impeded the e-healthcare system implementation. Three important challenges for privacy preserving system need to be addressed: accurate diagnosis, privacy protection without compromising accuracy, and computation efficiency. It is essential to guarantee prediction accuracy since disease diagnosis is strongly related to health and life. By implementing matrix encryption method, we propose a real-time disease diagnosis scheme using support vector machine (SVM). A biomedical signal provided by the client is diagnosed such that the server does not get any information about the signal as well as the final result of the diagnosis while the proposed scheme also achieves confidentiality of the SVM classifier and the server's medical data. The proposed scheme has no accuracy degradation. Experiments on real-world data illustrate the high efficiency of the proposed scheme. It takes less than 1 second to derive the disease diagnosis result using a device with 4Gb RAMs, suggesting the feasibility to implement real-time privacy preserving health monitoring.
External validation is often recommended to ensure the generalizability of ML models. However, it neither guarantees generalizability nor equates to a model's clinical usefulness (the ultimate goal of any clinical decision-support tool). External validation is misaligned with current healthcare ML needs. First, patient data changes across time, geography, and facilities. These changes create significant volatility in the performance of a single fixed model (especially for deep learning models, which dominate clinical ML). Second, newer ML techniques, current market forces, and updated regulatory frameworks are enabling frequent updating and monitoring of individual deployed model instances. We submit that external validation is insufficient to establish ML models' safety or utility. Proposals to fix the external validation paradigm do not go far enough. Continued reliance on it as the ultimate test is likely to lead us astray. We propose the MLOps-inspired paradigm of recurring local validation as an alternative that ensures the validity of models while protecting against performance-disruptive data variability. This paradigm relies on site-specific reliability tests before every deployment, followed by regular and recurrent checks throughout the life cycle of the deployed algorithm. Initial and recurrent reliability tests protect against performance-disruptive distribution shifts, and concept drifts that jeopardize patient safety.
For large-scale cyber-physical systems, the collaboration of spatially distributed sensors is often needed to perform the state estimation process. Privacy concerns naturally arise from disclosing sensitive measurement signals to a cloud estimator that predicts the system state. To solve this issue, we propose a differentially private set-based estimation protocol that preserves the privacy of the measurement signals. Compared to existing research, our approach achieves less privacy loss and utility loss using a numerically optimized truncated noise distribution. The proposed estimator is perturbed by weaker noise than the analytical approaches in the literature to guarantee the same level of privacy, therefore improving the estimation utility. Numerical and comparison experiments with truncated Laplace noise are presented to support our approach. Zonotopes, a less conservative form of set representation, are used to represent estimation sets, giving set operations a computational advantage. The privacy-preserving noise anonymizes the centers of these estimated zonotopes, concealing the precise positions of the estimated zonotopes.
The rapidly growing traffic demands in fiber-optical networks require flexibility and accuracy in configuring lightpaths, for which fast and accurate quality of transmission (QoT) estimation is of pivotal importance. This paper introduces a machine learning (ML)-based QoT estimation approach that meets these requirements. The proposed gradient-boosting ML model uses precomputed per-channel self-channel-interference values as representative and condensed features to estimate non-linear interference in a flexible-grid network. With an enhanced Gaussian noise (GN) model simulation as the baseline, the ML model achieves a mean absolute signal-to-noise ratio error of approximately 0.1 dB, which is an improvement over the GN model. For three different network topologies and network planning approaches of varying complexities, a multi-period network planning study is performed in which ML and GN are compared as path computation elements (PCEs). The results show that the ML PCE is capable of matching or slightly improving the performance of the GN PCE on all topologies while reducing significantly the computation time of network planning by up to 70%.
The primary challenge in video super-resolution (VSR) is to handle large motions in the input frames, which makes it difficult to accurately aggregate information from multiple frames. Existing works either adopt deformable convolutions or estimate optical flow as a prior to establish correspondences between frames for the effective alignment and fusion. However, they fail to take into account the valuable semantic information that can greatly enhance it; and flow-based methods heavily rely on the accuracy of a flow estimate model, which may not provide precise flows given two low-resolution frames. In this paper, we investigate a more robust and semantic-aware prior for enhanced VSR by utilizing the Segment Anything Model (SAM), a powerful foundational model that is less susceptible to image degradation. To use the SAM-based prior, we propose a simple yet effective module -- SAM-guidEd refinEment Module (SEEM), which can enhance both alignment and fusion procedures by the utilization of semantic information. This light-weight plug-in module is specifically designed to not only leverage the attention mechanism for the generation of semantic-aware feature but also be easily and seamlessly integrated into existing methods. Concretely, we apply our SEEM to two representative methods, EDVR and BasicVSR, resulting in consistently improved performance with minimal implementation effort, on three widely used VSR datasets: Vimeo-90K, REDS and Vid4. More importantly, we found that the proposed SEEM can advance the existing methods in an efficient tuning manner, providing increased flexibility in adjusting the balance between performance and the number of training parameters. Code will be open-source soon.
Single sign-on (SSO) allows users to authenticate to third-party applications through a central identity provider. Despite their wide adoption, deployed SSO systems suffer from privacy problems such as user tracking by the identity provider. While numerous solutions have been proposed by academic papers, none were adopted because they require modifying identity providers, a significant adoption barrier in practice. Solutions do get deployed, however, fail to eliminate major privacy issues. Leveraging Trusted Execution Environments (TEEs), we propose MISO, the first privacy-preserving SSO system that is completely compatible with existing identity providers (such as Google and Facebook). This means MISO can be easily integrated into existing SSO ecosystem today and benefit end users. MISO also enables new functionality that standard SSO cannot offer: MISO allows users to leverage multiple identity providers in a single SSO workflow, potentially in a threshold fashion, to better protect user accounts. We fully implemented MISO based on Intel SGX. Our evaluation shows that MISO can handle high user concurrency with practical performance.
Deep machine learning models including Convolutional Neural Networks (CNN) have been successful in the detection of Mild Cognitive Impairment (MCI) using medical images, questionnaires, and videos. This paper proposes a novel Multi-branch Classifier-Video Vision Transformer (MC-ViViT) model to distinguish MCI from those with normal cognition by analyzing facial features. The data comes from the I-CONECT, a behavioral intervention trial aimed at improving cognitive function by providing frequent video chats. MC-ViViT extracts spatiotemporal features of videos in one branch and augments representations by the MC module. The I-CONECT dataset is challenging as the dataset is imbalanced containing Hard-Easy and Positive-Negative samples, which impedes the performance of MC-ViViT. We propose a loss function for Hard-Easy and Positive-Negative Samples (HP Loss) by combining Focal loss and AD-CORRE loss to address the imbalanced problem. Our experimental results on the I-CONECT dataset show the great potential of MC-ViViT in predicting MCI with a high accuracy of 90.63\% accuracy on some of the interview videos.
We propose a Bayesian model selection approach that allows medical practitioners to select among predictor variables while taking their respective costs into account. Medical procedures almost always incur costs in time and/or money. These costs might exceed their usefulness for modeling the outcome of interest. We develop Bayesian model selection that uses flexible model priors to penalize costly predictors a priori and select a subset of predictors useful relative to their costs. Our approach (i) gives the practitioner control over the magnitude of cost penalization, (ii) enables the prior to scale well with sample size, and (iii) enables the creation of our proposed inclusion path visualization, which can be used to make decisions about individual candidate predictors using both probabilistic and visual tools. We demonstrate the effectiveness of our inclusion path approach and the importance of being able to adjust the magnitude of the prior's cost penalization through a dataset pertaining to heart disease diagnosis in patients at the Cleveland Clinic Foundation, where several candidate predictors with various costs were recorded for patients, and through simulated data.
High-throughput sequencing (HTS) technologies have revolutionized the field of genomics, enabling rapid and cost-effective genome analysis for various applications. However, the increasing volume of genomic data generated by HTS technologies presents significant challenges for computational techniques to effectively analyze genomes. To address these challenges, several algorithm-architecture co-design works have been proposed, targeting different steps of the genome analysis pipeline. These works explore emerging technologies to provide fast, accurate, and low-power genome analysis. This paper provides a brief review of the recent advancements in accelerating genome analysis, covering the opportunities and challenges associated with the acceleration of the key steps of the genome analysis pipeline. Our analysis highlights the importance of integrating multiple steps of genome analysis using suitable architectures to unlock significant performance improvements and reduce data movement and energy consumption. We conclude by emphasizing the need for novel strategies and techniques to address the growing demands of genomic data generation and analysis.
Federated learning (FL) allows multiple parties to cooperatively learn a federated model without sharing private data with each other. The need of protecting such federated models from being plagiarized or misused, therefore, motivates us to propose a provable secure model ownership verification scheme using zero-knowledge proof, named FedZKP. It is shown that the FedZKP scheme without disclosing credentials is guaranteed to defeat a variety of existing and potential attacks. Both theoretical analysis and empirical studies demonstrate the security of FedZKP in the sense that the probability for attackers to breach the proposed FedZKP is negligible. Moreover, extensive experimental results confirm the fidelity and robustness of our scheme.
Data trading has been hindered by privacy concerns associated with user-owned data and the infinite reproducibility of data, making it challenging for data owners to retain exclusive rights over their data once it has been disclosed. Traditional data pricing models relied on uniform pricing or subscription-based models. However, with the development of Privacy-Preserving Computing techniques, the market can now protect the privacy and complete transactions using progressively disclosed information, which creates a technical foundation for generating greater social welfare through data usage. In this study, we propose a novel approach to modeling multi-round data trading with progressively disclosed information using a matchmaking-based Markov Decision Process (MDP) and introduce a Social Welfare-optimized Data Pricing Mechanism (SWDPM) to find optimal pricing strategies. To the best of our knowledge, this is the first study to model multi-round data trading with progressively disclosed information. Numerical experiments demonstrate that the SWDPM can increase social welfare 3 times by up to 54\% in trading feasibility, 43\% in trading efficiency, and 25\% in trading fairness by encouraging better matching of demand and price negotiation among traders.