Machine-learning algorithms can facilitate low-cost, user-guided visual diagnostic platforms for addressing disparities in access to sexual health services. We developed a clinical image dataset using original and augmented images for five penile diseases: herpes eruption, syphilitic chancres, penile candidiasis, penile cancer, and genital warts. We used a U-net architecture model for semantic pixel segmentation into background or subject image, the Inception-ResNet version 2 neural architecture to classify each pixel as diseased or non-diseased, and a salience map using GradCAM++. We trained the model on a random 91% sample of the image database using 150 epochs per image, and evaluated the model on the remaining 9% of images, assessing recall (or sensitivity), precision, specificity, and F1-score (accuracy). Of the 239 images in the validation dataset, 45 (18.8%) were of genital warts, 43 (18.0%) were of HSV infection, 29 (12.1%) were of penile cancer, 40 (16.7%) were of penile candidiasis, 37 (15.5%) were of syphilitic chancres, and 45 (18.8%) were of non-diseased penises. The overall accuracy of the model for correctly classifying the diseased image was 0.944. Between July 1st and October 1st 2023, there were 2,640 unique users of the mobile platform. Among a random sample of submissions (n=437), 271 (62.0%) were from the United States, 64 (14.6%) from Singapore, 41 (9.4%) from Candia, 40 (9.2%) from the United Kingdom, and 21 (4.8%) from Vietnam. The majority (n=277 [63.4%]) were between 18 and 30 years old. We report on the development of a machine-learning model for classifying five penile diseases, which demonstrated excellent performance on a validation dataset. That model is currently in use globally and has the potential to improve access to diagnostic services for penile diseases.
We introduce a hybrid method that integrates deep learning with model-analog forecasting, a straightforward yet effective approach that generates forecasts from similar initial climate states in a repository of model simulations. This hybrid framework employs a convolutional neural network to estimate state-dependent weights to identify analog states. The advantage of our method lies in its physical interpretability, offering insights into initial-error-sensitive regions through estimated weights and the ability to trace the physically-based temporal evolution of the system through analog forecasting. We evaluate our approach using the Community Earth System Model Version 2 Large Ensemble to forecast the El Ni\~no-Southern Oscillation (ENSO) on a seasonal-to-annual time scale. Results show a 10% improvement in forecasting sea surface temperature anomalies over the equatorial Pacific at 9-12 months leads compared to the traditional model-analog technique. Furthermore, our hybrid model demonstrates improvements in boreal winter and spring initialization when evaluated against a reanalysis dataset. Our deep learning-based approach reveals state-dependent sensitivity linked to various seasonally varying physical processes, including the Pacific Meridional Modes, equatorial recharge oscillator, and stochastic wind forcing. Notably, disparities emerge in the sensitivity associated with El Ni\~no and La Ni\~na events. We find that sea surface temperature over the tropical Pacific plays a more crucial role in El Ni\~no forecasting, while zonal wind stress over the same region exhibits greater significance in La Ni\~na prediction. This approach has broad implications for forecasting diverse climate phenomena, including regional temperature and precipitation, which are challenging for the traditional model-analog forecasting method.
Graph neural networks are becoming increasingly popular in the field of machine learning due to their unique ability to process data structured in graphs. They have also been applied in safety-critical environments where perturbations inherently occur. However, these perturbations require us to formally verify neural networks before their deployment in safety-critical environments as neural networks are prone to adversarial attacks. While there exists research on the formal verification of neural networks, there is no work verifying the robustness of generic graph convolutional network architectures with uncertainty in the node features and in the graph structure over multiple message-passing steps. This work addresses this research gap by explicitly preserving the non-convex dependencies of all elements in the underlying computations through reachability analysis with (matrix) polynomial zonotopes. We demonstrate our approach on three popular benchmark datasets.
Intelligent machine learning approaches are finding active use for event detection and identification that allow real-time situational awareness. Yet, such machine learning algorithms have been shown to be susceptible to adversarial attacks on the incoming telemetry data. This paper considers a physics-based modal decomposition method to extract features for event classification and focuses on interpretable classifiers including logistic regression and gradient boosting to distinguish two types of events: load loss and generation loss. The resulting classifiers are then tested against an adversarial algorithm to evaluate their robustness. The adversarial attack is tested in two settings: the white box setting, wherein the attacker knows exactly the classification model; and the gray box setting, wherein the attacker has access to historical data from the same network as was used to train the classifier, but does not know the classification model. Thorough experiments on the synthetic South Carolina 500-bus system highlight that a relatively simpler model such as logistic regression is more susceptible to adversarial attacks than gradient boosting.
The search for a general model that can operate seamlessly across multiple domains remains a key goal in machine learning research. The prevailing methodology in Reinforcement Learning (RL) typically limits models to a single task within a unimodal framework, a limitation that contrasts with the broader vision of a versatile, multi-domain model. In this paper, we present Jack of All Trades (JAT), a transformer-based model with a unique design optimized for handling sequential decision-making tasks and multimodal data types. The JAT model demonstrates its robust capabilities and versatility by achieving strong performance on very different RL benchmarks, along with promising results on Computer Vision (CV) and Natural Language Processing (NLP) tasks, all using a single set of weights. The JAT model marks a significant step towards more general, cross-domain AI model design, and notably, it is the first model of its kind to be fully open-sourced (see //huggingface.co/jat-project/jat), including a pioneering general-purpose dataset.
Interpreting neural network classifiers using gradient-based saliency maps has been extensively studied in the deep learning literature. While the existing algorithms manage to achieve satisfactory performance in application to standard image recognition datasets, recent works demonstrate the vulnerability of widely-used gradient-based interpretation schemes to norm-bounded perturbations adversarially designed for every individual input sample. However, such adversarial perturbations are commonly designed using the knowledge of an input sample, and hence perform sub-optimally in application to an unknown or constantly changing data point. In this paper, we show the existence of a Universal Perturbation for Interpretation (UPI) for standard image datasets, which can alter a gradient-based feature map of neural networks over a significant fraction of test samples. To design such a UPI, we propose a gradient-based optimization method as well as a principal component analysis (PCA)-based approach to compute a UPI which can effectively alter a neural network's gradient-based interpretation on different samples. We support the proposed UPI approaches by presenting several numerical results of their successful applications to standard image datasets.
This paper presents a distributed model predictive control (DMPC) algorithm for a heterogeneous platoon using arbitrary communication topologies, as long as each vehicle is able to communicate with a preceding vehicle in the platoon. The proposed DMPC algorithm is able to accommodate any spacing policy that is affine in a vehicle's velocity, which includes constant distance or constant time headway spacing policies. By analyzing the total cost for the entire platoon, a sufficient condition is derived to guarantee platoon asymptotic stability. Simulation experiments with a platoon of 50 vehicles and hardware experiments with a platoon of four 1/10th scale vehicles validate the algorithm and compare performance under different spacing policies and communication topologies.
Smart recommendation algorithms have revolutionized information dissemination, enhancing efficiency and reshaping content delivery across various domains. However, concerns about user agency have arisen due to the inherent opacity (information asymmetry) and the nature of one-way output (power asymmetry) on algorithms. While both issues have been criticized by scholars via advocating explainable AI (XAI) and human-AI collaborative decision-making (HACD), few research evaluates their integrated effects on users, and few HACD discussions in recommender systems beyond improving and filtering the results. This study proposes an incubating idea as a missing step in HACD that allows users to control the degrees of AI-recommended content. Then, we integrate it with existing XAI to a flow prototype aimed at assessing the enhancement of user agency. We seek to understand how types of agency impact user perception and experience, and bring empirical evidence to refine the guidelines and designs for human-AI interactive systems.
We introduce the optimized dynamic mode decomposition algorithm for constructing an adaptive and computationally efficient reduced order model and forecasting tool for global atmospheric chemistry dynamics. By exploiting a low-dimensional set of global spatio-temporal modes, interpretable characterizations of the underlying spatial and temporal scales can be computed. Forecasting is also achieved with a linear model that uses a linear superposition of the dominant spatio-temporal features. The DMD method is demonstrated on three months of global chemistry dynamics data, showing its significant performance in computational speed and interpretability. We show that the presented decomposition method successfully extracts known major features of atmospheric chemistry, such as summertime surface pollution and biomass burning activities. Moreover, the DMD algorithm allows for rapid reconstruction of the underlying linear model, which can then easily accommodate non-stationary data and changes in the dynamics.
The fusion of causal models with deep learning introducing increasingly intricate data sets, such as the causal associations within images or between textual components, has surfaced as a focal research area. Nonetheless, the broadening of original causal concepts and theories to such complex, non-statistical data has been met with serious challenges. In response, our study proposes redefinitions of causal data into three distinct categories from the standpoint of causal structure and representation: definite data, semi-definite data, and indefinite data. Definite data chiefly pertains to statistical data used in conventional causal scenarios, while semi-definite data refers to a spectrum of data formats germane to deep learning, including time-series, images, text, and others. Indefinite data is an emergent research sphere inferred from the progression of data forms by us. To comprehensively present these three data paradigms, we elaborate on their formal definitions, differences manifested in datasets, resolution pathways, and development of research. We summarize key tasks and achievements pertaining to definite and semi-definite data from myriad research undertakings, present a roadmap for indefinite data, beginning with its current research conundrums. Lastly, we classify and scrutinize the key datasets presently utilized within these three paradigms.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.