In the last decade, the United States has lost more than 500,000 people from an overdose involving prescription and illicit opioids (//www.cdc.gov/drugoverdose/epidemic/index.html) making it a national public health emergency (USDHHS, 2017). To more effectively prevent unintentional opioid overdoses, medical practitioners require robust and timely tools that can effectively identify at-risk patients. Community-based social media platforms such as Reddit allow self-disclosure for users to discuss otherwise sensitive drug-related behaviors, often acting as indicators for opioid use disorder. Towards this, we present a moderate size corpus of 2500 opioid-related posts from various subreddits spanning 6 different phases of opioid use: Medical Use, Misuse, Addiction, Recovery, Relapse, Not Using. For every post, we annotate span-level extractive explanations and crucially study their role both in annotation quality and model development. We evaluate several state-of-the-art models in a supervised, few-shot, or zero-shot setting. Experimental results and error analysis show that identifying the phases of opioid use disorder is highly contextual and challenging. However, we find that using explanations during modeling leads to a significant boost in classification accuracy demonstrating their beneficial role in a high-stakes domain such as studying the opioid use disorder continuum. The dataset will be made available for research on Github in the formal version.
Compared to the generations up to 4G, whose main focus was on broadband and coverage aspects, 5G has expanded the scope of wireless cellular systems towards embracing two new types of connectivity: massive machine-type communication (mMTC) and ultra-reliable low-latency communications (URLLC). This paper will discuss the possible evolution of these two types of connectivity within the umbrella of 6G wireless systems. The paper consists of three parts. The first part deals with the connectivity for a massive number of devices. While mMTC research in 5G was predominantly focused on the problem of uncoordinated access in the uplink for a large number of devices, the traffic patterns in 6G may become more symmetric, leading to closed-loop massive connectivity. One of the drivers for this is distributed learning/inference. The second part of the paper will discuss the evolution of wireless connectivity for critical services. While latency and reliability are tightly coupled in 5G, 6G will support a variety of safety critical control applications with different types of timing requirements, as evidenced by the emergence of metrics related to information freshness and information value. Additionally, ensuring ultra-high reliability for safety critical control applications requires modeling and estimation of the tail statistics of the wireless channel, queue length, and delay. The fulfillment of these stringent requirements calls for the development of novel AI-based techniques, incorporating optimization theory, explainable AI, generative AI and digital twins. The third part will analyze the coexistence of massive connectivity and critical services. We will consider scenarios in which a massive number of devices need to support traffic patterns of mixed criticality. This will be followed by a discussion about the management of wireless resources shared by services with different criticality.
Graph Neural Networks (GNNs) have achieved great success in Knowledge Graph Completion (KGC) by modelling how entities and relations interact in recent years. However, the explanation of the predicted facts has not caught the necessary attention. Proper explanations for the results of GNN-based KGC models increase model transparency and help researchers develop more reliable models. Existing practices for explaining KGC tasks rely on instance/subgraph-based approaches, while in some scenarios, paths can provide more user-friendly and interpretable explanations. Nonetheless, the methods for generating path-based explanations for KGs have not been well-explored. To address this gap, we propose Power-Link, the first path-based KGC explainer that explores GNN-based models. We design a novel simplified graph-powering technique, which enables the generation of path-based explanations with a fully parallelisable and memory-efficient training scheme. We further introduce three new metrics for quantitative evaluation of the explanations, together with a qualitative human evaluation. Extensive experiments demonstrate that Power-Link outperforms the SOTA baselines in interpretability, efficiency, and scalability.
As the horizon of intelligent transportation expands with the evolution of Automated Driving Systems (ADS), ensuring paramount safety becomes more imperative than ever. Traditional risk assessment methodologies, primarily crafted for human-driven vehicles, grapple to adequately adapt to the multifaceted, evolving environments of ADS. This paper introduces a framework for real-time Dynamic Risk Assessment (DRA) in ADS, harnessing the potency of Artificial Neural Networks (ANNs). Our proposed solution transcends these limitations, drawing upon ANNs, a cornerstone of deep learning, to meticulously analyze and categorize risk dimensions using real-time On-board Sensor (OBS) data. This learning-centric approach not only elevates the ADS's situational awareness but also enriches its understanding of immediate operational contexts. By dissecting OBS data, the system is empowered to pinpoint its current risk profile, thereby enhancing safety prospects for onboard passengers and the broader traffic ecosystem. Through this framework, we chart a direction in risk assessment, bridging the conventional voids and enhancing the proficiency of ADS. By utilizing ANNs, our methodology offers a perspective, allowing ADS to adeptly navigate and react to potential risk factors, ensuring safer and more informed autonomous journeys.
Despite Multi-modal Large Language Models (MM-LLMs) have made exciting strides recently, they are still struggling to efficiently model the interactions among multi-modal inputs and the generation in non-textual modalities. In this work, we propose TEAL (Tokenize and Embed ALl)}, an approach to treat the input from any modality as a token sequence and learn a joint embedding space for all modalities. Specifically, for the input from any modality, TEAL first discretizes it into a token sequence with the off-the-shelf tokenizer and embeds the token sequence into a joint embedding space with a learnable embedding matrix. MM-LLMs just need to predict the multi-modal tokens autoregressively as the textual LLMs do. Finally, the corresponding de-tokenizer is applied to generate the output in each modality based on the predicted token sequence. With the joint embedding space, TEAL enables the frozen LLMs to perform both understanding and generation tasks involving non-textual modalities, such as image and audio. Thus, the textual LLM can just work as an interface and maintain its high performance in textual understanding and generation. Experiments show that TEAL achieves substantial improvements in multi-modal understanding, and implements a simple scheme for multi-modal generations.
Benefitting from the vast spatial degrees of freedom, the amalgamation of integrated sensing and communication (ISAC) and massive multiple-input multiple-output (MIMO) is expected to simultaneously improve spectral and energy efficiencies as well as the sensing capability. However, a large number of antennas deployed in massive MIMO-ISAC raises critical challenges in acquiring both accurate channel state information and target parameter information. To overcome these two challenges with a unified framework, we first analyze their underlying system models and then propose a novel tensor-based approach that addresses both the channel estimation and target sensing problems. Specifically, by parameterizing the high-dimensional communication channel exploiting a small number of physical parameters, we associate the channel state information with the sensing parameters of targets in terms of angular, delay, and Doppler dimensions. Then, we propose a shared training pattern adopting the same time-frequency resources such that both the channel estimation and target parameter estimation can be formulated as a canonical polyadic decomposition problem with a similar mathematical expression. On this basis, we first investigate the uniqueness condition of the tensor factorization and the maximum number of resolvable targets by utilizing the specific Vandermonde
Diffusion Magnetic Resonance Imaging (dMRI) plays a crucial role in the noninvasive investigation of tissue microstructural properties and structural connectivity in the \textit{in vivo} human brain. However, to effectively capture the intricate characteristics of water diffusion at various directions and scales, it is important to employ comprehensive q-space sampling. Unfortunately, this requirement leads to long scan times, limiting the clinical applicability of dMRI. To address this challenge, we propose SSOR, a Simultaneous q-Space sampling Optimization and Reconstruction framework. We jointly optimize a subset of q-space samples using a continuous representation of spherical harmonic functions and a reconstruction network. Additionally, we integrate the unique properties of diffusion magnetic resonance imaging (dMRI) in both the q-space and image domains by applying $l1$-norm and total-variation regularization. The experiments conducted on HCP data demonstrate that SSOR has promising strengths both quantitatively and qualitatively and exhibits robustness to noise.
The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this paper, we present a comprehensive survey of the existing research in the area of MMS.
Deployment of Internet of Things (IoT) devices and Data Fusion techniques have gained popularity in public and government domains. This usually requires capturing and consolidating data from multiple sources. As datasets do not necessarily originate from identical sensors, fused data typically results in a complex data problem. Because military is investigating how heterogeneous IoT devices can aid processes and tasks, we investigate a multi-sensor approach. Moreover, we propose a signal to image encoding approach to transform information (signal) to integrate (fuse) data from IoT wearable devices to an image which is invertible and easier to visualize supporting decision making. Furthermore, we investigate the challenge of enabling an intelligent identification and detection operation and demonstrate the feasibility of the proposed Deep Learning and Anomaly Detection models that can support future application that utilizes hand gesture data from wearable devices.
Defensive deception is a promising approach for cyberdefense. Although defensive deception is increasingly popular in the research community, there has not been a systematic investigation of its key components, the underlying principles, and its tradeoffs in various problem settings. This survey paper focuses on defensive deception research centered on game theory and machine learning, since these are prominent families of artificial intelligence approaches that are widely employed in defensive deception. This paper brings forth insights, lessons, and limitations from prior work. It closes with an outline of some research directions to tackle major gaps in current defensive deception research.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.