We consider the localization of a mobile millimeter-wave client in a large indoor environment using multilayer perceptron neural networks (NNs). Instead of training and deploying a single deep model, we proceed by choosing among multiple tiny NNs trained in a self-supervised manner. The main challenge then becomes to determine and switch to the best NN among the available ones, as an incorrect NN will fail to localize the client. In order to upkeep the localization accuracy, we propose two switching schemes: one based on a Kalman filter, and one based on the statistical distribution of the training data. We analyze the proposed schemes via simulations, showing that our approach outperforms both geometric localization schemes and the use of a single NN.
Consumer Internet of Things (IoT) devices often leverage the local network to communicate with the corresponding companion app or other devices. This has benefits in terms of efficiency since it offloads the cloud. ENISA and NIST security guidelines underscore the importance of enabling default local communication for safety and reliability. Indeed, an IoT device should continue to function in case the cloud connection is not available. While the security of cloud-device connections is typically strengthened through the usage of standard protocols, local connectivity security is frequently overlooked. Neglecting the security of local communication opens doors to various threats, including replay attacks. In this paper, we investigate this class of attacks by designing a systematic methodology for automatically testing IoT devices vulnerability to replay attacks. Specifically, we propose a tool, named REPLIOT, able to test whether a replay attack is successful or not, without prior knowledge of the target devices. We perform thousands of automated experiments using popular commercial devices spanning various vendors and categories. Notably, our study reveals that among these devices, 51% of them do not support local connectivity, thus they are not compliant with the reliability and safety requirements of the ENISA/NIST guidelines. We find that 75% of the remaining devices are vulnerable to replay attacks with REPLIOT having a detection accuracy of 0.98-1. Finally, we investigate the possible causes of this vulnerability, discussing possible mitigation strategies.
The manipulation of deformable objects by robotic systems presents a significant challenge due to their complex and infinite-dimensional configuration spaces. This paper introduces a novel approach to Deformable Object Manipulation (DOM) by emphasizing the identification and manipulation of Structures of Interest (SOIs) in deformable fabric bags. We propose a bimanual manipulation framework that leverages a Graph Neural Network (GNN)-based latent dynamics model to succinctly represent and predict the behavior of these SOIs. Our approach involves constructing a graph representation from partial point cloud data of the object and learning the latent dynamics model that effectively captures the essential deformations of the fabric bag within a reduced computational space. By integrating this latent dynamics model with Model Predictive Control (MPC), we empower robotic manipulators to perform precise and stable manipulation tasks focused on the SOIs. We have validated our framework through various empirical experiments demonstrating its efficacy in bimanual manipulation of fabric bags. Our contributions not only address the complexities inherent in DOM but also provide new perspectives and methodologies for enhancing robotic interactions with deformable objects by concentrating on their critical structural elements. Experimental videos can be obtained from //sites.google.com/view/bagbot.
Deep learning-based joint source-channel coding (DJSCC) is expected to be a key technique for {the} next-generation wireless networks. However, the existing DJSCC schemes still face the challenge of channel adaptability as they are typically trained under specific channel conditions. In this paper, we propose a generic framework for channel-adaptive DJSCC by utilizing hypernetworks. To tailor the hypernetwork-based framework for communication systems, we propose a memory-efficient hypernetwork parameterization and then develop a channel-adaptive DJSCC network, named Hyper-AJSCC. Compared with existing adaptive DJSCC based on the attention mechanism, Hyper-AJSCC introduces much fewer parameters and can be seamlessly combined with various existing DJSCC networks without any substantial modifications to their neural network architecture. Extensive experiments demonstrate the better adaptability to channel conditions and higher memory efficiency of Hyper-AJSCC compared with state-of-the-art baselines.
Quantum communication networks (QCNs) utilize quantum mechanics for secure information transmission, but the reliance on fragile and expensive photonic quantum resources renders QCN resource optimization challenging. Unlike prior QCN works that relied on blindly compressing direct quantum embeddings of classical data, this letter proposes a novel quantum semantic communications (QSC) framework exploiting advancements in quantum machine learning and quantum semantic representations to extracts and embed only the relevant information from classical data into minimal high-dimensional quantum states that are accurately communicated over quantum channels with quantum communication and semantic fidelity measures. Simulation results indicate that, compared to semantic-agnostic QCN schemes, the proposed framework achieves approximately 50-75% reduction in quantum communication resources needed, while achieving a higher quantum semantic fidelity.
Pseudorange errors are the root cause of localization inaccuracy in GPS. Previous data-driven methods regress and eliminate pseudorange errors using handcrafted intermediate labels. Unlike them, we propose an end-to-end GPS localization framework, E2E-PrNet, to train a neural network for pseudorange correction (PrNet) directly using the final task loss calculated with the ground truth of GPS receiver states. The gradients of the loss with respect to learnable parameters are backpropagated through a differentiable nonlinear least squares optimizer to PrNet. The feasibility is verified with GPS data collected by Android phones, showing that E2E-PrNet outperforms the state-of-the-art end-to-end GPS localization methods.
This is a first draft of a quick primer on the use of Python (and relevant libraries) to build a wireless communication prototype that supports multiple-input and multiple-output (MIMO) systems with orthogonal frequency division multiplexing (OFDM) in addition to some machine learning use cases. This primer is intended to empower researchers with a means to efficiently create simulations. This draft is aligned with the syllabus of a graduate course we created to be taught in Fall 2022 and we aspire to update this draft occasionally based on feedback from the larger research community.
Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.
Leveraging datasets available to learn a model with high generalization ability to unseen domains is important for computer vision, especially when the unseen domain's annotated data are unavailable. We study a novel and practical problem of Open Domain Generalization (OpenDG), which learns from different source domains to achieve high performance on an unknown target domain, where the distributions and label sets of each individual source domain and the target domain can be different. The problem can be generally applied to diverse source domains and widely applicable to real-world applications. We propose a Domain-Augmented Meta-Learning framework to learn open-domain generalizable representations. We augment domains on both feature-level by a new Dirichlet mixup and label-level by distilled soft-labeling, which complements each domain with missing classes and other domain knowledge. We conduct meta-learning over domains by designing new meta-learning tasks and losses to preserve domain unique knowledge and generalize knowledge across domains simultaneously. Experiment results on various multi-domain datasets demonstrate that the proposed Domain-Augmented Meta-Learning (DAML) outperforms prior methods for unseen domain recognition.
A large number of real-world graphs or networks are inherently heterogeneous, involving a diversity of node types and relation types. Heterogeneous graph embedding is to embed rich structural and semantic information of a heterogeneous graph into low-dimensional node representations. Existing models usually define multiple metapaths in a heterogeneous graph to capture the composite relations and guide neighbor selection. However, these models either omit node content features, discard intermediate nodes along the metapath, or only consider one metapath. To address these three limitations, we propose a new model named Metapath Aggregated Graph Neural Network (MAGNN) to boost the final performance. Specifically, MAGNN employs three major components, i.e., the node content transformation to encapsulate input node attributes, the intra-metapath aggregation to incorporate intermediate semantic nodes, and the inter-metapath aggregation to combine messages from multiple metapaths. Extensive experiments on three real-world heterogeneous graph datasets for node classification, node clustering, and link prediction show that MAGNN achieves more accurate prediction results than state-of-the-art baselines.
Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.