Cyberattacks are increasingly threatening networked systems, often with the emergence of new types of unknown (zero-day) attacks and the rise of vulnerable devices. While Machine Learning (ML)-based Intrusion Detection Systems (IDSs) have been shown to be extremely promising in detecting these attacks, the need to learn large amounts of labelled data often limits the applicability of ML-based IDSs to cybersystems that only have access to private local data. To address this issue, this paper proposes a novel Decentralized and Online Federated Learning Intrusion Detection (DOF-ID) architecture. DOF-ID is a collaborative learning system that allows each IDS used for a cybersystem to learn from experience gained in other cybersystems in addition to its own local data without violating the data privacy of other systems. As the performance evaluation results using public Kitsune and Bot-IoT datasets show, DOF-ID significantly improves the intrusion detection performance in all collaborating nodes simultaneously with acceptable computation time for online learning.
A near-field wideband communication system is studied, wherein a base station (BS) employs an extremely large-scale antenna array (ELAA) to serve multiple users situated within its near-field region. To facilitate the near-field beamfocusing and mitigate the wideband beam split, true-time delayer (TTD)-based hybrid beamforming architectures are employed at the BS. Apart from the fully-connected TTD-based architecture, a new sub-connected TTD-based architecture is proposed for enhancing energy efficiency. Three wideband beamfocusing optimization approaches are proposed to maximize spectral efficiency for both architectures. 1) Fully-digital approximation (FDA) approach: In this approach, the TTD-based hybrid beamformers are optimized to approximate the optimal fully-digital beamformers using block coordinate descent. 2) Penalty-based FDA approach: In this approach, the penalty method is leveraged in the FDA approach to guarantee the convergence to a stationary point of the spectral maximization problem. 3) Heuristic two-stage (HTS) approach: In this approach, the closed-form TTD-based analog beamformers are first designed based on the outcomes of near-field beam training and the piecewise-near-field approximation. Subsequently, the low-dimensional digital beamformer is optimized using knowledge of the low-dimensional equivalent channels, resulting in reduced computational complexity and channel estimation complexity. Our numerical results unveil that 1) the proposed approaches effectively eliminate the near-field beam split effect, and 2) compared to the fully-connected architecture, the proposed sub-connected architecture exhibits higher energy efficiency and imposes fewer hardware limitations on TTDs and system bandwidth.
We consider a robust beamforming problem where large amount of downlink (DL) channel state information (CSI) data available at a multiple antenna access point (AP) is used to improve the link quality to a user equipment (UE) for beyond-5G and 6G applications such as environment-specific initial access (IA) or wireless power transfer (WPT). As the DL CSI available at the current instant may be imperfect or outdated, we propose a novel scheme which utilizes the (unknown) correlation between the antenna domain and physical domain to localize the possible future UE positions from the historical CSI database. Then, we develop a codebook design procedure to maximize the minimum sum beamforming gain to that localized CSI neighborhood. We also incorporate a UE specific parameter to enlarge the neighborhood to robustify the link further. We adopt an indoor channel model to demonstrate the performance of our solution, and benchmark against a usually optimal (but now sub-optimal due to outdated CSI) maximum ratio transmission (MRT) and a subspace based method.We numerically show that our algorithm outperforms the other methods by a large margin. This shows that customized environment-specific solutions are important to solve many future wireless applications, and we have paved the way to develop further data-driven approaches.
The field of motion prediction for automated driving has seen tremendous progress recently, bearing ever-more mighty neural network architectures. Leveraging these powerful models bears great potential for the closely related planning task. In this letter we propose a novel goal-conditioning method and show its potential to transform a state-of-the-art prediction model into a goal-directed planner. Our key insight is that conditioning prediction on a navigation goal at the behaviour level outperforms other widely adopted methods, with the additional benefit of increased model interpretability. We train our model on a large open-source dataset and show promising performance in a comprehensive benchmark.
Accurately detecting symbols transmitted over multiple-input multiple-output (MIMO) wireless channels is crucial in realizing the benefits of MIMO techniques. However, optimal MIMO detection is associated with a complexity that grows exponentially with the MIMO dimensions and quickly becomes impractical. Recently, stochastic sampling-based Bayesian inference techniques, such as Markov chain Monte Carlo (MCMC), have been combined with the gradient descent (GD) method to provide a promising framework for MIMO detection. In this work, we propose to efficiently approach optimal detection by exploring the discrete search space via MCMC random walk accelerated by Nesterov's gradient method. Nesterov's GD guides MCMC to make efficient searches without the computationally expensive matrix inversion and line search. Our proposed method operates using multiple GDs per random walk, achieving sufficient descent towards important regions of the search space before adding random perturbations, guaranteeing high sampling efficiency. To provide augmented exploration, extra samples are derived through the trajectory of Nesterov's GD by simple operations, effectively supplementing the sample list for statistical inference and boosting the overall MIMO detection performance. Furthermore, we design an early stopping tactic to terminate unnecessary further searches, remarkably reducing the complexity. Simulation results and complexity analysis reveal that the proposed method achieves near-optimal performance in both uncoded and coded MIMO systems, adapts to realistic channel models, and scales well to large MIMO dimensions.
Weakly hard real-time systems can, to some degree, tolerate deadline misses, but their schedulability still needs to be analyzed to ensure their quality of service. Such analysis usually occurs at early design stages to provide implementation guidelines to engineers so that they can make better design decisions. Estimating worst-case execution times (WCET) is a key input to schedulability analysis. However, early on during system design, estimating WCET values is challenging and engineers usually determine them as plausible ranges based on their domain knowledge. Our approach aims at finding restricted, safe WCET sub-ranges given a set of ranges initially estimated by experts in the context of weakly hard real-time systems. To this end, we leverage (1) multi-objective search aiming at maximizing the violation of weakly hard constraints in order to find worst-case scheduling scenarios and (2) polynomial logistic regression to infer safe WCET ranges with a probabilistic interpretation. We evaluated our approach by applying it to an industrial system in the satellite domain and several realistic synthetic systems. The results indicate that our approach significantly outperforms a baseline relying on random search without learning, and estimates safe WCET ranges with a high degree of confidence in practical time (< 23h).
Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.
The military is investigating methods to improve communication and agility in its multi-domain operations (MDO). Nascent popularity of Internet of Things (IoT) has gained traction in public and government domains. Its usage in MDO may revolutionize future battlefields and may enable strategic advantage. While this technology offers leverage to military capabilities, it comes with challenges where one is the uncertainty and associated risk. A key question is how can these uncertainties be addressed. Recently published studies proposed information camouflage to transform information from one data domain to another. As this is comparatively a new approach, we investigate challenges of such transformations and how these associated uncertainties can be detected and addressed, specifically unknown-unknowns to improve decision-making.
Seamlessly interacting with humans or robots is hard because these agents are non-stationary. They update their policy in response to the ego agent's behavior, and the ego agent must anticipate these changes to co-adapt. Inspired by humans, we recognize that robots do not need to explicitly model every low-level action another agent will make; instead, we can capture the latent strategy of other agents through high-level representations. We propose a reinforcement learning-based framework for learning latent representations of an agent's policy, where the ego agent identifies the relationship between its behavior and the other agent's future strategy. The ego agent then leverages these latent dynamics to influence the other agent, purposely guiding them towards policies suitable for co-adaptation. Across several simulated domains and a real-world air hockey game, our approach outperforms the alternatives and learns to influence the other agent.
The demand for artificial intelligence has grown significantly over the last decade and this growth has been fueled by advances in machine learning techniques and the ability to leverage hardware acceleration. However, in order to increase the quality of predictions and render machine learning solutions feasible for more complex applications, a substantial amount of training data is required. Although small machine learning models can be trained with modest amounts of data, the input for training larger models such as neural networks grows exponentially with the number of parameters. Since the demand for processing training data has outpaced the increase in computation power of computing machinery, there is a need for distributing the machine learning workload across multiple machines, and turning the centralized into a distributed system. These distributed systems present new challenges, first and foremost the efficient parallelization of the training process and the creation of a coherent model. This article provides an extensive overview of the current state-of-the-art in the field by outlining the challenges and opportunities of distributed machine learning over conventional (centralized) machine learning, discussing the techniques used for distributed machine learning, and providing an overview of the systems that are available.
Multivariate time series forecasting is extensively studied throughout the years with ubiquitous applications in areas such as finance, traffic, environment, etc. Still, concerns have been raised on traditional methods for incapable of modeling complex patterns or dependencies lying in real word data. To address such concerns, various deep learning models, mainly Recurrent Neural Network (RNN) based methods, are proposed. Nevertheless, capturing extremely long-term patterns while effectively incorporating information from other variables remains a challenge for time-series forecasting. Furthermore, lack-of-explainability remains one serious drawback for deep neural network models. Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting. MTNet consists of a large memory component, three separate encoders, and an autoregressive component to train jointly. Additionally, the attention mechanism designed enable MTNet to be highly interpretable. We can easily tell which part of the historic data is referenced the most.