Indoor visible light communication (VLC) is considered secure against attackers outside the confined area where the light propagates, but it is still susceptible to interception from inside the coverage area. A new technology, intelligent reflecting surfaces (IRS), has been recently introduced, offering a way to enhance physical layer security (PLS). Most research on IRS-assisted VLC assumes the same time of arrival from all reflecting elements and overlooks the effect of time delay and the associated intersymbol interference. This paper tackles, for the first time, the effect of time delay on the secrecy rate in VLC systems. Our results show that, at a fixed light-emitting diode (LED) power of 3W, the secrecy rate can be enhanced by up to 253\% at random positions for the legitimate user when the eavesdropper is located within a 1-meter radius of the LED. Our results also show that careful allocation of the IRS elements can lead to enhanced PLS even when the eavesdropper has a more favourable position and, thus, a better channel gain than the legitimate user.
Accurate positioning of remotely operated underwater vehicles (ROVs) in confined environments is crucial for inspection and mapping tasks and is also a prerequisite for autonomous operations. Presently, there are no positioning systems available that are suited for real-world use in confined underwater environments, unconstrained by environmental lighting and water turbidity levels and have sufficient accuracy for long-term, reliable and repeatable navigation. This shortage presents a significant barrier to enhancing the capabilities of ROVs in such scenarios. This paper introduces an innovative positioning system for ROVs operating in confined, cluttered underwater settings, achieved through the collaboration of an omnidirectional surface vehicle and an ROV. A formulation is proposed and evaluated in the simulation against ground truth. The experimental results from the simulation form a proof of principle of the proposed system and also demonstrate its deployability. Unlike many previous approaches, the system does not rely on fixed infrastructure or tracking of features in the environment and can cover large enclosed areas without additional equipment.
Deep neural networks have exhibited substantial success in the field of Natural Language Processing (NLP) and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literature, they are often light on the pragmatical issues of NLP verification. In this paper, we make an attempt to distil and evaluate general components of an NLP verification pipeline, that emerges from the progress in the field to date. Our contributions are two-fold. Firstly, we give a general characterisation of verifiable subspaces that result from embedding sentences into continuous spaces. We identify, and give an effective method to deal with, the technical challenge of semantic generalisability of verified subspaces; and propose it as a standard metric in the NLP verification pipelines (alongside with the standard metrics of model accuracy and model verifiability). Secondly, we propose a general methodology to analyse the effect of the embedding gap, a problem that refers to the discrepancy between verification of geometric subpspaces on the one hand, and semantic meaning of sentences which the geometric subspaces are supposed to represent, on the other hand. In extreme cases, poor choices in embedding of sentences may invalidate verification results. We propose a number of practical NLP methods that can help to identify the effects of the embedding gap; and in particular we propose the metric of falsifiability of semantic subpspaces as another fundamental metric to be reported as part of the NLP verification pipeline. We believe that together these general principles pave the way towards a more consolidated and effective development of this new domain.
Multi-agent perception (MAP) allows autonomous systems to understand complex environments by interpreting data from multiple sources. This paper investigates intermediate collaboration for MAP with a specific focus on exploring "good" properties of collaborative view (i.e., post-collaboration feature) and its underlying relationship to individual views (i.e., pre-collaboration features), which were treated as an opaque procedure by most existing works. We propose a novel framework named CMiMC (Contrastive Mutual Information Maximization for Collaborative Perception) for intermediate collaboration. The core philosophy of CMiMC is to preserve discriminative information of individual views in the collaborative view by maximizing mutual information between pre- and post-collaboration features while enhancing the efficacy of collaborative views by minimizing the loss function of downstream tasks. In particular, we define multi-view mutual information (MVMI) for intermediate collaboration that evaluates correlations between collaborative views and individual views on both global and local scales. We establish CMiMNet based on multi-view contrastive learning to realize estimation and maximization of MVMI, which assists the training of a collaboration encoder for voxel-level feature fusion. We evaluate CMiMC on V2X-Sim 1.0, and it improves the SOTA average precision by 3.08% and 4.44% at 0.5 and 0.7 IoU (Intersection-over-Union) thresholds, respectively. In addition, CMiMC can reduce communication volume to 1/32 while achieving performance comparable to SOTA. Code and Appendix are released at //github.com/77SWF/CMiMC.
The problem of distributed optimization requires a group of networked agents to compute a parameter that minimizes the average of their local cost functions. While there are a variety of distributed optimization algorithms that can solve this problem, they are typically vulnerable to "Byzantine" agents that do not follow the algorithm. Recent attempts to address this issue focus on single dimensional functions, or assume certain statistical properties of the functions at the agents. In this paper, we provide two resilient, scalable, distributed optimization algorithms for multi-dimensional functions. Our schemes involve two filters, (1) a distance-based filter and (2) a min-max filter, which each remove neighborhood states that are extreme (defined precisely in our algorithms) at each iteration. We show that these algorithms can mitigate the impact of up to $F$ (unknown) Byzantine agents in the neighborhood of each regular agent. In particular, we show that if the network topology satisfies certain conditions, all of the regular agents' states are guaranteed to converge to a bounded region that contains the minimizer of the average of the regular agents' functions.
Localization in outdoor wireless systems typically requires transmitting specific reference signals to estimate distance (trilateration methods) or angle (triangulation methods). These cause overhead on communication, need a LoS link to work well, and require multiple base stations, often imposing synchronization or specific hardware requirements. Fingerprinting has none of these drawbacks, but building its database requires high human effort to collect real-world measurements. For a long time, this issue limited the size of databases and thus their performance. This work proposes significantly reducing human effort in building fingerprinting databases by populating them with \textit{digital twin RF maps}. These RF maps are built from ray-tracing simulations on a digital replica of the environment across several frequency bands and beamforming configurations. Online user fingerprints are then matched against this spatial database. The approach was evaluated with practical simulations using realistic propagation models and user measurements. Our experiments show sub-meter localization errors on a NLoS location 95\% of the time using sensible user measurement report sizes. Results highlight the promising potential of the proposed digital twin approach for ubiquitous wide-area 6G localization.
Object counting typically uses 2D point annotations. The complexity of object shapes and the subjectivity of annotators may lead to annotation inconsistency, potentially confusing counting model training. Some sophisticated noise-resistance counting methods have been proposed to alleviate this issue. Differently, we aim to directly refine the initial point annotations before training counting models. For that, we propose the Shifted Autoencoders (SAE), which enhances annotation consistency. Specifically, SAE applies random shifts to initial point annotations and employs a UNet to restore them to their original positions. Similar to MAE reconstruction, the trained SAE captures general position knowledge and ignores specific manual offset noise. This allows to restore the initial point annotations to more general and thus consistent positions. Extensive experiments show that using such refined consistent annotations to train some advanced (including noise-resistance) object counting models steadily/significantly boosts their performances. Remarkably, the proposed SAE helps to set new records on nine datasets. We will make codes and refined point annotations available.
We address the problem of user association in a dense millimeter wave (mmWave) network, in which each arriving user brings a file containing a random number of packets and each time slot is divided into multiple mini-slots. This problem is an instance of the restless multi-armed bandit problem, and is provably hard to solve. Using a technique introduced by Whittle, we relax the hard per-stage constraint that each arriving user must be associated with exactly one mmWave base station (mBS) to a long-term constraint and then use the Lagrangian multiplier technique to convert the problem into an unconstrained problem. This decouples the process governing the system into separate Markov Decision Processes at different mBSs. We prove that the problem is Whittle indexable, present a scheme for computing the Whittle indices of different mBSs, and propose an association scheme under which, each arriving user is associated with the mBS with the smallest value of the Whittle index. Using extensive simulations, we show that the proposed Whittle index based scheme outperforms several user association schemes proposed in prior work in terms of various performance metrics such as average cost, delay, throughput, and Jain's fairness index.
Collision detection is one of the most time-consuming operations during motion planning. Thus, there is an increasing interest in exploring machine learning techniques to speed up collision detection and sampling-based motion planning. A recent line of research focuses on utilizing neural signed distance functions of either the robot geometry or the swept volume of the robot motion. Building on this, we present a novel neural implicit swept volume model to continuously represent arbitrary motions parameterized by their start and goal configurations. This allows to quickly compute signed distances for any point in the task space to the robot motion. Further, we present an algorithm combining the speed of the deep learning-based signed distance computations with the strong accuracy guarantees of geometric collision checkers. We validate our approach in simulated and real-world robotic experiments, and demonstrate that it is able to speed up a commercial bin picking application.
We describe ACE0, a lightweight platform for evaluating the suitability and viability of AI methods for behaviour discovery in multiagent simulations. Specifically, ACE0 was designed to explore AI methods for multi-agent simulations used in operations research studies related to new technologies such as autonomous aircraft. Simulation environments used in production are often high-fidelity, complex, require significant domain knowledge and as a result have high R&D costs. Minimal and lightweight simulation environments can help researchers and engineers evaluate the viability of new AI technologies for behaviour discovery in a more agile and potentially cost effective manner. In this paper we describe the motivation for the development of ACE0.We provide a technical overview of the system architecture, describe a case study of behaviour discovery in the aerospace domain, and provide a qualitative evaluation of the system. The evaluation includes a brief description of collaborative research projects with academic partners, exploring different AI behaviour discovery methods.
Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.