The limited priors required by neural networks make them the dominating choice to encode and learn policies using reinforcement learning (RL). However, they are also black-boxes, making it hard to understand the agent's behaviour, especially when working on the image level. Therefore, neuro-symbolic RL aims at creating policies that are interpretable in the first place. Unfortunately, interpretability is not explainability. To achieve both, we introduce Neurally gUided Differentiable loGic policiEs (NUDGE). NUDGE exploits trained neural network-based agents to guide the search of candidate-weighted logic rules, then uses differentiable logic to train the logic agents. Our experimental evaluation demonstrates that NUDGE agents can induce interpretable and explainable policies while outperforming purely neural ones and showing good flexibility to environments of different initial states and problem sizes.
Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable. In this work we study how policies can be maximally robust to arbitrary observational noise by analysing how they are altered by this noise through a stochastic linear operator interpretation of the disturbances, and establish connections between robustness and properties of the noise kernel and of the underlying MDPs. Then, we construct sufficient conditions for policy robustness, and propose a robustness-inducing scheme, applicable to any policy gradient algorithm, that formally trades off expected policy utility for robustness through lexicographic optimisation, while preserving convergence and sub-optimality in the policy synthesis.
With the strong robusticity on illumination variations, near-infrared (NIR) can be an effective and essential complement to visible (VIS) facial expression recognition in low lighting or complete darkness conditions. However, facial expression recognition (FER) from NIR images presents more challenging problem than traditional FER due to the limitations imposed by the data scale and the difficulty of extracting discriminative features from incomplete visible lighting contents. In this paper, we give the first attempt to deep NIR facial expression recognition and proposed a novel method called near-infrared facial expression transformer (NFER-Former). Specifically, to make full use of the abundant label information in the field of VIS, we introduce a Self-Attention Orthogonal Decomposition mechanism that disentangles the expression information and spectrum information from the input image, so that the expression features can be extracted without the interference of spectrum variation. We also propose a Hypergraph-Guided Feature Embedding method that models some key facial behaviors and learns the structure of the complex correlations between them, thereby alleviating the interference of inter-class similarity. Additionally, we have constructed a large NIR-VIS Facial Expression dataset that includes 360 subjects to better validate the efficiency of NFER-Former. Extensive experiments and ablation studies show that NFER-Former significantly improves the performance of NIR FER and achieves state-of-the-art results on the only two available NIR FER datasets, Oulu-CASIA and Large-HFE.
In the era of extensive data growth, robust and efficient mechanisms are needed to store and manage vast amounts of digital information, such as Data Storage Systems (DSSs). Concurrently, privacy concerns have arisen, leading to the development of techniques like Private Information Retrieval (PIR) to enable data access while preserving privacy. A PIR protocol allows users to retrieve information from a database without revealing the specifics of their query or the data they are accessing. With the advent of quantum computing, researchers have explored the potential of using quantum systems to enhance privacy in information retrieval. In a Quantum Private Information Retrieval (QPIR) protocol, a user can retrieve information from a database by downloading quantum systems from multiple servers, while ensuring that the servers remain oblivious to the specific information being accessed. This scenario offers a unique advantage by leveraging the inherent properties of quantum systems to provide enhanced privacy guarantees and improved communication rates compared to classical PIR protocols. In this thesis we consider the QPIR setting where the queries and the coded storage systems are classical, while the responses from the servers are quantum. This problem was treated by Song et al. for replicated storage and different collusion patterns. This thesis aims to develop QPIR protocols for coded storage by combining known classical PIR protocols with quantum communication algorithms, achieving enhanced privacy and communication costs. We consider different storage codes and robustness assumptions, and we prove that the achieved communication cost is always lower than the classical counterparts.
Quantum Internet signifies a remarkable advancement in communication technology, harnessing the principles of quantum entanglement and superposition to facilitate unparalleled levels of security and efficient computations. Quantum communication can be achieved through the utilization of quantum entanglement. Through the exchange of entangled pairs between two entities, quantum communication becomes feasible, enabled by the process of quantum teleportation. Given the lossy nature of the channels and the exponential decoherence of the transmitted photons, a set of intermediate nodes can serve as quantum repeaters to perform entanglement swapping and directly entangle two distant nodes. Such quantum repeaters may be malicious and by setting up malicious entanglements, intermediate nodes can jeopardize the confidentiality of the quantum information exchanged between the two communication nodes. Hence, this paper proposes a quantum identity authentication protocol that protects quantum networks from malicious entanglements. Unlike the existing protocols, the proposed quantum authentication protocol does not require periodic refreshments of the shared secret keys. Simulation results demonstrate that the proposed protocol can detect malicious entanglements with a 100% probability after an average of 4 authentication rounds.
We consider the high-dimensional linear regression model and assume that a fraction of the measurements are altered by an adversary with complete knowledge of the data and the underlying distribution. We are interested in a scenario where dense additive noise is heavy-tailed while the measurement vectors follow a sub-Gaussian distribution. Within this framework, we establish minimax lower bounds for the performance of an arbitrary estimator that depend on the the fraction of corrupted observations as well as the tail behavior of the additive noise. Moreover, we design a modification of the so-called Square-Root Slope estimator with several desirable features: (a) it is provably robust to adversarial contamination, and satisfies performance guarantees in the form of sub-Gaussian deviation inequalities that match the lower error bounds, up to logarithmic factors; (b) it is fully adaptive with respect to the unknown sparsity level and the variance of the additive noise, and (c) it is computationally tractable as a solution of a convex optimization problem. To analyze performance of the proposed estimator, we prove several properties of matrices with sub-Gaussian rows that may be of independent interest.
In computational social choice, the distortion of a voting rule quantifies the degree to which the rule overcomes limited preference information to select a socially desirable outcome. This concept has been investigated extensively, but only through a worst-case lens. Instead, we study the expected distortion of voting rules with respect to an underlying distribution over voter utilities. Our main contribution is the design and analysis of a novel and intuitive rule, binomial voting, which provides strong distribution-independent guarantees for both expected distortion and expected welfare.
The increasing spreading of small commercial Unmanned Aerial Vehicles (UAVs, aka drones) presents serious threats for critical areas such as airports, power plants, governmental and military facilities. In fact, such UAVs can easily disturb or jam radio communications, collide with other flying objects, perform espionage activity, and carry offensive payloads, e.g., weapons or explosives. A central problem when designing surveillance solutions for the localization of unauthorized UAVs in critical areas is to decide how many triangulating sensors to use, and where to deploy them to optimise both coverage and cost effectiveness. In this article, we compute deployments of triangulating sensors for UAV localization, optimizing a given blend of metrics, namely: coverage under multiple sensing quality levels, cost-effectiveness, fault-tolerance. We focus on large, complex 3D regions, which exhibit obstacles (e.g., buildings), varying terrain elevation, different coverage priorities, constraints on possible sensors placement. Our novel approach relies on computational geometry and statistical model checking, and enables the effective use of off-the-shelf AI-based black-box optimizers. Moreover, our method allows us to compute a closed-form, analytical representation of the region uncovered by a sensor deployment, which provides the means for rigorous, formal certification of the quality of the latter. We show the practical feasibility of our approach by computing optimal sensor deployments for UAV localization in two large, complex 3D critical regions, the Rome Leonardo Da Vinci International Airport (FCO) and the Vienna International Center (VIC), using NOMAD as our state-of-the-art underlying optimization engine. Results show that we can compute optimal sensor deployments within a few hours on a standard workstation and within minutes on a small parallel infrastructure.
Adversarial attacks to image classification systems present challenges to convolutional networks and opportunities for understanding them. This study suggests that adversarial perturbations on images lead to noise in the features constructed by these networks. Motivated by this observation, we develop new network architectures that increase adversarial robustness by performing feature denoising. Specifically, our networks contain blocks that denoise the features using non-local means or other filters; the entire networks are trained end-to-end. When combined with adversarial training, our feature denoising networks substantially improve the state-of-the-art in adversarial robustness in both white-box and black-box attack settings. On ImageNet, under 10-iteration PGD white-box attacks where prior art has 27.9% accuracy, our method achieves 55.7%; even under extreme 2000-iteration PGD white-box attacks, our method secures 42.6% accuracy. A network based on our method was ranked first in Competition on Adversarial Attacks and Defenses (CAAD) 2018 --- it achieved 50.6% classification accuracy on a secret, ImageNet-like test dataset against 48 unknown attackers, surpassing the runner-up approach by ~10%. Code and models will be made publicly available.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.
Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.