Metaverse is a vast virtual environment parallel to the physical world in which users enjoy a variety of services acting as an avatar. To build a secure living habitat, it's vital to ensure the virtual-physical traceability that tracking a malicious player in the physical world via his avatars in virtual space. In this paper, we propose a two-factor authentication framework based on chameleon signature and biometric-based authentication. First, aiming at disguise in virtual space, we propose a chameleon collision signature algorithm to achieve the verifiability of the avatar's virtual identity. Second, facing at impersonation in physical world, we construct an avatar's identity model based on the player's biometric template and the chameleon key to realize the verifiability of the avatar's physical identity. Finally, we design two decentralized authentication protocols based on the avatar's identity model to ensure the consistency of the avatar's virtual and physical identities. Security analysis indicates that the proposed authentication framework guarantees the consistency and traceability of avatar's identity. Simulation experiments show that the framework not only completes the decentralized authentication between avatars but also achieves the virtual-physical tracking.
In the modern paradigm of federated learning, a large number of users are involved in a global learning task, in a collaborative way. They alternate local computations and two-way communication with a distant orchestrating server. Communication, which can be slow and costly, is the main bottleneck in this setting. To reduce the communication load and therefore accelerate distributed gradient descent, two strategies are popular: 1) communicate less frequently; that is, perform several iterations of local computations between the communication rounds; and 2) communicate compressed information instead of full-dimensional vectors. In this paper, we propose the first algorithm for distributed optimization and federated learning, which harnesses these two strategies jointly and converges linearly to an exact solution, with a doubly accelerated rate: our algorithm benefits from the two acceleration mechanisms provided by local training and compression, namely a better dependency on the condition number of the functions and on the dimension of the model, respectively.
The extensive use of smartphones and wearable devices has facilitated many useful applications. For example, with Global Positioning System (GPS)-equipped smart and wearable devices, many applications can gather, process, and share rich metadata, such as geolocation, trajectories, elevation, and time. For example, fitness applications, such as Runkeeper and Strava, utilize the information for activity tracking and have recently witnessed a boom in popularity. Those fitness tracker applications have their own web platforms and allow users to share activities on such platforms or even with other social network platforms. To preserve the privacy of users while allowing sharing, several of those platforms may allow users to disclose partial information, such as the elevation profile for an activity, which supposedly would not leak the location of the users. In this work, and as a cautionary tale, we create a proof of concept where we examine the extent to which elevation profiles can be used to predict the location of users. To tackle this problem, we devise three plausible threat settings under which the city or borough of the targets can be predicted. Those threat settings define the amount of information available to the adversary to launch the prediction attacks. Establishing that simple features of elevation profiles, e.g., spectral features, are insufficient, we devise both natural language processing (NLP)-inspired text-like representation and computer vision-inspired image-like representation of elevation profiles, and we convert the problem at hand into text and image classification problem. We use both traditional machine learning- and deep learning-based techniques and achieve a prediction success rate ranging from 59.59\% to 99.80\%. The findings are alarming, highlighting that sharing elevation information may have significant location privacy risks.
Since 2021, the term "Metaverse" has been the most popular one, garnering a lot of interest. Because of its contained environment and built-in computing and networking capabilities, a modern car makes an intriguing location to host its own little metaverse. Additionally, the travellers don't have much to do to pass the time while traveling, making them ideal customers for immersive services. Vetaverse (Vehicular-Metaverse), which we define as the future continuum between vehicular industries and Metaverse, is envisioned as a blended immersive realm that scales up to cities and countries, as digital twins of the intelligent Transportation Systems, referred to as "TS-Metaverse", as well as customized XR services inside each Individual Vehicle, referred to as "IV-Metaverse". The two subcategories serve fundamentally different purposes, namely long-term interconnection, maintenance, monitoring, and management on scale for large transportation systems (TS), and personalized, private, and immersive infotainment services (IV). By outlining the framework of Vetaverse and examining important enabler technologies, we reveal this impending trend. Additionally, we examine unresolved issues and potential routes for future study while highlighting some intriguing Vetaverse services.
Free-form rationales aim to aid model interpretability by supplying the background knowledge that can help understand model decisions. Crowdsourced rationales are provided for commonsense QA instances in popular datasets such as CoS-E and ECQA, but their utility remains under-investigated. We present human studies which show that ECQA rationales indeed provide additional background information to understand a decision, while over 88% of CoS-E rationales do not. Inspired by this finding, we ask: can the additional context provided by free-form rationales benefit models, similar to human users? We investigate the utility of rationales as an additional source of supervision, by varying the quantity and quality of rationales during training. After controlling for instances where rationales leak the correct answer while not providing additional background knowledge, we find that incorporating only 5% of rationales during training can boost model performance by 47.22% for CoS-E and 57.14% for ECQA during inference. Moreover, we also show that rationale quality matters: compared to crowdsourced rationales, T5-generated rationales provide not only weaker supervision to models, but are also not helpful for humans in aiding model interpretability.
Machine learning (ML) models are costly to train as they can require a significant amount of data, computational resources and technical expertise. Thus, they constitute valuable intellectual property that needs protection from adversaries wanting to steal them. Ownership verification techniques allow the victims of model stealing attacks to demonstrate that a suspect model was in fact stolen from theirs. Although a number of ownership verification techniques based on watermarking or fingerprinting have been proposed, most of them fall short either in terms of security guarantees (well-equipped adversaries can evade verification) or computational cost. A fingerprinting technique introduced at ICLR '21, Dataset Inference (DI), has been shown to offer better robustness and efficiency than prior methods. The authors of DI provided a correctness proof for linear (suspect) models. However, in the same setting, we prove that DI suffers from high false positives (FPs) -- it can incorrectly identify an independent model trained with non-overlapping data from the same distribution as stolen. We further prove that DI also triggers FPs in realistic, non-linear suspect models. We then confirm empirically that DI leads to FPs, with high confidence. Second, we show that DI also suffers from false negatives (FNs) -- an adversary can fool DI by regularising a stolen model's decision boundaries using adversarial training, thereby leading to an FN. To this end, we demonstrate that DI fails to identify a model adversarially trained from a stolen dataset -- the setting where DI is the hardest to evade. Finally, we discuss the implications of our findings, the viability of fingerprinting-based ownership verification in general, and suggest directions for future work.
A new approach to calculating the finite Fourier transform is suggested throughout the process of this study. The idea that the series has been updated with the appropriate modification and purification, which serves as the basis for the study, and that this update functions as the basis for the investigation is the conceptual goal of this method, which was designed especially for the purpose of this study. It is provided here that this methodology, which was designed especially for the purpose of this study, has been updated with the appropriate modification and purification, which serves as the basis for the study, is provided here. This study also used this update as the premise to get started. In order for this approach to be successful, the starting point must be the presumption that the series has been appropriately purified and organized to the point where it can be considered adequate. The attributes of this series were discovered as a result of the work that was ordered to choose an acceptable application of the Fourier series, to apply it, and to conduct an analysis of it in relation to the finite Fourier transform. These qualities were determined this study. The results of this study provided a better understanding of the characteristics of this series.
Voice control in the smart home is commonplace, enabling the convenient control of smart home Internet of Things hubs, gateways and devices, along with information seeking dialogues. Cloud-based voice assistants are used to facilitate the interaction, yet privacy concerns surround the cloud analysis of data. To what extent can voice control be performed using purely local computation, to ensure user data remains private? In this paper we present a taxonomy of the voice control technologies present in commercial smart home systems. We first review literature on the topic, and summarise relevant work categorising IoT devices and voice control in the home. The taxonomic classification of these entities is then presented, and we analyse our findings. Following on, we turn to academic efforts in implementing and evaluating voice-controlled smart home set-ups, and we then discuss open-source libraries and devices that are applicable to the design of a privacy-preserving voice assistant for smart homes and the IoT. Towards the end, we consider additional technologies and methods that could support a cloud-free voice assistant, and conclude the work.
Classic machine learning methods are built on the $i.i.d.$ assumption that training and testing data are independent and identically distributed. However, in real scenarios, the $i.i.d.$ assumption can hardly be satisfied, rendering the sharp drop of classic machine learning algorithms' performances under distributional shifts, which indicates the significance of investigating the Out-of-Distribution generalization problem. Out-of-Distribution (OOD) generalization problem addresses the challenging setting where the testing distribution is unknown and different from the training. This paper serves as the first effort to systematically and comprehensively discuss the OOD generalization problem, from the definition, methodology, evaluation to the implications and future directions. Firstly, we provide the formal definition of the OOD generalization problem. Secondly, existing methods are categorized into three parts based on their positions in the whole learning pipeline, namely unsupervised representation learning, supervised model learning and optimization, and typical methods for each category are discussed in detail. We then demonstrate the theoretical connections of different categories, and introduce the commonly used datasets and evaluation metrics. Finally, we summarize the whole literature and raise some future directions for OOD generalization problem. The summary of OOD generalization methods reviewed in this survey can be found at //out-of-distribution-generalization.com.
Deep learning models on graphs have achieved remarkable performance in various graph analysis tasks, e.g., node classification, link prediction and graph clustering. However, they expose uncertainty and unreliability against the well-designed inputs, i.e., adversarial examples. Accordingly, various studies have emerged for both attack and defense addressed in different graph analysis tasks, leading to the arms race in graph adversarial learning. For instance, the attacker has poisoning and evasion attack, and the defense group correspondingly has preprocessing- and adversarial- based methods. Despite the booming works, there still lacks a unified problem definition and a comprehensive review. To bridge this gap, we investigate and summarize the existing works on graph adversarial learning tasks systemically. Specifically, we survey and unify the existing works w.r.t. attack and defense in graph analysis tasks, and give proper definitions and taxonomies at the same time. Besides, we emphasize the importance of related evaluation metrics, and investigate and summarize them comprehensively. Hopefully, our works can serve as a reference for the relevant researchers, thus providing assistance for their studies. More details of our works are available at //github.com/gitgiter/Graph-Adversarial-Learning.
The concept of smart grid has been introduced as a new vision of the conventional power grid to figure out an efficient way of integrating green and renewable energy technologies. In this way, Internet-connected smart grid, also called energy Internet, is also emerging as an innovative approach to ensure the energy from anywhere at any time. The ultimate goal of these developments is to build a sustainable society. However, integrating and coordinating a large number of growing connections can be a challenging issue for the traditional centralized grid system. Consequently, the smart grid is undergoing a transformation to the decentralized topology from its centralized form. On the other hand, blockchain has some excellent features which make it a promising application for smart grid paradigm. In this paper, we have an aim to provide a comprehensive survey on application of blockchain in smart grid. As such, we identify the significant security challenges of smart grid scenarios that can be addressed by blockchain. Then, we present a number of blockchain-based recent research works presented in different literatures addressing security issues in the area of smart grid. We also summarize several related practical projects, trials, and products that have been emerged recently. Finally, we discuss essential research challenges and future directions of applying blockchain to smart grid security issues.