Smart cities often rely on technological innovation to improve citizens' safety and quality of life. This paper presents a novel smart mobility system that aims to facilitate people accessing public mobility while preserving their privacy. The system is based on a zero interaction approach whereby a person can use public transport services without any need to perform explicit actions. Operations related to ticket purchase and validation have been fully automated. The system is also designed with the privacy-by-design paradigm in mind, to preserve user privacy as much as possible. Throughout the paper several technical details are discussed as well to describe a prototype version of the system that was implemented. The prototype has been successfully tested in the city of Imola (Emilia Romagna, Italy) in order to prove the system validity on the field.
Post-processing immunity is a fundamental property of differential privacy: it enables arbitrary data-independent transformations to differentially private outputs without affecting their privacy guarantees. Post-processing is routinely applied in data-release applications, including census data, which are then used to make allocations with substantial societal impacts. This paper shows that post-processing causes disparate impacts on individuals or groups and analyzes two critical settings: the release of differentially private datasets and the use of such private datasets for downstream decisions, such as the allocation of funds informed by US Census data. In the first setting, the paper proposes tight bounds on the unfairness of traditional post-processing mechanisms, giving a unique tool to decision-makers to quantify the disparate impacts introduced by their release. In the second setting, this paper proposes a novel post-processing mechanism that is (approximately) optimal under different fairness metrics, either reducing fairness issues substantially or reducing the cost of privacy. The theoretical analysis is complemented with numerical simulations on Census data.
Millions of consumers depend on smart camera systems to remotely monitor their homes and businesses. However, the architecture and design of popular commercial systems require users to relinquish control of their data to untrusted third parties, such as service providers (e.g., the cloud). Third parties therefore can (and in some instances have) access the video footage without the users' knowledge or consent -- violating the core tenet of user privacy. In this paper, we present CaCTUs, a privacy-preserving smart Camera system Controlled Totally by Users. CaCTUs returns control to the user; the root of trust begins with the user and is maintained through a series of cryptographic protocols, designed to support popular features, such as sharing, deleting, and viewing videos live. We show that the system can support live streaming with a latency of 2s at a frame rate of 10fps and a resolution of 480p. In so doing, we demonstrate that it is feasible to implement a performant smart-camera system that leverages the convenience of a cloud-based model while retaining the ability to control access to (private) data.
Within process mining, a relevant activity is conformance checking. Such activity consists of establishing the extent to which actual executions of a process conform the expected behavior of a reference model. Current techniques focus on prescriptive models of the control-flow as references. In certain scenarios, however, a prescriptive model might not be available and, additionally, the control-flow perspective might not be ideal for this purpose. This paper tackles these two problems by suggesting a conformance approach that uses a descriptive model (i.e., a pattern of the observed behavior over a certain amount of time) which is not necessarily referring to the control-flow (e.g., it can be based on the social network of handover of work). Additionally, the entire approach can work both offline and online, thus providing feedback in real time. The approach, which is implemented in ProM, has been tested and results from 3 experiments with real world as well as synthetic data are reported.
One of the most prominent and widely-used blockchain privacy solutions are zero-knowledge proof (ZKP) mixers operating on top of smart contract-enabled blockchains. ZKP mixers typically advertise their level of privacy through a so-called anonymity set size, similar to k-anonymity, where a user hides among a set of $k$ other users. In reality, however, these anonymity set claims are mostly inaccurate, as we find through empirical measurements of the currently most active ZKP mixers. We propose five heuristics that, in combination, can increase the probability that an adversary links a withdrawer to the correct depositor on average by 51.94% (108.63%) on the most popular Ethereum (ETH) and Binance Smart Chain (BSC) mixer, respectively. Our empirical evidence is hence also the first to suggest a differing privacy-predilection of users on ETH and BSC. We further identify 105 Decentralized Finance (DeFi) attackers leveraging ZKP mixers as the initial funds and to deposit attack revenue (e.g., from phishing scams, hacking centralized exchanges, and blockchain project attacks). State-of-the-art mixers are moreover tightly intertwined with the growing DeFi ecosystem by offering ``anonymity mining'' (AM) incentives, i.e., mixer users receive monetary rewards for mixing coins. However, contrary to the claims of related work, we find that AM does not always contribute to improving the quality of an anonymity set size of a mixer, because AM tends to attract privacy-ignorant users naively reusing addresses.
Cross-Blockchain communication has gained traction due to the increasing fragmentation of blockchain networks and scalability solutions such as side-chaining and sharding. With SmartSync, we propose a novel concept for cross-blockchain smart contract interactions that creates client contracts on arbitrary blockchain networks supporting the same execution environment. Client contracts mirror the logic and state of the original instance and enable seamless on-chain function executions providing recent states. Synchronized contracts supply instant read-only function calls to other applications hosted on the target blockchain. Hereby, current limitations in cross-chain communication are alleviated and new forms of contract interactions are enabled. State updates are transmitted in a verifiable manner using Merkle proofs and do not require trusted intermediaries. To permit lightweight synchronizations, we introduce transition confirmations that facilitate the application of verifiable state transitions without re-executing transactions of the source blockchain. We prove the concept's soundness by providing a prototypical implementation that enables smart contract forks, state synchronizations, and on-chain validation on EVM-compatible blockchains. Our evaluation demonstrates SmartSync's applicability for presented use cases providing access to recent states to third-party contracts on the target blockchain. Execution costs scale sub-linearly with the number of value updates and depend on the depth and index of corresponding Merkle proofs.
As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.
News recommendation aims to display news articles to users based on their personal interest. Existing news recommendation methods rely on centralized storage of user behavior data for model training, which may lead to privacy concerns and risks due to the privacy-sensitive nature of user behaviors. In this paper, we propose a privacy-preserving method for news recommendation model training based on federated learning, where the user behavior data is locally stored on user devices. Our method can leverage the useful information in the behaviors of massive number users to train accurate news recommendation models and meanwhile remove the need of centralized storage of them. More specifically, on each user device we keep a local copy of the news recommendation model, and compute gradients of the local model based on the user behaviors in this device. The local gradients from a group of randomly selected users are uploaded to server, which are further aggregated to update the global model in the server. Since the model gradients may contain some implicit private information, we apply local differential privacy (LDP) to them before uploading for better privacy protection. The updated global model is then distributed to each user device for local model update. We repeat this process for multiple rounds. Extensive experiments on a real-world dataset show the effectiveness of our method in news recommendation model training with privacy protection.
Most existing recommender systems leverage user behavior data of one type only, such as the purchase behavior in E-commerce that is directly related to the business KPI (Key Performance Indicator) of conversion rate. Besides the key behavioral data, we argue that other forms of user behaviors also provide valuable signal, such as views, clicks, adding a product to shop carts and so on. They should be taken into account properly to provide quality recommendation for users. In this work, we contribute a new solution named NMTR (short for Neural Multi-Task Recommendation) for learning recommender systems from user multi-behavior data. We develop a neural network model to capture the complicated and multi-type interactions between users and items. In particular, our model accounts for the cascading relationship among different types of behaviors (e.g., a user must click on a product before purchasing it). To fully exploit the signal in the data of multiple types of behaviors, we perform a joint optimization based on the multi-task learning framework, where the optimization on a behavior is treated as a task. Extensive experiments on two real-world datasets demonstrate that NMTR significantly outperforms state-of-the-art recommender systems that are designed to learn from both single-behavior data and multi-behavior data. Further analysis shows that modeling multiple behaviors is particularly useful for providing recommendation for sparse users that have very few interactions.
Privacy is a major good for users of personalized services such as recommender systems. When applied to the field of health informatics, privacy concerns of users may be amplified, but the possible utility of such services is also high. Despite availability of technologies such as k-anonymity, differential privacy, privacy-aware recommendation, and personalized privacy trade-offs, little research has been conducted on the users' willingness to share health data for usage in such systems. In two conjoint-decision studies (sample size n=521), we investigate importance and utility of privacy-preserving techniques related to sharing of personal health data for k-anonymity and differential privacy. Users were asked to pick a preferred sharing scenario depending on the recipient of the data, the benefit of sharing data, the type of data, and the parameterized privacy. Users disagreed with sharing data for commercial purposes regarding mental illnesses and with high de-anonymization risks but showed little concern when data is used for scientific purposes and is related to physical illnesses. Suggestions for health recommender system development are derived from the findings.
With the ever-growing volume, complexity and dynamicity of online information, recommender system has been an effective key solution to overcome such information overload. In recent years, deep learning's revolutionary advances in speech recognition, image analysis and natural language processing have gained significant attention. Meanwhile, recent studies also demonstrate its effectiveness in coping with information retrieval and recommendation tasks. Applying deep learning techniques into recommender system has been gaining momentum due to its state-of-the-art performances and high-quality recommendations. In contrast to traditional recommendation models, deep learning provides a better understanding of user's demands, item's characteristics and historical interactions between them. This article aims to provide a comprehensive review of recent research efforts on deep learning based recommender systems towards fostering innovations of recommender system research. A taxonomy of deep learning based recommendation models is presented and used to categorize the surveyed articles. Open problems are identified based on the analytics of the reviewed works and potential solutions discussed.