欧美精品日韩精品国内精品_亚洲AV永久少妇精品一区在线_亚洲国产一区二区三区在线播放_国产黄色专区在线观看_女人十八毛片视频一级毛片_亚洲国产精品_亚洲欧美日韩自偷自拍

Wearable devices generate different types of physiological data about the individuals. These data can provide valuable insights for medical researchers and clinicians that cannot be availed through traditional measures. Researchers have historically relied on survey responses or observed behavior. Interestingly, physiological data can provide a richer amount of user cognition than that obtained from any other sources, including the user himself. Therefore, the inexpensive consumer-grade wearable devices have become a point of interest for the health researchers. In addition, they are also used in continuous remote health monitoring and sometimes by the insurance companies. However, the biggest concern for such kind of use cases is the privacy of the individuals. There are a few privacy mechanisms, such as abstraction and k-anonymity, are widely used in information systems. Recently, Differential Privacy (DP) has emerged as a proficient technique to publish privacy sensitive data, including data from wearable devices. In this paper, we have conducted a Systematic Literature Review (SLR) to identify, select and critically appraise researches in DP as well as to understand different techniques and exiting use of DP in wearable data publishing. Based on our study we have identified the limitations of proposed solutions and provided future directions.

相關內容

可穿戴設(she)備

關注 127

可穿戴設備即直接穿在身上，或是整合到用戶的衣服或配件的一種便攜式設備。可穿戴設備不僅僅是一種硬件設備，更是通過軟件支持以及數據交互、云端交互來實現強大的功能，可穿戴設備將會對我們的生活、感知帶來很大的轉變。

事件抽取 · INFORMS · 信息抽取 · state-of-the-art · 在線 ·

2021 年 11 月 5 日

An overview of event extraction and its applications

Jiangwei Liu,Liangyu Min,Xiaohong Huang

With the rapid development of information technology, online platforms have produced enormous text resources. As a particular form of Information Extraction (IE), Event Extraction (EE) has gained increasing popularity due to its ability to automatically extract events from human language. However, there are limited literature surveys on event extraction. Existing review works either spend much effort describing the details of various approaches or focus on a particular field. This study provides a comprehensive overview of the state-of-the-art event extraction methods and their applications from text, including closed-domain and open-domain event extraction. A trait of this survey is that it provides an overview in moderate complexity, avoiding involving too many details of particular approaches. This study focuses on discussing the common characters, application fields, advantages, and disadvantages of representative works, ignoring the specificities of individual approaches. Finally, we summarize the common issues, current solutions, and future research directions. We hope this work could help researchers and practitioners obtain a quick overview of recent event extraction.

優化器 · 情景 · 統計量 · 設計 · 數據庫 ·

2021 年 11 月 5 日

Budget Sharing for Multi-Analyst Differential Privacy

David Pujol,Yikai Wu,Brandon Fain,Ashwin Machanavajjhala

from arxiv, 13 pages, 5 figures. Proceedings of the VLDB Endowment (PVLDB) Vol. 14 No. 10. Presented at the International Conference on Very Large Data Bases (VLDB) 2021

Large organizations that collect data about populations (like the US Census Bureau) release summary statistics that are used by multiple stakeholders for resource allocation and policy making problems. These organizations are also legally required to protect the privacy of individuals from whom they collect data. Differential Privacy (DP) provides a solution to release useful summary data while preserving privacy. Most DP mechanisms are designed to answer a single set of queries. In reality, there are often multiple stakeholders that use a given data release and have overlapping but not-identical queries. This introduces a novel joint optimization problem in DP where the privacy budget must be shared among different analysts. We initiate study into the problem of DP query answering across multiple analysts. To capture the competing goals and priorities of multiple analysts, we formulate three desiderata that any mechanism should satisfy in this setting -- The Sharing Incentive, Non-Interference, and Adaptivity -- while still optimizing for overall error. We demonstrate how existing DP query answering mechanisms in the multi-analyst settings fail to satisfy at least one of the desiderata. We present novel DP algorithms that provably satisfy all our desiderata and empirically show that they incur low error on realistic tasks.

可理解性 · COVID-19 · surge · 服務器 · Better ·

2021 年 11 月 3 日

"We won't even challenge their lefty academic definition of racist:" Understanding the Use of e-Prints on Reddit and 4chan

Satrio Baskoro Yudhoatmojo,Emiliano De Cristofaro,Jeremy Blackburn

The dissemination and the reach of scientific knowledge have increased at a blistering pace. In this context, e-print servers have played a central role by providing scientists with a rapid and open mechanism for disseminating research without having to wait for the (lengthy) peer-review process. While helping the scientific community in several ways, e-print servers also provide scientific communicators and the general public with access to a wealth of knowledge without having to pay hefty subscription fees. Arguably, e-print servers' value has never been so evident, for better or worse, as during the COVID-19 pandemic. This motivates us to study how e-print servers are positioned within the greater Web, and how they are "used" on Web communities. Using data from Reddit (2005-2021) and 4chan's Politically Incorrect board (2016--2021), we uncover a surprisingly diverse set of communities discussing e-print papers. We find that real-world events and distinct factors influence the e-prints people are talking about. For instance, there was a sudden increase in the discussion of e-prints, corresponding to a surge in COVID-19 related research, in the first phase of the pandemic. We find a substantial difference in the conversation around e-prints and their actual content; in fact, e-prints are often being exploited to further conspiracy theories and/or extremist ideology. Overall, our work further highlights the need to quickly and effectively validate non peer-reviewed e-prints that get substantial press/social media coverage, as well as mitigate wrongful interpretations of scientific outputs.

穩健性 · MoDELS · Continuity · Taxonomy · 聯邦學習 ·

2020 年 12 月 7 日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Lingjuan Lyu,Han Yu,Xingjun Ma,Lichao Sun,Jun Zhao,Qiang Yang,Philip S. Yu

from arxiv, arXiv admin note: text overlap with arXiv:2003.02133; text overlap with arXiv:1911.11815 by other authors

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

學成 · 深度學習 · MoDELS · Better · CASES ·

2020 年 3 月 26 日

A Survey of Deep Learning for Scientific Discovery

Maithra Raghu,Eric Schmidt

Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific settings. But a significant challenge to this is simply knowing where to start. The sheer breadth and diversity of different deep learning techniques makes it difficult to determine what scientific problems might be most amenable to these methods, or which specific combination of methods might offer the most promising first approach. In this survey, we focus on addressing this central issue, providing an overview of many widely used deep learning models, spanning visual, sequential and graph structured data, associated tasks and different training methods, along with techniques to use deep learning with less data and better interpret these complex models --- two central considerations for many scientific use cases. We also include overviews of the full design process, implementation tips, and links to a plethora of tutorials, research summaries and open-sourced deep learning pipelines and pretrained models, developed by the community. We hope that this survey will help accelerate the use of deep learning across different scientific domains.

聯邦學習 · Extensibility · 學成 · MoDELS · 邊 ·

2019 年 12 月 17 日

Asynchronous Federated Learning with Differential Privacy for Edge Intelligence

Yanan Li,Shusen Yang,Xuebin Ren,Cong Zhao

Federated learning has been showing as a promising approach in paving the last mile of artificial intelligence, due to its great potential of solving the data isolation problem in large scale machine learning. Particularly, with consideration of the heterogeneity in practical edge computing systems, asynchronous edge-cloud collaboration based federated learning can further improve the learning efficiency by significantly reducing the straggler effect. Despite no raw data sharing, the open architecture and extensive collaborations of asynchronous federated learning (AFL) still give some malicious participants great opportunities to infer other parties' training data, thus leading to serious concerns of privacy. To achieve a rigorous privacy guarantee with high utility, we investigate to secure asynchronous edge-cloud collaborative federated learning with differential privacy, focusing on the impacts of differential privacy on model convergence of AFL. Formally, we give the first analysis on the model convergence of AFL under DP and propose a multi-stage adjustable private algorithm (MAPA) to improve the trade-off between model utility and privacy by dynamically adjusting both the noise scale and the learning rate. Through extensive simulations and real-world experiments with an edge-could testbed, we demonstrate that MAPA significantly improves both the model accuracy and convergence speed with sufficient privacy guarantee.

學成 · Processing（編程語言） · 目標函數 · 增廣拉格朗日法 · 泛函 ·

2019 年 3 月 25 日

DP-ADMM: ADMM-based Distributed Learning with Differential Privacy

Zonghao Huang,Rui Hu,Yuanxiong Guo,Eric Chan-Tin,Yanmin Gong

from arxiv, under revision

Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.

模型評估 · Use Case · MoDELS · CASE · 學成 ·

2018 年 2 月 28 日

Together or Alone: The Price of Privacy in Collaborative Learning

Balazs Pejo,Qiang Tang

Machine Learning is a widely-used method for prediction generation. These predictions are more accurate when the model is trained on a larger dataset. On the other hand, the data is usually divided amongst different entities. For privacy reasons, the training can be done locally and then the model can be safely aggregated amongst the participants. However, if there are only two participants in \textit{Collaborative Learning}, the safe aggregation loses its power since the output of the training already contains much information about the participants. To resolve this issue, they must employ privacy-preserving mechanisms, which inevitably affect the accuracy of the model. In this paper, we model the training process as a two-player game where each player aims to achieve a higher accuracy while preserving its privacy. We introduce the notion of \textit{Price of Privacy}, a novel approach to measure the effect of privacy protection on the accuracy of the model. We develop a theoretical model for different player types, and we either find or prove the existence of a Nash Equilibrium with some assumptions. Moreover, we confirm these assumptions via a Recommendation Systems use case: for a specific learning algorithm, we apply three privacy-preserving mechanisms on two real-world datasets. Finally, as a complementary work for the designed game, we interpolate the relationship between privacy and accuracy for this use case and present three other methods to approximate it in a real-world scenario.

大數據 · 可理解性 · 復合數據 · Processing（編程語言） · Better ·

2016 年 1 月 15 日

Big Data: Understanding Big Data

Kevin Taylor-Sakyi

from arxiv, 8 pages, Big Data Analytics, Data Storage, MapReduce, Knowledge-Space, Big Data Inconsistencies

Steve Jobs, one of the greatest visionaries of our time was quoted in 1996 saying "a lot of times, people do not know what they want until you show it to them" [38] indicating he advocated products to be developed based on human intuition rather than research. With the advancements of mobile devices, social networks and the Internet of Things, enormous amounts of complex data, both structured and unstructured are being captured in hope to allow organizations to make better business decisions as data is now vital for an organizations success. These enormous amounts of data are referred to as Big Data, which enables a competitive advantage over rivals when processed and analyzed appropriately. However Big Data Analytics has a few concerns including Management of Data-lifecycle, Privacy & Security, and Data Representation. This paper reviews the fundamental concept of Big Data, the Data Storage domain, the MapReduce programming paradigm used in processing these large datasets, and focuses on two case studies showing the effectiveness of Big Data Analytics and presents how it could be of greater good in the future if handled appropriately.