午夜剧场成年免费视,97国产精品无码免费视频

The privacy of personal information has received significant attention in mobile software. Although previous researchers have designed some methods to identify the conflict between app behavior and privacy policies, little is known about investigating regulation requirements for third-party libraries (TPLs). The regulators enacted multiple regulations to regulate the usage of personal information for TPLs (e.g., the "California Consumer Privacy Act" requires businesses clearly notify consumers if they share consumers' data with third parties or not). However, it remains challenging to analyze the legality of TPLs due to three reasons: 1) TPLs are mainly published on public repositoriesinstead of app market (e.g., Google play). The public repositories do not perform privacy compliance analysis for each TPL. 2) TPLs only provide independent functions or function sequences. They cannot run independently, which limits the application of performing dynamic analysis. 3) Since not all the functions of TPLs are related to user privacy, we must locate the functions of TPLs that access/process personal information before performing privacy compliance analysis. To overcome the above challenges, in this paper, we propose an automated system named ATPChecker to analyze whether the Android TPLs meet privacy-related regulations or not. Our findings remind developers to be mindful of TPL usage when developing apps or writing privacy policies to avoid violating regulations.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · INFORMS · 講稿 · Performer · TOOLS ·

2023 年 7 月 17 日

Probabilistic Counters for Privacy Preserving Data Aggregation

Dominik Bojko,Krzysztof Grining,Marek Klonowski

Probabilistic counters are well-known tools often used for space-efficient set cardinality estimation. In this paper, we investigate probabilistic counters from the perspective of preserving privacy. We use the standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals but provide only general information about the population. Therefore, they can be used safely without violating the privacy of individuals. However, it turned out, that providing a precise, formal analysis of the privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach. We demonstrate that probabilistic counters can be used as a privacy protection mechanism without extra randomization. Namely, the inherent randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used multiple times. In particular, we present a specific privacy-preserving data aggregation protocol based on Morris Counter and MaxGeo Counter. Some of the presented results are devoted to counters that have not been investigated so far from the perspective of privacy protection. Another part is an improvement of previous results. We show how our results can be used to perform distributed surveys and compare the properties of counter-based solutions and a standard Laplace method.

CASE · INFORMS · Processing（編程語言） · Continuity · Analysis ·

2023 年 7 月 17 日

Visualising Personal Data Flows: Insights from a Case Study of Booking.com

Haiyue Yuan,Matthew Boakes,Xiao Ma,Dongmei Cao,Shujun Li

from arxiv, This is the full edition of a paper published in Intelligent Information Systems: CAiSE Forum 2023, Zaragoza, Spain, June 12-16, 2023, Proceedings, Lecture Notes in Business Information Processing (LNBIP), Volume 477, pp. 52-60, 2023, Springer Nature, //link.springer.com/book/10.1007/978-3-031-34674-3

Commercial organisations are holding and processing an ever-increasing amount of personal data. Policies and laws are continually changing to require these companies to be more transparent regarding the collection, storage, processing and sharing of this data. This paper reports our work of taking Booking.com as a case study to visualise personal data flows extracted from their privacy policy. By showcasing how the company shares its consumers' personal data, we raise questions and extend discussions on the challenges and limitations of using privacy policies to inform online users about the true scale and the landscape of personal data flows. This case study can inform us about future research on more data flow-oriented privacy policy analysis and on the construction of a more comprehensive ontology on personal data flows in complicated business ecosystems.

INFORMS · 可理解性 · 語言模型化 · 代碼 · TOOLS ·

2023 年 7 月 17 日

In-IDE Generation-based Information Support with a Large Language Model

Daye Nam,Andrew Macvean,Vincent Hellendoorn,Bogdan Vasilescu,Brad Myers

Developers often face challenges in code understanding, which is crucial for building and maintaining high-quality software systems. Code comments and documentation can provide some context for the code, but are often scarce or missing. This challenge has become even more pressing with the rise of large language model (LLM) based code generation tools. To understand unfamiliar code, most software developers rely on general-purpose search engines to search through various programming information resources, which often requires multiple iterations of query rewriting and information foraging. More recently, developers have turned to online chatbots powered by LLMs, such as ChatGPT, which can provide more customized responses but also incur more overhead as developers need to communicate a significant amount of context to the LLM via a textual interface. In this study, we provide the investigation of an LLM-based conversational UI in the IDE. We aim to understand the promises and obstacles for tools powered by LLMs that are contextually aware, in that they automatically leverage the developer's programming context to answer queries. To this end, we develop an IDE Plugin that allows users to query back-ends such as OpenAI's GPT-3.5 and GPT-4 with high-level requests, like: explaining a highlighted section of code, explaining key domain-specific terms, or providing usage examples for an API. We conduct an exploratory user study with 32 participants to understand the usefulness and effectiveness, as well as individual preferences in the usage of, this LLM-powered information support tool. The study confirms that this approach can aid code understanding more effectively than web search, but the degree of the benefit differed by participants' experience levels.

數據集 · Analysis · MoDELS · state-of-the-art · Performer ·

2023 年 7 月 16 日

Analyzing Dataset Annotation Quality Management in the Wild

Jan-Christoph Klie,Richard Eckart de Castilho,Iryna Gurevych

Data quality is crucial for training accurate, unbiased, and trustworthy machine learning models and their correct evaluation. Recent works, however, have shown that even popular datasets used to train and evaluate state-of-the-art models contain a non-negligible amount of erroneous annotations, bias or annotation artifacts. There exist best practices and guidelines regarding annotation projects. But to the best of our knowledge, no large-scale analysis has been performed as of yet on how quality management is actually conducted when creating natural language datasets and whether these recommendations are followed. Therefore, we first survey and summarize recommended quality management practices for dataset creation as described in the literature and provide suggestions on how to apply them. Then, we compile a corpus of 591 scientific publications introducing text datasets and annotate it for quality-related aspects, such as annotator management, agreement, adjudication or data validation. Using these annotations, we then analyze how quality management is conducted in practice. We find that a majority of the annotated publications apply good or very good quality management. However, we deem the effort of 30% of the works as only subpar. Our analysis also shows common errors, especially with using inter-annotator agreement and computing annotation error rates.

Processing（編程語言） · Agent · MoDELS · Engineering · Seven ·

2023 年 7 月 16 日

Communicative Agents for Software Development

Chen Qian,Xin Cong,Cheng Yang,Weize Chen,Yusheng Su,Juyuan Xu,Zhiyuan Liu,Maosong Sun

from arxiv, 25 pages, 9 figures, 2 tables

Software engineering is a domain characterized by intricate decision-making processes, often relying on nuanced intuition and consultation. Recent advancements in deep learning have started to revolutionize software engineering practices through elaborate designs implemented at various stages of software development. In this paper, we present an innovative paradigm that leverages large language models (LLMs) throughout the entire software development process, streamlining and unifying key processes through natural language communication, thereby eliminating the need for specialized models at each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered software development company that mirrors the established waterfall model, meticulously dividing the development process into four distinct chronological stages: designing, coding, testing, and documenting. Each stage engages a team of agents, such as programmers, code reviewers, and test engineers, fostering collaborative dialogue and facilitating a seamless workflow. The chat chain acts as a facilitator, breaking down each stage into atomic subtasks. This enables dual roles, allowing for proposing and validating solutions through context-aware communication, leading to efficient resolution of specific subtasks. The instrumental analysis of ChatDev highlights its remarkable efficacy in software generation, enabling the completion of the entire software development process in under seven minutes at a cost of less than one dollar. It not only identifies and alleviates potential vulnerabilities but also rectifies potential hallucinations while maintaining commendable efficiency and cost-effectiveness. The potential of ChatDev unveils fresh possibilities for integrating LLMs into the realm of software development.

CRAFT · 可辨認的 · 設計 · 單元 · 講稿 ·

2023 年 7 月 15 日

Privately Policing Dark Patterns

Gregory M. Dickinson

Lawmakers around the country are crafting new laws to target "dark patterns" -- user interface designs that trick or coerce users into enabling cell phone location tracking, sharing browsing data, initiating automatic billing, or making whatever other choices their designers prefer. Dark patterns pose a serious problem. In their most aggressive forms, they interfere with human autonomy, undermine customers' evaluation and selection of products, and distort online markets for goods and services. Yet crafting legislation is a major challenge: Persuasion and deception are difficult to distinguish, and shifting tech trends present an ever-moving target. To address these challenges, this Article proposes leveraging state private law to define and track dark patterns as they evolve. Judge-crafted decisional law can respond quickly to new techniques, flexibly define the boundary between permissible and impermissible designs, and bolster state and federal regulatory enforcement efforts by quickly identifying those designs that most undermine user autonomy.

層 · INFORMS · 統計量 · 系統架構 · 控制器 ·

2023 年 7 月 13 日

Data Behind the Walls An Advanced Architecture for Data Privacy Management

Amen Faridoon,M. Tahar Kechadi

from arxiv, 7 pages

In today's highly connected society, we are constantly asked to provide personal information to retailers, voter surveys, medical professionals, and other data collection efforts. The collected data is stored in large data warehouses. Organisations and statistical agencies share and use this data to facilitate research in public health, economics, sociology, etc. However, this data contains sensitive information about individuals, which can result in identity theft, financial loss, stress and depression, embarrassment, abuse, etc. Therefore, one must ensure rigorous management of individuals' privacy. We propose, an advanced data privacy management architecture composed of three layers. The data management layer consists of de-identification and anonymisation, the access management layer for re-enforcing data access based on the concepts of Role-Based Access Control and the Chinese Wall Security Policy, and the roles layer for regulating different users. The proposed system architecture is validated on healthcare datasets.

MoDELS · Performer · 模型選擇 · 表示容量 · Principle ·

2023 年 7 月 13 日

Introducing Foundation Models as Surrogate Models: Advancing Towards More Practical Adversarial Attacks

Jiaming Zhang,Jitao Sang,Qi Yi

Recently, the no-box adversarial attack, in which the attacker lacks access to the model's architecture, weights, and training data, become the most practical and challenging attack setup. However, there is an unawareness of the potential and flexibility inherent in the surrogate model selection process on no-box setting. Inspired by the burgeoning interest in utilizing foundational models to address downstream tasks, this paper adopts an innovative idea that 1) recasting adversarial attack as a downstream task. Specifically, image noise generation to meet the emerging trend and 2) introducing foundational models as surrogate models. Harnessing the concept of non-robust features, we elaborate on two guiding principles for surrogate model selection to explain why the foundational model is an optimal choice for this role. However, paradoxically, we observe that these foundational models underperform. Analyzing this unexpected behavior within the feature space, we attribute the lackluster performance of foundational models (e.g., CLIP) to their significant representational capacity and, conversely, their lack of discriminative prowess. To mitigate this issue, we propose the use of a margin-based loss strategy for the fine-tuning of foundational models on target images. The experimental results verify that our approach, which employs the basic Fast Gradient Sign Method (FGSM) attack algorithm, outstrips the performance of other, more convoluted algorithms. We conclude by advocating for the research community to consider surrogate models as crucial determinants in the effectiveness of adversarial attacks in no-box settings. The implications of our work bear relevance for improving the efficacy of such adversarial attacks and the overall robustness of AI systems.

表示定理 · Agent · 泛函 · 表示 · 約束 ·

2023 年 7 月 11 日

Sequential Language-based Decisions

Adam Bjorndahl,Joseph Y. Halpern

from arxiv, In Proceedings TARK 2023, arXiv:2307.04005

In earlier work, we introduced the framework of language-based decisions, the core idea of which was to modify Savage's classical decision-theoretic framework by taking actions to be descriptions in some language, rather than functions from states to outcomes, as they are defined classically. Actions had the form "if psi then do(phi)", where psi and phi were formulas in some underlying language, specifying what effects would be brought about under what circumstances. The earlier work allowed only one-step actions. But, in practice, plans are typically composed of a sequence of steps. Here, we extend the earlier framework to sequential actions, making it much more broadly applicable. Our technical contribution is a representation theorem in the classical spirit: agents whose preferences over actions satisfy certain constraints can be modeled as if they are expected utility maximizers. As in the earlier work, due to the language-based specification of the actions, the representation theorem requires a construction not only of the probability and utility functions representing the agent's beliefs and preferences, but also the state and outcomes spaces over which these are defined, as well as a "selection function" which intuitively captures how agents disambiguate coarse descriptions. The (unbounded) depth of action sequencing adds substantial interest (and complexity!) to the proof.

分布式機器學習 · Machine Learning · 學成 · Storage · 優化器 ·

2019 年 9 月 18 日

Distributed Machine Learning on Mobile Devices: A Survey

Renjie Gu,Shuo Yang,Fan Wu

In recent years, mobile devices have gained increasingly development with stronger computation capability and larger storage. Some of the computation-intensive machine learning and deep learning tasks can now be run on mobile devices. To take advantage of the resources available on mobile devices and preserve users' privacy, the idea of mobile distributed machine learning is proposed. It uses local hardware resources and local data to solve machine learning sub-problems on mobile devices, and only uploads computation results instead of original data to contribute to the optimization of the global model. This architecture can not only relieve computation and storage burden on servers, but also protect the users' sensitive information. Another benefit is the bandwidth reduction, as various kinds of local data can now participate in the training process without being uploaded to the server. In this paper, we provide a comprehensive survey on recent studies of mobile distributed machine learning. We survey a number of widely-used mobile distributed machine learning methods. We also present an in-depth discussion on the challenges and future directions in this area. We believe that this survey can demonstrate a clear overview of mobile distributed machine learning and provide guidelines on applying mobile distributed machine learning to real applications.