女生喊疼男生越往里寨的免费观看_日日狠狠久久一区二区三区色综_精品二区一国产VA在线观看_深一点疼快再深一点娇喘视频_亚洲专区日韩专区欧美专区_久久婷婷狠狠综合激情_人人干人人摸日日摸

Brian Belgodere,Pierre Dognin,Adam Ivankay,Igor Melnyk,Youssef Mroueh,Aleksandra Mojsilovic,Jiri Navratil,Apoorva Nitsure,Inkit Padhi,Mattia Rigotti,Jerret Ross,Yair Schiff,Radhika Vedpathak,Richard A. Young

from arxiv, 49 pages; submitted

Data collected from the real world tends to be biased, unbalanced, and at risk of exposing sensitive and private information. This reality has given rise to the idea of creating synthetic datasets to alleviate risk, bias, harm, and privacy concerns inherent in the real data. This concept relies on Generative AI models to produce unbiased, privacy-preserving synthetic data while being true to the real data. In this new paradigm, how can we tell if this approach delivers on its promises? We present an auditing framework that offers a holistic assessment of synthetic datasets and AI models trained on them, centered around bias and discrimination prevention, fidelity to the real data, utility, robustness, and privacy preservation. We showcase our framework by auditing multiple generative models on diverse use cases, including education, healthcare, banking, human resources, and across different modalities, from tabular, to time-series, to natural language. Our use cases demonstrate the importance of a holistic assessment in order to ensure compliance with socio-technical safeguards that regulators and policymakers are increasingly enforcing. For this purpose, we introduce the trust index that ranks multiple synthetic datasets based on their prescribed safeguards and their desired trade-offs. Moreover, we devise a trust-index-driven model selection and cross-validation procedure via auditing in the training loop that we showcase on a class of transformer models that we dub TrustFormers, across different modalities. This trust-driven model selection allows for controllable trust trade-offs in the resulting synthetic data. We instrument our auditing framework with workflows that connect different stakeholders from model development to audit and certification via a synthetic data auditing report.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 峰值 · 操作 · Better · 散度 ·

2023 年 6 月 16 日

Approaching Unanticipated Consequences

Andrew Darby,Pete Sawyer,Nelly Bencomo

from arxiv, 14 pages, 2 figures

In an ever-changing world, even software that fulfils its requirements may have un-envisioned aftereffects with significant impacts. We explored how such impacts can be better understood at the pre-design phase in support of organisational preparedness. We considered three real-world case studies and engaged with literature from several disciplines to develop a conceptual framework. Across three workshops with industry practitioners and academics creative strategies from speculative design practices were used to prompt engagement with the framework. We found participant groups navigated the model with either a convergent or divergent intent. The academics, operating in an exploratory mode, came to a broad understanding of a class of technologies through its impacts. Operating in an anticipatory mode the industry practitioners came to a specific understanding of a technology's potential in their workplace. The study demonstrated potential for the conceptual framework to be used as a tool with implications for research and practice.

知識 (knowledge) · 任務對話系統 · 數據集 · Unstructured · Extensibility ·

2023 年 6 月 16 日

AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets

Yu Lu,Junwei Bao,Zichen Ma,Xiaoguang Han,Youzheng Wu,Shuguang Cui,Xiaodong He

from arxiv, 10 pages

High-quality data is essential for conversational recommendation systems and serves as the cornerstone of the network architecture development and training strategy design. Existing works contribute heavy human efforts to manually labeling or designing and extending recommender dialogue templates. However, they suffer from (i) the limited number of human annotators results in that datasets can hardly capture rich and large-scale cases in the real world, (ii) the limited experience and knowledge of annotators account for the uninformative corpus and inappropriate recommendations. In this paper, we propose a novel automatic dataset synthesis approach that can generate both large-scale and high-quality recommendation dialogues through a data2text generation process, where unstructured recommendation conversations are generated from structured graphs based on user-item information from the real world. In doing so, we comprehensively exploit: (i) rich personalized user profiles from traditional recommendation datasets, (ii) rich external knowledge from knowledge graphs, and (iii) the conversation ability contained in human-to-human conversational recommendation datasets. Extensive experiments validate the benefit brought by the automatically synthesized data under low-resource scenarios and demonstrate the promising potential to facilitate the development of a more effective conversational recommendation system.

INFORMS · 模型評估 · 跡 · 優化器 · 傳感器 ·

2023 年 6 月 15 日

Privacy Guarantees for Personal Mobility Data in Humanitarian Response

Nitin Kohli,Emily Aiken,Joshua Blumenstock

Personal mobility data from mobile phones and other sensors are increasingly used to inform policymaking during pandemics, natural disasters, and other humanitarian crises. However, even aggregated mobility traces can reveal private information about individual movements to potentially malicious actors. This paper develops and tests an approach for releasing private mobility data, which provides formal guarantees over the privacy of the underlying subjects. Specifically, we (1) introduce an algorithm for constructing differentially private mobility matrices, and derive privacy and accuracy bounds on this algorithm; (2) use real-world data from mobile phone operators in Afghanistan and Rwanda to show how this algorithm can enable the use of private mobility data in two high-stakes policy decisions: pandemic response and the distribution of humanitarian aid; and (3) discuss practical decisions that need to be made when implementing this approach, such as how to optimally balance privacy and accuracy. Taken together, these results can help enable the responsible use of private mobility data in humanitarian response.

點云 · 蒸餾 · Vision · MoDELS · 多樣性 ·

2023 年 6 月 15 日

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

Youquan Liu,Lingdong Kong,Jun Cen,Runnan Chen,Wenwei Zhang,Liang Pan,Kai Chen,Ziwei Liu

from arxiv, Preprint; 36 pages, 16 figures, 14 tables; Code at //github.com/youquanl/Segment-Any-Point-Cloud

Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception. In this work, we introduce Seal, a novel framework that harnesses VFMs for segmenting diverse automotive point cloud sequences. Seal exhibits three appealing properties: i) Scalability: VFMs are directly distilled into point clouds, eliminating the need for annotations in either 2D or 3D during pretraining. ii) Consistency: Spatial and temporal relationships are enforced at both the camera-to-LiDAR and point-to-segment stages, facilitating cross-modal representation learning. iii) Generalizability: Seal enables knowledge transfer in an off-the-shelf manner to downstream tasks involving diverse point clouds, including those from real/synthetic, low/high-resolution, large/small-scale, and clean/corrupted datasets. Extensive experiments conducted on eleven different point cloud datasets showcase the effectiveness and superiority of Seal. Notably, Seal achieves a remarkable 45.0% mIoU on nuScenes after linear probing, surpassing random initialization by 36.9% mIoU and outperforming prior arts by 6.1% mIoU. Moreover, Seal demonstrates significant performance gains over existing methods across 20 different few-shot fine-tuning tasks on all eleven tested point cloud datasets.

正則化項 · MoDELS · PAR · state-of-the-art · 統計量 ·

2023 年 6 月 15 日

Stable Deep MRI Reconstruction using Generative Priors

Martin Zach,Florian Knoll,Thomas Pock

Data-driven approaches recently achieved remarkable success in magnetic resonance imaging (MRI) reconstruction, but integration into clinical routine remains challenging due to a lack of generalizability and interpretability. In this paper, we address these challenges in a unified framework based on generative image priors. We propose a novel deep neural network based regularizer which is trained in a generative setting on reference magnitude images only. After training, the regularizer encodes higher-level domain statistics which we demonstrate by synthesizing images without data. Embedding the trained model in a classical variational approach yields high-quality reconstructions irrespective of the sub-sampling pattern. In addition, the model shows stable behavior when confronted with out-of-distribution data in the form of contrast variation. Furthermore, a probabilistic interpretation provides a distribution of reconstructions and hence allows uncertainty quantification. To reconstruct parallel MRI, we propose a fast algorithm to jointly estimate the image and the sensitivity maps. The results demonstrate competitive performance, on par with state-of-the-art end-to-end deep learning methods, while preserving the flexibility with respect to sub-sampling patterns and allowing for uncertainty quantification.

受試者工作特征 · AUC · 確切的 · Machine Learning · 情景 ·

2023 年 6 月 15 日

ppAURORA: Privacy Preserving Area Under Receiver Operating Characteristic and Precision-Recall Curves

Ali Burak ünal,Nico Pfeifer,Mete Akgün

from arxiv, Accepted in NSS-SocialSec 2023

Computing an AUC as a performance measure to compare the quality of different machine learning models is one of the final steps of many research projects. Many of these methods are trained on privacy-sensitive data and there are several different approaches like $\epsilon$-differential privacy, federated machine learning and cryptography if the datasets cannot be shared or used jointly at one place for training and/or testing. In this setting, it can also be a problem to compute the global AUC, since the labels might also contain privacy-sensitive information. There have been approaches based on $\epsilon$-differential privacy to address this problem, but to the best of our knowledge, no exact privacy preserving solution has been introduced. In this paper, we propose an MPC-based solution, called ppAURORA, with private merging of individually sorted lists from multiple sources to compute the exact AUC as one could obtain on the pooled original test samples. With ppAURORA, the computation of the exact area under precision-recall and receiver operating characteristic curves is possible even when ties between prediction confidence values exist. We use ppAURORA to evaluate two different models predicting acute myeloid leukemia therapy response and heart disease, respectively. We also assess its scalability via synthetic data experiments. All these experiments show that we efficiently and privately compute the exact same AUC with both evaluation metrics as one can obtain on the pooled test samples in plaintext according to the semi-honest adversary setting.

語音識別 · MoDELS · 詞元分析器 · Performer · 分離的 ·

2023 年 6 月 14 日

Towards training Bilingual and Code-Switched Speech Recognition models from Monolingual data sources

Kunal Dhawan,Dima Rekesh,Boris Ginsburg

Multilingual Automatic Speech Recognition (ASR) models are capable of transcribing audios across multiple languages, eliminating the need for separate models. In addition, they can perform Language Identification (LID) and handle code-switched speech. However, training these models requires special code-switch and multilingual speech corpora which are sparsely available. In this paper, we evaluate different approaches towards training of bilingual as well as code-switched ASR models using purely monolingual data sources. We introduce the concept of aggregate tokenizers that differs from the current prevalent technique of generating LIDs at the boundaries of monolingual samples and produces LID for each emitted token instead. We compare bilingual and monolingual model performance, showcase the efficacy of aggregate tokenizers, present a synthetic code-switched ASR data generation technique and demonstrate the effectiveness of the proposed code-switched ASR models for the tasks of speech recognition and spoken language identification.

穩健性 · Integration · TransAct · Performer · 評論員 ·

2023 年 6 月 13 日

RETINA: Distributed and Secure Trust Management for Smart Grid Applications and Energy Trading

Vaios Boulgourasa,Thodoris Ioannidis,Ilias Politis,Christos Xenakis

from arxiv, Under submission to Elsevier's Sustainable Energy, Grids and Networks (SEGAN)

The rapid adoption of smart grids demands robust security and efficiency measures due to their critical role in delivering electricity and their potential for customer-oriented benefits. This paper presents an innovative framework, named RETINA, which provides a resilient and secure energy trading mechanism within smart grid systems. RETINA tackles the inherent security and infrastructure challenges in smart grids by establishing a trust-based security layer and facilitating energy transactions through blockchain technology. Our proposed solution integrates Public Key Infrastructure (PKI) and the Web of Trust (WoT) concepts, promoting decentralized communication channels and robust key management. We further introduce a smart contract-based energy trading mechanism that factors in trust, distance, and energy type (green or non-green) in cost calculation. The utility and robustness of RETINA have been validated in a virtualized testbed environment with 500 nodes, demonstrating superior performance in terms of scalability and resilience compared to the existing WoT scheme. Furthermore, RETINA successfully enables a secure and efficient energy trading scheme, promoting the use of renewable energy sources. Future enhancements will include application to a realistic smart grid deployment and the integration of additional functionalities. This groundbreaking solution has the potential to revolutionize the smart grid ecosystem, addressing its current limitations and propelling the industry towards a future of advanced and secure energy exchange.

Learning · 控制器 · Taxonomy · 知識 (knowledge) · 深度學習 ·

2022 年 7 月 19 日

Controllable Data Generation by Deep Learning: A Review

Shiyu Wang,Yuanqi Du,Xiaojie Guo,Bo Pan,Liang Zhao

Designing and generating new data under targeted properties has been attracting various critical applications such as molecule design, image editing and speech synthesis. Traditional hand-crafted approaches heavily rely on expertise experience and intensive human efforts, yet still suffer from the insufficiency of scientific knowledge and low throughput to support effective and efficient data generation. Recently, the advancement of deep learning induces expressive methods that can learn the underlying representation and properties of data. Such capability provides new opportunities in figuring out the mutual relationship between the structural patterns and functional properties of the data and leveraging such relationship to generate structural data given the desired properties. This article provides a systematic review of this promising research area, commonly known as controllable deep data generation. Firstly, the potential challenges are raised and preliminaries are provided. Then the controllable deep data generation is formally defined, a taxonomy on various techniques is proposed and the evaluation metrics in this specific domain are summarized. After that, exciting applications of controllable deep data generation are introduced and existing works are experimentally analyzed and compared. Finally, the promising future directions of controllable deep data generation are highlighted and five potential challenges are identified.

控制器 · INTERACT · state-of-the-art · 模型評估 · Next ·

2020 年 8 月 3 日

Controllable Multi-Interest Framework for Recommendation

Yukuo Cen,Jianwei Zhang,Xu Zou,Chang Zhou,Hongxia Yang,Jie Tang

from arxiv, Accepted to KDD 2020

Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize the recommender system as a sequential recommendation problem, intending to predict the next items that the user might be interacted with. Recent works usually give an overall embedding from a user's behavior sequence. However, a unified user embedding cannot reflect the user's multiple interests during a period. In this paper, we propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our multi-interest module captures multiple interests from user behavior sequences, which can be exploited for retrieving candidate items from the large-scale item pool. These items are then fed into an aggregation module to obtain the overall recommendation. The aggregation module leverages a controllable factor to balance the recommendation accuracy and diversity. We conduct experiments for the sequential recommendation on two real-world datasets, Amazon and Taobao. Experimental results demonstrate that our framework achieves significant improvements over state-of-the-art models. Our framework has also been successfully deployed on the offline Alibaba distributed cloud platform.