日本人体黄色三级视频_国产一国产一级毛片A久久久_欧美日韩视频在线播放_久久国产劲爆V内射_性视频一级特黄大播放_国产精品国产三级国产专网站_在线外国免费AV

Differentially private (DP) synthetic data sets are a solution for sharing data while preserving the privacy of individual data providers. Understanding the effects of utilizing DP synthetic data in end-to-end machine learning pipelines impacts areas such as health care and humanitarian action, where data is scarce and regulated by restrictive privacy laws. In this work, we investigate the extent to which synthetic data can replace real, tabular data in machine learning pipelines and identify the most effective synthetic data generation techniques for training and evaluating machine learning models. We investigate the impacts of differentially private synthetic data on downstream classification tasks from the point of view of utility as well as fairness. Our analysis is comprehensive and includes representatives of the two main types of synthetic data generation algorithms: marginal-based and GAN-based. To the best of our knowledge, our work is the first that: (i) proposes a training and evaluation framework that does not assume that real data is available for testing the utility and fairness of machine learning models trained on synthetic data; (ii) presents the most extensive analysis of synthetic data set generation algorithms in terms of utility and fairness when used for training machine learning models; and (iii) encompasses several different definitions of fairness. Our findings demonstrate that marginal-based synthetic data generators surpass GAN-based ones regarding model training utility for tabular data. Indeed, we show that models trained using data generated by marginal-based algorithms can exhibit similar utility to models trained using real data. Our analysis also reveals that the marginal-based synthetic data generator MWEM PGM can train models that simultaneously achieve utility and fairness characteristics close to those obtained by models trained with real data.

相關內容

Machine Learning

關注 2241

機器(qi)學(xue)(xue)習(xi)(xi)（Machine Learning）是(shi)一(yi)個研(yan)(yan)(yan)究(jiu)(jiu)(jiu)計算學(xue)(xue)習(xi)(xi)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)的(de)(de)(de)國(guo)際(ji)論(lun)(lun)壇。該雜志發表文(wen)章，報告廣泛的(de)(de)(de)學(xue)(xue)習(xi)(xi)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)應(ying)用(yong)于各種學(xue)(xue)習(xi)(xi)問(wen)題(ti)的(de)(de)(de)實(shi)(shi)質(zhi)性結果(guo)。該雜志的(de)(de)(de)特色論(lun)(lun)文(wen)描述研(yan)(yan)(yan)究(jiu)(jiu)(jiu)的(de)(de)(de)問(wen)題(ti)和(he)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)，應(ying)用(yong)研(yan)(yan)(yan)究(jiu)(jiu)(jiu)和(he)研(yan)(yan)(yan)究(jiu)(jiu)(jiu)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)的(de)(de)(de)問(wen)題(ti)。有(you)關(guan)學(xue)(xue)習(xi)(xi)問(wen)題(ti)或(huo)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)的(de)(de)(de)論(lun)(lun)文(wen)通過實(shi)(shi)證研(yan)(yan)(yan)究(jiu)(jiu)(jiu)、理論(lun)(lun)分析(xi)或(huo)與(yu)心理現象的(de)(de)(de)比較提供(gong)了(le)堅實(shi)(shi)的(de)(de)(de)支(zhi)持。應(ying)用(yong)論(lun)(lun)文(wen)展示了(le)如(ru)何應(ying)用(yong)學(xue)(xue)習(xi)(xi)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)來解決重要的(de)(de)(de)應(ying)用(yong)問(wen)題(ti)。研(yan)(yan)(yan)究(jiu)(jiu)(jiu)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)論(lun)(lun)文(wen)改進了(le)機器(qi)學(xue)(xue)習(xi)(xi)的(de)(de)(de)研(yan)(yan)(yan)究(jiu)(jiu)(jiu)方(fang)(fang)(fang)(fang)法(fa)(fa)(fa)。所有(you)的(de)(de)(de)論(lun)(lun)文(wen)都以其他研(yan)(yan)(yan)究(jiu)(jiu)(jiu)人員可以驗證或(huo)復制(zhi)的(de)(de)(de)方(fang)(fang)(fang)(fang)式描述了(le)支(zhi)持證據。論(lun)(lun)文(wen)還詳細說明(ming)了(le)學(xue)(xue)習(xi)(xi)的(de)(de)(de)組成部分，并討(tao)論(lun)(lun)了(le)關(guan)于知識表示和(he)性能任(ren)務的(de)(de)(de)假設。官網地址：

INTERACT · TOOLS · 設計 · Engineering · MoDELS ·

2023 年 12 月 20 日

Composable Design of Multiphase Fluid Dynamics Solvers in Flash-X

Akash Dhruv

Multiphysics incompressible fluid dynamics simulations play a crucial role in understanding intricate behaviors of many complex engineering systems that involve interactions between solids, fluids, and various phases like liquid and gas. Numerical modeling of these interactions has generated significant research interest in recent decades and has led to the development of open source simulation tools and commercial software products targeting specific applications or general problem classes in computational fluid dynamics. As the demand increases for these simulations to adapt to platform heterogeneity, ensure composability between different physics models, and effectively utilize inheritance within partial differentiation systems, a fundamental reconsideration of numerical solver design becomes imperative. The discussion presented in this paper emphasizes the importance of these considerations and introduces the Flash-X approach as a potential solution. The software design strategies outlined in the article serve as a guide for Flash-X developers, providing insights into complexities associated with performance portability, composability, and sustainable development. These strategies provide a foundation for improving design of both new and existing simulation tools grappling with these challenges. By incorporating the principles outlined in the Flash-X approach, engineers and researchers can enhance the adaptability, efficiency, and overall effectiveness of their numerical solvers in the ever-evolving field of multiphysics simulations.

GROUP · 數據集 · 泛化理論 · Learning · 穩健性 ·

2023 年 12 月 19 日

Self-Supervised Detection of Perfect and Partial Input-Dependent Symmetries

Alonso Urbano,David W. Romero

Group equivariance ensures consistent responses to group transformations of the input, leading to more robust models and enhanced generalization capabilities. However, this property can lead to overly constrained models if the symmetries considered in the group differ from those observed in data. While common methods address this by determining the appropriate level of symmetry at the dataset level, they are limited to supervised settings and ignore scenarios in which multiple levels of symmetry co-exist in the same dataset. For instance, pictures of cars and planes exhibit different levels of rotation, yet both are included in the CIFAR-10 dataset. In this paper, we propose a method able to detect the level of symmetry of each input without the need for labels. To this end, we derive a sufficient and necessary condition to learn the distribution of symmetries in the data. Using the learned distribution, we generate pseudo-labels that allow us to learn the levels of symmetry of each input in a self-supervised manner. We validate the effectiveness of our approach on synthetic datasets with different per-class levels of symmetries e.g. MNISTMultiple, in which digits are uniformly rotated within a class-dependent interval. We demonstrate that our method can be used for practical applications such as the generation of standardized datasets in which the symmetries are not present, as well as the detection of out-of-distribution symmetries during inference. By doing so, both the generalization and robustness of non-equivariant models can be improved. Our code is publicly available at //github.com/aurban0/ssl-sym.

異常點 · 類別 · Learning · 樣本 · Extensibility ·

2023 年 12 月 19 日

Out-of-Distribution Detection in Long-Tailed Recognition with Calibrated Outlier Class Learning

Wenjun Miao,Guansong Pang,Tianqi Li,Xiao Bai,Jin Zheng

from arxiv, AAAI2024, with supplementary material

Existing out-of-distribution (OOD) methods have shown great success on balanced datasets but become ineffective in long-tailed recognition (LTR) scenarios where 1) OOD samples are often wrongly classified into head classes and/or 2) tail-class samples are treated as OOD samples. To address these issues, current studies fit a prior distribution of auxiliary/pseudo OOD data to the long-tailed in-distribution (ID) data. However, it is difficult to obtain such an accurate prior distribution given the unknowingness of real OOD samples and heavy class imbalance in LTR. A straightforward solution to avoid the requirement of this prior is to learn an outlier class to encapsulate the OOD samples. The main challenge is then to tackle the aforementioned confusion between OOD samples and head/tail-class samples when learning the outlier class. To this end, we introduce a novel calibrated outlier class learning (COCL) approach, in which 1) a debiased large margin learning method is introduced in the outlier class learning to distinguish OOD samples from both head and tail classes in the representation space and 2) an outlier-class-aware logit calibration method is defined to enhance the long-tailed classification confidence. Extensive empirical results on three popular benchmarks CIFAR10-LT, CIFAR100-LT, and ImageNet-LT demonstrate that COCL substantially outperforms state-of-the-art OOD detection methods in LTR while being able to improve the classification accuracy on ID data. Code is available at //github.com/mala-lab/COCL.

控制器 · Learning · 機器人 · MoDELS · 設計 ·

2023 年 12 月 15 日

Gaussian Process-Based Learning Control of Underactuated Balance Robots with an External and Internal Convertible Modeling Structure

Feng Han,Jingang Yi

External and internal convertible (EIC) form-based motion control is one of the effective designs of simultaneously trajectory tracking and balance for underactuated balance robots. Under certain conditions, the EIC-based control design however leads to uncontrolled robot motion. We present a Gaussian process (GP)-based data-driven learning control for underactuated balance robots with the EIC modeling structure. Two GP-based learning controllers are presented by using the EIC structure property. The partial EIC (PEIC)-based control design partitions the robotic dynamics into a fully actuated subsystem and one reduced-order underactuated system. The null-space EIC (NEIC)-based control compensates for the uncontrolled motion in a subspace, while the other closed-loop dynamics are not affected. Under the PEIC- and NEIC-based, the tracking and balance tasks are guaranteed and convergence rate and bounded errors are achieved without causing any uncontrolled motion by the original EIC-based control. We validate the results and demonstrate the GP-based learning control design performance using two inverted pendulum platforms.

MoDELS · Networking · Performer · 自編碼器 · TOOLS ·

2023 年 12 月 15 日

Symplectic Autoencoders for Model Reduction of Hamiltonian Systems

Benedikt Brantner,Michael Kraus

Many applications, such as optimization, uncertainty quantification and inverse problems, require repeatedly performing simulations of large-dimensional physical systems for different choices of parameters. This can be prohibitively expensive. In order to save computational cost, one can construct surrogate models by expressing the system in a low-dimensional basis, obtained from training data. This is referred to as model reduction. Past investigations have shown that, when performing model reduction of Hamiltonian systems, it is crucial to preserve the symplectic structure associated with the system in order to ensure long-term numerical stability. Up to this point structure-preserving reductions have largely been limited to linear transformations. We propose a new neural network architecture in the spirit of autoencoders, which are established tools for dimension reduction and feature extraction in data science, to obtain more general mappings. In order to train the network, a non-standard gradient descent approach is applied that leverages the differential-geometric structure emerging from the network design. The new architecture is shown to significantly outperform existing designs in accuracy.

圖 · 降維 · 歐幾里得距離 · 模糊邏輯 · CASES ·

2023 年 12 月 15 日

Concise Fuzzy Planar Embedding of Graphs: a Dimensionality Reduction Approach

Faisal N. Abu-Khzam,Rana H. Mouawi,Amer Hajj Ahmad,Sergio Thoumi

The enormous amount of data to be represented using large graphs exceeds in some cases the resources of a conventional computer. Edges in particular can take up a considerable amount of memory as compared to the number of nodes. However, rigorous edge storage might not always be essential to be able to draw the needed conclusions. A similar problem takes records with many variables and attempts to extract the most discernible features. It is said that the ``dimension'' of this data is reduced. Following an approach with the same objective in mind, we can map a graph representation to a $k$-dimensional space and answer queries of neighboring nodes mainly by measuring Euclidean distances. The accuracy of our answers would decrease but would be compensated for by fuzzy logic which gives an idea about the likelihood of error. This method allows for reasonable representation in memory while maintaining a fair amount of useful information, and allows for concise embedding in $k$-dimensional Euclidean space as well as solving some problems without having to decompress the graph. Of particular interest is the case where $k=2$. Promising highly accurate experimental results are obtained and reported.

MoDELS · 情景 ·

2023 年 12 月 15 日

Relational Models for the Lambek Calculus with Intersection and Constants

Stepan L. Kuznetsov

from arxiv, This article is an extended version of the conference paper presented at RAMiCS 2021

We consider relational semantics (R-models) for the Lambek calculus extended with intersection and explicit constants for zero and unit. For its variant without constants and a restriction which disallows empty antecedents, Andreka and Mikulas (1994) prove strong completeness. We show that it fails without this restriction, but, on the other hand, prove weak completeness for non-standard interpretation of constants. For the standard interpretation, even weak completeness fails. The weak completeness result extends to an infinitary setting, for so-called iterative divisions (Kleene star under division). We also prove strong completeness results for product-free fragments.

知識 (knowledge) · Machine Learning · MoDELS · 學成 · Conformer ·

2022 年 5 月 10 日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Julian W?rmann,Daniel Bogdoll,Etienne Bührle,Han Chen,Evaristus Fuh Chuo,Kostadin Cvejoski,Ludger van Elst,Tobias Glei?ner,Philip Gottschall,Stefan Griesche,Christian Hellert,Christian Hesels,Sebastian Houben,Tim Joseph,Niklas Keil,Johann Kelsch,Hendrik K?nigshof,Erwin Kraft,Leonie Kreuser,Kevin Krone,Tobias Latka,Denny Mattern,Stefan Matthes,Mohsin Munir,Moritz Nekolla,Adrian Paschke,Maximilian Alexander Pintz,Tianming Qiu,Faraz Qureishi,Syed Tahseen Raza Rizvi,J?rg Reichardt,Laura von Rueden,Stefan Rudolph,Alexander Sagel,Gerhard Schunk,Hao Shen,Hendrik Stapelbroek,Vera Stehr,Gurucharan Srinivas,Anh Tuan Tran,Abhishek Vivekanandan,Ya Wang,Florian Wasserrab,Tino Werner,Christian Wirth,Stefan Zwicklbauer

from arxiv, 93 pages

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.

可約的 · 模型評估 · 目標檢測 · FAST · Processing（編程語言） ·

2018 年 3 月 27 日

Dynamic Zoom-in Network for Fast Object Detection in Large Images

Mingfei Gao,Ruichi Yu,Ang Li,Vlad I. Morariu,Larry S. Davis

from arxiv, CVPR2018

We introduce a generic framework that reduces the computational cost of object detection while retaining accuracy for scenarios where objects with varied sizes appear in high resolution images. Detection progresses in a coarse-to-fine manner, first on a down-sampled version of the image and then on a sequence of higher resolution regions identified as likely to improve the detection accuracy. Built upon reinforcement learning, our approach consists of a model (R-net) that uses coarse detection results to predict the potential accuracy gain for analyzing a region at a higher resolution and another model (Q-net) that sequentially selects regions to zoom in. Experiments on the Caltech Pedestrians dataset show that our approach reduces the number of processed pixels by over 50% without a drop in detection accuracy. The merits of our approach become more significant on a high resolution test set collected from YFCC100M dataset, where our approach maintains high detection performance while reducing the number of processed pixels by about 70% and the detection time by over 50%.