高清国产三级在线播放,一区二区三区免费观看在线视频播放,人人操人人摸人人摸

Training deep learning models with differential privacy (DP) results in a degradation of performance. The training dynamics of models with DP show a significant difference from standard training, whereas understanding the geometric properties of private learning remains largely unexplored. In this paper, we investigate sharpness, a key factor in achieving better generalization, in private learning. We show that flat minima can help reduce the negative effects of per-example gradient clipping and the addition of Gaussian noise. We then verify the effectiveness of Sharpness-Aware Minimization (SAM) for seeking flat minima in private learning. However, we also discover that SAM is detrimental to the privacy budget and computational time due to its two-step optimization. Thus, we propose a new sharpness-aware training method that mitigates the privacy-optimization trade-off. Our experimental results demonstrate that the proposed method improves the performance of deep learning models with DP from both scratch and fine-tuning. Code is available at //github.com/jinseongP/DPSAT.

相關內容

Learning

關注 12

泛化理論 · contrastive · 圖 · Principle · Learning ·

2023 年 8 月 1 日

MARIO: Model Agnostic Recipe for Improving OOD Generalization of Graph Contrastive Learning

Yun Zhu,Haizhou Shi,Zhenshuo Zhang,Siliang Tang

from arxiv, 21 pages, 15 figures

In this work, we investigate the problem of out-of-distribution (OOD) generalization for unsupervised learning methods on graph data. This scenario is particularly challenging because graph neural networks (GNNs) have been shown to be sensitive to distributional shifts, even when labels are available. To address this challenge, we propose a \underline{M}odel-\underline{A}gnostic \underline{R}ecipe for \underline{I}mproving \underline{O}OD generalizability of unsupervised graph contrastive learning methods, which we refer to as MARIO. MARIO introduces two principles aimed at developing distributional-shift-robust graph contrastive methods to overcome the limitations of existing frameworks: (i) Information Bottleneck (IB) principle for achieving generalizable representations and (ii) Invariant principle that incorporates adversarial data augmentation to obtain invariant representations. To the best of our knowledge, this is the first work that investigates the OOD generalization problem of graph contrastive learning, with a specific focus on node-level tasks. Through extensive experiments, we demonstrate that our method achieves state-of-the-art performance on the OOD test set, while maintaining comparable performance on the in-distribution test set when compared to existing approaches. The source code for our method can be found at: //github.com/ZhuYun97/MARIO

損失 · 圖像分割 · 情景 · UNet · 損失函數（機器學習） ·

2023 年 8 月 1 日

Boundary Difference Over Union Loss For Medical Image Segmentation

Fan Sun,Zhiming Luo,Shaozi Li

from arxiv, MICCAI 2023

Medical image segmentation is crucial for clinical diagnosis. However, current losses for medical image segmentation mainly focus on overall segmentation results, with fewer losses proposed to guide boundary segmentation. Those that do exist often need to be used in combination with other losses and produce ineffective results. To address this issue, we have developed a simple and effective loss called the Boundary Difference over Union Loss (Boundary DoU Loss) to guide boundary region segmentation. It is obtained by calculating the ratio of the difference set of prediction and ground truth to the union of the difference set and the partial intersection set. Our loss only relies on region calculation, making it easy to implement and training stable without needing any additional losses. Additionally, we use the target size to adaptively adjust attention applied to the boundary regions. Experimental results using UNet, TransUNet, and Swin-UNet on two datasets (ACDC and Synapse) demonstrate the effectiveness of our proposed loss function. Code is available at //github.com/sunfan-bvb/BoundaryDoULoss.

回合 · SLIP · MoDELS · Learning · Networking ·

2023 年 7 月 31 日

Probabilistic Traversability Model for Risk-Aware Motion Planning in Off-Road Environments

Xiaoyi Cai,Michael Everett,Lakshay Sharma,Philip R. Osteen,Jonathan P. How

from arxiv, To appear in IROS23. Video and code: //github.com/mit-acl/mppi_numba

A key challenge in off-road navigation is that even visually similar terrains or ones from the same semantic class may have substantially different traction properties. Existing work typically assumes no wheel slip or uses the expected traction for motion planning, where the predicted trajectories provide a poor indication of the actual performance if the terrain traction has high uncertainty. In contrast, this work proposes to analyze terrain traversability with the empirical distribution of traction parameters in unicycle dynamics, which can be learned by a neural network in a self-supervised fashion. The probabilistic traction model leads to two risk-aware cost formulations that account for the worst-case expected cost and traction. To help the learned model generalize to unseen environment, terrains with features that lead to unreliable predictions are detected via a density estimator fit to the trained network's latent space and avoided via auxiliary penalties during planning. Simulation results demonstrate that the proposed approach outperforms existing work that assumes no slip or uses the expected traction in both navigation success rate and completion time. Furthermore, avoiding terrains with low density-based confidence score achieves up to 30% improvement in success rate when the learned traction model is used in a novel environment.

優化器 · 前向 · 梯度上升 · 容差 · 極小值 ·

2023 年 7 月 31 日

Lookbehind Optimizer: k steps back, 1 step forward

Gon?alo Mordido,Pranshu Malviya,Aristide Baratin,Sarath Chandar

The Lookahead optimizer improves the training stability of deep neural networks by having a set of fast weights that "look ahead" to guide the descent direction. Here, we combine this idea with sharpness-aware minimization (SAM) to stabilize its multi-step variant and improve the loss-sharpness trade-off. We propose Lookbehind, which computes $k$ gradient ascent steps ("looking behind") at each iteration and combine the gradients to bias the descent step toward flatter minima. We apply Lookbehind on top of two popular sharpness-aware training methods -- SAM and adaptive SAM (ASAM) -- and show that our approach leads to a myriad of benefits across a variety of tasks and training regimes. Particularly, we show increased generalization performance, greater robustness against noisy weights, and higher tolerance to catastrophic forgetting in lifelong learning settings.

數據增強 · Performer · RIFE · state-of-the-art · 數據集 ·

2023 年 7 月 31 日

Noisy Self-Training with Data Augmentations for Offensive and Hate Speech Detection Tasks

Jo?o A. Leite,Carolina Scarton,Diego F. Silva

from arxiv, Accepted to RANLP 2023

Online social media is rife with offensive and hateful comments, prompting the need for their automatic detection given the sheer amount of posts created every second. Creating high-quality human-labelled datasets for this task is difficult and costly, especially because non-offensive posts are significantly more frequent than offensive ones. However, unlabelled data is abundant, easier, and cheaper to obtain. In this scenario, self-training methods, using weakly-labelled examples to increase the amount of training data, can be employed. Recent "noisy" self-training approaches incorporate data augmentation techniques to ensure prediction consistency and increase robustness against noisy data and adversarial attacks. In this paper, we experiment with default and noisy self-training using three different textual data augmentation techniques across five different pre-trained BERT architectures varying in size. We evaluate our experiments on two offensive/hate-speech datasets and demonstrate that (i) self-training consistently improves performance regardless of model size, resulting in up to +1.5% F1-macro on both datasets, and (ii) noisy self-training with textual data augmentations, despite being successfully applied in similar settings, decreases performance on offensive and hate-speech domains when compared to the default method, even with state-of-the-art augmentations such as backtranslation.

分解的 · 對稱矩陣 · Networking · Performer · Neural Networks ·

2023 年 7 月 31 日

The Decimation Scheme for Symmetric Matrix Factorization

Francesco Camilli,Marc Mézard

from arxiv, 30 pages, 13 figures

Matrix factorization is an inference problem that has acquired importance due to its vast range of applications that go from dictionary learning to recommendation systems and machine learning with deep networks. The study of its fundamental statistical limits represents a true challenge, and despite a decade-long history of efforts in the community, there is still no closed formula able to describe its optimal performances in the case where the rank of the matrix scales linearly with its size. In the present paper, we study this extensive rank problem, extending the alternative 'decimation' procedure that we recently introduced, and carry out a thorough study of its performance. Decimation aims at recovering one column/line of the factors at a time, by mapping the problem into a sequence of neural network models of associative memory at a tunable temperature. Though being sub-optimal, decimation has the advantage of being theoretically analyzable. We extend its scope and analysis to two families of matrices. For a large class of compactly supported priors, we show that the replica symmetric free entropy of the neural network models takes a universal form in the low temperature limit. For sparse Ising prior, we show that the storage capacity of the neural network models diverges as sparsity in the patterns increases, and we introduce a simple algorithm based on a ground state search that implements decimation and performs matrix factorization, with no need of an informative initialization.

塑造 · 優化器 · Integration · 模型評估 · 流 ·

2023 年 7 月 31 日

An analytical solution for supersonic flow over a circular cylinder using an optimized shock shape

S R Siva Prasad Kochi,M Ramakrishna

from arxiv, 22 pages, 17 figures, 7 tables. arXiv admin note: text overlap with arXiv:1104.4701 by other authors

An analytical solution for high supersonic flow over a circular cylinder based on Schneider's inverse method has been presented. In the inverse method, a shock shape is assumed and the corresponding flow field and the shape of the body producing the shock are found by integrating the equations of motion using the stream function. A shock shape theorised by Moeckel has been assumed and it is optimized by minimising the error between the shape of the body obtained using Schneider's method and the actual shape of the body. A further improvement in the shock shape is also found by using the Moeckel's shock shape in a small series expansion. With this shock shape, the whole flow field in the shock layer has been calculated using Schneider's method by integrating the equations of motion. This solution is compared against a fifth order accurate numerical solution using the discontinuous Galerkin method (DGM) and the maximum error in density is found to be of the order of 0.001 which demonstrates the accuracy of the method used for both plane and axisymmetric flows.

Learning · 聯邦學習 · 模型評估 · Shuffle · Extensibility ·

2023 年 7 月 30 日

Shuffled Differentially Private Federated Learning for Time Series Data Analytics

Chenxi Huang,Chaoyang Jiang,Zhenghua Chen

Trustworthy federated learning aims to achieve optimal performance while ensuring clients' privacy. Existing privacy-preserving federated learning approaches are mostly tailored for image data, lacking applications for time series data, which have many important applications, like machine health monitoring, human activity recognition, etc. Furthermore, protective noising on a time series data analytics model can significantly interfere with temporal-dependent learning, leading to a greater decline in accuracy. To address these issues, we develop a privacy-preserving federated learning algorithm for time series data. Specifically, we employ local differential privacy to extend the privacy protection trust boundary to the clients. We also incorporate shuffle techniques to achieve a privacy amplification, mitigating the accuracy decline caused by leveraging local differential privacy. Extensive experiments were conducted on five time series datasets. The evaluation results reveal that our algorithm experienced minimal accuracy loss compared to non-private federated learning in both small and large client scenarios. Under the same level of privacy protection, our algorithm demonstrated improved accuracy compared to the centralized differentially private federated learning in both scenarios.

Networking · 知識 (knowledge) · Learning · MoDELS · Machine Learning ·

2023 年 7 月 29 日

The effect of network topologies on fully decentralized learning: a preliminary investigation

Luigi Palmieri,Lorenzo Valerio,Chiara Boldrini,Andrea Passarella

In a decentralized machine learning system, data is typically partitioned among multiple devices or nodes, each of which trains a local model using its own data. These local models are then shared and combined to create a global model that can make accurate predictions on new data. In this paper, we start exploring the role of the network topology connecting nodes on the performance of a Machine Learning model trained through direct collaboration between nodes. We investigate how different types of topologies impact the "spreading of knowledge", i.e., the ability of nodes to incorporate in their local model the knowledge derived by learning patterns in data available in other nodes across the networks. Specifically, we highlight the different roles in this process of more or less connected nodes (hubs and leaves), as well as that of macroscopic network properties (primarily, degree distribution and modularity). Among others, we show that, while it is known that even weak connectivity among network components is sufficient for information spread, it may not be sufficient for knowledge spread. More intuitively, we also find that hubs have a more significant role than leaves in spreading knowledge, although this manifests itself not only for heavy-tailed distributions but also when "hubs" have only moderately more connections than leaves. Finally, we show that tightly knit communities severely hinder knowledge spread.

PDE · 約束 · 操作 · 訓練數據 · Learning ·

2023 年 7 月 29 日

Physics-Informed Neural Operator for Learning Partial Differential Equations

Zongyi Li,Hongkai Zheng,Nikola Kovachki,David Jin,Haoxuan Chen,Burigede Liu,Kamyar Azizzadenesheli,Anima Anandkumar

In this paper, we propose physics-informed neural operators (PINO) that combine training data and physics constraints to learn the solution operator of a given family of parametric Partial Differential Equations (PDE). PINO is the first hybrid approach incorporating data and PDE constraints at different resolutions to learn the operator. Specifically, in PINO, we combine coarse-resolution training data with PDE constraints imposed at a higher resolution. The resulting PINO model can accurately approximate the ground-truth solution operator for many popular PDE families and shows no degradation in accuracy even under zero-shot super-resolution, i.e., being able to predict beyond the resolution of training data. PINO uses the Fourier neural operator (FNO) framework that is guaranteed to be a universal approximator for any continuous operator and discretization-convergent in the limit of mesh refinement. By adding PDE constraints to FNO at a higher resolution, we obtain a high-fidelity reconstruction of the ground-truth operator. Moreover, PINO succeeds in settings where no training data is available and only PDE constraints are imposed, while previous approaches, such as the Physics-Informed Neural Network (PINN), fail due to optimization challenges, e.g., in multi-scale dynamic systems such as Kolmogorov flows.