斗破苍穹第四季25集免费观看_日韩欧美国产AⅤ另类_甜味弥漫一区二区在线观看_一区二区精品日韩欧美在线观看_波多野结衣电影国产一区二区三区_一级做人爰全程视频在线看_日本伦理手机在线

Group imbalance has been a known problem in empirical risk minimization (ERM), where the achieved high average accuracy is accompanied by low accuracy in a minority group. Despite algorithmic efforts to improve the minority group accuracy, a theoretical generalization analysis of ERM on individual groups remains elusive. By formulating the group imbalance problem with the Gaussian Mixture Model, this paper quantifies the impact of individual groups on the sample complexity, the convergence rate, and the average and group-level testing performance. Although our theoretical framework is centered on binary classification using a one-hidden-layer neural network, to the best of our knowledge, we provide the first theoretical analysis of the group-level generalization of ERM in addition to the commonly studied average generalization performance. Sample insights of our theoretical results include that when all group-level co-variance is in the medium regime and all mean are close to zero, the learning performance is most desirable in the sense of a small sample complexity, a fast training rate, and a high average and group-level testing accuracy. Moreover, we show that increasing the fraction of the minority group in the training data does not necessarily improve the generalization performance of the minority group. Our theoretical results are validated on both synthetic and empirical datasets, such as CelebA and CIFAR-10 in image classification.

相關內容

GROUP

關注 1

Group一直是研究計算機支持的合作工作、人機交互、計算機支持的協作學習和社會技術研究的主要場所。該會議將社會科學、計算機科學、工程、設計、價值觀以及其他與小組工作相關的多個不同主題的工作結合起來，并進行了廣泛的概念化。官網鏈接： · 分離的 · Processing（編程語言） · 路徑 ·

2024 年 5 月 1 日

Quantum cryptographic protocols with dual messaging system via 2D alternate quantum walks and genuine single particle entangled states

Dinesh Kumar Panda,Colin Benjamin

from arxiv, 11 pages (including supplementary material), 2 figures and 1 table

Single-particle entangled states (SPES) can offer a more secure way of encoding and processing quantum information than their multi-particle counterparts. The SPES generated via a 2D alternate quantum-walk setup from initially separable states can be either 3-way or 2-way entangled. This letter shows that the generated genuine three-way and nonlocal two-way SPES can be used as cryptographic keys to securely encode two distinct messages simultaneously. We detail the message encryption-decryption steps and show the resilience of the 3-way and 2-way SPES-based cryptographic protocols against eavesdropper attacks like intercept-and-resend and man-in-the-middle. We also detail how these protocols can be experimentally realized using single photons, with the three degrees of freedom being OAM, path, and polarization. These have unparalleled security for quantum communication tasks. The ability to simultaneously encode two distinct messages using the generated SPES showcases the versatility and efficiency of the proposed cryptographic protocol. This capability could significantly improve the throughput of quantum communication systems.

Networking · SR · 極大 · QoS · AIM ·

2024 年 4 月 30 日

Alternative paths computation for congestion mitigation in segment-routing networks

Sébastien Martin,Youcef Magnouche,Paolo Medagliani,Jérémie Leguay

from arxiv, 6 pages

In backbone networks, it is fundamental to quickly protect traffic against any unexpected event, such as failures or congestions, which may impact Quality of Service (QoS). Standard solutions based on Segment Routing (SR), such as Topology-Independent Loop-Free Alternate (TI-LFA), are used in practice to handle failures, but no distributed solutions exist for distributed and tactical congestion mitigation. A promising approach leveraging SR has been recently proposed to quickly steer traffic away from congested links over alternative paths. As the pre-computation of alternative paths plays a paramount role to efficiently mitigating congestions, we investigate the associated path computation problem aiming at maximizing the amount of traffic that can be rerouted as well as the resilience against any 1-link failure. In particular, we focus on two variants of this problem. First, we maximize the residual flow after all possible failures. We show that the problem is NP-Hard, and we solve it via a Benders decomposition algorithm. Then, to provide a practical and scalable solution, we solve a relaxed variant problem, that maximizes, instead of flow, the number of surviving alternative paths after all possible failures. We provide a polynomial algorithm. Through numerical experiments, we compare the two variants and show that they allow to increase the amount of rerouted traffic and the resiliency of the network after any 1-link failure.

MoDELS · 有偏 · 推斷 · INFORMS · 卡爾曼濾波 ·

2024 年 4 月 29 日

A real-time digital twin of azimuthal thermoacoustic instabilities

Andrea Nóvoa,Nicolas Noiray,James R. Dawson,Luca Magri

from arxiv, 23 pages

When they occur, azimuthal thermoacoustic oscillations can detrimentally affect the safe operation of gas turbines and aeroengines. We develop a real-time digital twin of azimuthal thermoacoustics of a hydrogen-based annular combustor. The digital twin seamlessly combines two sources of information about the system (i) a physics-based low-order model; and (ii) raw and sparse experimental data from microphones, which contain both aleatoric noise and turbulent fluctuations. First, we derive a low-order thermoacoustic model for azimuthal instabilities, which is deterministic. Second, we propose a real-time data assimilation framework to infer the acoustic pressure, the physical parameters, and the model and measurement biases simultaneously. This is the bias-regularized ensemble Kalman filter (r-EnKF), for which we find an analytical solution that solves the optimization problem. Third, we propose a reservoir computer, which infers both the model bias and measurement bias to close the assimilation equations. Fourth, we propose a real-time digital twin of the azimuthal thermoacoustic dynamics of a laboratory hydrogen-based annular combustor for a variety of equivalence ratios. We find that the real-time digital twin (i) autonomously predicts azimuthal dynamics, in contrast to bias-unregularized methods; (ii) uncovers the physical acoustic pressure from the raw data, i.e., it acts as a physics-based filter; (iii) is a time-varying parameter system, which generalizes existing models that have constant parameters, and capture only slow-varying variables. The digital twin generalizes to all equivalence ratios, which bridges the gap of existing models. This work opens new opportunities for real-time digital twinning of multi-physics problems.

Networking · Neural Networks · 樣本 · 優化器 · UniFormer ·

2024 年 4 月 29 日

Optimal time sampling in physics-informed neural networks

Gabriel Turinici

Physics-informed neural networks (PINN) is a extremely powerful paradigm used to solve equations encountered in scientific computing applications. An important part of the procedure is the minimization of the equation residual which includes, when the equation is time-dependent, a time sampling. It was argued in the literature that the sampling need not be uniform but should overweight initial time instants, but no rigorous explanation was provided for these choice. In this paper we take some prototypical examples and, under standard hypothesis concerning the neural network convergence, we show that the optimal time sampling follows a truncated exponential distribution. In particular we explain when the time sampling is best to be uniform and when it should not be. The findings are illustrated with numerical examples on linear equation, Burgers' equation and the Lorenz system.

Networking · Neural Networks · 損失 · 圖片分類 · 統計量 ·

2024 年 4 月 29 日

Statistical theory for image classification using deep convolutional neural networks with cross-entropy loss under the hierarchical max-pooling model

Michael Kohler,Sophie Langer

from arxiv, arXiv admin note: text overlap with arXiv:2003.01526

Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analyzing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.

估計/估計量 · Extensibility · motivation · 泛函 · Analysis ·

2024 年 4 月 28 日

Semiparametric causal mediation analysis in cluster-randomized experiments

Chao Cheng,Fan Li

In cluster-randomized experiments, there is emerging interest in exploring the causal mechanism in which a cluster-level treatment affects the outcome through an intermediate outcome. Despite an extensive development of causal mediation methods in the past decade, only a few exceptions have been considered in assessing causal mediation in cluster-randomized studies, all of which depend on parametric model-based estimators. In this article, we develop the formal semiparametric efficiency theory to motivate several doubly-robust methods for addressing several mediation effect estimands corresponding to both the cluster-average and the individual-level treatment effects in cluster-randomized experiments--the natural indirect effect, natural direct effect, and spillover mediation effect. We derive the efficient influence function for each mediation effect, and carefully parameterize each efficient influence function to motivate practical strategies for operationalizing each estimator. We consider both parametric working models and data-adaptive machine learners to estimate the nuisance functions, and obtain semiparametric efficient causal mediation estimators in the latter case. Our methods are illustrated via extensive simulations and two completed cluster-randomized experiments.

近似 · Networking · Neural Networks · 泛函 · 層 ·

2024 年 4 月 26 日

Multi-layer random features and the approximation power of neural networks

Rustem Takhanov

from arxiv, Accepted to Uncertainty in Artificial Intelligence (UAI) 2024

A neural architecture with randomly initialized weights, in the infinite width limit, is equivalent to a Gaussian Random Field whose covariance function is the so-called Neural Network Gaussian Process kernel (NNGP). We prove that a reproducing kernel Hilbert space (RKHS) defined by the NNGP contains only functions that can be approximated by the architecture. To achieve a certain approximation error the required number of neurons in each layer is defined by the RKHS norm of the target function. Moreover, the approximation can be constructed from a supervised dataset by a random multi-layer representation of an input vector, together with training of the last layer's weights. For a 2-layer NN and a domain equal to an $n-1$-dimensional sphere in ${\mathbb R}^n$, we compare the number of neurons required by Barron's theorem and by the multi-layer features construction. We show that if eigenvalues of the integral operator of the NNGP decay slower than $k^{-n-\frac{2}{3}}$ where $k$ is an order of an eigenvalue, then our theorem guarantees a more succinct neural network approximation than Barron's theorem. We also make some computational experiments to verify our theoretical findings. Our experiments show that realistic neural networks easily learn target functions even when both theorems do not give any guarantees.

異常點 · Analysis · INFORMS · 列 · 列聯表 ·

2024 年 4 月 26 日

Correspondence analysis: handling cell-wise outliers via the reconstitution algorithm

Qianqian Qi,David J. Hessen,Aike N. Vonk,Peter G. M. van der Heijden

Correspondence analysis (CA) is a popular technique to visualize the relationship between two categorical variables. CA uses the data from a two-way contingency table and is affected by the presence of outliers. The supplementary points method is a popular method to handle outliers. Its disadvantage is that the information from entire rows or columns is removed. However, outliers can be caused by cells only. In this paper, a reconstitution algorithm is introduced to cope with such cells. This algorithm can reduce the contribution of cells in CA instead of deleting entire rows or columns. Thus the remaining information in the row and column involved can be used in the analysis. The reconstitution algorithm is compared with two alternative methods for handling outliers, the supplementary points method and MacroPCA. It is shown that the proposed strategy works well.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

FCN · 全卷積網絡 · 3D · 級聯 · MoDELS ·

2018 年 3 月 20 日

An application of cascaded 3D fully convolutional networks for medical image segmentation

Holger R. Roth,Hirohisa Oda,Xiangrong Zhou,Natsuki Shimizu,Ying Yang,Yuichiro Hayashi,Masahiro Oda,Michitaka Fujiwara,Kazunari Misawa,Kensaku Mori

from arxiv, Preprint accepted for publication in Computerized Medical Imaging and Graphics. Substantial extension of arXiv:1704.06382; Corrected references to figure numbers in this version

Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.