高清国产三级在线播放,2021日本人人爽人人爽,天天操天天爽综合网

Within distributed learning, workers typically compute gradients on their assigned dataset chunks and send them to the parameter server (PS), which aggregates them to compute either an exact or approximate version of $\nabla L$ (gradient of the loss function $L$). However, in large-scale clusters, many workers are slower than their promised speed or even failure-prone. A gradient coding solution introduces redundancy within the assignment of chunks to the workers and uses coding theoretic ideas to allow the PS to recover $\nabla L$ (exactly or approximately), even in the presence of stragglers. Unfortunately, most existing gradient coding protocols are inefficient from a computation perspective as they coarsely classify workers as operational or failed; the potentially valuable work performed by slow workers (partial stragglers) is ignored. In this work, we present novel gradient coding protocols that judiciously leverage the work performed by partial stragglers. Our protocols are efficient from a computation and communication perspective and numerically stable. For an important class of chunk assignments, we present efficient algorithms for optimizing the relative ordering of chunks within the workers; this ordering affects the overall execution time. For exact gradient reconstruction, our protocol is around $2\times$ faster than the original class of protocols and for approximate gradient reconstruction, the mean-squared-error of our reconstructed gradient is several orders of magnitude better.

相關內容

Performer

關注 0

Learning · Machine Learning · 穩健性 · MoDELS · Processing（編程語言） ·

2024 年 7 月 11 日

How to beat a Bayesian adversary

Zihan Ding,Kexin Jin,Jonas Latz,Chenguang Liu

Deep neural networks and other modern machine learning models are often susceptible to adversarial attacks. Indeed, an adversary may often be able to change a model's prediction through a small, directed perturbation of the model's input - an issue in safety-critical applications. Adversarially robust machine learning is usually based on a minmax optimisation problem that minimises the machine learning loss under maximisation-based adversarial attacks. In this work, we study adversaries that determine their attack using a Bayesian statistical approach rather than maximisation. The resulting Bayesian adversarial robustness problem is a relaxation of the usual minmax problem. To solve this problem, we propose Abram - a continuous-time particle system that shall approximate the gradient flow corresponding to the underlying learning problem. We show that Abram approximates a McKean-Vlasov process and justify the use of Abram by giving assumptions under which the McKean-Vlasov process finds the minimiser of the Bayesian adversarial robustness problem. We discuss two ways to discretise Abram and show its suitability in benchmark adversarial deep learning experiments.

估計/估計量 · 流形 · 隨機采樣 · 樣本 · Analysis ·

2024 年 7 月 10 日

Convergence of Hessian estimator from random samples on a manifold with boundary

Chih-Wei Chen,Hau-Tieng Wu

A common method for estimating the Hessian operator from random samples on a low-dimensional manifold involves locally fitting a quadratic polynomial. Although widely used, it is unclear if this estimator introduces bias, especially in complex manifolds with boundaries and nonuniform sampling. Rigorous theoretical guarantees of its asymptotic behavior have been lacking. We show that, under mild conditions, this estimator asymptotically converges to the Hessian operator, with nonuniform sampling and curvature effects proving negligible, even near boundaries. Our analysis framework simplifies the intensive computations required for direct analysis.

CASE · 優化器 · 近似 · CASES · 相似度 ·

2024 年 7 月 10 日

Random unitaries in extremely low depth

Thomas Schuster,Jonas Haferkamp,Hsin-Yuan Huang

from arxiv, 12 pages, 6 figures + 46-page appendix

We prove that random quantum circuits on any geometry, including a 1D line, can form approximate unitary designs over $n$ qubits in $\log n$ depth. In a similar manner, we construct pseudorandom unitaries (PRUs) in 1D circuits in $\text{poly} \log n $ depth, and in all-to-all-connected circuits in $\text{poly} \log \log n $ depth. In all three cases, the $n$ dependence is optimal and improves exponentially over known results. These shallow quantum circuits have low complexity and create only short-range entanglement, yet are indistinguishable from unitaries with exponential complexity. Our construction glues local random unitaries on $\log n$-sized or $\text{poly} \log n$-sized patches of qubits to form a global random unitary on all $n$ qubits. In the case of designs, the local unitaries are drawn from existing constructions of approximate unitary $k$-designs, and hence also inherit an optimal scaling in $k$. In the case of PRUs, the local unitaries are drawn from existing unitary ensembles conjectured to form PRUs. Applications of our results include proving that classical shadows with 1D log-depth Clifford circuits are as powerful as those with deep circuits, demonstrating superpolynomial quantum advantage in learning low-complexity physical systems, and establishing quantum hardness for recognizing phases of matter with topological order.

泛函 · MoDELS · 廣義函數 · 矩 · 相互獨立的 ·

2024 年 7 月 10 日

A new over-dispersed count model

Anupama Nandi,Subrata Chakraborty,Aniket Biswas

from arxiv, The paper is not complete

A new two-parameter discrete distribution, namely the PoiG distribution is derived by the convolution of a Poisson variate and an independently distributed geometric random variable. This distribution generalizes both the Poisson and geometric distributions and can be used for modelling over-dispersed as well as equi-dispersed count data. A number of important statistical properties of the proposed count model, such as the probability generating function, the moment generating function, the moments, the survival function and the hazard rate function. Monotonic properties are studied, such as the log concavity and the stochastic ordering are also investigated in detail. Method of moment and the maximum likelihood estimators of the parameters of the proposed model are presented. It is envisaged that the proposed distribution may prove to be useful for the practitioners for modelling over-dispersed count data compared to its closest competitors.

異常檢測 · Branch · CASE · 穩健性 · 泛函 ·

2024 年 7 月 10 日

Anomaly detection using data depth: multivariate case

Pavlo Mozharovskyi,Romain Valla

Anomaly detection is a branch of data analysis and machine learning which aims at identifying observations that exhibit abnormal behaviour. Be it measurement errors, disease development, severe weather, production quality default(s) (items) or failed equipment, financial frauds or crisis events, their on-time identification, isolation and explanation constitute an important task in almost any branch of science and industry. By providing a robust ordering, data depth - statistical function that measures belongingness of any point of the space to a data set - becomes a particularly useful tool for detection of anomalies. Already known for its theoretical properties, data depth has undergone substantial computational developments in the last decade and particularly recent years, which has made it applicable for contemporary-sized problems of data analysis and machine learning. In this article, data depth is studied as an efficient anomaly detection tool, assigning abnormality labels to observations with lower depth values, in a multivariate setting. Practical questions of necessity and reasonability of invariances and shape of the depth function, its robustness and computational complexity, choice of the threshold are discussed. Illustrations include use-cases that underline advantageous behaviour of data depth in various settings.

INFORMS · 互信息 · 統計量 · 回合 · 相似度 ·

2024 年 7 月 10 日

Mutual Information calculation on different appearances

Jiecheng Liao,Junhao Lu,Jeff Ji,Jiacheng He

from arxiv, demo for the work: elucidator.cn/demo-mi/

Mutual information has many applications in image alignment and matching, mainly due to its ability to measure the statistical dependence between two images, even if the two images are from different modalities (e.g., CT and MRI). It considers not only the pixel intensities of the images but also the spatial relationships between the pixels. In this project, we apply the mutual information formula to image matching, where image A is the moving object and image B is the target object and calculate the mutual information between them to evaluate the similarity between the images. For comparison, we also used entropy and information-gain methods to test the dependency of the images. We also investigated the effect of different environments on the mutual information of the same image and used experiments and plots to demonstrate.

穩健性 · 估計/估計量 · 約束 · 統計量 · 情景 ·

2024 年 7 月 9 日

Distributionally robust risk evaluation with an isotonic constraint

Yu Gui,Rina Foygel Barber,Cong Ma

Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.

估計/估計量 · 方差 · 線性的 · MoDELS · 塑造 ·

2024 年 7 月 9 日

Efficient estimation of partially linear additive Cox models and variance estimation under shape restrictions

Junjun Lang,Yukun Liu,Jing Qin

Shape-restricted inferences have exhibited empirical success in various applications with survival data. However, certain works fall short in providing a rigorous theoretical justification and an easy-to-use variance estimator with theoretical guarantee. Motivated by Deng et al. (2023), this paper delves into an additive and shape-restricted partially linear Cox model for right-censored data, where each additive component satisfies a specific shape restriction, encompassing monotonic increasing/decreasing and convexity/concavity. We systematically investigate the consistencies and convergence rates of the shape-restricted maximum partial likelihood estimator (SMPLE) of all the underlying parameters. We further establish the aymptotic normality and semiparametric effiency of the SMPLE for the linear covariate shift. To estimate the asymptotic variance, we propose an innovative data-splitting variance estimation method that boasts exceptional versatility and broad applicability. Our simulation results and an analysis of the Rotterdam Breast Cancer dataset demonstrate that the SMPLE has comparable performance with the maximum likelihood estimator under the Cox model when the Cox model is correct, and outperforms the latter and Huang (1999)'s method when the Cox model is violated or the hazard is nonsmooth. Meanwhile, the proposed variance estimation method usually leads to reliable interval estimates based on the SMPLE and its competitors.

流形 · 潛在 · Learning · 優化器 · AIM ·

2024 年 7 月 8 日

System stabilization with policy optimization on unstable latent manifolds

Steffen W. R. Werner,Benjamin Peherstorfer

from arxiv, 29 pages, 10 figures, 1 table

Stability is a basic requirement when studying the behavior of dynamical systems. However, stabilizing dynamical systems via reinforcement learning is challenging because only little data can be collected over short time horizons before instabilities are triggered and data become meaningless. This work introduces a reinforcement learning approach that is formulated over latent manifolds of unstable dynamics so that stabilizing policies can be trained from few data samples. The unstable manifolds are minimal in the sense that they contain the lowest dimensional dynamics that are necessary for learning policies that guarantee stabilization. This is in stark contrast to generic latent manifolds that aim to approximate all -- stable and unstable -- system dynamics and thus are higher dimensional and often require higher amounts of data. Experiments demonstrate that the proposed approach stabilizes even complex physical systems from few data samples for which other methods that operate either directly in the system state space or on generic latent manifolds fail.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.