国产白浆一区二区无码视频在线,日韩专区欧美专区亚洲福利,欧美日韩亚洲高清不卡一区二区三区

The ability to align points across two related yet incomparable point clouds (e.g. living in different spaces) plays an important role in machine learning. The Gromov-Wasserstein (GW) framework provides an increasingly popular answer to such problems, by seeking a low-distortion, geometry-preserving assignment between these points. As a non-convex, quadratic generalization of optimal transport (OT), GW is NP-hard. While practitioners often resort to solving GW approximately as a nested sequence of entropy-regularized OT problems, the cubic complexity (in the number $n$ of samples) of that approach is a roadblock. We show in this work how a recent variant of the OT problem that restricts the set of admissible couplings to those having a low-rank factorization is remarkably well suited to the resolution of GW: when applied to GW, we show that this approach is not only able to compute a stationary point of the GW problem in time $O(n^2)$, but also uniquely positioned to benefit from the knowledge that the initial cost matrices are low-rank, to yield a linear time $O(n)$ GW approximation. Our approach yields similar results, yet orders of magnitude faster computation than the SoTA entropic GW approaches, on both simulated and real data.

相關內容

秩

關注 0

可逆 · 并發系統 · 回溯 · 系統 · 一致 ·

2023 年 3 月 28 日

Bridging Causal Reversibility and Time Reversibility: A Stochastic Process Algebraic Approach

Marco Bernardo,Claudio A. Mezzina

Causal reversibility blends reversibility and causality for concurrent systems. It indicates that an action can be undone provided that all of its consequences have been undone already, thus making it possible to bring the system back to a past consistent state. Time reversibility is instead considered in the field of stochastic processes, mostly for efficient analysis purposes. A performance model based on a continuous-time Markov chain is time reversible if its stochastic behavior remains the same when the direction of time is reversed. We bridge these two theories of reversibility by showing the conditions under which causal reversibility and time reversibility are both ensured by construction. This is done in the setting of a stochastic process calculus, which is then equipped with a variant of stochastic bisimilarity accounting for both forward and backward directions.

基尼指數 · 廣義 · 公平性 · 非光滑 · 非光滑優化 ·

2023 年 3 月 28 日

Optimizing generalized Gini indices for fairness in rankings

Virginie Do,Nicolas Usunier

from arxiv, Accepted to SIGIR 2022

There is growing interest in designing recommender systems that aim at being fair towards item producers or their least satisfied users. Inspired by the domain of inequality measurement in economics, this paper explores the use of generalized Gini welfare functions (GGFs) as a means to specify the normative criterion that recommender systems should optimize for. GGFs weight individuals depending on their ranks in the population, giving more weight to worse-off individuals to promote equality. Depending on these weights, GGFs minimize the Gini index of item exposure to promote equality between items, or focus on the performance on specific quantiles of least satisfied users. GGFs for ranking are challenging to optimize because they are non-differentiable. We resolve this challenge by leveraging tools from non-smooth optimization and projection operators used in differentiable sorting. We present experiments using real datasets with up to 15k users and items, which show that our approach obtains better trade-offs than the baselines on a variety of recommendation tasks and fairness criteria.

隨機幾何 · 幾何圖 · 測地線 · 流形 · 離散 ·

2023 年 3 月 27 日

Stability of Entropic Wasserstein Barycenters and application to random geometric graphs

Marc Theveneau,Nicolas Keriven

As interest in graph data has grown in recent years, the computation of various geometric tools has become essential. In some area such as mesh processing, they often rely on the computation of geodesics and shortest paths in discretized manifolds. A recent example of such a tool is the computation of Wasserstein barycenters (WB), a very general notion of barycenters derived from the theory of Optimal Transport, and their entropic-regularized variant. In this paper, we examine how WBs on discretized meshes relate to the geometry of the underlying manifold. We first provide a generic stability result with respect to the input cost matrices. We then apply this result to random geometric graphs on manifolds, whose shortest paths converge to geodesics, hence proving the consistency of WBs computed on discretized shapes.

后驗分布 · 經驗頻率 · CMS · 貝葉斯 · 樣本 ·

2023 年 3 月 27 日

Random measure priors in Bayesian frequency recovery from sketches

Mario Beraha,Stefano Favaro

Given a lossy-compressed representation, or sketch, of data with values in a set of symbols, the frequency recovery problem considers the estimation of the empirical frequency of a new data point. Recent studies have applied Bayesian nonparametrics (BNPs) to develop learning-augmented versions of the popular count-min sketch (CMS) recovery algorithm. In this paper, we present a novel BNP approach to frequency recovery, which is not built from the CMS but still relies on a sketch obtained by random hashing. Assuming data to be modeled as random samples from an unknown discrete distribution, which is endowed with a Poisson-Kingman (PK) prior, we provide the posterior distribution of the empirical frequency of a symbol, given the sketch. Estimates are then obtained as mean functionals. An application of our result is presented for the Dirichlet process (DP) and Pitman-Yor process (PYP) priors, and in particular: i) we characterize the DP prior as the sole PK prior featuring a property of sufficiency with respect to the sketch, leading to a simple posterior distribution; ii) we identify a large sample regime under which the PYP prior leads to a simple approximation of the posterior distribution. Then, we develop our BNP approach to a "traits" formulation of the frequency recovery problem, not yet studied in the CMS literature, in which data belong to more than one symbol (trait), and exhibit nonnegative integer levels of associations with each trait. In particular, by modeling data as random samples from a generalized Indian buffet process, we provide the posterior distribution of the empirical frequency level of a trait, given the sketch. This result is then applied under the assumption of a Poisson and Bernoulli distribution for the levels of associations, leading to a simple posterior distribution and a simple approximation of the posterior distribution, respectively.

預條件 · 差分 · 非局部 · 并行計算 · 并行 ·

2023 年 3 月 25 日

Preconditioned Algorithm for Difference of Convex Functions with applications to Graph Ginzburg-Landau Model

Xinhua Shen,Hongpeng Sun,Xuecheng Tai

In this work, we propose and study a preconditioned framework with a graphic Ginzburg-Landau functional for image segmentation and data clustering by parallel computing. Solving nonlocal models is usually challenging due to the huge computation burden. For the nonconvex and nonlocal variational functional, we propose several damped Jacobi and generalized Richardson preconditioners for the large-scale linear systems within a difference of convex functions algorithms framework. They are efficient for parallel computing with GPU and can leverage the computational cost. Our framework also provides flexible step sizes with a global convergence guarantee. Numerical experiments show the proposed algorithms are very competitive compared to the singular value decomposition based spectral method.

時間序列 · 多元時間序列 · 序列聚類 · 序列 · 相似性 ·

2023 年 3 月 24 日

Clustering Multivariate Time Series using Energy Distance

Richard A. Davis,Leon Fernandes,Konstantinos Fokianos

from arxiv, 26 pages, 7 figures, to be published in Journal of Time Series Anaylsis

A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Sz\'ekely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure separation between the finite dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally the methodology is applied to two examples; one involves GDP of selected countries and the other is population size of various states in the U.S.A. in the years 1900 -1999.

正則化 · 收斂速度 · Gromov-Wasserstein 距離 · 樣本復雜度 · 最優 ·

2023 年 3 月 24 日

Gromov-Wasserstein Distances: Entropic Regularization, Duality, and Sample Complexity

Zhengxin Zhang,Ziv Goldfeld,Youssef Mroueh,Bharath K. Sriperumbudur

from arxiv, 32 pages

The Gromov-Wasserstein (GW) distance quantifies dissimilarity between metric measure spaces and provides a meaningful figure of merit for applications involving heterogeneous data. While computational aspects of the GW distance have been widely studied, a strong duality theory and fundamental statistical questions concerning empirical convergence rates remained obscure. This work closes these gaps for the $(2,2)$-GW distance (namely, with quadratic cost) over Euclidean spaces of different dimensions $d_x$ and $d_y$. We consider both the standard GW and the entropic GW (EGW) distances, derive their dual forms, and use them to analyze expected empirical convergence rates. The resulting rates are $n^{-2/\max\{d_x,d_y,4\}}$ (up to a log factor when $\max\{d_x,d_y\}=4$) and $n^{-1/2}$ for the two-sample GW and EGW problems, respectively, which matches the corresponding rates for standard and entropic optimal transport distances. We also study stability of EGW in the entropic regularization parameter and establish approximation and continuity results for the cost and optimal couplings. Lastly, the duality is leveraged to shed new light on the open problem of the one-dimensional GW distance between uniform distributions on $n$ points, illuminating why the identity and anti-identity permutations may not be optimal. Our results serve as a first step towards a comprehensive statistical theory as well as computational advancements for GW distances, based on the discovered dual formulation.

有限元 · 誤差估計 · 離散 · 網格 · 渦旋 ·

2023 年 3 月 24 日

Error bounds for discrete minimizers of the Ginzburg-Landau energy in the high-$κ$ regime

Benjamin D?rich,Patrick Henning

In this work, we study discrete minimizers of the Ginzburg-Landau energy in finite element spaces. Special focus is given to the influence of the Ginzburg-Landau parameter $\kappa$. This parameter is of physical interest as large values can trigger the appearance of vortex lattices. Since the vortices have to be resolved on sufficiently fine computational meshes, it is important to translate the size of $\kappa$ into a mesh resolution condition, which can be done through error estimates that are explicit with respect to $\kappa$ and the spatial mesh width $h$. For that, we first work in an abstract framework for a general class of discrete spaces, where we present convergence results in a problem-adapted $\kappa$-weighted norm. Afterwards we apply our findings to Lagrangian finite elements and a particular generalized finite element construction. In numerical experiments we confirm that our derived $L^2$- and $H^1$-error estimates are indeed optimal in $\kappa$ and $h$.

INFORMS · 散度 · Markov · Continuity · state-of-the-art ·

2023 年 3 月 24 日

Lower Bounds on the Bayesian Risk via Information Measures

Amedeo Roberto Esposito,Adrien Vandenbroucque,Michael Gastpar

This paper focuses on parameter estimation and introduces a new method for lower bounding the Bayesian risk. The method allows for the use of virtually \emph{any} information measure, including R\'enyi's $\alpha$, $\varphi$-Divergences, and Sibson's $\alpha$-Mutual Information. The approach considers divergences as functionals of measures and exploits the duality between spaces of measures and spaces of functions. In particular, we show that one can lower bound the risk with any information measure by upper bounding its dual via Markov's inequality. We are thus able to provide estimator-independent impossibility results thanks to the Data-Processing Inequalities that divergences satisfy. The results are then applied to settings of interest involving both discrete and continuous parameters, including the ``Hide-and-Seek'' problem, and compared to the state-of-the-art techniques. An important observation is that the behaviour of the lower bound in the number of samples is influenced by the choice of the information measure. We leverage this by introducing a new divergence inspired by the ``Hockey-Stick'' Divergence, which is demonstrated empirically to provide the largest lower-bound across all considered settings. If the observations are subject to privatisation, stronger impossibility results can be obtained via Strong Data-Processing Inequalities. The paper also discusses some generalisations and alternative directions.

Learning · Neural Networks · Networking · 可約的 · Networks ·

2022 年 9 月 1 日

Learning with Differentiable Algorithms

Felix Petersen

from arxiv, PhD thesis (summa cum laude), University of Konstanz, 162 pages

Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path in a large graph, neural networks allow learning from data to predict the most likely answer in more complex tasks such as image classification, which cannot be reduced to an exact algorithm. To get the best of both worlds, this thesis explores combining both concepts leading to more robust, better performing, more interpretable, more computationally efficient, and more data efficient architectures. The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm. When integrating an algorithm into a neural architecture, it is important that the algorithm is differentiable such that the architecture can be trained end-to-end and gradients can be propagated back through the algorithm in a meaningful way. To make algorithms differentiable, this thesis proposes a general method for continuously relaxing algorithms by perturbing variables and approximating the expectation value in closed form, i.e., without sampling. In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable renderers, and differentiable logic gate networks. Finally, this thesis presents alternative training strategies for learning with algorithms.