国产亚洲欧美日韩精品色狠二区,高清国产三级在线播放

The aim of this paper is twofold. Based on the geometric Wasserstein tangent space, we first introduce Wasserstein steepest descent flows. These are locally absolutely continuous curves in the Wasserstein space whose tangent vectors point into a steepest descent direction of a given functional. This allows the use of Euler forward schemes instead of minimizing movement schemes introduced by Jordan, Kinderlehrer and Otto. For locally Lipschitz continuous functionals which are $\lambda$-convex along generalized geodesics, we show that there exists a unique Wasserstein steepest descent flow which coincides with the Wasserstein gradient flow. The second aim is to study Wasserstein flows of the maximum mean discrepancy with respect to certain Riesz kernels. The crucial part is hereby the treatment of the interaction energy. Although it is not $\lambda$-convex along generalized geodesics, we give analytic expressions for Wasserstein steepest descent flows of the interaction energy starting at Dirac measures. In contrast to smooth kernels, the particle may explode, i.e., a Dirac measure becomes a non-Dirac one. The computation of steepest descent flows amounts to finding equilibrium measures with external fields, which nicely links Wasserstein flows of interaction energies with potential theory. Finally, we provide numerical simulations of Wasserstein steepest descent flows of discrepancies.

相關內容

最(zui)速下降(jiang)法

關注 0

估計/估計量 · 混合專家模型 · 極大似然估計 · Networking · Neural Networks ·

2023 年 5 月 12 日

Towards Convergence Rates for Parameter Estimation in Gaussian-gated Mixture of Experts

Huy Nguyen,TrungTin Nguyen,Khai Nguyen,Nhat Ho

from arxiv, 30 pages, 4 figures; Huy Nguyen and TrungTin Nguyen contributed equally to this work

Originally introduced as a neural network for ensemble learning, mixture of experts (MoE) has recently become a fundamental building block of highly successful modern deep neural networks for heterogeneous data analysis in several applications, including those in machine learning, statistics, bioinformatics, economics, and medicine. Despite its popularity in practice, a satisfactory level of understanding of the convergence behavior of Gaussian-gated MoE parameter estimation is far from complete. The underlying reason for this challenge is the inclusion of covariates in the Gaussian gating and expert networks, which leads to their intrinsically complex interactions via partial differential equations with respect to their parameters. We address these issues by designing novel Voronoi loss functions to accurately capture heterogeneity in the maximum likelihood estimator (MLE) for resolving parameter estimation in these models. Our results reveal distinct behaviors of the MLE under two settings: the first setting is when all the location parameters in the Gaussian gating are non-zeros while the second setting is when there exists at least one zero-valued location parameter. Notably, these behaviors can be characterized by the solvability of two different systems of polynomial equations. Finally, we conduct a simulation study to verify our theoretical results.

Networking · 圖 · 圖注意力網絡 · 泛函 · Learning ·

2023 年 5 月 12 日

Graph Neural Modeling of Network Flows

Victor-Alexandru Darvariu,Stephen Hailes,Mirco Musolesi

Network flow problems, which involve distributing traffic over a network such that the underlying infrastructure is used effectively, are ubiquitous in transportation and logistics. Among them, the Multi-Commodity Network Flow (MCNF) problem is of general interest, as it concerns the distribution of multiple flows of different sizes between several sources and sinks, while achieving effective utilization of the links. Due to the appeal of data-driven optimization, these problems have increasingly been approached using graph learning methods. In this paper, we propose a novel graph learning architecture for network flow problems called Per-Edge Weights (PEW). This method builds on a Graph Attention Network and uses distinctly parametrized message functions along each link. We extensively evaluate the proposed solution through an Internet flow routing case study using $17$ Service Provider topologies and $2$ routing schemes. We show that PEW yields substantial gains over architectures whose global message function constrains the routing unnecessarily. We also find that an MLP is competitive with other standard architectures. Furthermore, we shed some light on the relationship between graph structure and predictive performance for data-driven routing of flows, an aspect that has not been considered by existing work in the area.

Networking · Gossip協議 · 樣例 · Processing（編程語言） · MoDELS ·

2023 年 5 月 12 日

Parameterized Verification of Disjunctive Timed Networks

étienne André,Paul Eichler,Swen Jacobs,Shyam Lal Karra

from arxiv, 21 pages, 6 figures

We introduce new techniques for the parameterized verification of disjunctive timed networks (DTNs), i.e., networks of timed automata (TAs) that communicate via location guards that enable a transition only if at least one process is in a given location. This computational model has been considered in the literature before, and example applications are gossiping clock synchronization protocols or planning problems. We address the minimum-time reachability problem (minreach) in DTNs, and show how to efficiently solve it based on a novel zone-graph algorithm. We further show that solving minreach allows us to construct a summary TA capturing exactly the possible behaviors of a single TA within a DTN of arbitrary size. The combination of these two results enables the parameterized verification of DTNs, while avoiding the construction of an exponential-size cutoff-system required by existing results. Our techniques are also implemented, and experiments show their practicality.

正則化項 · 平滑 · Learning · CASES · 核化 ·

2023 年 5 月 12 日

Random Smoothing Regularization in Kernel Gradient Descent Learning

Liang Ding,Tianyang Hu,Jiahang Jiang,Donghao Li,Wenjia Wang,Yuan Yao

Random smoothing data augmentation is a unique form of regularization that can prevent overfitting by introducing noise to the input data, encouraging the model to learn more generalized features. Despite its success in various applications, there has been a lack of systematic study on the regularization ability of random smoothing. In this paper, we aim to bridge this gap by presenting a framework for random smoothing regularization that can adaptively and effectively learn a wide range of ground truth functions belonging to the classical Sobolev spaces. Specifically, we investigate two underlying function spaces: the Sobolev space of low intrinsic dimension, which includes the Sobolev space in $D$-dimensional Euclidean space or low-dimensional sub-manifolds as special cases, and the mixed smooth Sobolev space with a tensor structure. By using random smoothing regularization as novel convolution-based smoothing kernels, we can attain optimal convergence rates in these cases using a kernel gradient descent algorithm, either with early stopping or weight decay. It is noteworthy that our estimator can adapt to the structural assumptions of the underlying data and avoid the curse of dimensionality. This is achieved through various choices of injected noise distributions such as Gaussian, Laplace, or general polynomial noises, allowing for broad adaptation to the aforementioned structural assumptions of the underlying data. The convergence rate depends only on the effective dimension, which may be significantly smaller than the actual data dimension. We conduct numerical experiments on simulated data to validate our theoretical results.

Signal Processing · Processing（編程語言） · Performer · 值域 · CASE ·

2023 年 5 月 11 日

Generalized signals on simplicial complexes

Xingchao Jian,Feng Ji,Wee Peng Tay

Topological signal processing (TSP) over simplicial complexes typically assumes observations associated with the simplicial complexes are real scalars. In this paper, we develop TSP theories for the case where observations belong to abelian groups more general than real numbers, including function spaces that are commonly used to represent time-varying signals. Our approach generalizes the Hodge decomposition and allows for signal processing tasks to be performed on these more complex observations. We propose a unified and flexible framework for TSP that expands its applicability to a wider range of signal processing applications. Numerical results demonstrate the effectiveness of this approach and provide a foundation for future research in this area.

Analysis · Performer · MoDELS · 信息先驗 · INFORMS ·

2023 年 5 月 11 日

Bayesian sensitivity analysis for a missing data model

Bart Eggen,Stéphanie L. van der Pas,Aad W. van der Vaart

In causal inference, sensitivity analysis is important to assess the robustness of study conclusions to key assumptions. We perform sensitivity analysis of the assumption that missing outcomes are missing completely at random. We follow a Bayesian approach, which is nonparametric for the outcome distribution and can be combined with an informative prior on the sensitivity parameter. We give insight in the posterior and provide theoretical guarantees in the form of Bernstein-von Mises theorems for estimating the mean outcome. We study different parametrisations of the model involving Dirichlet process priors on the distribution of the outcome and on the distribution of the outcome conditional on the subject being treated. We show that these parametrisations incorporate a prior on the sensitivity parameter in different ways and discuss the relative merits. We also present a simulation study, showing the performance of the methods in finite sample scenarios.

MoDELS · 方差 · Extensibility · Performer · 確切的 ·

2023 年 5 月 10 日

Bayesian variance change point detection with credible sets

Lorenzo Cappello,Oscar Hernan Madrid Padilla

This paper introduces a novel Bayesian approach to detect changes in the variance of a Gaussian sequence model, focusing on quantifying the uncertainty in the change point locations and providing a scalable algorithm for inference. Such a measure of uncertainty is necessary when change point methods are deployed in sensitive applications, for example, when one is interested in determining whether an organ is viable for transplant. The key of our proposal is framing the problem as a product of multiple single changes in the scale parameter. We fit the model through an iterative procedure similar to what is done for additive models. The novelty is that each iteration returns a probability distribution on time instances, which captures the uncertainty in the change point location. Leveraging a recent result in the literature, we can show that our proposal is a variational approximation of the exact model posterior distribution. We study the algorithm's convergence and the change point localization rate. Extensive experiments in simulation studies illustrate the performance of our method and the possibility of generalizing it to more complex data-generating mechanisms. We apply the new model to an experiment involving a novel technique to assess the viability of a liver and oceanographic data.

控制器 · Weight · CASE · 試驗 · Extensibility ·

2023 年 5 月 10 日

Case Weighted Adaptive Power Priors for Hybrid Control Analyses with Time-to-Event Data

Evan Kwiatkowski,Jiawen Zhu,Xiao Li,Herbert Pang,Grazyna Lieberman,Matthew A. Psioda

from arxiv, 27 pages, 10 figures

We develop a method for hybrid analyses that uses external controls to augment internal control arms in randomized controlled trials (RCT) where the degree of borrowing is determined based on similarity between RCT and external control patients to account for systematic differences (e.g. unmeasured confounders). The method represents a novel extension of the power prior where discounting weights are computed separately for each external control based on compatibility with the randomized control data. The discounting weights are determined using the predictive distribution for the external controls derived via the posterior distribution for time-to-event parameters estimated from the RCT. This method is applied using a proportional hazards regression model with piecewise constant baseline hazard. A simulation study and a real-data example are presented based on a completed trial in non-small cell lung cancer. It is shown that the case weighted adaptive power prior provides robust inference under various forms of incompatibility between the external controls and RCT population.

極小點 · 寬度 · 圖 · Weight · 可辨認的 ·

2023 年 5 月 9 日

Width Helps and Hinders Splitting Flows

Manuel Cáceres,Massimo Cairo,Andreas Grigorjew,Shahbaz Khan,Brendan Mumey,Romeo Rizzi,Alexandru I. Tomescu,Lucia Williams

from arxiv, A preliminary version was submitted to ESA 2022

Minimum flow decomposition (MFD) is the NP-hard problem of finding a smallest decomposition of a network flow/circulation $X$ on a directed graph $G$ into weighted source-to-sink paths whose superposition equals $X$. We show that, for acyclic graphs, considering the \emph{width} of the graph (the minimum number of paths needed to cover all of its edges) yields advances in our understanding of its approximability. For the version of the problem that uses only non-negative weights, we identify and characterise a new class of \emph{width-stable} graphs, for which a popular heuristic is a \gwsimple-approximation ($|X|$ being the total flow of $X$), and strengthen its worst-case approximation ratio from $\Omega(\sqrt{m})$ to $\Omega(m / \log m)$ for sparse graphs, where $m$ is the number of edges in the graph. We also study a new problem on graphs with cycles, Minimum Cost Circulation Decomposition (MCCD), and show that it generalises MFD through a simple reduction. For the version allowing also negative weights, we give a $(\lceil \log \Vert X \Vert \rceil +1)$-approximation ($\Vert X \Vert$ being the maximum absolute value of $X$ on any edge) using a power-of-two approach, combined with parity fixing arguments and a decomposition of unitary circulations ($\Vert X \Vert \leq 1$), using a generalised notion of width for this problem. Finally, we disprove a conjecture about the linear independence of minimum (non-negative) flow decompositions posed by Kloster et al. [ALENEX 2018], but show that its useful implication (polynomial-time assignments of weights to a given set of paths to decompose a flow) holds for the negative version.

生成對抗網絡 · 支持向量機 ·

2019 年 10 月 17 日

[付費(fei)5元(yuan)查看(kan)完整內容(rong)]Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs