露脸视频一区二区三区在线播放_久久久久精品波多野吉衣无码AV_国产成人精品在视频_欧美一级情欲视频免费_久久久久久久九九免费视频_亚洲夜夜欢AA一区二区三区_麻豆精品国产亚洲AV无码

from arxiv, To appear in the Journal of the Royal Statistical Society, Series B, 2021+. This version fixes a minor error (a Lipschitz assumption on the gradient of the log-target was omitted in the "Notation" section on page 5)

The use of heuristics to assess the convergence and compress the output of Markov chain Monte Carlo can be sub-optimal in terms of the empirical approximations that are produced. Typically a number of the initial states are attributed to "burn in" and removed, whilst the remainder of the chain is "thinned" if compression is also required. In this paper we consider the problem of retrospectively selecting a subset of states, of fixed cardinality, from the sample path such that the approximation provided by their empirical distribution is close to optimal. A novel method is proposed, based on greedy minimisation of a kernel Stein discrepancy, that is suitable for problems where heavy compression is required. Theoretical results guarantee consistency of the method and its effectiveness is demonstrated in the challenging context of parameter inference for ordinary differential equations. Software is available in the Stein Thinning package in Python, R and MATLAB.

相關內容

MCMC

關注 0

Machine Learning · 優化器 · 學成 · 可辨認的 · Performer ·

2021 年 11 月 10 日

Deducing of Optimal Machine Learning Algorithms for Heterogeneity

Omar Alfarisi,Zeyar Aung,Mohamed Sassi

For defining the optimal machine learning algorithm, the decision was not easy for which we shall choose. To help future researchers, we describe in this paper the optimal among the best of the algorithms. We built a synthetic data set and performed the supervised machine learning runs for five different algorithms. For heterogeneity, we identified Random Forest, among others, to be the best algorithm.

穩健性 · 學成 · Processing（編程語言） · 統計量 · 統計效率 ·

2021 年 11 月 10 日

Robust Batch Policy Learning in Markov Decision Processes

Zhengling Qi,Peng Liao

We study the offline data-driven sequential decision making problem in the framework of Markov decision process (MDP). In order to enhance the generalizability and adaptivity of the learned policy, we propose to evaluate each policy by a set of the average rewards with respect to distributions centered at the policy induced stationary distribution. Given a pre-collected dataset of multiple trajectories generated by some behavior policy, our goal is to learn a robust policy in a pre-specified policy class that can maximize the smallest value of this set. Leveraging the theory of semi-parametric statistics, we develop a statistically efficient policy learning method for estimating the de ned robust optimal policy. A rate-optimal regret bound up to a logarithmic factor is established in terms of total decision points in the dataset.

圖 · 優化器 · state-of-the-art · 線性的 · 模型評估 ·

2021 年 11 月 9 日

Graph Matching via Optimal Transport

Ali Saad-Eldin,Benjamin D. Pedigo,Carey E. Priebe,Joshua T. Vogelstein

The graph matching problem seeks to find an alignment between the nodes of two graphs that minimizes the number of adjacency disagreements. Solving the graph matching is increasingly important due to it's applications in operations research, computer vision, neuroscience, and more. However, current state-of-the-art algorithms are inefficient in matching very large graphs, though they produce good accuracy. The main computational bottleneck of these algorithms is the linear assignment problem, which must be solved at each iteration. In this paper, we leverage the recent advances in the field of optimal transport to replace the accepted use of linear assignment algorithms. We present GOAT, a modification to the state-of-the-art graph matching approximation algorithm "FAQ" (Vogelstein, 2015), replacing its linear sum assignment step with the "Lightspeed Optimal Transport" method of Cuturi (2013). The modification provides improvements to both speed and empirical matching accuracy. The effectiveness of the approach is demonstrated in matching graphs in simulated and real data examples.

奇異的 · 估計/估計量 · 優化器 · Frobenius 范數 · 奇異向量 ·

2021 年 11 月 8 日

Optimal cleaning for singular values of cross-covariance matrices

Florent Benaych-Georges,Jean-Philippe Bouchaud,Marc Potters

from arxiv, 35 pages, 6 figures. In v2: details added in some proofs and remark about estimator convergence added in the Optimality section (Sect. 1.3). In v3: added details about the global effect of error in estimating ideally cleaned singular values. In v4: typos corrected and comments added

We give a new algorithm for the estimation of the cross-covariance matrix $\mathbb{E} XY'$ of two large dimensional signals $X\in\mathbb{R}^n$, $Y\in \mathbb{R}^p$ in the context where the number $T$ of observations of the pair $(X,Y)$ is itself large, but with $T/n$ and $T/p$ not supposed to be small. In the asymptotic regime where $n,p,T$ are large, with high probability, this algorithm is optimal for the Frobenius norm among rotationally invariant estimators, i.e. estimators derived from the empirical estimator by cleaning the singular values, while letting singular vectors unchanged.

泛化理論 · 優化器 · 泛化誤差 · Networking · 特征空間 ·

2021 年 11 月 8 日

Measuring Generalization with Optimal Transport

Ching-Yao Chuang,Youssef Mroueh,Kristjan Greenewald,Antonio Torralba,Stefanie Jegelka

from arxiv, NeurIPS 2021

Understanding the generalization of deep neural networks is one of the most important tasks in deep learning. Although much progress has been made, theoretical error bounds still often behave disparately from empirical observations. In this work, we develop margin-based generalization bounds, where the margins are normalized with optimal transport costs between independent random subsets sampled from the training distribution. In particular, the optimal transport cost can be interpreted as a generalization of variance which captures the structural properties of the learned feature space. Our bounds robustly predict the generalization error, given training data and network parameters, on large scale datasets. Theoretically, we demonstrate that the concentration and separation of features play crucial roles in generalization, supporting empirical results in the literature. The code is available at \url{//github.com/chingyaoc/kV-Margin}.

MCMC · Performer · 馬爾可夫鏈蒙特卡羅 · 重要性采樣 · 馬爾可夫鏈 ·

2021 年 11 月 6 日

MCMC Algorithms for Posteriors on Matrix Spaces

Alexandros Beskos,Kengo Kamatani

from arxiv, 45 pages, 18 figures

We study Markov chain Monte Carlo (MCMC) algorithms for target distributions defined on matrix spaces. Such an important sampling problem has yet to be analytically explored. We carry out a major step in covering this gap by developing the proper theoretical framework that allows for the identification of ergodicity properties of typical MCMC algorithms, relevant in such a context. Beyond the standard Random-Walk Metropolis (RWM) and preconditioned Crank--Nicolson (pCN), a contribution of this paper in the development of a novel algorithm, termed the `Mixed' pCN (MpCN). RWM and pCN are shown not to be geometrically ergodic for an important class of matrix distributions with heavy tails. In contrast, MpCN is robust across targets with different tail behaviour and has very good empirical performance within the class of heavy-tailed distributions. Geometric ergodicity for MpCN is not fully proven in this work, as some remaining drift conditions are quite challenging to obtain owing to the complexity of the state space. We do, however, make a lot of progress towards a proof, and show in detail the last steps left for future work. We illustrate the computational performance of the various algorithms through numerical applications, including calibration on real data of a challenging model arising in financial statistics.

優化器 · 統計量 · 異常點 · Better · 估計/估計量 ·

2021 年 11 月 5 日

Outlier-Robust Optimal Transport: Duality, Structure, and Statistical Analysis

Sloan Nietert,Rachel Cummings,Ziv Goldfeld

The Wasserstein distance, rooted in optimal transport (OT) theory, is a popular discrepancy measure between probability distributions with various applications to statistics and machine learning. Despite their rich structure and demonstrated utility, Wasserstein distances are sensitive to outliers in the considered distributions, which hinders applicability in practice. Inspired by the Huber contamination model, we propose a new outlier-robust Wasserstein distance $\mathsf{W}_p^\varepsilon$ which allows for $\varepsilon$ outlier mass to be removed from each contaminated distribution. Our formulation amounts to a highly regular optimization problem that lends itself better for analysis compared to previously considered frameworks. Leveraging this, we conduct a thorough theoretical study of $\mathsf{W}_p^\varepsilon$, encompassing characterization of optimal perturbations, regularity, duality, and statistical estimation and robustness results. In particular, by decoupling the optimization variables, we arrive at a simple dual form for $\mathsf{W}_p^\varepsilon$ that can be implemented via an elementary modification to standard, duality-based OT solvers. We illustrate the benefits of our framework via applications to generative modeling with contaminated datasets.

Oracle · JACM · FOCS · 圖 · 無向 ·

2021 年 11 月 5 日

Optimal Approximate Distance Oracle for Planar Graphs

Hung Le,Christian Wulff-Nilsen

from arxiv, To appear in FOCS 2021

A $(1+\epsilon)$-approximate distance oracle of an edge-weighted graph is a data structure that returns an approximate shortest path distance between any two query vertices up to a $(1+\epsilon)$ factor. Thorup (FOCS 2001, JACM 2004) and Klein (SODA 2002) independently constructed a $(1+\epsilon)$-approximate distance oracle with $O(n\log n)$ space, measured in number of words, and $O(1)$ query time when $G$ is an undirected planar graph with $n$ vertices and $\epsilon$ is a fixed constant. Many follow-up works gave $(1+\epsilon)$-approximate distance oracles with various trade-offs between space and query time. However, improving $O(n\log n)$ space bound without sacrificing query time remains an open problem for almost two decades. In this work, we resolve this problem affirmatively by constructing a $(1+\epsilon)$-approximate distance oracle with optimal $O(n)$ space and $O(1)$ query time for undirected planar graphs and fixed $\epsilon$. We also make substantial progress for planar digraphs with non-negative edge weights. For fixed $\epsilon > 0$, we give a $(1+\epsilon)$-approximate distance oracle with space $o(n\log(Nn))$ and $O(\log\log(Nn)$ query time; here $N$ is the ratio between the largest and smallest positive edge weight. This improves Thorup's (FOCS 2001, JACM 2004) $O(n\log(Nn)\log n)$ space bound by more than a logarithmic factor while matching the query time of his structure. This is the first improvement for planar digraphs in two decades, both in the weighted and unweighted setting.

估計/估計量 · 秩 · 優化器 · 統計量 · 規范化的 ·

2021 年 11 月 5 日

Local Asymptotic Normality and Optimal Estimation of low-rank Quantum Systems

Samriddha Lahiry,Michael Nussbaum

In classical statistics, a statistical experiment consisting of $n$ i.i.d observations from d-dimensional multinomial distributions can be well approximated by a $d-1$ dimensional Gaussian distribution. In a quantum version of the result it has been shown that a collection of $n$ qudits of full rank can be well approximated by a quantum system containing a classical part, which is a $d-1$ dimensional Gaussian distribution, and a quantum part containing an ensemble of $d(d-1)/2$ shifted thermal states. In this paper, we obtain a generalization of this result when the qudits are not of full rank. We show that when the rank of the qudits is $r$, then the limiting experiment consists of an $r-1$ dimensional Gaussian distribution and an ensemble of both shifted pure and shifted thermal states. We also outline a two-stage procedure for the estimation of the low-rank qudit, where we obtain an estimator which is sharp minimax optimal. For the estimation of a linear functional of the quantum state, we construct an estimator, analyze the risk and use quantum LAN to show that our estimator is also optimal in the minimax sense.

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.