久久久久精品电影_国产无遮挡又黄又爽不要VIP软_久久久久精品波多野吉衣无码AV_亚洲欧美中文日韩在线观看_精品国产福利观看在线福祉_欧美人与禽交在线观看_午夜高清无码在线

from arxiv, 35 pages, 6 figures. In v2: details added in some proofs and remark about estimator convergence added in the Optimality section (Sect. 1.3). In v3: added details about the global effect of error in estimating ideally cleaned singular values. In v4: typos corrected and comments added

We give a new algorithm for the estimation of the cross-covariance matrix $\mathbb{E} XY'$ of two large dimensional signals $X\in\mathbb{R}^n$, $Y\in \mathbb{R}^p$ in the context where the number $T$ of observations of the pair $(X,Y)$ is itself large, but with $T/n$ and $T/p$ not supposed to be small. In the asymptotic regime where $n,p,T$ are large, with high probability, this algorithm is optimal for the Frobenius norm among rotationally invariant estimators, i.e. estimators derived from the empirical estimator by cleaning the singular values, while letting singular vectors unchanged.

相關內容

奇異的(de)

關注 0

估計/估計量 · 優化器 · 圖 · 代價矩陣 · 相似度度量 ·

2022 年 1 月 11 日

Entropic Optimal Transport in Random Graphs

Nicolas Keriven

In graph analysis, a classic task consists in computing similarity measures between (groups of) nodes. In latent space random graphs, nodes are associated to unknown latent variables. One may then seek to compute distances directly in the latent space, using only the graph structure. In this paper, we show that it is possible to consistently estimate entropic-regularized Optimal Transport (OT) distances between groups of nodes in the latent space. We provide a general stability result for entropic OT with respect to perturbations of the cost matrix. We then apply it to several examples of random graphs, such as graphons or $\epsilon$-graphs on manifolds. Along the way, we prove new concentration results for the so-called Universal Singular Value Thresholding estimator, and for the estimation of geodesic distances on a manifold.

矩陣乘積 · 分解的 · 情景 · CASES · 離散化 ·

2022 年 1 月 11 日

Non-Sturmian sequences of matrices providing the maximum growth rate of matrix products

Victor Kozyakin

from arxiv, 27 pages, 11 figures, 47 bibliography references, 1 Python program listing, restructured/rewritten presentation

In the theory of linear switching systems with discrete time, as in other areas of mathematics, the problem of studying the growth rate of the norms of all possible matrix products $A_{\sigma_{n}}\cdots A_{\sigma_{0}}$ with factors from a set of matrices $\mathscr{A}$ arises. So far, only for a relatively small number of classes of matrices $\mathscr{A}$ has it been possible to accurately describe the sequences of matrices that guarantee the maximum rate of increase of the corresponding norms. Moreover, in almost all cases studied theoretically, the index sequences $\{\sigma_{n}\}$ of matrices maximizing the norms of the corresponding matrix products have been shown to be periodic or so-called Sturmian, which entails a whole set of "good" properties of the sequences $\{A_{\sigma_{n}}\}$, in particular the existence of a limiting frequency of occurrence of each matrix factor $A_{i}\in\mathscr{A}$ in them. In the paper it is shown that this is not always the case: a class of matrices is defined consisting of two $2\times 2$ matrices, similar to rotations in the plane, in which the sequence $\{A_{\sigma_{n}}\}$ maximizing the growth rate of the norms $\|A_{\sigma_{n}}\cdots A_{\sigma_{0}}\|$ is not Sturmian. All considerations are based on numerical modeling and cannot be considered mathematically rigorous in this part; rather, they should be interpreted as a set of questions for further comprehensive theoretical analysis.

近似 · 泛函 · 線性的 · 原點 · 情景 ·

2022 年 1 月 10 日

Matrix and tensor rigidity and $L_p$-approximation

Yuri Malykhin

In this paper we apply methods originated in Complexity theory to some problems of Approximation. We notice that the construction of Alman and Williams that disproves the rigidity of Walsh-Hadamard matrices, provides good $\ell_p$-approximation for $p<2$. It follows that the first $n$ functions of Walsh system can be approximated with an error $n^{-\delta}$ by a linear space of dimension $n^{1-\delta}$: $$ d_{n^{1-\delta}}(\{w_1,\ldots,w_n\}, L_p[0,1]) \le n^{-\delta},\quad p\in[1,2),\;\delta=\delta(p)>0. $$ We do not know if this is possible for the trigonometric system. We show that the algebraic method of Alon--Frankl--R\"odl for bounding the number of low-signum-rank matrices, works for tensors: almost all signum-tensors have large signum-rank and can't be $\ell_1$-approximated by low-rank tensors. This implies lower bounds for $\Theta_m$~ -- the error of $m$-term approximation of multivariate functions by sums of tensor products $u^1(x_1)\cdots u^d(x_d)$. In particular, for the set of trigonometric polynomials with spectrum in $\prod_{j=1}^d[-n_j,n_j]$ and of norm $\|t\|_\infty\le 1$ we have $$ \Theta_m(\mathcal T(n_1,\ldots,n_d)_\infty,L_1[-\pi,\pi]^d) \ge c_1(d)>0,\quad m\le c_2(d)\frac{\prod n_j}{\max\{n_j\}}. $$ Sharp bounds follow for classes of dominated mixed smoothness: $$ \Theta_m(W^{(r,r,\ldots,r)}_p,L_q[0,1]^d)\asymp m^{-\frac{rd}{d-1}},\quad\mbox 2\le p\le\infty,\; 1\le q\le 2. $$

奇異的 · 子空間 · 近似 · 奇異值 · 奇異值分解 ·

2022 年 1 月 9 日

A FEAST SVDsolver for the computation of singular value decompositions of large matrices based on the Chebyshev--Jackson series expansion

Zhongxiao Jia,Kailiang Zhang

from arxiv, 33, 3 figures

The FEAST eigensolver is extended to the computation of the singular triplets of a large matrix $A$ with the singular values in a given interval. It is subspace iteration in nature applied to an approximate spectral projector associated with the cross-product matrix $A^TA$ and constructs approximate left and right singular subspaces corresponding to the desired singular values, onto which $A$ is projected to obtain approximations to the desired singular triplets. Approximate spectral projectors are constructed using the Chebyshev--Jackson series expansion other than contour integration and quadrature rules, and they are proven to be always symmetric positive semi-definite with the eigenvalues in $[0,1]$. Compact estimates are established for pointwise approximation errors of a specific step function that corresponds to the exact spectral projector, the accuracy of the approximate spectral projector, the number of desired singular triplets,the distance between the desired right singular subspace and the subspace generated each iteration, and the convergence of the FEAST SVDsolver. Practical selection strategies are proposed for the series degree and the subspace dimension. Numerical experiments illustrate that the FEAST SVDsolver is robust and efficient.

鞍點 · Lipschitz連續 · Continuity · 易處理的 · Lipschitz ·

2022 年 1 月 7 日

Stochastic Saddle Point Problems with Decision-Dependent Distributions

Killian Wood,Emiliano Dall'Anese

This paper focuses on stochastic saddle point problems with decision-dependent distributions in both the static and time-varying settings. These are problems whose objective is the expected value of a stochastic payoff function, where random variables are drawn from a distribution induced by a distributional map. For general distributional maps, the problem of finding saddle points is in general computationally burdensome, even if the distribution is known. To enable a tractable solution approach, we introduce the notion of equilibrium points -- which are saddle points for the stationary stochastic minimax problem that they induce -- and provide conditions for their existence and uniqueness. We demonstrate that the distance between the two classes of solutions is bounded provided that the objective has a strongly-convex-strongly-concave payoff and Lipschitz continuous distributional map. We develop deterministic and stochastic primal-dual algorithms and demonstrate their convergence to the equilibrium point. In particular, by modeling errors emerging from a stochastic gradient estimator as sub-Weibull random variables, we provide error bounds in expectation and in high probability that hold for each iteration; moreover, we show convergence to a neighborhood in expectation and almost surely. Finally, we investigate a condition on the distributional map -- which we call opposing mixture dominance -- that ensures the objective is strongly-convex-strongly-concave. Under this assumption, we show that primal-dual algorithms converge to the saddle points in a similar fashion.

標量 · 相互獨立的 · 馬爾可夫鏈 · 圖 · binary ·

2022 年 1 月 6 日

Scalar and Matrix Chernoff Bounds from $\ell_{\infty}$-Independence

Tali Kaufman,Rasmus Kyng,Federico Soldá

We present new scalar and matrix Chernoff-style concentration bounds for a broad class of probability distributions over the binary hypercube $\{0,1\}^n$. Motivated by recent tools developed for the study of mixing times of Markov chains on discrete distributions, we say that a distribution is $\ell_\infty$-independent when the infinity norm of its influence matrix $\mathcal{I}$ is bounded by a constant. We show that any distribution which is $\ell_\infty$-independent satisfies a matrix Chernoff bound that matches the matrix Chernoff bound for independent random variables due to Tropp. Our matrix Chernoff bound is a broad generalization and strengthening of the matrix Chernoff bound of Kyng and Song (FOCS'18). Using our bound, we can conclude as a corollary that a union of $O(\log|V|)$ random spanning trees gives a spectral graph sparsifier of a graph with $|V|$ vertices with high probability, matching results for independent edge sampling, and matching lower bounds from Kyng and Song.

Fisher信息矩陣 · INFORMS · ReLU · 向量化 · 隱藏層 ·

2022 年 1 月 5 日

Approximate Spectral Decomposition of Fisher Information Matrix for Simple ReLU Networks

Yoshinari Takeishi,Masazumi Iida,Jun'ichi Takeuchi

We argue the Fisher information matrix (FIM) of one hidden layer networks with the ReLU activation function. Let $W$ denote the $d \times p$ weight matrix from the $d$-dimensional input to the hidden layer consisting of $p$ neurons, and $v$ the $p$-dimensional weight vector from the hidden layer to the scalar output. We focus on the FIM of $v$, which we denote as $I$. When $p$ is large, under certain conditions, the following approximately holds. 1) There are three major clusters in the eigenvalue distribution. 2) Since $I$ is non-negative owing to the ReLU, the first eigenvalue is the Perron-Frobenius eigenvalue. 3) For the cluster of the next maximum values, the eigenspace is spanned by the row vectors of $W$. 4) The direct sum of the eigenspace of the first eigenvalue and that of the third cluster is spanned by the set of all the vectors obtained as the Hadamard product of any pair of the row vectors of $W$. We confirmed by numerical simulation that the above is approximately correct when the number of hidden nodes is about 10000.

噪聲分布 · 優化器 · 噪聲 · 類別 · 泛函 ·

2022 年 1 月 4 日

Optimal design of the Barker proposal and other locally-balanced Metropolis-Hastings algorithms

Jure Vogrinc,Samuel Livingstone,Giacomo Zanella

from arxiv, 24 pages, 4 figures

We study the class of first-order locally-balanced Metropolis--Hastings algorithms introduced in Livingstone & Zanella (2021). To choose a specific algorithm within the class the user must select a balancing function $g:\mathbb{R} \to \mathbb{R}$ satisfying $g(t) = tg(1/t)$, and a noise distribution for the proposal increment. Popular choices within the class are the Metropolis-adjusted Langevin algorithm and the recently introduced Barker proposal. We first establish a universal limiting optimal acceptance rate of 57% and scaling of $n^{-1/3}$ as the dimension $n$ tends to infinity among all members of the class under mild smoothness assumptions on $g$ and when the target distribution for the algorithm is of the product form. In particular we obtain an explicit expression for the asymptotic efficiency of an arbitrary algorithm in the class, as measured by expected squared jumping distance. We then consider how to optimise this expression under various constraints. We derive an optimal choice of noise distribution for the Barker proposal, optimal choice of balancing function under a Gaussian noise distribution, and optimal choice of first-order locally-balanced algorithm among the entire class, which turns out to depend on the specific target distribution. Numerical simulations confirm our theoretical findings and in particular show that a bi-modal choice of noise distribution in the Barker proposal gives rise to a practical algorithm that is consistently more efficient than the original Gaussian version.

相關系數 · 協方差矩陣 · 樣本 · 可辨認的 · 幾乎必然 ·

2022 年 1 月 4 日

Large sample correlation matrices: a comparison theorem and its applications

Johannes Heiny

from arxiv, 20 pages

In this paper, we show that the diagonal of a high-dimensional sample covariance matrix stemming from $n$ independent observations of a $p$-dimensional time series with finite fourth moments can be approximated in spectral norm by the diagonal of the population covariance matrix. We assume that $n,p\to \infty$ with $p/n$ tending to a constant which might be positive or zero. As applications, we provide an approximation of the sample correlation matrix ${\mathbf R}$ and derive a variety of results for its eigenvalues. We identify the limiting spectral distribution of ${\mathbf R}$ and construct an estimator for the population correlation matrix and its eigenvalues. Finally, the almost sure limits of the extreme eigenvalues of ${\mathbf R}$ in a generalized spiked correlation model are analyzed.

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.