亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper we analyze a simple spectral method (EIG1) for the problem of matrix alignment, consisting in aligning their leading eigenvectors: given two matrices $A$ and $B$, we compute $v_1$ and $v'_1$ two corresponding leading eigenvectors. The algorithm returns the permutation $\hat{\pi}$ such that the rank of coordinate $\hat{\pi}(i)$ in $v_1$ and that of coordinate $i$ in $v'_1$ (up to the sign of $v'_1$) are the same. We consider a model of weighted graphs where the adjacency matrix $A$ belongs to the Gaussian Orthogonal Ensemble (GOE) of size $N \times N$, and $B$ is a noisy version of $A$ where all nodes have been relabeled according to some planted permutation $\pi$, namely $B= \Pi^T (A+\sigma H) \Pi $, where $\Pi$ is the permutation matrix associated with $\pi$ and $H$ is an independent copy of $A$. We show the following zero-one law: with high probability, under the condition $\sigma N^{7/6+\epsilon} \to 0$ for some $\epsilon>0$, EIG1 recovers all but a vanishing part of the underlying permutation $\pi$, whereas if $\sigma N^{7/6-\epsilon} \to \infty$, this method cannot recover more than $o(N)$ correct matches. This result gives an understanding of the simplest and fastest spectral method for matrix alignment (or complete weighted graph alignment), and involves proof methods and techniques which could be of independent interest.

相關內容

Estimation of the precision matrix (or inverse covariance matrix) is of great importance in statistical data analysis. However, as the number of parameters scales quadratically with the dimension p, computation becomes very challenging when p is large. In this paper, we propose an adaptive sieving reduction algorithm to generate a solution path for the estimation of precision matrices under the $\ell_1$ penalized D-trace loss, with each subproblem being solved by a second-order algorithm. In each iteration of our algorithm, we are able to greatly reduce the number of variables in the problem based on the Karush-Kuhn-Tucker (KKT) conditions and the sparse structure of the estimated precision matrix in the previous iteration. As a result, our algorithm is capable of handling datasets with very high dimensions that may go beyond the capacity of the existing methods. Moreover, for the sub-problem in each iteration, other than solving the primal problem directly, we develop a semismooth Newton augmented Lagrangian algorithm with global linear convergence on the dual problem to improve the efficiency. Theoretical properties of our proposed algorithm have been established. In particular, we show that the convergence rate of our algorithm is asymptotically superlinear. The high efficiency and promising performance of our algorithm are illustrated via extensive simulation studies and real data applications, with comparison to several state-of-the-art solvers.

Low rank matrix recovery problems, including matrix completion and matrix sensing, appear in a broad range of applications. In this work we present GNMR -- an extremely simple iterative algorithm for low rank matrix recovery, based on a Gauss-Newton linearization. On the theoretical front, we derive recovery guarantees for GNMR in both the matrix sensing and matrix completion settings. A key property of GNMR is that it implicitly keeps the factor matrices approximately balanced throughout its iterations. On the empirical front, we show that for matrix completion with uniform sampling, GNMR performs better than several popular methods, especially when given very few observations close to the information limit.

For a graph $G$, let $\lambda_2(G)$ denote its second smallest Laplacian eigenvalue. It was conjectured that $\lambda_2(G) + \lambda_2(\overline{G}) \geq 1$, where $\bar{G}$ is the complement of $G$. Here, we prove this conjecture in the general case. Also, we will show that $\max\{\lambda_2(G), \lambda_2(\overline{G})\} \geq 1 - O(n^{-\frac 13})$, where $n$ is the number of vertices of $G$.

In 1998, Reed conjectured that every graph $G$ satisfies $\chi(G) \leq \lceil \frac{1}{2}(\Delta(G) + 1 + \omega(G))\rceil$, where $\chi(G)$ is the chromatic number of $G$, $\Delta(G)$ is the maximum degree of $G$, and $\omega(G)$ is the clique number of $G$. As evidence for his conjecture, he proved an "epsilon version" of it, i.e. that there exists some $\varepsilon > 0$ such that $\chi(G) \leq (1 - \varepsilon)(\Delta(G) + 1) + \varepsilon\omega(G)$. It is natural to ask if Reed's conjecture or an epsilon version of it is true for the list-chromatic number. In this paper we consider a "local version" of the list-coloring version of Reed's conjecture. Namely, we conjecture that if $G$ is a graph with list-assignment $L$ such that for each vertex $v$ of $G$, $|L(v)| \geq \lceil \frac{1}{2}(d(v) + 1 + \omega(v))\rceil$, where $d(v)$ is the degree of $v$ and $\omega(v)$ is the size of the largest clique containing $v$, then $G$ is $L$-colorable. Our main result is that an "epsilon version" of this conjecture is true, under some mild assumptions. Using this result, we also prove a significantly improved lower bound on the density of $k$-critical graphs with clique number less than $k/2$, as follows. For every $\alpha > 0$, if $\varepsilon \leq \frac{\alpha^2}{1350}$, then if $G$ is an $L$-critical graph for some $k$-list-assignment $L$ such that $\omega(G) < (\frac{1}{2} - \alpha)k$ and $k$ is sufficiently large, then $G$ has average degree at least $(1 + \varepsilon)k$. This implies that for every $\alpha > 0$, there exists $\varepsilon > 0$ such that if $G$ is a graph with $\omega(G)\leq (\frac{1}{2} - \alpha)\mathrm{mad}(G)$, where $\mathrm{mad}(G)$ is the maximum average degree of $G$, then $\chi_\ell(G) \leq \left\lceil (1 - \varepsilon)(\mathrm{mad}(G) + 1) + \varepsilon \omega(G)\right\rceil$.

Coloring unit-disk graphs efficiently is an important problem in the global and distributed setting, with applications in radio channel assignment problems when the communication relies on omni-directional antennas of the same power. In this context it is important to bound not only the complexity of the coloring algorithms, but also the number of colors used. In this paper, we consider two natural distributed settings. In the location-aware setting (when nodes know their coordinates in the plane), we give a constant time distributed algorithm coloring any unit-disk graph $G$ with at most $(3+\epsilon)\omega(G)+6$ colors, for any constant $\epsilon>0$, where $\omega(G)$ is the clique number of $G$. This improves upon a classical 3-approximation algorithm for this problem, for all unit-disk graphs whose chromatic number significantly exceeds their clique number. When nodes do not know their coordinates in the plane, we give a distributed algorithm in the LOCAL model that colors every unit-disk graph $G$ with at most $5.68\omega(G)$ colors in $O(2^{\sqrt{\log \log n}})$ rounds. Moreover, when $\omega(G)=O(1)$, the algorithm runs in $O(\log^* n)$ rounds. This algorithm is based on a study of the local structure of unit-disk graphs, which is of independent interest. We conjecture that every unit-disk graph $G$ has average degree at most $4\omega(G)$, which would imply the existence of a $O(\log n)$ round algorithm coloring any unit-disk graph $G$ with (approximatively) $4\omega(G)$ colors.

We introduce the Gaussian orthogonal latent factor processes for modeling and predicting large correlated data. To handle the computational challenge, we first decompose the likelihood function of the Gaussian random field with a multi-dimensional input domain into a product of densities at the orthogonal components with lower-dimensional inputs. The continuous-time Kalman filter is implemented to compute the likelihood function efficiently without making approximations. We also show that the posterior distribution of the factor processes is independent, as a consequence of prior independence of factor processes and orthogonal factor loading matrix. For studies with large sample sizes, we propose a flexible way to model the mean, and we derive the marginal posterior distribution to solve identifiability issues in sampling these parameters. Both simulated and real data applications confirm the outstanding performance of this method.

The area of Data Analytics on graphs promises a paradigm shift as we approach information processing of classes of data, which are typically acquired on irregular but structured domains (social networks, various ad-hoc sensor networks). Yet, despite its long history, current approaches mostly focus on the optimization of graphs themselves, rather than on directly inferring learning strategies, such as detection, estimation, statistical and probabilistic inference, clustering and separation from signals and data acquired on graphs. To fill this void, we first revisit graph topologies from a Data Analytics point of view, and establish a taxonomy of graph networks through a linear algebraic formalism of graph topology (vertices, connections, directivity). This serves as a basis for spectral analysis of graphs, whereby the eigenvalues and eigenvectors of graph Laplacian and adjacency matrices are shown to convey physical meaning related to both graph topology and higher-order graph properties, such as cuts, walks, paths, and neighborhoods. Next, to illustrate estimation strategies performed on graph signals, spectral analysis of graphs is introduced through eigenanalysis of mathematical descriptors of graphs and in a generic way. Finally, a framework for vertex clustering and graph segmentation is established based on graph spectral representation (eigenanalysis) which illustrates the power of graphs in various data association tasks. The supporting examples demonstrate the promise of Graph Data Analytics in modeling structural and functional/semantic inferences. At the same time, Part I serves as a basis for Part II and Part III which deal with theory, methods and applications of processing Data on Graphs and Graph Topology Learning from data.

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

We study the problem of learning a latent variable model from a stream of data. Latent variable models are popular in practice because they can explain observed data in terms of unobserved concepts. These models have been traditionally studied in the offline setting. The online EM is arguably the most popular algorithm for learning latent variable models online. Although it is computationally efficient, it typically converges to a local optimum. In this work, we develop a new online learning algorithm for latent variable models, which we call SpectralLeader. SpectralLeader always converges to the global optimum, and we derive a $O(\sqrt{n})$ upper bound up to log factors on its $n$-step regret in the bag-of-words model. We show that SpectralLeader performs similarly to or better than the online EM with tuned hyper-parameters, in both synthetic and real-world experiments.

Review-based recommender systems have gained noticeable ground in recent years. In addition to the rating scores, those systems are enriched with textual evaluations of items by the users. Neural language processing models, on the other hand, have already found application in recommender systems, mainly as a means of encoding user preference data, with the actual textual description of items serving only as side information. In this paper, a novel approach to incorporating the aforementioned models into the recommendation process is presented. Initially, a neural language processing model and more specifically the paragraph vector model is used to encode textual user reviews of variable length into feature vectors of fixed length. Subsequently this information is fused along with the rating scores in a probabilistic matrix factorization algorithm, based on maximum a-posteriori estimation. The resulting system, ParVecMF, is compared to a ratings' matrix factorization approach on a reference dataset. The obtained preliminary results on a set of two metrics are encouraging and may stimulate further research in this area.

北京阿比特科技有限公司