We propose an efficient algorithm for matching two correlated Erd\H{o}s--R\'enyi graphs with $n$ vertices whose edges are correlated through a latent vertex correspondence. When the edge density $q= n^{- \alpha+o(1)}$ for a constant $\alpha \in [0,1)$, we show that our algorithm has polynomial running time and succeeds to recover the latent matching as long as the edge correlation is non-vanishing. This is closely related to our previous work on a polynomial-time algorithm that matches two Gaussian Wigner matrices with non-vanishing correlation, and provides the first polynomial-time random graph matching algorithm (regardless of the regime of $q$) when the edge correlation is below the square root of the Otter's constant (which is $\approx 0.338$).
The problem of matching markets has been studied for a long time in the literature due to its wide range of applications. Finding a stable matching is a common equilibrium objective in this problem. Since market participants are usually uncertain of their preferences, a rich line of recent works study the online setting where one-side participants (players) learn their unknown preferences from iterative interactions with the other side (arms). Most previous works in this line are only able to derive theoretical guarantees for player-pessimal stable regret, which is defined compared with the players' least-preferred stable matching. However, under the pessimal stable matching, players only obtain the least reward among all stable matchings. To maximize players' profits, player-optimal stable matching would be the most desirable. Though \citet{basu21beyond} successfully bring an upper bound for player-optimal stable regret, their result can be exponentially large if players' preference gap is small. Whether a polynomial guarantee for this regret exists is a significant but still open problem. In this work, we provide a new algorithm named explore-then-Gale-Shapley (ETGS) and show that the optimal stable regret of each player can be upper bounded by $O(K\log T/\Delta^2)$ where $K$ is the number of arms, $T$ is the horizon and $\Delta$ is the players' minimum preference gap among the first $N+1$-ranked arms. This result significantly improves previous works which either have a weaker player-pessimal stable matching objective or apply only to markets with special assumptions. When the preferences of participants satisfy some special conditions, our regret upper bound also matches the previously derived lower bound.
We contribute to the sparsely populated area of unsupervised deep graph matching with application to keypoint matching in images. Contrary to the standard \emph{supervised} approach, our method does not require ground truth correspondences between keypoint pairs. Instead, it is self-supervised by enforcing consistency of matchings between images of the same object category. As the matching and the consistency loss are discrete, their derivatives cannot be straightforwardly used for learning. We address this issue in a principled way by building our method upon the recent results on black-box differentiation of combinatorial solvers. This makes our method exceptionally flexible, as it is compatible with arbitrary network architectures and combinatorial solvers. Our experimental evaluation suggests that our technique sets a new state-of-the-art for unsupervised graph matching.
This paper is focused on the approximation of the Euler equations of compressible fluid dynamics on a staggered mesh. With this aim, the flow parameters are described by the velocity, the density and the internal energy. The thermodynamic quantities are described on the elements of the mesh, and thus the approximation is only in $L^2$, while the kinematic quantities are globally continuous. The method is general in the sense that the thermodynamic and kinetic parameters are described by an arbitrary degree of polynomials. In practice, the difference between the degrees of the kinematic parameters and the thermodynamic ones {is set} to $1$. The integration in time is done using the forward Euler method but can be extended straightforwardly to higher-order methods. In order to guarantee that the limit solution will be a weak solution of the problem, we introduce a general correction method in the spirit of the Lagrangian staggered method described in \cite{Svetlana,MR4059382, MR3023731}, and we prove a Lax Wendroff theorem. The proof is valid for multidimensional versions of the scheme, even though most of the numerical illustrations in this work, on classical benchmark problems, are one-dimensional because we have easy access to the exact solution for comparison. We conclude by explaining that the method is general and can be used in different settings, for example, Finite Volume, or discontinuous Galerkin method, not just the specific one presented in this paper.
We study the operator norm discrepancy of i.i.d. random matrices, initiating the matrix-valued analog of a long line of work on the $\ell^{\infty}$ norm discrepancy of i.i.d. random vectors. First, we give a new analysis of the matrix hyperbolic cosine algorithm of Zouzias (2011), a matrix version of an online vector discrepancy algorithm of Spencer (1977) studied for average-case inputs by Bansal and Spencer (2020), for the case of i.i.d. random matrix inputs. We both give a general analysis and extract concrete bounds on the discrepancy achieved by this algorithm for matrices with independent entries and positive semidefinite matrices drawn from Wishart distributions. Second, using the first moment method, we give lower bounds on the discrepancy of random matrices, in particular showing that the matrix hyperbolic cosine algorithm achieves optimal discrepancy up to logarithmic terms in several cases. We both treat the special case of the Gaussian orthogonal ensemble and give a general result for low-rank matrix distributions that we apply to orthogonally invariant random projections.
We study dynamic algorithms in the model of algorithms with predictions. We assume the algorithm is given imperfect predictions regarding future updates, and we ask how such predictions can be used to improve the running time. This can be seen as a model interpolating between classic online and offline dynamic algorithms. Our results give smooth tradeoffs between these two extreme settings. First, we give algorithms for incremental and decremental transitive closure and approximate APSP that take as an additional input a predicted sequence of updates (edge insertions, or edge deletions, respectively). They preprocess it in $\tilde{O}(n^{(3+\omega)/2})$ time, and then handle updates in $\tilde{O}(1)$ worst-case time and queries in $\tilde{O}(\eta^2)$ worst-case time. Here $\eta$ is an error measure that can be bounded by the maximum difference between the predicted and actual insertion (deletion) time of an edge, i.e., by the $\ell_\infty$-error of the predictions. The second group of results concerns fully dynamic problems with vertex updates, where the algorithm has access to a predicted sequence of the next $n$ updates. We show how to solve fully dynamic triangle detection, maximum matching, single-source reachability, and more, in $O(n^{\omega-1}+n\eta_i)$ worst-case update time. Here $\eta_i$ denotes how much earlier the $i$-th update occurs than predicted. Our last result is a reduction that transforms a worst-case incremental algorithm without predictions into a fully dynamic algorithm which is given a predicted deletion time for each element at the time of its insertion. As a consequence we can, e.g., maintain fully dynamic exact APSP with such predictions in $\tilde{O}(n^2)$ worst-case vertex insertion time and $\tilde{O}(n^2 (1+\eta_i))$ worst-case vertex deletion time (for the prediction error $\eta_i$ defined as above).
This paper develops an approximation to the (effective) $p$-resistance and applies it to multi-class clustering. Spectral methods based on the graph Laplacian and its generalization to the graph $p$-Laplacian have been a backbone of non-euclidean clustering techniques. The advantage of the $p$-Laplacian is that the parameter $p$ induces a controllable bias on cluster structure. The drawback of $p$-Laplacian eigenvector based methods is that the third and higher eigenvectors are difficult to compute. Thus, instead, we are motivated to use the $p$-resistance induced by the $p$-Laplacian for clustering. For $p$-resistance, small $p$ biases towards clusters with high internal connectivity while large $p$ biases towards clusters of small "extent," that is a preference for smaller shortest-path distances between vertices in the cluster. However, the $p$-resistance is expensive to compute. We overcome this by developing an approximation to the $p$-resistance. We prove upper and lower bounds on this approximation and observe that it is exact when the graph is a tree. We also provide theoretical justification for the use of $p$-resistance for clustering. Finally, we provide experiments comparing our approximated $p$-resistance clustering to other $p$-Laplacian based methods.
In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.
We employ a toolset -- dubbed Dr. Frankenstein -- to analyse the similarity of representations in deep neural networks. With this toolset, we aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer. We demonstrate that the inner representations emerging in deep convolutional neural networks with the same architecture but different initializations can be matched with a surprisingly high degree of accuracy even with a single, affine stitching layer. We choose the stitching layer from several possible classes of linear transformations and investigate their performance and properties. The task of matching representations is closely related to notions of similarity. Using this toolset, we also provide a novel viewpoint on the current line of research regarding similarity indices of neural network representations: the perspective of the performance on a task.
Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.
Learning from a few examples remains a key challenge in machine learning. Despite recent advances in important domains such as vision and language, the standard supervised deep learning paradigm does not offer a satisfactory solution for learning new concepts rapidly from little data. In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types. We then define one-shot learning problems on vision (using Omniglot, ImageNet) and language tasks. Our algorithm improves one-shot accuracy on ImageNet from 87.6% to 93.2% and from 88.0% to 93.8% on Omniglot compared to competing approaches. We also demonstrate the usefulness of the same model on language modeling by introducing a one-shot task on the Penn Treebank.