In this note, we provide a refined analysis of Mitra's algorithm \cite{mitra2008clustering} for classifying general discrete mixture distribution models. Built upon spectral clustering \cite{mcsherry2001spectral}, this algorithm offers compelling conditions for probability distributions. We enhance this analysis by tailoring the model to bipartite stochastic block models, resulting in more refined conditions. Compared to those derived in \cite{mitra2008clustering}, our improved separation conditions are obtained.
Stochastic approximation is a class of algorithms that update a vector iteratively, incrementally, and stochastically, including, e.g., stochastic gradient descent and temporal difference learning. One fundamental challenge in analyzing a stochastic approximation algorithm is to establish its stability, i.e., to show that the stochastic vector iterates are bounded almost surely. In this paper, we extend the celebrated Borkar-Meyn theorem for stability from the Martingale difference noise setting to the Markovian noise setting, which greatly improves its applicability in reinforcement learning, especially in those off-policy reinforcement learning algorithms with linear function approximation and eligibility traces. Central to our analysis is the diminishing asymptotic rate of change of a few functions, which is implied by both a form of strong law of large numbers and a commonly used V4 Lyapunov drift condition and trivially holds if the Markov chain is finite and irreducible.
In a seminal work, Chiba and Nishizeki [SIAM J. Comput. `85] developed subgraph listing algorithms for triangles, 4-cycle and $k$-cliques, where $k \geq 3.$ The runtimes of their algorithms are parameterized by the number of edges $m$ and the arboricity $\alpha$ of a graph. The arboricity $\alpha$ of a graph is the minimum number of spanning forests required to cover it. Their work introduces: * A triangle listing algorithm that runs in $O(m\alpha)$ time. * An output-sensitive 4-Cycle-Listing algorithm that lists all 4-cycles in $O(m\alpha + t)$ time, where $t$ is the number of 4-cycles in the graph. * A k-Clique-Listing algorithm that runs in $O(m\alpha^{k-2})$ time, for $k \geq 4.$ Despite the widespread use of these algorithms in practice, no improvements have been made over them in the past few decades. Therefore, recent work has gone into studying lower bounds for subgraph listing problems. The works of Kopelowitz, Pettie and Porat [SODA `16] and Vassilevska W. and Xu [FOCS `20] showed that the triangle-listing algorithm of Chiba and Nishizeki is optimal under the $\mathsf{3SUM}$ and $\mathsf{APSP}$ hypotheses respectively. However, it remained open whether the remaining algorithms were optimal. In this note, we show that in fact all the above algorithms are optimal under popular hardness conjectures. First, we show that the $\mathsf{4}\text{-}\mathsf{Cycle}\text{-}\mathsf{Listing}$ algorithm is tight under the $\mathsf{3SUM}$ hypothesis following the techniques of Jin and Xu [STOC `23], and Abboud, Bringmann and Fishcher [STOC `23] . Additionally, we show that the $k\text{-}\mathsf{Clique}\text{-}\mathsf{Listing}$ algorithm is essentially tight under the exact $k$-clique hypothesis by following the techniques of Dalirooyfard, Mathialagan, Vassilevska W. and Xu [STOC `24]. These hardness results hold even when the number of 4-cycles or $k$-cliques in the graph is small.
Hashing functions, which are created to provide brief and erratic digests for the message entered, are the primary cryptographic primitives used in blockchain networks. Hashing is employed in blockchain networks to create linked block lists, which offer safe and secure distributed repository storage for critical information. Due to the unique nature of the hash search problem in blockchain networks, the most parallelization of calculations is possible. This technical report presents a performance evaluation of three popular hashing algorithms Blake3, SHA-256, and SHA-512. These hashing algorithms are widely used in various applications, such as digital signatures, message authentication, and password storage. It then discusses the performance metrics used to evaluate the algorithms, such as hash rate/throughput and memory usage. The evaluation is conducted on a range of hardware platforms, including desktop and VMs. The evaluation includes synthetic benchmarks. The results of the evaluation show that Blake3 generally outperforms both SHA-256 and SHA-512 in terms of throughput and latency. However, the performance advantage of Blake3 varies depending on the specific hardware platform and the size of the input data. The report concludes with recommendations for selecting the most suitable hashing algorithm for a given application, based on its performance requirements and security needs. The evaluation results can also inform future research and development efforts to improve the performance and security of hashing algorithms.
Since the work of Polyanskiy, Poor and Verd\'u on the finite blocklength performance of capacity-achieving codes for discrete memoryless channels, many papers have attempted to find further results for more practically relevant channels. However, it seems that the complexity of computing capacity-achieving codes has not been investigated until now. We study this question for the simplest non-trivial Gaussian channels, i.e., the additive colored Gaussian noise channel. To assess the computational complexity, we consider the classes $\mathrm{FP}_1$ and $\#\mathrm{P}_1$. $\mathrm{FP}_1$ includes functions computable by a deterministic Turing machine in polynomial time, whereas $\#\mathrm{P}_1$ encompasses functions that count the number of solutions verifiable in polynomial time. It is widely assumed that $\mathrm{FP}_1\neq\#\mathrm{P}_1$. It is of interest to determine the conditions under which, for a given $M \in \mathbb{N}$, where $M$ describes the precision of the deviation of $C(P,N)$, for a certain blocklength $n_M$ and a decoding error $\epsilon > 0$ with $\epsilon\in\mathbb{Q}$, the following holds: $R_{n_M}(\epsilon)>C(P,N)-\frac{1}{2^M}$. It is shown that there is a polynomial-time computable $N_*$ such that for sufficiently large $P_*\in\mathbb{Q}$, the sequences $\{R_{n_M}(\epsilon)\}_{{n_M}\in\mathbb{N}}$, where each $R_{n_M}(\epsilon)$ satisfies the previous condition, cannot be computed in polynomial time if $\mathrm{FP}_1\neq\#\mathrm{P}_1$. Hence, the complexity of computing the sequence $\{R_{n_M}(\epsilon)\}_{n_M\in\mathbb{N}}$ grows faster than any polynomial as $M$ increases. Consequently, it is shown that either the sequence of achievable rates $\{R_{n_M}(\epsilon)\}_{n_M\in\mathbb{N}}$ as a function of the blocklength, or the sequence of blocklengths $\{n_M\}_{M\in\mathbb{N}}$ corresponding to the achievable rates, is not a polynomial-time computable sequence.
Shared dynamics models are important for capturing the complexity and variability inherent in Human-Robot Interaction (HRI). Therefore, learning such shared dynamics models can enhance coordination and adaptability to enable successful reactive interactions with a human partner. In this work, we propose a novel approach for learning a shared latent space representation for HRIs from demonstrations in a Mixture of Experts fashion for reactively generating robot actions from human observations. We train a Variational Autoencoder (VAE) to learn robot motions regularized using an informative latent space prior that captures the multimodality of the human observations via a Mixture Density Network (MDN). We show how our formulation derives from a Gaussian Mixture Regression formulation that is typically used approaches for learning HRI from demonstrations such as using an HMM/GMM for learning a joint distribution over the actions of the human and the robot. We further incorporate an additional regularization to prevent "mode collapse", a common phenomenon when using latent space mixture models with VAEs. We find that our approach of using an informative MDN prior from human observations for a VAE generates more accurate robot motions compared to previous HMM-based or recurrent approaches of learning shared latent representations, which we validate on various HRI datasets involving interactions such as handshakes, fistbumps, waving, and handovers. Further experiments in a real-world human-to-robot handover scenario show the efficacy of our approach for generating successful interactions with four different human interaction partners.
In differentially private (DP) machine learning, the privacy guarantees of DP mechanisms are often reported and compared on the basis of a single $(\varepsilon, \delta)$-pair. This practice overlooks that DP guarantees can vary substantially even between mechanisms sharing a given $(\varepsilon, \delta)$, and potentially introduces privacy vulnerabilities which can remain undetected. This motivates the need for robust, rigorous methods for comparing DP guarantees in such cases. Here, we introduce the $\Delta$-divergence between mechanisms which quantifies the worst-case excess privacy vulnerability of choosing one mechanism over another in terms of $(\varepsilon, \delta)$, $f$-DP and in terms of a newly presented Bayesian interpretation. Moreover, as a generalisation of the Blackwell theorem, it is endowed with strong decision-theoretic foundations. Through application examples, we show that our techniques can facilitate informed decision-making and reveal gaps in the current understanding of privacy risks, as current practices in DP-SGD often result in choosing mechanisms with high excess privacy vulnerabilities.
The Paterson--Stockmeyer method is an evaluation scheme for matrix polynomials with scalar coefficients that arise in many state-of-the-art algorithms based on polynomial or rational approximation, for example, those for computing transcendental matrix functions. We derive a mixed-precision version of the Paterson--Stockmeyer method that is particularly useful for evaluating matrix polynomials with scalar coefficients of decaying magnitude. The new method is mainly of interest in the arbitrary precision arithmetic, and it is particularly attractive for high-precision computations. The key idea is to perform computations on data of small magnitude in low precision, and rounding error analysis is provided for the use of lower-than-working precisions. We focus on the evaluation of the Taylor approximants of the matrix exponential and show the applicability of our method to the existing scaling and squaring algorithms. We also demonstrate through experiments the general applicability of our method to other problems, such as computing the polynomials from the Pad\'e approximant of the matrix exponential and the Taylor approximant of the matrix cosine. Numerical experiments show our mixed-precision Paterson--Stockmeyer algorithms can be more efficient than its fixed-precision counterpart while delivering the same level of accuracy.
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing facts based on mined logic rules underlying knowledge graphs (KGs), has become a fast-growing research direction. It has been proven to significantly benefit the usage of KGs in many AI applications, such as question answering and recommendation systems, etc. According to the graph types, the existing KGR models can be roughly divided into three categories, \textit{i.e.,} static models, temporal models, and multi-modal models. The early works in this domain mainly focus on static KGR and tend to directly apply general knowledge graph embedding models to the reasoning task. However, these models are not suitable for more complex but practical tasks, such as inductive static KGR, temporal KGR, and multi-modal KGR. To this end, multiple works have been developed recently, but no survey papers and open-source repositories comprehensively summarize and discuss models in this important direction. To fill the gap, we conduct a survey for knowledge graph reasoning tracing from static to temporal and then to multi-modal KGs. Concretely, the preliminaries, summaries of KGR models, and typical datasets are introduced and discussed consequently. Moreover, we discuss the challenges and potential opportunities. The corresponding open-source repository is shared on GitHub: //github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.
The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
In multi-turn dialog, utterances do not always take the full form of sentences \cite{Carbonell1983DiscoursePA}, which naturally makes understanding the dialog context more difficult. However, it is essential to fully grasp the dialog context to generate a reasonable response. Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog. Enlightened by the multi-task learning scheme, we propose a joint framework that unifies these two tasks, sharing the same encoder to extract the common and task-invariant features with different decoders to learn task-specific features. To better fusing information from the question and the dialog history in the encoding part, we propose to augment the Transformer architecture with a memory updater, which is designed to selectively store and update the history dialog information so as to support downstream tasks. For the experiment, we employ human annotators to write and examine a large-scale dialog reading comprehension dataset. Extensive experiments are conducted on this dataset, and the results show that the proposed model brings substantial improvements over several strong baselines on both tasks. In this way, we demonstrate that reasoning can indeed help better response generation and vice versa. We release our large-scale dataset for further research.