Persistent homology is a methodology central to topological data analysis that extracts and summarizes the topological features within a dataset as a persistence diagram; it has recently gained much popularity from its myriad successful applications to many domains. However, its algebraic construction induces a metric space of persistence diagrams with a highly complex geometry. In this paper, we prove convergence of the $k$-means clustering algorithm on persistence diagram space and establish theoretical properties of the solution to the optimization problem in the Karush--Kuhn--Tucker framework. Additionally, we perform numerical experiments on various representations of persistent homology, including embeddings of persistence diagrams as well as diagrams themselves and their generalizations as persistence measures; we find that clustering performance directly on persistence diagrams and measures outperform their vectorized representations.
The coalgebraic $\mu$-calculus provides a generic semantic framework for fixpoint logics over systems whose branching type goes beyond the standard relational setup, e.g. probabilistic, weighted, or game-based. Previous work on the coalgebraic $\mu$-calculus includes an exponential-time upper bound on satisfiability checking, which however relies on the availability of tableau rules for the next-step modalities that are sufficiently well-behaved in a formally defined sense; in particular, rule matches need to be representable by polynomial-sized codes, and the sequent duals of the rules need to absorb cut. While such rule sets have been identified for some important cases, they are not known to exist in all cases of interest, in particular ones involving either integer weights as in the graded $\mu$-calculus, or real-valued weights in combination with non-linear arithmetic. In the present work, we prove the same upper complexity bound under more general assumptions, specifically regarding the complexity of the (much simpler) satisfiability problem for the underlying one-step logic, roughly described as the nesting-free next-step fragment of the logic. The bound is realized by a generic global caching algorithm that supports on-the-fly satisfiability checking. Notably, our approach directly accommodates unguarded formulae, and thus avoids use of the guardedness transformation. Example applications include new exponential-time upper bounds for satisfiability checking in an extension of the graded $\mu$-calculus with polynomial inequalities (including positive Presburger arithmetic), as well as an extension of the (two-valued) probabilistic $\mu$-calculus with polynomial inequalities.
We present C$\cdot$ASE, an efficient and effective framework that learns conditional Adversarial Skill Embeddings for physics-based characters. Our physically simulated character can learn a diverse repertoire of skills while providing controllability in the form of direct manipulation of the skills to be performed. C$\cdot$ASE divides the heterogeneous skill motions into distinct subsets containing homogeneous samples for training a low-level conditional model to learn conditional behavior distribution. The skill-conditioned imitation learning naturally offers explicit control over the character's skills after training. The training course incorporates the focal skill sampling, skeletal residual forces, and element-wise feature masking to balance diverse skills of varying complexities, mitigate dynamics mismatch to master agile motions and capture more general behavior characteristics, respectively. Once trained, the conditional model can produce highly diverse and realistic skills, outperforming state-of-the-art models, and can be repurposed in various downstream tasks. In particular, the explicit skill control handle allows a high-level policy or user to direct the character with desired skill specifications, which we demonstrate is advantageous for interactive character animation.
Consider the problem of estimating a random variable $X$ from noisy observations $Y = X+ Z$, where $Z$ is standard normal, under the $L^1$ fidelity criterion. It is well known that the optimal Bayesian estimator in this setting is the conditional median. This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian. Along the way, several other results are presented. In particular, it is demonstrated that if the conditional distribution $P_{X|Y=y}$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution. Additionally, we consider other $L^p$ losses and observe the following phenomenon: for $p \in [1,2]$, Gaussian is the only prior distribution that induces a linear optimal Bayesian estimator, and for $p \in (2,\infty)$, infinitely many prior distributions on $X$ can induce linearity. Finally, extensions are provided to encompass noise models leading to conditional distributions from certain exponential families.
We study the multi-task learning problem that aims to simultaneously analyze multiple datasets collected from different sources and learn one model for each of them. We propose a family of adaptive methods that automatically utilize possible similarities among those tasks while carefully handling their differences. We derive sharp statistical guarantees for the methods and prove their robustness against outlier tasks. Numerical experiments on synthetic and real datasets demonstrate the efficacy of our new methods.
Matrix product codes are generalizations of some well-known constructions of codes, such as Reed-Muller codes, $[u+v,u-v]$-construction, etc. Recently, a bound for the symbol-pair distance of a matrix product code was given in \cite{LEL}, and new families of MDS symbol-pair codes were constructed by using this bound. In this paper, we generalize this bound to the $b$-symbol distance of a matrix product code and determine all minimum $b$-symbol distances of Reed-Muller codes. We also give a bound for the minimum $b$-symbol distance of codes obtained from the $[u+v,u-v]$-construction, and use this bound to construct some $[2n,2n-2]_q$-linear $b$-symbol almost MDS codes with arbitrary length. All the minimum $b$-symbol distances of $[n,n-1]_q$-linear codes and $[n,n-2]_q$-linear codes for $1\leq b\leq n$ are determined. Some examples are presented to illustrate these results.
Secure multi-party computation using a physical deck of cards, often called card-based cryptography, has been extensively studied during the past decade. Card-based protocols to compute various Boolean functions have been developed. As each input bit is typically encoded by two cards, computing an $n$-variable Boolean function requires at least $2n$ cards. We are interested in optimal protocols that use exactly $2n$ cards. In particular, we focus on symmetric functions. In this paper, we formulate the problem of developing $2n$-card protocols to compute $n$-variable symmetric Boolean functions by classifying all such functions into several NPN-equivalence classes. We then summarize existing protocols that can compute some representative functions from these classes, and also solve some open problems in the cases $n=4$, 5, 6, and 7. In particular, we develop a protocol to compute a function $k$Mod3, which determines whether the sum of all inputs is congruent to $k$ modulo 3 ($k \in \{0,1,2\}$).
Most existing salient object detection methods mostly use U-Net or feature pyramid structure, which simply aggregates feature maps of different scales, ignoring the uniqueness and interdependence of them and their respective contributions to the final prediction. To overcome these, we propose the M$^3$Net, i.e., the Multilevel, Mixed and Multistage attention network for Salient Object Detection (SOD). Firstly, we propose Multiscale Interaction Block which innovatively introduces the cross-attention approach to achieve the interaction between multilevel features, allowing high-level features to guide low-level feature learning and thus enhancing salient regions. Secondly, considering the fact that previous Transformer based SOD methods locate salient regions only using global self-attention while inevitably overlooking the details of complex objects, we propose the Mixed Attention Block. This block combines global self-attention and window self-attention, aiming at modeling context at both global and local levels to further improve the accuracy of the prediction map. Finally, we proposed a multilevel supervision strategy to optimize the aggregated feature stage-by-stage. Experiments on six challenging datasets demonstrate that the proposed M$^3$Net surpasses recent CNN and Transformer-based SOD arts in terms of four metrics. Codes are available at //github.com/I2-Multimedia-Lab/M3Net.
Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.
Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.
Network representation learning in low dimensional vector space has attracted considerable attention in both academic and industrial domains. Most real-world networks are dynamic with addition/deletion of nodes and edges. The existing graph embedding methods are designed for static networks and they cannot capture evolving patterns in a large dynamic network. In this paper, we propose a dynamic embedding method, dynnode2vec, based on the well-known graph embedding method node2vec. Node2vec is a random walk based embedding method for static networks. Applying static network embedding in dynamic settings has two crucial problems: 1) Generating random walks for every time step is time consuming 2) Embedding vector spaces in each timestamp are different. In order to tackle these challenges, dynnode2vec uses evolving random walks and initializes the current graph embedding with previous embedding vectors. We demonstrate the advantages of the proposed dynamic network embedding by conducting empirical evaluations on several large dynamic network datasets.