We present a $(1+\frac{k}{k+2})$-approximation algorithm for the Maximum $k$-dependent Set problem on bipartite graphs for any $k\ge1$. For a graph with $n$ vertices and $m$ edges, the algorithm runs in $O(k m \sqrt{n})$ time and improves upon the previously best-known approximation ratio of $1+\frac{k}{k+1}$ established by Kumar et al. [Theoretical Computer Science, 526: 90--96 (2014)]. Our proof also indicates that the algorithm retains its approximation ratio when applied to the (more general) class of K\"{o}nig-Egerv\'{a}ry graphs.
The modeling and simulation of dynamical systems is a necessary step for many control approaches. Using classical, parameter-based techniques for modeling of modern systems, e.g., soft robotics or human-robot interaction, is often challenging or even infeasible due to the complexity of the system dynamics. In contrast, data-driven approaches need only a minimum of prior knowledge and scale with the complexity of the system. In particular, Gaussian process dynamical models (GPDMs) provide very promising results for the modeling of complex dynamics. However, the control properties of these GP models are just sparsely researched, which leads to a "blackbox" treatment in modeling and control scenarios. In addition, the sampling of GPDMs for prediction purpose respecting their non-parametric nature results in non-Markovian dynamics making the theoretical analysis challenging. In this article, we present approximated GPDMs which are Markov and analyze their control theoretical properties. Among others, the approximated error is analyzed and conditions for boundedness of the trajectories are provided. The outcomes are illustrated with numerical examples that show the power of the approximated models while the the computational time is significantly reduced.
Defeaturing consists in simplifying geometrical models by removing the geometrical features that are considered not relevant for a given simulation. Feature removal and simplification of computer-aided design models enables faster simulations for engineering analysis problems, and simplifies the meshing problem that is otherwise often unfeasible. The effects of defeaturing on the analysis are then neglected and, as of today, there are basically very few strategies to quantitatively evaluate such an impact. Understanding well the effects of this process is an important step for automatic integration of design and analysis. We formalize the process of defeaturing by understanding its effect on the solution of Poisson equation defined on the geometrical model of interest containing a single feature, with Neumann boundary conditions on the feature itself. We derive an a posteriori estimator of the energy error between the solutions of the exact and the defeatured geometries in $\mathbb{R}^n$, $n\in\{2,3\}$, that is simple, reliable and efficient up to oscillations. The dependence of the estimator upon the size of the features is explicit.
We study the Maximum Independent Set (MIS) problem under the notion of stability introduced by Bilu and Linial (2010): a weighted instance of MIS is $\gamma$-stable if it has a unique optimal solution that remains the unique optimum under multiplicative perturbations of the weights by a factor of at most $\gamma\geq 1$. The goal then is to efficiently recover the unique optimal solution. In this work, we solve stable instances of MIS on several graphs classes: we solve $\widetilde{O}(\Delta/\sqrt{\log \Delta})$-stable instances on graphs of maximum degree $\Delta$, $(k - 1)$-stable instances on $k$-colorable graphs and $(1 + \varepsilon)$-stable instances on planar graphs. For general graphs, we present a strong lower bound showing that there are no efficient algorithms for $O(n^{\frac{1}{2} - \varepsilon})$-stable instances of MIS, assuming the planted clique conjecture. We also give an algorithm for $(\varepsilon n)$-stable instances. As a by-product of our techniques, we give algorithms and lower bounds for stable instances of Node Multiway Cut. Furthermore, we prove a general result showing that the integrality gap of convex relaxations of several maximization problems reduces dramatically on stable instances. Moreover, we initiate the study of certified algorithms, a notion recently introduced by Makarychev and Makarychev (2018), which is a class of $\gamma$-approximation algorithms that satisfy one crucial property: the solution returned is optimal for a perturbation of the original instance. We obtain $\Delta$-certified algorithms for MIS on graphs of maximum degree $\Delta$, and $(1+\varepsilon)$-certified algorithms on planar graphs. Finally, we analyze the algorithm of Berman and Furer (1994) and prove that it is a $\left(\frac{\Delta + 1}{3} + \varepsilon\right)$-certified algorithm for MIS on graphs of maximum degree $\Delta$ where all weights are equal to 1.
We consider the product of determinantal point processes (DPPs), a point process whose probability mass is proportional to the product of principal minors of multiple matrices, as a natural, promising generalization of DPPs. We study the computational complexity of computing its normalizing constant, which is among the most essential probabilistic inference tasks. Our complexity-theoretic results (almost) rule out the existence of efficient algorithms for this task unless the input matrices are forced to have favorable structures. In particular, we prove the following: (1) Computing $\sum_S\det({\bf A}_{S,S})^p$ exactly for every (fixed) positive even integer $p$ is UP-hard and Mod$_3$P-hard, which gives a negative answer to an open question posed by Kulesza and Taskar. (2) $\sum_S\det({\bf A}_{S,S})\det({\bf B}_{S,S})\det({\bf C}_{S,S})$ is NP-hard to approximate within a factor of $2^{O(|I|^{1-\epsilon})}$ or $2^{O(n^{1/\epsilon})}$ for any $\epsilon>0$, where $|I|$ is the input size and $n$ is the order of the input matrix. This result is stronger than the #P-hardness for the case of two matrices derived by Gillenwater. (3) There exists a $k^{O(k)}n^{O(1)}$-time algorithm for computing $\sum_S\det({\bf A}_{S,S})\det({\bf B}_{S,S})$, where $k$ is the maximum rank of $\bf A$ and $\bf B$ or the treewidth of the graph formed by nonzero entries of $\bf A$ and $\bf B$. Such parameterized algorithms are said to be fixed-parameter tractable. These results can be extended to the fixed-size case. Further, we present two applications of fixed-parameter tractable algorithms given a matrix $\bf A$ of treewidth $w$: (4) We can compute a $2^{\frac{n}{2p-1}}$-approximation to $\sum_S\det({\bf A}_{S,S})^p$ for any fractional number $p>1$ in $w^{O(wp)}n^{O(1)}$ time. (5) We can find a $2^{\sqrt n}$-approximation to unconstrained MAP inference in $w^{O(w\sqrt n)}n^{O(1)}$ time.
Given a simple graph $G$ and an integer $k$, the goal of $k$-Clique problem is to decide if $G$ contains a complete subgraph of size $k$. We say an algorithm approximates $k$-Clique within a factor $g(k)$ if it can find a clique of size at least $k / g(k)$ when $G$ is guaranteed to have a $k$-clique. Recently, it was shown that approximating $k$-Clique within a constant factor is W[1]-hard [Lin21]. We study the approximation of $k$-Clique under the Exponential Time Hypothesis (ETH). The reduction of [Lin21] already implies an $n^{\Omega(\sqrt[6]{\log k})}$-time lower bound under ETH. We improve this lower bound to $n^{\Omega(\log k)}$. Using the gap-amplification technique by expander graphs, we also prove that there is no $k^{o(1)}$ factor FPT-approximation algorithm for $k$-Clique under ETH. We also suggest a new way to prove the Parameterized Inapproximability Hypothesis (PIH) under ETH. We show that if there is no $n^{O(\frac{k}{\log k})}$ algorithm to approximate $k$-Clique within a constant factor, then PIH is true.
A permutation graph can be defined as an intersection graph of segments whose endpoints lie on two parallel lines $\ell_1$ and $\ell_2$, one on each. A bipartite permutation graph is a permutation graph which is bipartite. In the the bipartite permutation vertex deletion problem we ask for a given $n$-vertex graph, whether we can remove at most $k$ vertices to obtain a bipartite permutation graph. This problem is NP-complete but it does admit an FPT algorithm parameterized by $k$. In this paper we study the kernelization of this problem and show that it admits a polynomial kernel with $O(k^{99})$ vertices.
Efficient computation of node proximity queries such as transition probabilities, Personalized PageRank, and Katz are of fundamental importance in various graph mining and learning tasks. In particular, several recent works leverage fast node proximity computation to improve the scalability of Graph Neural Networks (GNN). However, prior studies on proximity computation and GNN feature propagation are on a case-by-case basis, with each paper focusing on a particular proximity measure. In this paper, we propose Approximate Graph Propagation (AGP), a unified randomized algorithm that computes various proximity queries and GNN feature propagation, including transition probabilities, Personalized PageRank, heat kernel PageRank, Katz, SGC, GDC, and APPNP. Our algorithm provides a theoretical bounded error guarantee and runs in almost optimal time complexity. We conduct an extensive experimental study to demonstrate AGP's effectiveness in two concrete applications: local clustering with heat kernel PageRank and node classification with GNNs. Most notably, we present an empirical study on a billion-edge graph Papers100M, the largest publicly available GNN dataset so far. The results show that AGP can significantly improve various existing GNN models' scalability without sacrificing prediction accuracy.
In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.
Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.
Many resource allocation problems in the cloud can be described as a basic Virtual Network Embedding Problem (VNEP): finding mappings of request graphs (describing the workloads) onto a substrate graph (describing the physical infrastructure). In the offline setting, the two natural objectives are profit maximization, i.e., embedding a maximal number of request graphs subject to the resource constraints, and cost minimization, i.e., embedding all requests at minimal overall cost. The VNEP can be seen as a generalization of classic routing and call admission problems, in which requests are arbitrary graphs whose communication endpoints are not fixed. Due to its applications, the problem has been studied intensively in the networking community. However, the underlying algorithmic problem is hardly understood. This paper presents the first fixed-parameter tractable approximation algorithms for the VNEP. Our algorithms are based on randomized rounding. Due to the flexible mapping options and the arbitrary request graph topologies, we show that a novel linear program formulation is required. Only using this novel formulation the computation of convex combinations of valid mappings is enabled, as the formulation needs to account for the structure of the request graphs. Accordingly, to capture the structure of request graphs, we introduce the graph-theoretic notion of extraction orders and extraction width and show that our algorithms have exponential runtime in the request graphs' maximal width. Hence, for request graphs of fixed extraction width, we obtain the first polynomial-time approximations. Studying the new notion of extraction orders we show that (i) computing extraction orders of minimal width is NP-hard and (ii) that computing decomposable LP solutions is in general NP-hard, even when restricting request graphs to planar ones.