A (unit) disk graph is the intersection graph of closed (unit) disks in the plane. Almost three decades ago, an elegant polynomial-time algorithm was found for \textsc{Maximum Clique} on unit disk graphs [Clark, Colbourn, Johnson; Discrete Mathematics '90]. Since then, it has been an intriguing open question whether or not tractability can be extended to general disk graphs. We show that the disjoint union of two odd cycles is never the complement of a disk graph nor of a unit (3-dimensional) ball graph. From that fact and existing results, we derive a simple QPTAS and a subexponential algorithm running in time $2^{\tilde{O}(n^{2/3})}$ for \textsc{Maximum Clique} on disk and unit ball graphs. We then obtain a randomized EPTAS for computing the independence number on graphs having no disjoint union of two odd cycles as an induced subgraph, bounded VC-dimension, and linear independence number. This, in combination with our structural results, yields a randomized EPTAS for \textsc{Max Clique} on disk and unit ball graphs. \textsc{Max Clique} on unit ball graphs is equivalent to finding, given a collection of points in $\mathbb R^3$, a maximum subset of points with diameter at most some fixed value. In stark contrast, \textsc{Maximum Clique} on ball graphs and unit $4$-dimensional ball graphs, as well as intersection graphs of filled ellipses (even close to unit disks) or filled triangles is unlikely to have such algorithms. Indeed, we show that, for all those problems, there is a constant ratio of approximation which cannot be attained even in time $2^{n^{1-\varepsilon}}$, unless the Exponential Time Hypothesis fails.
We propose a simple and efficient clustering method for high-dimensional data with a large number of clusters. Our algorithm achieves high-performance by evaluating distances of datapoints with a subset of the cluster centres. Our contribution is substantially more efficient than k-means as it does not require an all to all comparison of data points and clusters. We show that the optimal solutions of our approximation are the same as in the exact solution. However, our approach is considerably more efficient at extracting these clusters compared to the state-of-the-art. We compare our approximation with the exact k-means and alternative approximation approaches on a series of standardised clustering tasks. For the evaluation, we consider the algorithmic complexity, including number of operations to convergence, and the stability of the results.
In 1943, Hadwiger conjectured that every graph with no $K_t$ minor is $(t-1)$-colorable for every $t\ge 1$. In the 1980s, Kostochka and Thomason independently proved that every graph with no $K_t$ minor has average degree $O(t\sqrt{\log t})$ and hence is $O(t\sqrt{\log t})$-colorable. Recently, Norin, Song and the second author showed that every graph with no $K_t$ minor is $O(t(\log t)^{\beta})$-colorable for every $\beta > 1/4$, making the first improvement on the order of magnitude of the $O(t\sqrt{\log t})$ bound. The first main result of this paper is that every graph with no $K_t$ minor is $O(t\log\log t)$-colorable. This is a corollary of our main technical result that the chromatic number of a $K_t$-minor-free graph is bounded by $O(t(1+f(G,t)))$ where $f(G,t)$ is the maximum of $\frac{\chi(H)}{a}$ over all $a\ge \frac{t}{\sqrt{\log t}}$ and $K_a$-minor-free subgraphs $H$ of $G$ that are small (i.e. $O(a\log^4 a)$ vertices). This has a number of interesting corollaries. First as mentioned, using the current best-known bounds on coloring small $K_t$-minor-free graphs, we show that $K_t$-minor-free graphs are $O(t\log\log t)$-colorable. Second, it shows that proving Linear Hadwiger's Conjecture (that $K_t$-minor-free graphs are $O(t)$-colorable) reduces to proving it for small graphs. Third, we prove that $K_t$-minor-free graphs with clique number at most $\sqrt{\log t}/ (\log \log t)^2$ are $O(t)$-colorable. This implies our final corollary that Linear Hadwiger's Conjecture holds for $K_r$-free graphs for every fixed $r$. One key to proving the main theorem is a new standalone result that every $K_t$-minor-free graph of average degree $d=\Omega(t)$ has a subgraph on $O(t \log^3 t)$ vertices with average degree $\Omega(d)$.
Huang proved that every set of more than half the vertices of the $d$-dimensional hypercube $Q_d$ induces a subgraph of maximum degree at least $\sqrt{d}$, which is tight by a result of Chung, F\"uredi, Graham, and Seymour. Huang asked whether similar results can be obtained for other highly symmetric graphs. First, we present three infinite families of Cayley graphs of unbounded degree that contain induced subgraphs of maximum degree $1$ on more than half the vertices. In particular, this refutes a conjecture of Potechin and Tsang, for which first counterexamples were shown recently by Lehner and Verret. The first family consists of dihedrants and contains a sporadic counterexample encountered earlier by Lehner and Verret. The second family are star graphs, these are edge-transitive Cayley graphs of the symmetric group. All members of the third family are $d$-regular containing an induced matching on a $\frac{d}{2d-1}$-fraction of the vertices. This is largest possible and answers a question of Lehner and Verret. Second, we consider Huang's lower bound for graphs with subcubes and show that the corresponding lower bound is tight for products of Coxeter groups of type $\mathbf{A_n}$, $\mathbf{I_2}(2k+1)$, and most exceptional cases. We believe that Coxeter groups are a suitable generalization of the hypercube with respect to Huang's question. Finally, we show that induced subgraphs on more than half the vertices of Levi graphs of projective planes and of the Ramanujan graphs of Lubotzky, Phillips, and Sarnak have unbounded degree. This gives classes of Cayley graphs with properties similar to the ones provided by Huang's results. However, in contrast to Coxeter groups these graphs have no subcubes.
We previously proposed the first nontrivial examples of a code having support $t$-designs for all weights obtained from the Assmus-Mattson theorem and having support $t'$-designs for some weights with some $t'>t$. This suggests the possibility of generalizing the Assmus-Mattson theorem, which is very important in design and coding theory. In the present paper, we generalize this example as a strengthening of the Assmus-Mattson theorem along this direction. As a corollary, we provide a new characterization of the extended Golay code $\mathcal{G}_{24}$.
This study concerns probability distribution estimation of sample maximum. The traditional approach is the parametric fitting to the limiting distribution - the generalized extreme value distribution; however, the model in finite cases is misspecified to a certain extent. We propose a plug-in type of the kernel distribution estimator which does not need model specification. It is proved that both asymptotic convergence rates depend on the tail index and the second order parameter. As the tail gets light, the degree of misspecification of the parametric fitting becomes large, that means the convergence rate becomes slow. In the Weibull cases, which can be seen as the limit of tail-lightness, only the nonparametric distribution estimator keeps its consistency. Finally, we report results of numerical experiments and two real case studies.
Let $Q_{n}^{r}$ be the graph with vertex set $\{-1,1\}^{n}$ in which two vertices are joined if their Hamming distance is at most $r$. The edge-isoperimetric problem for $Q_{n}^{r}$ is that: For every $(n,r,M)$ such that $1\le r\le n$ and $1\le M\le2^{n}$, determine the minimum edge-boundary size of a subset of vertices of $Q_{n}^{r}$ with a given size $M$. In this paper, we apply two different approaches to prove bounds for this problem. The first approach is a linear programming approach and the second is a probabilistic approach. Our bound derived by the first approach generalizes the tight bound for $M=2^{n-1}$ derived by Kahn, Kalai, and Linial in 1989. Moreover, our bound is also tight for $M=2^{n-2}$ and $r\le\frac{n}{2}-1$. Our bounds derived by the second approach are expressed in terms of the \emph{noise stability}, and they are shown to be asymptotically tight as $n\to\infty$ when $r=2\lfloor\frac{\beta n}{2}\rfloor+1$ and $M=\lfloor\alpha2^{n}\rfloor$ for fixed $\alpha,\beta\in(0,1)$, and is tight up to a factor $2$ when $r=2\lfloor\frac{\beta n}{2}\rfloor$ and $M=\lfloor\alpha2^{n}\rfloor$. In fact, the edge-isoperimetric problem is equivalent to a ball-noise stability problem which is a variant of the traditional (i.i.d.-) noise stability problem. Our results can be interpreted as bounds for the ball-noise stability problem.
For $m, d \in {\mathbb N}$, a jittered sampling point set $P$ having $N = m^d$ points in $[0,1)^d$ is constructed by partitioning the unit cube $[0,1)^d$ into $m^d$ axis-aligned cubes of equal size and then placing one point independently and uniformly at random in each cube. We show that there are constants $c \ge 0$ and $C$ such that for all $d$ and all $m \ge d$ the expected non-normalized star discrepancy of a jittered sampling point set satisfies \[c \,dm^{\frac{d-1}{2}} \sqrt{1 + \log(\tfrac md)} \le {\mathbb E} D^*(P) \le C\, dm^{\frac{d-1}{2}} \sqrt{1 + \log(\tfrac md)}.\] This discrepancy is thus smaller by a factor of $\Theta\big(\sqrt{\frac{1+\log(m/d)}{m/d}}\,\big)$ than the one of a uniformly distributed random point set of $m^d$ points. This result improves both the upper and the lower bound for the discrepancy of jittered sampling given by Pausinger and Steinerberger (Journal of Complexity (2016)). It also removes the asymptotic requirement that $m$ is sufficiently large compared to $d$.
The mutual information (MI) of Gaussian multi-input multi-output (MIMO) channels has been evaluated by utilizing random matrix theory (RMT) and shown to asymptotically follow Gaussian distribution, where the ergodic mutual information (EMI) converges to a deterministic quantity. However, with non-Gaussian channels, there is a bias between the EMI and its deterministic equivalent (DE), whose evaluation is not available in the literature. This bias of the EMI is related to the bias for the trace of the resolvent in large RMT. In this paper, we first derive the bias for the trace of the resolvent, which is further extended to compute the bias for the linear spectral statistics (LSS). Then, we apply the above results on non-Gaussian MIMO channels to determine the bias for the EMI. It is also proved that the bias for the EMI is $-0.5$ times of that for the variance of the MI. Finally, the derived bias is utilized to modify the central limit theory (CLT) and calculate the outage probability. Numerical results show that the modified CLT significantly outperforms previous methods in approximating the distribution of the MI and improves the accuracy for the outage probability evaluation.
We initiate a systematic study on $\mathit{dynamic}$ $\mathit{influence}$ $\mathit{maximization}$ (DIM). In the DIM problem, one maintains a seed set $S$ of at most $k$ nodes in a dynamically involving social network, with the goal of maximizing the expected influence spread while minimizing the amortized updating cost. We consider two evolution models. In the $\mathit{incremental}$ model, the social network gets enlarged over time and one only introduces new users and establishes new social links, we design an algorithm that achieves $(1-1/e-\epsilon)$-approximation to the optimal solution and has $k \cdot\mathsf{poly}(\log n, \epsilon^{-1})$ amortized running time, which matches the state-of-art offline algorithm with only poly-logarithmic overhead. In the $\mathit{fully}$ $\mathit{dynamic}$ model, users join in and leave, influence propagation gets strengthened or weakened in real time, we prove that under the Strong Exponential Time Hypothesis (SETH), no algorithm can achieve $2^{-(\log n)^{1-o(1)}}$-approximation unless the amortized running time is $n^{1-o(1)}$. On the technical side, we exploit novel adaptive sampling approaches that reduce DIM to the dynamic MAX-k coverage problem, and design an efficient $(1-1/e-\epsilon)$-approximation algorithm for it. Our lower bound leverages the recent developed distributed PCP framework.
We present a randomized $O(m \log^2 n)$ work, $O(\text{polylog } n)$ depth parallel algorithm for minimum cut. This algorithm matches the work bounds of a recent sequential algorithm by Gawrychowski, Mozes, and Weimann [ICALP'20], and improves on the previously best parallel algorithm by Geissmann and Gianinazzi [SPAA'18], which performs $O(m \log^4 n)$ work in $O(\text{polylog } n)$ depth. Our algorithm makes use of three components that might be of independent interest. Firstly, we design a parallel data structure that efficiently supports batched mixed queries and updates on trees. It generalizes and improves the work bounds of a previous data structure of Geissmann and Gianinazzi and is work efficient with respect to the best sequential algorithm. Secondly, we design a parallel algorithm for approximate minimum cut that improves on previous results by Karger and Motwani. We use this algorithm to give a work-efficient procedure to produce a tree packing, as in Karger's sequential algorithm for minimum cuts. Lastly, we design an efficient parallel algorithm for solving the minimum $2$-respecting cut problem.