In 1982, Tuza conjectured that the size $\tau(G)$ of a minimum set of edges that intersects every triangle of a graph $G$ is at most twice the size $\nu(G)$ of a maximum set of edge-disjoint triangles of $G$. This conjecture was proved for several graph classes. In this paper, we present three results regarding Tuza's Conjecture for dense graphs. By using a probabilistic argument, Tuza proved its conjecture for graphs on $n$ vertices with minimum degree at least $\frac{7n}{8}$. We extend this technique to show that Tuza's conjecture is valid for split graphs with minimum degree at least $\frac{3n}{5}$; and that $\tau(G) < \frac{28}{15}\nu(G)$ for every tripartite graph with minimum degree more than $\frac{33n}{56}$. Finally, we show that $\tau(G)\leq \frac{3}{2}\nu(G)$ when $G$ is a complete 4-partite graph. Moreover, this bound is tight.
Given a graph $G=(V,E)$ and an integer $k\in \mathbb{N}$, we investigate the 2-Eigenvalue Vertex Deletion (2-EVD) problem. The objective is to remove at most $k$ vertices such that the adjacency matrix of the resulting graph has at most two eigenvalues. It is established that the adjacency matrix of a graph has at most two eigenvalues if and only if the graph is a collection of equal-sized cliques. Thus, the 2-Eigenvalue Vertex Deletion amounts to removing a set of at most $k$ vertices to transform the graph into a collection of equal-sized cliques. The 2-Eigenvalue Edge Editing (2-EEE), 2-Eigenvalue Edge Deletion (2-EED) and 2-Eigenvalue Edge Addition (2-EEA) problems are defined analogously. We present a kernel of size $\mathcal{O}(k^{3})$ for $2$-EVD, along with an FPT algorithm with a running time of $\mathcal{O}^{*}(2^{k})$. For the problem $2$-EEE, we provide a kernel of size $\mathcal{O}(k^{2})$. Additionally, we present linear kernels of size $5k$ and $6k$ for $2$-EEA and $2$-EED respectively. For the $2$-EED, we also construct an algorithm with running time $\mathcal{O}^{*}(1.47^{k})$ . These results address open questions posed by Misra et al. (ISAAC 2023) regarding the complexity of these problems when parameterized by the solution size.
Given a (multi)graph $G$ which contains a bipartite subgraph with $\rho$ edges, what is the largest triangle-free subgraph of $G$ that can be found efficiently? We present an SDP-based algorithm that finds one with at least $0.8823 \rho$ edges, thus improving on the subgraph with $0.878 \rho$ edges obtained by the classic Max-Cut algorithm of Goemans and Williamson. On the other hand, by a reduction from Hastad's 3-bit PCP we show that it is NP-hard to find a triangle-free subgraph with $(25 / 26 + \epsilon) \rho \approx (0.961 + \epsilon) \rho$ edges. As an application, we classify the Maximum Promise Constraint Satisfaction Problem MaxPCSP($G$,$H$) for all bipartite $G$: Given an input (multi)graph $X$ which admits a $G$-colouring satisfying $\rho$ edges, find an $H$-colouring of $X$ that satisfies $\rho$ edges. This problem is solvable in polynomial time, apart from trivial cases, if $H$ contains a triangle, and is NP-hard otherwise.
Sorting has a natural generalization where the input consists of: (1) a ground set $X$ of size $n$, (2) a partial oracle $O_P$ specifying some fixed partial order $P$ on $X$ and (3) a linear oracle $O_L$ specifying a linear order $L$ that extends $P$. The goal is to recover the linear order $L$ on $X$ using the fewest number of linear oracle queries. In this problem, we measure algorithmic complexity through three metrics: oracle queries to $O_L$, oracle queries to $O_P$, and the time spent. Any algorithm requires worst-case $\log_2 e(P)$ linear oracle queries to recover the linear order on $X$. Kahn and Saks presented the first algorithm that uses $\Theta(\log e(P))$ linear oracle queries (using $O(n^2)$ partial oracle queries and exponential time). The state-of-the-art for the general problem is by Cardinal, Fiorini, Joret, Jungers and Munro who at STOC'10 manage to separate the linear and partial oracle queries into a preprocessing and query phase. They can preprocess $P$ using $O(n^2)$ partial oracle queries and $O(n^{2.5})$ time. Then, given $O_L$, they uncover the linear order on $X$ in $\Theta(\log e(P))$ linear oracle queries and $O(n + \log e(P))$ time -- which is worst-case optimal in the number of linear oracle queries but not in the time spent. For $c \geq 1$, our algorithm can preprocess $O_P$ using $O(n^{1 + \frac{1}{c}})$ queries and time. Given $O_L$, we uncover $L$ using $\Theta(c \log e(P))$ queries and time. We show a matching lower bound, as there exist positive constants $(\alpha, \beta)$ where for any constant $c \geq 1$, any algorithm that uses at most $\alpha \cdot n^{1 + \frac{1}{c}}$ preprocessing must use worst-case at least $\beta \cdot c \log e(P)$ linear oracle queries. Thus, we solve the problem of sorting under partial information through an algorithm that is asymptotically tight across all three metrics.
In the Steiner Tree problem we are given an undirected edge-weighted graph as input, along with a set $K$ of vertices called terminals. The task is to output a minimum-weight connected subgraph that spans all the terminals. The famous Dreyfus-Wagner algorithm running in $3^{|K|} \mathsf{poly}(n)$ time shows that the problem is fixed-parameter tractable parameterized by the number of terminals. We present fixed-parameter tractable algorithms for Steiner Tree using structurally smaller parameterizations. Our first result concerns the parameterization by a multiway cut $S$ of the terminals, which is a vertex set $S$ (possibly containing terminals) such that each connected component of $G-S$ contains at most one terminal. We show that Steiner Tree can be solved in $2^{O(|S|\log|S|)}\mathsf{poly}(n)$ time and polynomial space, where $S$ is a minimum multiway cut for $K$. The algorithm is based on the insight that, after guessing how an optimal Steiner tree interacts with a multiway cut $S$, computing a minimum-cost solution of this type can be formulated as minimum-cost bipartite matching. Our second result concerns a new hybrid parameterization called $K$-free treewidth that simultaneously refines the number of terminals $|K|$ and the treewidth of the input graph. By utilizing recent work on $\mathcal{H}$-Treewidth in order to find a corresponding decomposition of the graph, we give an algorithm that solves Steiner Tree in time $2^{O(k)} \mathsf{poly}(n)$, where $k$ denotes the $K$-free treewidth of the input graph. To obtain this running time, we show how the rank-based approach for solving Steiner Tree parameterized by treewidth can be extended to work in the setting of $K$-free treewidth, by exploiting existing algorithms parameterized by $|K|$ to compute the table entries of leaf bags of a tree $K$-free decomposition.
Given a weighted graph $G$, a minimum weight $\alpha$-spanner is a least-weight subgraph $H\subseteq G$ that preserves minimum distances between all node pairs up to a factor of $\alpha$. There are many results on heuristics and approximation algorithms, including a recent investigation of their practical performance [20]. Exact approaches, in contrast, have long been denounced as impractical: The first exact ILP (integer linear program) method [48] from 2004 is based on a model with exponentially many path variables, solved via column generation. A second approach [2], modeling via arc-based multicommodity flow, was presented in 2019. In both cases, only graphs with 40-100 nodes were reported to be solvable. In this paper, we briefly report on a theoretical comparison between these two models from a polyhedral point of view, and then concentrate on improvements and engineering aspects. We evaluate their performance in a large-scale empirical study. We report that our tuned column generation approach, based on multicriteria shortest path computations, is able to solve instances with over 16000 nodes within 13 minutes. Furthermore, now knowing optimal solutions for larger graphs, we are able to investigate the quality of the strongest known heuristic on reasonably sized instances for the first time.
Let $R \cup B$ be a set of $n$ points in $\mathbb{R}^2$, and let $k \in 1..n$. Our goal is to compute a line that "best" separates the "red" points $R$ from the "blue" points $B$ with at most $k$ outliers. We present an efficient semi-online dynamic data structure that can maintain whether such a separator exists. Furthermore, we present efficient exact and approximation algorithms that compute a linear separator that is guaranteed to misclassify at most $k$, points and minimizes the distance to the farthest outlier. Our exact algorithm runs in $O(nk + n \log n)$ time, and our $(1+\varepsilon)$-approximation algorithm runs in $O(\varepsilon^{-1/2}((n + k^2) \log n))$ time. Based on our $(1+\varepsilon)$-approximation algorithm we then also obtain a semi-online data structure to maintain such a separator efficiently.
Kaplan et al. and Hoffmann et al. developed influential scaling laws for the optimal model size as a function of the compute budget, but these laws yield substantially different predictions. We explain the discrepancy by reproducing the Kaplan scaling law on two datasets (OpenWebText2 and RefinedWeb) and identifying three factors causing the difference: last layer computational cost, warmup duration, and scale-dependent optimizer tuning. With these factors corrected, we obtain excellent agreement with the Hoffmann et al. (i.e., "Chinchilla") scaling law. Counter to a hypothesis of Hoffmann et al., we find that careful learning rate decay is not essential for the validity of their scaling law. As a secondary result, we derive scaling laws for the optimal learning rate and batch size, finding that tuning the AdamW $\beta_2$ parameter is essential at lower batch sizes.
{\sc Vertex $(s, t)$-Cut} and {\sc Vertex Multiway Cut} are two fundamental graph separation problems in algorithmic graph theory. We study matroidal generalizations of these problems, where in addition to the usual input, we are given a representation $R \in \mathbb{F}^{r \times n}$ of a linear matroid $\mathcal{M} = (V(G), \mathcal{I})$ of rank $r$ in the input, and the goal is to determine whether there exists a vertex subset $S \subseteq V(G)$ that has the required cut properties, as well as is independent in the matroid $\mathcal{M}$. We refer to these problems as {\sc Independent Vertex $(s, t)$-cut}, and {\sc Independent Multiway Cut}, respectively. We show that these problems are fixed-parameter tractable ({\sf FPT}) when parameterized by the solution size (which can be assumed to be equal to the rank of the matroid $\mathcal{M}$). These results are obtained by exploiting the recent technique of flow augmentation [Kim et al.~STOC '22], combined with a dynamic programming algorithm on flow-paths \'a la [Feige and Mahdian,~STOC '06] that maintains a representative family of solutions w.r.t.~the given matroid [Marx, TCS '06; Fomin et al., JACM]. As a corollary, we also obtain {\sf FPT} algorithms for the independent version of {\sc Odd Cycle Transversal}. Further, our results can be generalized to other variants of the problems, e.g., weighted versions, or edge-deletion versions.
In the Maximum Independent Set of Hyperrectangles problem, we are given a set of $n$ (possibly overlapping) $d$-dimensional axis-aligned hyperrectangles, and the goal is to find a subset of non-overlapping hyperrectangles of maximum cardinality. For $d=1$, this corresponds to the classical Interval Scheduling problem, where a simple greedy algorithm returns an optimal solution. In the offline setting, for $d$-dimensional hyperrectangles, polynomial time $(\log n)^{O(d)}$-approximation algorithms are known. However, the problem becomes notably challenging in the online setting, where the input objects (hyperrectangles) appear one by one in an adversarial order, and on the arrival of an object, the algorithm needs to make an immediate and irrevocable decision whether or not to select the object while maintaining the feasibility. Even for interval scheduling, an $\Omega(n)$ lower bound is known on the competitive ratio. To circumvent these negative results, in this work, we study the online maximum independent set of axis-aligned hyperrectangles in the random-order arrival model, where the adversary specifies the set of input objects which then arrive in a uniformly random order. Starting from the prototypical secretary problem, the random-order model has received significant attention to study algorithms beyond the worst-case competitive analysis. Surprisingly, we show that the problem in the random-order model almost matches the best-known offline approximation guarantees, up to polylogarithmic factors. In particular, we give a simple $(\log n)^{O(d)}$-competitive algorithm for $d$-dimensional hyperrectangles in this model, which runs in $\tilde{O_d}(n)$ time. Our approach also yields $(\log n)^{O(d)}$-competitive algorithms in the random-order model for more general objects such as $d$-dimensional fat objects and ellipsoids. Furthermore, our guarantees hold with high probability.
In multi-turn dialog, utterances do not always take the full form of sentences \cite{Carbonell1983DiscoursePA}, which naturally makes understanding the dialog context more difficult. However, it is essential to fully grasp the dialog context to generate a reasonable response. Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog. Enlightened by the multi-task learning scheme, we propose a joint framework that unifies these two tasks, sharing the same encoder to extract the common and task-invariant features with different decoders to learn task-specific features. To better fusing information from the question and the dialog history in the encoding part, we propose to augment the Transformer architecture with a memory updater, which is designed to selectively store and update the history dialog information so as to support downstream tasks. For the experiment, we employ human annotators to write and examine a large-scale dialog reading comprehension dataset. Extensive experiments are conducted on this dataset, and the results show that the proposed model brings substantial improvements over several strong baselines on both tasks. In this way, we demonstrate that reasoning can indeed help better response generation and vice versa. We release our large-scale dataset for further research.