We give an algorithm that takes as input an $n$-vertex graph $G$ and an integer $k$, runs in time $2^{O(k^2)} n^{O(1)}$, and outputs a tree decomposition of $G$ of width at most $k$, if such a decomposition exists. This resolves the long-standing open problem of whether there is a $2^{o(k^3)} n^{O(1)}$ time algorithm for treewidth. In particular, our algorithm is the first improvement on the dependency on $k$ in algorithms for treewidth since the $2^{O(k^3)} n^{O(1)}$ time algorithm given by Bodlaender and Kloks [ICALP 1991] and Lagergren and Arnborg [ICALP 1991]. We also give an algorithm that given an $n$-vertex graph $G$, an integer $k$, and a rational $\varepsilon \in (0,1)$, in time $k^{O(k/\varepsilon)} n^{O(1)}$ either outputs a tree decomposition of $G$ of width at most $(1+\varepsilon)k$ or determines that the treewidth of $G$ is larger than $k$. Prior to our work, no approximation algorithms for treewidth with approximation ratio less than $2$, other than the exact algorithms, were known. Both of our algorithms work in polynomial space.
Consider that there are $k\le n$ agents in a simple, connected, and undirected graph $G=(V,E)$ with $n$ nodes and $m$ edges. The goal of the dispersion problem is to move these $k$ agents to distinct nodes. Agents can communicate only when they are at the same node, and no other means of communication such as whiteboards are available. We assume that the agents operate synchronously. We consider two scenarios: when all agents are initially located at any single node (rooted setting) and when they are initially distributed over any one or more nodes (general setting). Kshemkalyani and Sharma presented a dispersion algorithm for the general setting, which uses $O(m_k)$ time and $\log(k+\delta)$ bits of memory per agent [OPODIS 2021]. Here, $m_k$ is the maximum number of edges in any induced subgraph of $G$ with $k$ nodes, and $\delta$ is the maximum degree of $G$. This algorithm is the fastest in the literature, as no algorithm with $o(m_k)$ time has been discovered even for the rooted setting. In this paper, we present faster algorithms for both the rooted and general settings. First, we present an algorithm for the rooted setting that solves the dispersion problem in $O(k\log \min(k,\delta))=O(k\log k)$ time using $O(\log \delta)$ bits of memory per agent. Next, we propose an algorithm for the general setting that achieves dispersion in $O(k (\log k)\cdot (\log \min(k,\delta))=O(k \log^2 k)$ time using $O(\log (k+\delta))$ bits.
Most adversarial attacks and defenses focus on perturbations within small $\ell_p$-norm constraints. However, $\ell_p$ threat models cannot capture all relevant semantic-preserving perturbations, and hence, the scope of robustness evaluations is limited. In this work, we introduce Score-Based Adversarial Generation (ScoreAG), a novel framework that leverages the advancements in score-based generative models to generate adversarial examples beyond $\ell_p$-norm constraints, so-called unrestricted adversarial examples, overcoming their limitations. Unlike traditional methods, ScoreAG maintains the core semantics of images while generating realistic adversarial examples, either by transforming existing images or synthesizing new ones entirely from scratch. We further exploit the generative capability of ScoreAG to purify images, empirically enhancing the robustness of classifiers. Our extensive empirical evaluation demonstrates that ScoreAG matches the performance of state-of-the-art attacks and defenses across multiple benchmarks. This work highlights the importance of investigating adversarial examples bounded by semantics rather than $\ell_p$-norm constraints. ScoreAG represents an important step towards more encompassing robustness assessments.
We consider an online binary prediction setting where a forecaster observes a sequence of $T$ bits one by one. Before each bit is revealed, the forecaster predicts the probability that the bit is $1$. The forecaster is called well-calibrated if for each $p \in [0, 1]$, among the $n_p$ bits for which the forecaster predicts probability $p$, the actual number of ones, $m_p$, is indeed equal to $p \cdot n_p$. The calibration error, defined as $\sum_p |m_p - p n_p|$, quantifies the extent to which the forecaster deviates from being well-calibrated. It has long been known that an $O(T^{2/3})$ calibration error is achievable even when the bits are chosen adversarially, and possibly based on the previous predictions. However, little is known on the lower bound side, except an $\Omega(\sqrt{T})$ bound that follows from the trivial example of independent fair coin flips. In this paper, we prove an $\Omega(T^{0.528})$ bound on the calibration error, which is the first super-$\sqrt{T}$ lower bound for this setting to the best of our knowledge. The technical contributions of our work include two lower bound techniques, early stopping and sidestepping, which circumvent the obstacles that have previously hindered strong calibration lower bounds. We also propose an abstraction of the prediction setting, termed the Sign-Preservation game, which may be of independent interest. This game has a much smaller state space than the full prediction setting and allows simpler analyses. The $\Omega(T^{0.528})$ lower bound follows from a general reduction theorem that translates lower bounds on the game value of Sign-Preservation into lower bounds on the calibration error.
In this paper, we prove the following non-linear generalization of the classical Sylvester-Gallai theorem. Let $\mathbb{K}$ be an algebraically closed field of characteristic $0$, and $\mathcal{F}=\{F_1,\cdots,F_m\} \subset \mathbb{K}[x_1,\cdots,x_N]$ be a set of irreducible homogeneous polynomials of degree at most $d$ such that $F_i$ is not a scalar multiple of $F_j$ for $i\neq j$. Suppose that for any two distinct $F_i,F_j\in \mathcal{F}$, there is $k\neq i,j$ such that $F_k\in \mathrm{rad}(F_i,F_j)$. We prove that such radical SG configurations must be low dimensional. More precisely, we show that there exists a function $\lambda : \mathbb{N} \to \mathbb{N}$, independent of $\mathbb{K},N$ and $m$, such that any such configuration $\mathcal{F}$ must satisfy $$ \dim (\mathrm{span}_{\mathbb{K}}{\mathcal{F}}) \leq \lambda(d). $$ Our result confirms a conjecture of Gupta [Gup14, Conjecture 2] and generalizes the quadratic and cubic Sylvester-Gallai theorems of [S20,OS22]. Our result takes us one step closer towards the first deterministic polynomial time algorithm for the Polynomial Identity Testing (PIT) problem for depth-4 circuits of bounded top and bottom fanins. Our result, when combined with the Stillman uniformity type results of [AH20a,DLL19,ESS21], yields uniform bounds for several algebraic invariants such as projective dimension, Betti numbers and Castelnuovo-Mumford regularity of ideals generated by radical SG configurations.
We formalize and interpret the geometric structure of $d$-dimensional fully connected ReLU-layers in neural networks. The parameters of a ReLU-layer induce a natural partition of the input domain, such that in each sector of the partition, the ReLU-layer can be greatly simplified. This leads to a geometric interpretation of a ReLU-layer as a projection onto a polyhedral cone followed by an affine transformation, in line with the description in [doi:10.48550/arXiv.1905.08922] for convolutional networks with ReLU activations. Further, this structure facilitates simplified expressions for preimages of the intersection between partition sectors and hyperplanes, which is useful when describing decision boundaries in a classification setting. We investigate this in detail for a feed-forward network with one hidden ReLU-layer, where we provide results on the geometric complexity of the decision boundary generated by such networks, as well as proving that modulo an affine transformation, such a network can only generate $d$ different decision boundaries. Finally, the effect of adding more layers to the network is discussed.
We introduce Longitudinal Predictive Conformal Inference (LPCI), a novel distribution-free conformal prediction algorithm for longitudinal data. Current conformal prediction approaches for time series data predominantly focus on the univariate setting, and thus lack cross-sectional coverage when applied individually to each time series in a longitudinal dataset. The current state-of-the-art for longitudinal data relies on creating infinitely-wide prediction intervals to guarantee both cross-sectional and asymptotic longitudinal coverage. The proposed LPCI method addresses this by ensuring that both longitudinal and cross-sectional coverages are guaranteed without resorting to infinitely wide intervals. In our approach, we model the residual data as a quantile fixed-effects regression problem, constructing prediction intervals with a trained quantile regressor. Our extensive experiments demonstrate that LPCI achieves valid cross-sectional coverage and outperforms existing benchmarks in terms of longitudinal coverage rates. Theoretically, we establish LPCI's asymptotic coverage guarantees for both dimensions, with finite-width intervals. The robust performance of LPCI in generating reliable prediction intervals for longitudinal data underscores its potential for broad applications, including in medicine, finance, and supply chain management.
Explicit Runge--Kutta (\rk{}) methods are susceptible to a reduction in the observed order of convergence when applied to initial-boundary value problem with time-dependent boundary conditions. We study conditions on \erk{} methods that guarantee high-order convergence for linear problems; we refer to these conditions as weak stage order conditions. We prove a general relationship between the method's order, weak stage order, and number of stages. We derive \erk{} methods with high weak stage order and demonstrate, through numerical tests, that they avoid the order reduction phenomenon up to any order for linear problems and up to order three for nonlinear problems.
We consider the following natural problem that generalizes min-sum-radii clustering: Given is $k\in\mathbb{N}$ as well as some metric space $(V,d)$ where $V=F\cup C$ for facilities $F$ and clients $C$. The goal is to find a clustering given by $k$ facility-radius pairs $(f_1,r_1),\dots,(f_k,r_k)\in F\times\mathbb{R}_{\geq 0}$ such that $C\subseteq B(f_1,r_1)\cup\dots\cup B(f_k,r_k)$ and $\sum_{i=1,\dots,k} g(r_i)$ is minimized for some increasing function $g:\mathbb{R}_{\geq 0}\rightarrow\mathbb{R}_{\geq 0}$. Here, $B(x,r)$ is the radius-$r$ ball centered at $x$. For the case that $(V,d)$ is the shortest-path metric of some edge-weighted graph of bounded treewidth, we present a dynamic program that is tailored to this class of problems and achieves a polynomial running time, establishing that the problem is in $\mathsf{XP}$ with parameter treewidth.
We introduce the natural notion of a matching frame in a $2$-dimensional string. A matching frame in a $2$-dimensional $n\times m$ string $M$, is a rectangle such that the strings written on the horizontal sides of the rectangle are identical, and so are the strings written on the vertical sides of the rectangle. Formally, a matching frame in $M$ is a tuple $(u,d,\ell,r)$ such that $M[u][\ell ..r] = M[d][\ell ..r]$ and $M[u..d][\ell] = M[u..d][r]$. In this paper, we present an algorithm for finding the maximum perimeter matching frame in a matrix $M$ in $\tilde{O}(n^{2.5})$ time (assuming $n \ge m)$. Additionally, for every constant $\epsilon> 0$ we present a near-linear $(1-\epsilon)$-approximation algorithm for the maximum perimeter of a matching frame. In the development of the aforementioned algorithms, we introduce inventive technical elements and uncover distinctive structural properties that we believe will captivate the curiosity of the community.
Let $P$ be a set of at most $n$ points and let $R$ be a set of at most $n$ geometric ranges, such as for example disks or rectangles, where each $p \in P$ has an associated supply $s_{p} > 0$, and each $r \in R$ has an associated demand $d_{r} > 0$. An assignment is a set $\mathcal{A}$ of ordered triples $(p,r,a_{pr}) \in P \times R \times \mathbb{R}_{>0}$ such that $p \in r$. We show how to compute a maximum assignment that satisfies the constraints given by the supplies and demands. Using our techniques, we can also solve minimum bottleneck problems, such as computing a perfect matching between a set of $n$ red points~$P$ and $n$ blue points $Q$ that minimizes the length of the longest edge. For the $L_\infty$-metric, we can do this in time $O(n^{1+\varepsilon})$ in any fixed dimension, for the $L_2$-metric in the plane in time $O(n^{4/3 + \varepsilon})$, for any $\varepsilon > 0$.