An essential requirement of spanners in many applications is to be fault-tolerant: a $(1+\epsilon)$-spanner of a metric space is called (vertex) $f$-fault-tolerant ($f$-FT) if it remains a $(1+\epsilon)$-spanner (for the non-faulty points) when up to $f$ faulty points are removed from the spanner. Fault-tolerant (FT) spanners for Euclidean and doubling metrics have been extensively studied since the 90s. For low-dimensional Euclidean metrics, Czumaj and Zhao in SoCG'03 [CZ03] showed that the optimal guarantees $O(f n)$, $O(f)$ and $O(f^2)$ on the size, degree and lightness of $f$-FT spanners can be achieved via a greedy algorithm, which na\"{\i}vely runs in $O(n^3) \cdot 2^{O(f)}$ time. The question of whether the optimal bounds of [CZ03] can be achieved via a fast construction has remained elusive, with the lightness parameter being the bottleneck. Moreover, in the wider family of doubling metrics, it is not even clear whether there exists an $f$-FT spanner with lightness that depends solely on $f$ (even exponentially): all existing constructions have lightness $\Omega(\log n)$ since they are built on the net-tree spanner, which is induced by a hierarchical net-tree of lightness $\Omega(\log n)$. In this paper we settle in the affirmative these longstanding open questions. Specifically, we design a construction of $f$-FT spanners that is optimal with respect to all the involved parameters (size, degree, lightness and running time): For any $n$-point doubling metric, any $\epsilon > 0$, and any integer $1 \le f \le n-2$, our construction provides, within time $O(n \log n + f n)$, an $f$-FT $(1+\epsilon)$-spanner with size $O(f n)$, degree $O(f)$ and lightness $O(f^2)$.
For a given function $F$ from $\mathbb F_{p^n}$ to itself, determining whether there exists a function which is CCZ-equivalent but EA-inequivalent to $F$ is a very important and interesting problem. For example, K\"olsch \cite{KOL21} showed that there is no function which is CCZ-equivalent but EA-inequivalent to the inverse function. On the other hand, for the cases of Gold function $F(x)=x^{2^i+1}$ and $F(x)=x^3+{\rm Tr}(x^9)$ over $\mathbb F_{2^n}$, Budaghyan, Carlet and Pott (respectively, Budaghyan, Carlet and Leander) \cite{BCP06, BCL09FFTA} found functions which are CCZ-equivalent but EA-inequivalent to $F$. In this paper, when a given function $F$ has a component function which has a linear structure, we present functions which are CCZ-equivalent to $F$, and if suitable conditions are satisfied, the constructed functions are shown to be EA-inequivalent to $F$. As a consequence, for every quadratic function $F$ on $\mathbb F_{2^n}$ ($n\geq 4$) with nonlinearity $>0$ and differential uniformity $\leq 2^{n-3}$, we explicitly construct functions which are CCZ-equivalent but EA-inequivalent to $F$. Also for every non-planar quadratic function on $\mathbb F_{p^n}$ $(p>2, n\geq 4)$ with $|\mathcal W_F|\leq p^{n-1}$ and differential uniformity $\leq p^{n-3}$, we explicitly construct functions which are CCZ-equivalent but EA-inequivalent to $F$.
We consider the sequential allocation of $m$ balls (jobs) into $n$ bins (servers) by allowing each ball to choose from some bins sampled uniformly at random. The goal is to maintain a small gap between the maximum load and the average load. In this paper, we present a general framework that allows us to analyze various allocation processes that slightly prefer allocating into underloaded, as opposed to overloaded bins. Our analysis covers several natural instances of processes, including: The Caching process (a.k.a. memory protocol) as studied by Mitzenmacher, Prabhakar and Shah (2002): At each round we only take one bin sample, but we also have access to a cache in which the most recently used bin is stored. We place the ball into the least loaded of the two. The Packing process: At each round we only take one bin sample. If the load is below some threshold (e.g., the average load), then we place as many balls until the threshold is reached; otherwise, we place only one ball. The Twinning process: At each round, we only take one bin sample. If the load is below some threshold, then we place two balls; otherwise, we place only one ball. The Thinning process as recently studied by Feldheim and Gurel-Gurevich (2021): At each round, we first take one bin sample. If its load is below some threshold, we place one ball; otherwise, we place one ball into a $\textit{second}$ bin sample. As we demonstrate, our general framework implies for all these processes a gap of $\mathcal{O}(\log n)$ between the maximum load and average load, even when an arbitrary number of balls $m \geq n$ are allocated (heavily loaded case). Our analysis is inspired by a previous work of Peres, Talwar and Wieder (2010) for the $(1+\beta)$-process, however here we rely on the interplay between different potential functions to prove stabilization.
Node classification is the task of predicting the labels of unlabeled nodes in a graph. State-of-the-art methods based on graph neural networks achieve excellent performance when all labels are available during training. But in real-life, models are often applied on data with new classes, which can lead to massive misclassification and thus significantly degrade performance. Hence, developing open-set classification methods is crucial to determine if a given sample belongs to a known class. Existing methods for open-set node classification generally use transductive learning with part or all of the features of real unseen class nodes to help with open-set classification. In this paper, we propose a novel generative open-set node classification method, i.e. $\mathcal{G}^2Pxy$, which follows a stricter inductive learning setting where no information about unknown classes is available during training and validation. Two kinds of proxy unknown nodes, inter-class unknown proxies and external unknown proxies are generated via mixup to efficiently anticipate the distribution of novel classes. Using the generated proxies, a closed-set classifier can be transformed into an open-set one, by augmenting it with an extra proxy classifier. Under the constraints of both cross entropy loss and complement entropy loss, $\mathcal{G}^2Pxy$ achieves superior effectiveness for unknown class detection and known class classification, which is validated by experiments on benchmark graph datasets. Moreover, $\mathcal{G}^2Pxy$ does not have specific requirement on the GNN architecture and shows good generalizations.
This paper describes a purely functional library for computing level-$p$-complexity of Boolean functions, and applies it to two-level iterated majority. Boolean functions are simply functions from $n$ bits to one bit, and they can describe digital circuits, voting systems, etc. An example of a Boolean function is majority, which returns the value that has majority among the $n$ input bits for odd $n$. The complexity of a Boolean function $f$ measures the cost of evaluating it: how many bits of the input are needed to be certain about the result of $f$. There are many competing complexity measures but we focus on level-$p$-complexity -- a function of the probability $p$ that a bit is 1. The level-$p$-complexity $D_p(f)$ is the minimum expected cost when the input bits are independent and identically distributed with Bernoulli($p$) distribution. We specify the problem as choosing the minimum expected cost of all possible decision trees -- which directly translates to a clearly correct, but very inefficient implementation. The library uses thinning and memoization for efficiency and type classes for separation of concerns. The complexity is represented using (sets of) polynomials, and the order relation used for thinning is implemented using polynomial factorisation and root-counting. Finally we compute the complexity for two-level iterated majority and improve on an earlier result by J.~Jansson.
An $n$-vertex $m$-edge graph is \emph{$k$-vertex connected} if it cannot be disconnected by deleting less than $k$ vertices. After more than half a century of intensive research, the result by [Li et al. STOC'21] finally gave a \emph{randomized} algorithm for checking $k$-connectivity in near-optimal $\widehat{O}(m)$ time. (We use $\widehat{O}(\cdot)$ to hide an $n^{o(1)}$ factor.) Deterministic algorithms, unfortunately, have remained much slower even if we assume a linear-time max-flow algorithm: they either require at least $\Omega(mn)$ time [Even'75; Henzinger Rao and Gabow, FOCS'96; Gabow, FOCS'00] or assume that $k=o(\sqrt{\log n})$ [Saranurak and Yingchareonthawornchai, FOCS'22]. We show a \emph{deterministic} algorithm for checking $k$-vertex connectivity in time proportional to making $\widehat{O}(k^{2})$ max-flow calls, and, hence, in $\widehat{O}(mk^{2})$ time using the deterministic max-flow algorithm by [Brand et al. FOCS'23]. Our algorithm gives the first almost-linear-time bound for all $k$ where $\sqrt{\log n}\le k\le n^{o(1)}$ and subsumes up to a sub polynomial factor the long-standing state-of-the-art algorithm by [Even'75] which requires $O(n+k^{2})$ max-flow calls. Our key technique is a deterministic algorithm for terminal reduction for vertex connectivity: given a terminal set separated by a vertex mincut, output either a vertex mincut or a smaller terminal set that remains separated by a vertex mincut. We also show a deterministic $(1+\epsilon)$-approximation algorithm for vertex connectivity that makes $O(n/\epsilon^2)$ max-flow calls, improving the bound of $O(n^{1.5})$ max-flow calls in the exact algorithm of [Gabow, FOCS'00]. The technique is based on Ramanujan graphs.
Metric spaces $(X, d)$ are ubiquitous objects in mathematics and computer science that allow for capturing (pairwise) distance relationships $d(x, y)$ between points $x, y \in X$. Because of this, it is natural to ask what useful generalizations there are of metric spaces for capturing "$k$-wise distance relationships" $d(x_1, \ldots, x_k)$ among points $x_1, \ldots, x_k \in X$ for $k > 2$. To that end, G\"{a}hler (Math. Nachr., 1963) (and perhaps others even earlier) defined $k$-metric spaces, which generalize metric spaces, and most notably generalize the triangle inequality $d(x_1, x_2) \leq d(x_1, y) + d(y, x_2)$ to the "simplex inequality" $d(x_1, \ldots, x_k) \leq \sum_{i=1}^k d(x_1, \ldots, x_{i-1}, y, x_{i+1}, \ldots, x_k)$. (The definition holds for any fixed $k \geq 2$, and a $2$-metric space is just a (standard) metric space.) In this work, we introduce strong $k$-metric spaces, $k$-metric spaces that satisfy a topological condition stronger than the simplex inequality, which makes them "behave nicely." We also introduce coboundary $k$-metrics, which generalize $\ell_p$ metrics (and in fact all finite metric spaces induced by norms) and minimum bounding chain $k$-metrics, which generalize shortest path metrics (and capture all strong $k$-metrics). Using these definitions, we prove analogs of a number of fundamental results about embedding finite metric spaces including Fr\'{e}chet embedding (isometric embedding into $\ell_{\infty}$) and isometric embedding of all tree metrics into $\ell_1$. We also study relationships between families of (strong) $k$-metrics, and show that natural quantities, like simplex volume, are strong $k$-metrics.
Dialogue relation extraction (DRE) that identifies the relations between argument pairs in dialogue text, suffers much from the frequent occurrence of personal pronouns, or entity and speaker coreference. This work introduces a new benchmark dataset DialogRE^C+, introducing coreference resolution into the DRE scenario. With the aid of high-quality coreference knowledge, the reasoning of argument relations is expected to be enhanced. In DialogRE^C+ dataset, we manually annotate total 5,068 coreference chains over 36,369 argument mentions based on the existing DialogRE data, where four different coreference chain types namely speaker chain, person chain, location chain and organization chain are explicitly marked. We further develop 4 coreference-enhanced graph-based DRE models, which learn effective coreference representations for improving the DRE task. We also train a coreference resolution model based on our annotations and evaluate the effect of automatically extracted coreference chains demonstrating the practicality of our dataset and its potential to other domains and tasks.
This paper addresses the challenges of real-time, large-scale, and near-optimal multi-agent pathfinding (MAPF) through enhancements to the recently proposed LaCAM* algorithm. LaCAM* is a scalable search-based algorithm that guarantees the eventual finding of optimal solutions for cumulative transition costs. While it has demonstrated remarkable planning success rates, surpassing various state-of-the-art MAPF methods, its initial solution quality is far from optimal, and its convergence speed to the optimum is slow. To overcome these limitations, this paper introduces several improvement techniques, partly drawing inspiration from other MAPF methods. We provide empirical evidence that the fusion of these techniques significantly improves the solution quality of LaCAM*, thus further pushing the boundaries of MAPF algorithms.
Simile detection is a valuable task for many natural language processing (NLP)-based applications, particularly in the field of literature. However, existing research on simile detection often relies on corpora that are limited in size and do not adequately represent the full range of simile forms. To address this issue, we propose a simile data augmentation method based on \textbf{W}ord replacement And Sentence completion using the GPT-2 language model. Our iterative process called I-WAS, is designed to improve the quality of the augmented sentences. To better evaluate the performance of our method in real-world applications, we have compiled a corpus containing a more diverse set of simile forms for experimentation. Our experimental results demonstrate the effectiveness of our proposed data augmentation method for simile detection.
Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.