In this paper, for any fixed integer $q>2$, we construct $q$-ary codes correcting a burst of at most $t$ deletions with redundancy $\log n+8\log\log n+o(\log\log n)+\gamma_{q,t}$ bits and near-linear encoding/decoding complexity, where $n$ is the message length and $\gamma_{q,t}$ is a constant that only depends on $q$ and $t$. In previous works there are constructions of such codes with redundancy $\log n+O(\log q\log\log n)$ bits or $\log n+O(t^2\log\log n)+O(t\log q)$. The redundancy of our new construction is independent of $q$ and $t$ in the second term.
Let $G$ be a simple graph with adjacency matrix $A(G)$, signless Laplacian matrix $Q(G)$, degree diagonal matrix $D(G)$ and let $l(G)$ be the line graph of $G$. In 2017, Nikiforov defined the $A_\alpha$-matrix of $G$, $A_\alpha(G)$, as a linear convex combination of $A(G)$ and $D(G)$, the following way, $A_\alpha(G):=\alpha A(G)+(1-\alpha)D(G),$ where $\alpha\in[0,1]$. In this paper, we present some bounds for the eigenvalues of $A_\alpha(G)$ and for the largest and smallest eigenvalues of $A_\alpha(l(G))$. Extremal graphs attaining some of these bounds are characterized.
On current computer architectures, GMRES' performance can be limited by its communication cost to generate orthonormal basis vectors of the Krylov subspace. To address this performance bottleneck, its $s$-step variant orthogonalizes a block of $s$ basis vectors at a time, potentially reducing the communication cost by a factor of $s$. Unfortunately, for a large step size $s$, the solver can generate extremely ill-conditioned basis vectors, and to maintain stability in practice, a conservatively small step size is used, which limits the performance of the $s$-step solver. To enhance the performance using a small step size, in this paper, we introduce a two-stage block orthogonalization scheme. Similar to the original scheme, the first stage of the proposed method operates on a block of $s$ basis vectors at a time, but its objective is to maintain the well-conditioning of the generated basis vectors with a lower cost. The orthogonalization of the basis vectors is delayed until the second stage when enough basis vectors are generated to obtain higher performance. Our analysis shows the stability of the proposed two-stage scheme. The performance is improved because while the same amount of computation as the original scheme is required, most of the communication is done at the second stage of the proposed scheme, reducing the overall communication requirements. Our performance results with up to 192 NVIDIA V100 GPUs on the Summit supercomputer demonstrate that when solving a 2D Laplace problem, the two-stage approach can reduce the orthogonalization time and the total time-to-solution by the respective factors of up to $2.6\times$ and $1.6\times$ over the original $s$-step GMRES, which had already obtained the respective speedups of $2.1\times$ and $1.8\times$ over the standard GMRES. Similar speedups were obtained for 3D problems and for matrices from the SuiteSparse Matrix Collection.
Analytic combinatorics in several variables refers to a suite of tools that provide sharp asymptotic estimates for certain combinatorial quantities. In this paper, we apply these tools to determine the Gilbert--Varshamov lower bound on the rate of optimal codes in $L_1$ metric. Several different code spaces are analyzed, including the simplex and the hypercube in $\mathbb{Z^n}$, all of which are inspired by concrete data storage and transmission models such as the sticky insertion channel, the permutation channel, the adjacent transposition (bit-shift) channel, the multilevel flash memory channel, etc.
We consider online algorithms for the $k$-server problem on trees of size $n$. Chrobak and Larmore proposed a $k$-competitive algorithm for this problem that has the optimal competitive ratio. However, the existing implementations have $O\left(k^2 + k\cdot \log n\right)$ or $O\left(k(\log n)^2\right)$ time complexity for processing a query, where $n$ is the number of nodes. We propose a new time-efficient implementation of this algorithm that has $O(n)$ time complexity for preprocessing and $O\left(k\log k\right)$ time for processing a query. The new algorithm is faster than both existing algorithms and the time complexity for query processing does not depend on the tree size.
We introduce the $k$-Plane Insertion into Plane drawing ($k$-PIP) problem: given a plane drawing of a planar graph $G$ and a set of edges $F$, insert the edges in $F$ into the drawing such that the resulting drawing is $k$-plane. In this paper, we focus on the $1$-PIP scenario. We present a linear-time algorithm for the case that $G$ is a triangulation, while proving NP-completeness for the case that $G$ is biconnected and $F$ forms a path or a matching.
In the Tricolored Euclidean Traveling Salesperson problem, we are given~$k=3$ sets of points in the plane and are looking for disjoint tours, each covering one of the sets. Arora (1998) famously gave a PTAS based on ``patching'' for the case $k=1$ and, recently, Dross et al.~(2023) generalized this result to~$k=2$. Our contribution is a $(5/3+\epsilon)$-approximation algorithm for~$k=3$ that further generalizes Arora's approach. It is believed that patching is generally no longer possible for more than two tours. We circumvent this issue by either applying a conditional patching scheme for three tours or using an alternative approach based on a weighted solution for $k=2$.
The remarkable capability of large language models (LLMs) for in-context learning (ICL) needs to be activated by demonstration examples. Prior work has extensively explored the selection of examples for ICL, predominantly following the "select then organize" paradigm, such approaches often neglect the internal relationships between examples and exist an inconsistency between the training and inference. In this paper, we formulate the problem as a $\textit{se}$quential $\textit{se}$lection problem and introduce $\texttt{Se}^2$, a sequential-aware method that leverages the LLM's feedback on varying context, aiding in capturing inter-relationships and sequential information among examples, significantly enriching the contextuality and relevance of ICL prompts. Meanwhile, we utilize beam search to seek and construct example sequences, enhancing both quality and diversity. Extensive experiments across 23 NLP tasks from 8 distinct categories illustrate that $\texttt{Se}^2$ markedly surpasses competitive baselines and achieves 42% relative improvement over random selection. Further in-depth analysis show the effectiveness of proposed strategies, highlighting $\texttt{Se}^2$'s exceptional stability and adaptability across various scenarios. Our code will be released to facilitate future research.
The $k$-Maximum Inner Product Search ($k$MIPS) serves as a foundational component in recommender systems and various data mining tasks. However, while most existing $k$MIPS approaches prioritize the efficient retrieval of highly relevant items for users, they often neglect an equally pivotal facet of search results: \emph{diversity}. To bridge this gap, we revisit and refine the diversity-aware $k$MIPS (D$k$MIPS) problem by incorporating two well-known diversity objectives -- minimizing the average and maximum pairwise item similarities within the results -- into the original relevance objective. This enhancement, inspired by Maximal Marginal Relevance (MMR), offers users a controllable trade-off between relevance and diversity. We introduce \textsc{Greedy} and \textsc{DualGreedy}, two linear scan-based algorithms tailored for D$k$MIPS. They both achieve data-dependent approximations and, when aiming to minimize the average pairwise similarity, \textsc{DualGreedy} attains an approximation ratio of $1/4$ with an additive term for regularization. To further improve query efficiency, we integrate a lightweight Ball-Cone Tree (BC-Tree) index with the two algorithms. Finally, comprehensive experiments on ten real-world data sets demonstrate the efficacy of our proposed methods, showcasing their capability to efficiently deliver diverse and relevant search results to users.
We present a KE-tableau-based procedure for the main TBox and ABox reasoning tasks for the description logic $\mathcal{DL}\langle \mathsf{4LQS^{R,\!\times}}\rangle(\mathbf{D})$, in short $\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$. The logic $\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$, representable in the decidable multi-sorted quantified set-theoretic fragment $\mathsf{4LQS^R}$, combines the high scalability and efficiency of rule languages such as the Semantic Web Rule Language (SWRL) with the expressivity of description logics. Our algorithm is based on a variant of the KE-tableau system for sets of universally quantified clauses, where the KE-elimination rule is generalized in such a way as to incorporate the $\gamma$-rule. The novel system, called KE$^\gamma$-tableau, turns out to be an improvement of the system introduced in \cite{RR2017} and of standard first-order KE-tableau \cite{dagostino94}. Suitable benchmark test sets executed on C++ implementations of the three mentioned systems show that the performances of the KE$^\gamma$-tableau-based reasoner are often up to about 400% better than the ones of the other two systems. This a first step towards the construction of efficient reasoners for expressive OWL ontologies based on fragments of computable set-theory.
We present a KE-tableau-based implementation of a reasoner for a decidable fragment of (stratified) set theory expressing the description logic $\mathcal{DL}\langle \mathsf{4LQS^{R,\!\times}}\rangle(\mathbf{D})$ ($\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$, for short). Our application solves the main TBox and ABox reasoning problems for $\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$. In particular, it solves the consistency problem for $\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$-knowledge bases represented in set-theoretic terms, and a generalization of the \emph{Conjunctive Query Answering} problem in which conjunctive queries with variables of three sorts are admitted. The reasoner, which extends and optimizes a previous prototype for the consistency checking of $\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$-knowledge bases (see \cite{cilc17}), is implemented in \textsf{C++}. It supports $\mathcal{DL}_{\mathbf{D}}^{4,\!\times}$-knowledge bases serialized in the OWL/XML format, and it admits also rules expressed in SWRL (Semantic Web Rule Language).