An average-case variant of the $k$-SUM conjecture asserts that finding $k$ numbers that sum to 0 in a list of $r$ random numbers, each of the order $r^k$, cannot be done in much less than $r^{\lceil k/2 \rceil}$ time. On the other hand, in the dense regime of parameters, where the list contains more numbers and many solutions exist, the complexity of finding one of them can be significantly improved by Wagner's $k$-tree algorithm. Such algorithms for $k$-SUM in the dense regime have many applications, notably in cryptanalysis. In this paper, assuming the average-case $k$-SUM conjecture, we prove that known algorithms are essentially optimal for $k= 3,4,5$. For $k>5$, we prove the optimality of the $k$-tree algorithm for a limited range of parameters. We also prove similar results for $k$-XOR, where the sum is replaced with exclusive or. Our results are obtained by a self-reduction that, given an instance of $k$-SUM which has a few solutions, produces from it many instances in the dense regime. We solve each of these instances using the dense $k$-SUM oracle, and hope that a solution to a dense instance also solves the original problem. We deal with potentially malicious oracles (that repeatedly output correlated useless solutions) by an obfuscation process that adds noise to the dense instances. Using discrete Fourier analysis, we show that the obfuscation eliminates correlations among the oracle's solutions, even though its inputs are highly correlated.
Goemans and Rothvoss (SODA'14) gave a framework for solving problems in time $enc(P)^{2^{O(N)}}enc(Q)^{O(1)}$ that can be described as finding a point in $\text{int.cone}(P\cap\mathbb{Z}^N)\cap Q$, where $P,Q\subset\mathbb{R}^N$ are (bounded) polyhedra. This framework can be used to solve various scheduling problems, but the encoding length $enc(P)$ usually involves large parameters like the makespan. We describe three tools to improve the framework by Goemans and Rothvoss: Problem-specific preprocessing, LP relaxation techniques and a new bound for the number of vertices of the integer hull. In particular, applied to the classical scheduling problem $P||C_{\max}$, these tools each improve the running time from $(\log(C_{\max}))^{2^{O(d)}} enc(I)^{O(1)}$ to the possibly much better $(\log(p_{\max}))^{2^{O(d)}}enc(I)^{O(1)}$. Here, $p_{\max}$ is the largest processing time, $d$ is the number of different processing times, $C_{\max}$ is the makespan and $enc(I)$ is the encoding length of the instance. This running time is FPT w.r.t. parameter $d$ if $p_{\max}$ is given in unary. We obtain similar results for various other problems. Moreover, we show how a balancing result by Govzmann et al. can be used to speed up an additive approximation scheme by Buchem et al. (ICALP'21) in the high-multiplicity setting. On the complexity side, we use reductions from the literature to provide new parameterized lower bounds for $P||C_{\max}$ and to show that the improved running time of the additive approximation algorithm is probably optimal. Finally, we show that the big open question asked by Mnich and van Bevern (Comput. Oper. Res. '18) whether $P||C_{\max}$ is FPT w.r.t. the number of job types $d$ has the same answer as the question whether $Q||C_{\max}$ is FPT w.r.t. the number of job and machine types $d+\tau$ (all in high-multiplicity encoding). The same holds for objective $C_{\min}$.
We prove that Sherali-Adams with polynomially bounded coefficients requires proofs of size $n^{\Omega(d)}$ to rule out the existence of an $n^{\Theta(1)}$-clique in Erd\H{o}s-R\'{e}nyi random graphs whose maximum clique is of size $d\leq 2\log n$. This lower bound is tight up to the multiplicative constant in the exponent. We obtain this result by introducing a technique inspired by pseudo-calibration which may be of independent interest. The technique involves defining a measure on monomials that precisely captures the contribution of a monomial to a refutation. This measure intuitively captures progress and should have further applications in proof complexity.
Reinforcement Learning (RL), bolstered by the expressive capabilities of Deep Neural Networks (DNNs) for function approximation, has demonstrated considerable success in numerous applications. However, its practicality in addressing various real-world scenarios, characterized by diverse and unpredictable dynamics, noisy signals, and large state and action spaces, remains limited. This limitation stems from poor data efficiency, limited generalization capabilities, a lack of safety guarantees, and the absence of interpretability, among other factors. To overcome these challenges and improve performance across these crucial metrics, one promising avenue is to incorporate additional structural information about the problem into the RL learning process. Various sub-fields of RL have proposed methods for incorporating such inductive biases. We amalgamate these diverse methodologies under a unified framework, shedding light on the role of structure in the learning problem, and classify these methods into distinct patterns of incorporating structure. By leveraging this comprehensive framework, we provide valuable insights into the challenges of structured RL and lay the groundwork for a design pattern perspective on RL research. This novel perspective paves the way for future advancements and aids in developing more effective and efficient RL algorithms that can potentially handle real-world scenarios better.
Let $G=(V, E)$ be a graph and let each vertex of $G$ has a lamp and a button. Each button can be of $\sigma^+$-type or $\sigma$-type. Assume that initially some lamps are on and others are off. The button on vertex $x$ is of $\sigma^+$-type ($\sigma$-type, respectively) if pressing the button changes the lamp states on $x$ and on its neighbors in $G$ (the lamp states on the neighbors of $x$ only, respectively). Assume that there is a set $X\subseteq V$ such that pressing buttons on vertices of $X$ lights all lamps on vertices of $G$. In particular, it is known to hold when initially all lamps are off and all buttons are of $\sigma^+$-type. Finding such a set $X$ of the smallest size is NP-hard even if initially all lamps are off and all buttons are of $\sigma^+$-type. Using a linear algebraic approach we design a polynomial-time approximation algorithm for the problem such that for the set $X$ constructed by the algorithm, we have $|X|\le \min\{r,(|V|+{\rm opt})/2\},$ where $r$ is the rank of a (modified) adjacent matrix of $G$ and ${\rm opt}$ is the size of an optimal solution to the problem. To the best of our knowledge, this is the first polynomial-time approximation algorithm for the problem with a nontrivial approximation guarantee.
An $\ell$-vertex-ranking of a graph $G$ is a colouring of the vertices of $G$ with integer colours so that in any connected subgraph $H$ of $G$ with diameter at most $\ell$, there is a vertex in $H$ whose colour is larger than that of every other vertex in $H$. The $\ell$-vertex-ranking number, $\chi_{\ell-\mathrm{vr}}(G)$, of $G$ is the minimum integer $k$ such that $G$ has an $\ell$-vertex-ranking using $k$ colours. We prove that, for any fixed $d$ and $\ell$, every $d$-degenerate $n$-vertex graph $G$ satisfies $\chi_{\ell-\mathrm{vr}}(G)= O(n^{1-2/(\ell+1)}\log n)$ if $\ell$ is even and $\chi_{\ell-\mathrm{vr}}(G)= O(n^{1-2/\ell}\log n)$ if $\ell$ is odd. The case $\ell=2$ resolves (up to the $\log n$ factor) an open problem posed by \citet{karpas.neiman.ea:on} and the cases $\ell\in\{2,3\}$ are asymptotically optimal (up to the $\log n$ factor).
MAG$\pi$ is a Multiparty, Asynchronous and Generalised $\pi$-calculus that introduces timeouts into session types as a means of reasoning about failure-prone communication. Its type system guarantees that all possible message-loss is handled by timeout branches. In this work, we argue that the previous is unnecessarily strict. We present MAG$\pi$!, an extension serving as the first introduction of replication into Multiparty Session Types (MPST). Replication is a standard $\pi$-calculus construct used to model infinitely available servers. We lift this construct to type-level, and show that it simplifies specification of distributed client-server interactions. We prove properties relevant to generalised MPST: subject reduction, session fidelity and process property verification.
In the $k$-Disjoint Shortest Paths ($k$-DSP) problem, we are given a weighted graph $G$ on $n$ nodes and $m$ edges with specified source vertices $s_1, \dots, s_k$, and target vertices $t_1, \dots, t_k$, and are tasked with determining if $G$ contains vertex-disjoint $(s_i,t_i)$-shortest paths. For any constant $k$, it is known that $k$-DSP can be solved in polynomial time over undirected graphs and directed acyclic graphs (DAGs). However, the exact time complexity of $k$-DSP remains mysterious, with large gaps between the fastest known algorithms and best conditional lower bounds. In this paper, we obtain faster algorithms for important cases of $k$-DSP, and present better conditional lower bounds for $k$-DSP and its variants. Previous work solved 2-DSP over weighted undirected graphs in $O(n^7)$ time, and weighted DAGs in $O(mn)$ time. For the main result of this paper, we present linear time algorithms for solving 2-DSP on weighted undirected graphs and DAGs. Our algorithms are algebraic however, and so only solve the detection rather than search version of 2-DSP. For lower bounds, prior work implied that $k$-Clique can be reduced to $2k$-DSP in DAGs and undirected graphs with $O((kn)^2)$ nodes. We improve this reduction, by showing how to reduce from $k$-Clique to $k$-DSP in DAGs and undirected graphs with $O((kn)^2)$ nodes. A variant of $k$-DSP is the $k$-Disjoint Paths ($k$-DP) problem, where the solution paths no longer need to be shortest paths. Previous work reduced from $k$-Clique to $p$-DP in DAGs with $O(kn)$ nodes, for $p= k + k(k-1)/2$. We improve this by showing a reduction from $k$-Clique to $p$-DP, for $p=k + \lfloor k^2/4\rfloor$. Under the $k$-Clique Hypothesis from fine-grained complexity, our results establish better conditional lower bounds for $k$-DSP for all $k\ge 4$, and better conditional lower bounds for $p$-DP for all $p\le 4031$.
We consider the distributed complexity of the (degree+1)-list coloring problem, in which each node $u$ of degree $d(u)$ is assigned a palette of $d(u)+1$ colors, and the goal is to find a proper coloring using these color palettes. The (degree+1)-list coloring problem is a natural generalization of the classical $(\Delta+1)$-coloring and $(\Delta+1)$-list coloring problems, both being benchmark problems extensively studied in distributed and parallel computing. In this paper we settle the complexity of the (degree+1)-list coloring problem in the Congested Clique model by showing that it can be solved deterministically in a constant number of rounds.
Existing datasets for attribute value extraction (AVE) predominantly focus on explicit attribute values while neglecting the implicit ones, lack product images, are often not publicly available, and lack an in-depth human inspection across diverse domains. To address these limitations, we present ImplicitAVE, the first, publicly available multimodal dataset for implicit attribute value extraction. ImplicitAVE, sourced from the MAVE dataset, is carefully curated and expanded to include implicit AVE and multimodality, resulting in a refined dataset of 68k training and 1.6k testing data across five domains. We also explore the application of multimodal large language models (MLLMs) to implicit AVE, establishing a comprehensive benchmark for MLLMs on the ImplicitAVE dataset. Six recent MLLMs with eleven variants are evaluated across diverse settings, revealing that implicit value extraction remains a challenging task for MLLMs. The contributions of this work include the development and release of ImplicitAVE, and the exploration and benchmarking of various MLLMs for implicit AVE, providing valuable insights and potential future research directions. Dataset and code are available at //github.com/HenryPengZou/ImplicitAVE
In prior work, Gupta et al. (SPAA 2022) presented a distributed algorithm for multiplying sparse $n \times n$ matrices, using $n$ computers. They assumed that the input matrices are uniformly sparse -- there are at most $d$ non-zeros in each row and column -- and the task is to compute a uniformly sparse part of the product matrix. Initially each computer knows one row of each input matrix, and eventually each computer needs to know one row of the product matrix. In each communication round each computer can send and receive one $O(\log n)$-bit message. Their algorithm solves this task in $O(d^{1.907})$ rounds, while the trivial bound is $O(d^2)$. We improve on the prior work in two dimensions: First, we show that we can solve the same task faster, in only $O(d^{1.832})$ rounds. Second, we explore what happens when matrices are not uniformly sparse. We consider the following alternative notions of sparsity: row-sparse matrices (at most $d$ non-zeros per row), column-sparse matrices, matrices with bounded degeneracy (we can recursively delete a row or column with at most $d$ non-zeros), average-sparse matrices (at most $dn$ non-zeros in total), and general matrices. We show that we can still compute $X = AB$ in $O(d^{1.832})$ rounds even if one of the three matrices ($A$, $B$, or $X$) is average-sparse instead of uniformly sparse. We present algorithms that handle a much broader range of sparsity in $O(d^2 + \log n)$ rounds, and present conditional hardness results that put limits on further improvements and generalizations.