By the MAXSAT problem, we are given a set $V$ of $m$ variables and a collection $C$ of $n$ clauses over $V$. We will seek a truth assignment to maximize the number of satisfied clauses. This problem is $\textit{NP}$-hard even for its restricted version, the 2-MAXSAT problem by which every clause contains at most 2 literals. In this paper, we discuss an efficient algorithm to solve this problem. Its worst case time complexity is bounded by O($(nm)^2(log_2\;nm)^{log_2\;nm}$). This shows that the 2-MAXSAT problem can be solved in polynomial time.
In the committee selection problem, the goal is to choose a subset of size $k$ from a set of candidates $C$ that collectively gives the best representation to a set of voters. We consider this problem in Euclidean $d$-space where each voter/candidate is a point and voters' preferences are implicitly represented by Euclidean distances to candidates. We explore fault-tolerance in committee selection and study the following three variants: (1) given a committee and a set of $f$ failing candidates, find their optimal replacement; (2) compute the worst-case replacement score for a given committee under failure of $f$ candidates; and (3) design a committee with the best replacement score under worst-case failures. The score of a committee is determined using the well-known (min-max) Chamberlin-Courant rule: minimize the maximum distance between any voter and its closest candidate in the committee. Our main results include the following: (1) in one dimension, all three problems can be solved in polynomial time; (2) in dimension $d \geq 2$, all three problems are NP-hard; and (3) all three problems admit a constant-factor approximation in any fixed dimension, and the optimal committee problem has an FPT bicriterion approximation.
Rank-order coding, a form of temporal coding, has emerged as a promising scheme to explain the rapid ability of the mammalian brain. Owing to its speed as well as efficiency, rank-order coding is increasingly gaining interest in diverse research areas beyond neuroscience. However, much uncertainty still exists about the performance of rank-order coding under noise. Herein we show what information rates are fundamentally possible and what trade-offs are at stake. An unexpected finding in this paper is the emergence of a special class of errors that, in a regime, increase with less noise.
We consider the maximum weight $b$-matching problem in the random-order semi-streaming model. Assuming all weights are small integers drawn from $[1,W]$, we present a $2 - \frac{1}{2W} + \varepsilon$ approximation algorithm, using a memory of $O(\max(|M_G|, n) \cdot poly(\log(m),W,1/\varepsilon))$, where $|M_G|$ denotes the cardinality of the optimal matching. Our result generalizes that of Bernstein [Bernstein, 2015], which achieves a $3/2 + \varepsilon$ approximation for the maximum cardinality simple matching. When $W$ is small, our result also improves upon that of Gamlath et al. [Gamlath et al., 2019], which obtains a $2 - \delta$ approximation (for some small constant $\delta \sim 10^{-17}$) for the maximum weight simple matching. In particular, for the weighted $b$-matching problem, ours is the first result beating the approximation ratio of $2$. Our technique hinges on a generalized weighted version of edge-degree constrained subgraphs, originally developed by Bernstein and Stein [Bernstein and Stein, 2015]. Such a subgraph has bounded vertex degree (hence uses only a small number of edges), and can be easily computed. The fact that it contains a $2 - \frac{1}{2W} + \varepsilon$ approximation of the maximum weight matching is proved using the classical K\H{o}nig-Egerv\'ary's duality theorem.
Generative Language Models (GLMs) have shown impressive performance in tasks such as text generation, understanding, and reasoning. However, the large model size poses challenges for practical deployment. To solve this problem, Quantization-Aware Training (QAT) has become increasingly popular. However, current QAT methods for generative models have resulted in a noticeable loss of accuracy. To counteract this issue, we propose a novel knowledge distillation method specifically designed for GLMs. Our method, called token-scaled logit distillation, prevents overfitting and provides superior learning from the teacher model and ground truth. This research marks the first evaluation of ternary weight quantization-aware training of large-scale GLMs with less than 1.0 degradation in perplexity and no loss of accuracy in a reasoning task.
In this paper, we present a new family of cross $Z$-complementary pairs (CZCPs) based on generalized Boolean functions and two roots of unity. Our key idea is to consider an arbitrary partition of the set $\{1,2,\cdots, n\}$ with two subsets corresponding to two given roots of unity for which two truncated sequences of new alphabet size determined by the two roots of unity are obtained. We show that these two truncated sequences form a new $q$-ary CZCP with flexible sequence length and large zero-correlation zone width. Furthermore, we derive an enumeration formula by considering the Stirling number of the second kind for the partitions and show that the number of constructed CZCPs increases significantly compared to the existing works.
Multilevel lattice codes, such as the associated to Constructions $C$, $\overline{D}$, D and D', have relevant applications in communications. In this paper, we investigate some properties of lattices obtained via Constructions D and D' from $q$-ary linear codes. Connections with Construction A, generator matrices, expressions and bounds for the lattice volume and minimum distances are derived. Extensions of previous results regarding construction and decoding of binary and $p$-ary linear codes ($p$ prime) are also presented.
In this article, we consider a $n$-dimensional random walk $X_t$, whose error terms are linear processes generated by $n$-dimensional noise vectors, and each component of these noise vectors is independent and may not be identically distributed with uniformly bounded 8th moment and densities. Given $T$ observations such that the dimension $n$ and sample size $T$ going to infinity proportionally, define $\boldsymbol{X}$ and $\hat{\boldsymbol{R}}$ as the data matrix and the sample correlation matrix of $\boldsymbol{X}$ respectively. This article establishes the central limit theorem (CLT) of the first $K$ largest eigenvalues of $n^{-1}\hat{\boldsymbol{R}}$. Subsequently, we propose a new unit root test for the panel high-dimensional nonstationary time series based on the CLT of the largest eigenvalue of $n^{-1}\hat{\boldsymbol{R}}$. A numerical experiment is undertaken to verify the power of our proposed unit root test.
We re-visit the complexity of kernelization for the $d$-Hitting Set problem. This is a classic problem in Parameterized Complexity, which encompasses several other of the most well-studied problems in this field, such as Vertex Cover, Feedback Vertex Set in Tournaments (FVST) and Cluster Vertex Deletion (CVD). In fact, $d$-Hitting Set encompasses any deletion problem to a hereditary property that can be characterized by a finite set of forbidden induced subgraphs. With respect to bit size, the kernelization complexity of $d$-Hitting Set is essentially settled: there exists a kernel with $O(k^d)$ bits ($O(k^d)$ sets and $O(k^{d-1})$ elements) and this it tight by the result of Dell and van Melkebeek [STOC 2010, JACM 2014]. Still, the question of whether there exists a kernel for $d$-Hitting Set with fewer elements has remained one of the most major open problems~in~Kernelization. In this paper, we first show that if we allow the kernelization to be lossy with a qualitatively better loss than the best possible approximation ratio of polynomial time approximation algorithms, then one can obtain kernels where the number of elements is linear for every fixed $d$. Further, based on this, we present our main result: we show that there exist approximate Turing kernelizations for $d$-Hitting Set that even beat the established bit-size lower bounds for exact kernelizations -- in fact, we use a constant number of oracle calls, each with ``near linear'' ($O(k^{1+\epsilon})$) bit size, that is, almost the best one could hope for. Lastly, for two special cases of implicit 3-Hitting set, namely, FVST and CVD, we obtain the ``best of both worlds'' type of results -- $(1+\epsilon)$-approximate kernelizations with a linear number of vertices. In terms of size, this substantially improves the exact kernels of Fomin et al. [SODA 2018, TALG 2019], with simpler arguments.
Event-based motion deblurring has shown promising results by exploiting low-latency events. However, current approaches are limited in their practical usage, as they assume the same spatial resolution of inputs and specific blurriness distributions. This work addresses these limitations and aims to generalize the performance of event-based deblurring in real-world scenarios. We propose a scale-aware network that allows flexible input spatial scales and enables learning from different temporal scales of motion blur. A two-stage self-supervised learning scheme is then developed to fit real-world data distribution. By utilizing the relativity of blurriness, our approach efficiently ensures the restored brightness and structure of latent images and further generalizes deblurring performance to handle varying spatial and temporal scales of motion blur in a self-distillation manner. Our method is extensively evaluated, demonstrating remarkable performance, and we also introduce a real-world dataset consisting of multi-scale blurry frames and events to facilitate research in event-based deblurring.
The problem of Multiple Object Tracking (MOT) consists in following the trajectory of different objects in a sequence, usually a video. In recent years, with the rise of Deep Learning, the algorithms that provide a solution to this problem have benefited from the representational power of deep models. This paper provides a comprehensive survey on works that employ Deep Learning models to solve the task of MOT on single-camera videos. Four main steps in MOT algorithms are identified, and an in-depth review of how Deep Learning was employed in each one of these stages is presented. A complete experimental comparison of the presented works on the three MOTChallenge datasets is also provided, identifying a number of similarities among the top-performing methods and presenting some possible future research directions.