The notion of $\alpha$-equivalence between $\lambda$-terms is commonly used to identify terms that are considered equal. However, due to the primitive treatment of free variables, this notion falls short when comparing subterms occurring within a larger context. Depending on the usage of the Barendregt convention (choosing different variable names for all involved binders), it will equate either too few or too many subterms. We introduce a formal notion of context-sensitive $\alpha$-equivalence, where two open terms can be compared within a context that resolves their free variables. We show that this equivalence coincides exactly with the notion of bisimulation equivalence. Furthermore, we present an efficient $O(n\log n)$ runtime algorithm that identifies $\lambda$-terms modulo context-sensitive $\alpha$-equivalence, improving upon a previously established $O(n\log^2 n)$ bound for a hashing modulo ordinary $\alpha$-equivalence by Maziarz et al. Hashing $\lambda$-terms is useful in many applications that require common subterm elimination and structure sharing. We employ the algorithm to obtain a large-scale, densely packed, interconnected graph of mathematical knowledge from the Coq proof assistant for machine learning purposes.
Analytic combinatorics in several variables refers to a suite of tools that provide sharp asymptotic estimates for certain combinatorial quantities. In this paper, we apply these tools to determine the Gilbert--Varshamov lower bound on the rate of optimal codes in $L_1$ metric. Several different code spaces are analyzed, including the simplex and the hypercube in $\mathbb{Z^n}$, all of which are inspired by concrete data storage and transmission models such as the sticky insertion channel, the permutation channel, the adjacent transposition (bit-shift) channel, the multilevel flash memory channel, etc.
We consider online algorithms for the $k$-server problem on trees of size $n$. Chrobak and Larmore proposed a $k$-competitive algorithm for this problem that has the optimal competitive ratio. However, the existing implementations have $O\left(k^2 + k\cdot \log n\right)$ or $O\left(k(\log n)^2\right)$ time complexity for processing a query, where $n$ is the number of nodes. We propose a new time-efficient implementation of this algorithm that has $O(n)$ time complexity for preprocessing and $O\left(k\log k\right)$ time for processing a query. The new algorithm is faster than both existing algorithms and the time complexity for query processing does not depend on the tree size.
We introduce the $k$-Plane Insertion into Plane drawing ($k$-PIP) problem: given a plane drawing of a planar graph $G$ and a set of edges $F$, insert the edges in $F$ into the drawing such that the resulting drawing is $k$-plane. In this paper, we focus on the $1$-PIP scenario. We present a linear-time algorithm for the case that $G$ is a triangulation, while proving NP-completeness for the case that $G$ is biconnected and $F$ forms a path or a matching.
As a promising 3D generation technique, multiview diffusion (MVD) has received a lot of attention due to its advantages in terms of generalizability, quality, and efficiency. By finetuning pretrained large image diffusion models with 3D data, the MVD methods first generate multiple views of a 3D object based on an image or text prompt and then reconstruct 3D shapes with multiview 3D reconstruction. However, the sparse views and inconsistent details in the generated images make 3D reconstruction challenging. We present MVD$^2$, an efficient 3D reconstruction method for multiview diffusion (MVD) images. MVD$^2$ aggregates image features into a 3D feature volume by projection and convolution and then decodes volumetric features into a 3D mesh. We train MVD$^2$ with 3D shape collections and MVD images prompted by rendered views of 3D shapes. To address the discrepancy between the generated multiview images and ground-truth views of the 3D shapes, we design a simple-yet-efficient view-dependent training scheme. MVD$^2$ improves the 3D generation quality of MVD and is fast and robust to various MVD methods. After training, it can efficiently decode 3D meshes from multiview images within one second. We train MVD$^2$ with Zero-123++ and ObjectVerse-LVIS 3D dataset and demonstrate its superior performance in generating 3D models from multiview images generated by different MVD methods, using both synthetic and real images as prompts.
In the Tricolored Euclidean Traveling Salesperson problem, we are given~$k=3$ sets of points in the plane and are looking for disjoint tours, each covering one of the sets. Arora (1998) famously gave a PTAS based on ``patching'' for the case $k=1$ and, recently, Dross et al.~(2023) generalized this result to~$k=2$. Our contribution is a $(5/3+\epsilon)$-approximation algorithm for~$k=3$ that further generalizes Arora's approach. It is believed that patching is generally no longer possible for more than two tours. We circumvent this issue by either applying a conditional patching scheme for three tours or using an alternative approach based on a weighted solution for $k=2$.
The $k$-Maximum Inner Product Search ($k$MIPS) serves as a foundational component in recommender systems and various data mining tasks. However, while most existing $k$MIPS approaches prioritize the efficient retrieval of highly relevant items for users, they often neglect an equally pivotal facet of search results: \emph{diversity}. To bridge this gap, we revisit and refine the diversity-aware $k$MIPS (D$k$MIPS) problem by incorporating two well-known diversity objectives -- minimizing the average and maximum pairwise item similarities within the results -- into the original relevance objective. This enhancement, inspired by Maximal Marginal Relevance (MMR), offers users a controllable trade-off between relevance and diversity. We introduce \textsc{Greedy} and \textsc{DualGreedy}, two linear scan-based algorithms tailored for D$k$MIPS. They both achieve data-dependent approximations and, when aiming to minimize the average pairwise similarity, \textsc{DualGreedy} attains an approximation ratio of $1/4$ with an additive term for regularization. To further improve query efficiency, we integrate a lightweight Ball-Cone Tree (BC-Tree) index with the two algorithms. Finally, comprehensive experiments on ten real-world data sets demonstrate the efficacy of our proposed methods, showcasing their capability to efficiently deliver diverse and relevant search results to users.
Sparse Mixture of Experts (MoE) models are popular for training large language models due to their computational efficiency. However, the commonly used top-$k$ routing mechanism suffers from redundancy computation and memory costs due to the unbalanced routing. Some experts are overflow, where the exceeding tokens are dropped. While some experts are vacant, which are padded with zeros, negatively impacting model performance. To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification. The Intra-GPU Rectification handles dropped tokens, efficiently routing them to experts within the GPU where they are located to avoid inter-GPU communication. The Fill-in Rectification addresses padding by replacing padding tokens with the tokens that have high routing scores. Our experimental results demonstrate that the Intra-GPU Rectification and the Fill-in Rectification effectively handle dropped tokens and padding, respectively. Furthermore, the combination of them achieves superior performance, surpassing the accuracy of the vanilla top-1 router by 4.7%.
Processing and reasoning over long contexts is crucial for many practical applications of Large Language Models (LLMs), such as document comprehension and agent construction. Despite recent strides in making LLMs process contexts with more than 100K tokens, there is currently a lack of a standardized benchmark to evaluate this long-context capability. Existing public benchmarks typically focus on contexts around 10K tokens, limiting the assessment and comparison of LLMs in processing longer contexts. In this paper, we propose $\infty$Bench, the first LLM benchmark featuring an average data length surpassing 100K tokens. $\infty$Bench comprises synthetic and realistic tasks spanning diverse domains, presented in both English and Chinese. The tasks in $\infty$Bench are designed to require well understanding of long dependencies in contexts, and make simply retrieving a limited number of passages from contexts not sufficient for these tasks. In our experiments, based on $\infty$Bench, we evaluate the state-of-the-art proprietary and open-source LLMs tailored for processing long contexts. The results indicate that existing long context LLMs still require significant advancements to effectively process 100K+ context. We further present three intriguing analyses regarding the behavior of LLMs processing long context.
We develop a sparse hierarchical $hp$-finite element method ($hp$-FEM) for the Helmholtz equation with rotationally invariant variable coefficients posed on a two-dimensional disk or annulus. The mesh is an inner disk cell (omitted if on an annulus domain) and concentric annuli cells. The discretization preserves the Fourier mode decoupling of rotationally invariant operators, such as the Laplacian, which manifests as block diagonal mass and stiffness matrices. Moreover, the matrices have a sparsity pattern independent of the order of the discretization and admit an optimal complexity factorization. The sparse $hp$-FEM can handle radial discontinuities in the right-hand side and in rotationally invariant Helmholtz coefficients. We consider examples such as a high-frequency Helmholtz equation with radial discontinuities, the time-dependent Schr\"odinger equation, and an extension to a three-dimensional cylinder domain, with a quasi-optimal solve, via the Alternating Direction Implicit (ADI) algorithm.
A supervised feature selection method selects an appropriate but concise set of features to differentiate classes, which is highly expensive for large-scale datasets. Therefore, feature selection should aim at both minimizing the number of selected features and maximizing the accuracy of classification, or any other task. However, this crucial task is computationally highly demanding on many real-world datasets and requires a very efficient algorithm to reach a set of optimal features with a limited number of fitness evaluations. For this purpose, we have proposed the binary multi-objective coordinate search (MOCS) algorithm to solve large-scale feature selection problems. To the best of our knowledge, the proposed algorithm in this paper is the first multi-objective coordinate search algorithm. In this method, we generate new individuals by flipping a variable of the candidate solutions on the Pareto front. This enables us to investigate the effectiveness of each feature in the corresponding subset. In fact, this strategy can play the role of crossover and mutation operators to generate distinct subsets of features. The reported results indicate the significant superiority of our method over NSGA-II, on five real-world large-scale datasets, particularly when the computing budget is limited. Moreover, this simple hyper-parameter-free algorithm can solve feature selection much faster and more efficiently than NSGA-II.