欧美精品日韩精品国内精品,国产自愉一二三四五区

This paper investigates the rate-distortion function, under a squared error distortion $D$, for an $n$-dimensional random vector uniformly distributed on an $(n-1)$-sphere of radius $R$. First, an expression for the rate-distortion function is derived for any values of $n$, $D$, and $R$. Second, two types of asymptotics with respect to the rate-distortion function of a Gaussian source are characterized. More specifically, these asymptotics concern the low-distortion regime (that is, $D \to 0$) and the high-dimensional regime (that is, $n \to \infty$).

相關內容

方(fang)陣

關注 0

通道 · TOOLS · 估計/估計量 · 單純形 · Adobe Flash ·

2024 年 2 月 22 日

Gilbert-Varshamov Bound for Codes in $L_1$ Metric using Multivariate Analytic Combinatorics

Keshav Goyal,Duc Tu Dao,Mladen Kova?evi?,Han Mao Kiah

from arxiv, 33 pages, 3 figures, submitted to IEEE Transactions on Information Theory

Analytic combinatorics in several variables refers to a suite of tools that provide sharp asymptotic estimates for certain combinatorial quantities. In this paper, we apply these tools to determine the Gilbert--Varshamov lower bound on the rate of optimal codes in $L_1$ metric. Several different code spaces are analyzed, including the simplex and the hypercube in $\mathbb{Z^n}$, all of which are inspired by concrete data storage and transmission models such as the sticky insertion channel, the permutation channel, the adjacent transposition (bit-shift) channel, the multilevel flash memory channel, etc.

Processing（編程語言） · 在線 · 優化器 · 算法與數據結構 ·

2024 年 2 月 22 日

Time Efficient Implementation for Online $k$-server Problem on Trees

Kamil Khadiev,Maxim Yagafarov

from arxiv, TAMC 2024. arXiv admin note: text overlap with arXiv:2008.00270

We consider online algorithms for the $k$-server problem on trees of size $n$. Chrobak and Larmore proposed a $k$-competitive algorithm for this problem that has the optimal competitive ratio. However, the existing implementations have $O\left(k^2 + k\cdot \log n\right)$ or $O\left(k(\log n)^2\right)$ time complexity for processing a query, where $n$ is the number of nodes. We propose a new time-efficient implementation of this algorithm that has $O(n)$ time complexity for preprocessing and $O\left(k\log k\right)$ time for processing a query. The new algorithm is faster than both existing algorithms and the time complexity for query processing does not depend on the tree size.

CASE · 情景 · 圖 · 路徑 · 講稿 ·

2024 年 2 月 22 日

On $k$-Plane Insertion into Plane Drawings

Julia Katheder,Philipp Kindermann,Fabian Klute,Irene Parada,Ignaz Rutter

We introduce the $k$-Plane Insertion into Plane drawing ($k$-PIP) problem: given a plane drawing of a planar graph $G$ and a set of edges $F$, insert the edges in $F$ into the drawing such that the resulting drawing is $k$-plane. In this paper, we focus on the $1$-PIP scenario. We present a linear-time algorithm for the case that $G$ is a triangulation, while proving NP-completeness for the case that $G$ is biconnected and $F$ forms a path or a matching.

Extensibility · 可約的 · 上下文窗口 · 大語言模型 · MoDELS ·

2024 年 2 月 22 日

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Jiaheng Liu,Zhiqi Bai,Yuanxing Zhang,Chenchen Zhang,Yu Zhang,Ge Zhang,Jiakai Wang,Haoran Que,Yukang Chen,Wenbo Su,Tiezheng Ge,Jie Fu,Wenhu Chen,Bo Zheng

Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources. Existing long-context extension methods usually need additional training procedures to support corresponding long-context windows, where the long-context training data (e.g., 32k) is needed, and high GPU training costs are assumed. To address the aforementioned issues, we propose an Efficient and Extreme length extension method for Large Language Models, called E 2 -LLM, with only one training procedure and dramatically reduced computation cost, which also removes the need to collect long-context data. Concretely, first, the training data of our E 2 -LLM only requires a short length (e.g., 4k), which reduces the tuning cost greatly. Second, the training procedure on the short training context window is performed only once time, and we can support different evaluation context windows at inference. Third, in E 2 - LLM, based on RoPE position embeddings, we introduce two different augmentation methods on the scale and position index parameters for different samples in training. It aims to make the model more robust to the different relative differences when directly interpolating the arbitrary context length at inference. Comprehensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our E 2 -LLM on challenging long-context tasks.

3D · 三維重建 · 塑造 · 解碼 · Prompt ·

2024 年 2 月 22 日

MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion

Xin-Yang Zheng,Hao Pan,Yu-Xiao Guo,Xin Tong,Yang Liu

As a promising 3D generation technique, multiview diffusion (MVD) has received a lot of attention due to its advantages in terms of generalizability, quality, and efficiency. By finetuning pretrained large image diffusion models with 3D data, the MVD methods first generate multiple views of a 3D object based on an image or text prompt and then reconstruct 3D shapes with multiview 3D reconstruction. However, the sparse views and inconsistent details in the generated images make 3D reconstruction challenging. We present MVD$^2$, an efficient 3D reconstruction method for multiview diffusion (MVD) images. MVD$^2$ aggregates image features into a 3D feature volume by projection and convolution and then decodes volumetric features into a 3D mesh. We train MVD$^2$ with 3D shape collections and MVD images prompted by rendered views of 3D shapes. To address the discrepancy between the generated multiview images and ground-truth views of the 3D shapes, we design a simple-yet-efficient view-dependent training scheme. MVD$^2$ improves the 3D generation quality of MVD and is fast and robust to various MVD methods. After training, it can efficiently decode 3D meshes from multiview images within one second. We train MVD$^2$ with Zero-123++ and ObjectVerse-LVIS 3D dataset and demonstrate its superior performance in generating 3D models from multiview images generated by different MVD methods, using both synthetic and real images as prompts.

CASE · Weight · 算法與數據結構 ·

2024 年 2 月 21 日

A $(5/3+ε)$-Approximation for Tricolored Non-crossing Euclidean TSP

Júlia Baligács,Yann Disser,Andreas Emil Feldmann,Anna Zych-Pawlewicz

In the Tricolored Euclidean Traveling Salesperson problem, we are given~$k=3$ sets of points in the plane and are looking for disjoint tours, each covering one of the sets. Arora (1998) famously gave a PTAS based on ``patching'' for the case $k=1$ and, recently, Dross et al.~(2023) generalized this result to~$k=2$. Our contribution is a $(5/3+\epsilon)$-approximation algorithm for~$k=3$ that further generalizes Arora's approach. It is believed that patching is generally no longer possible for more than two tours. We circumvent this issue by either applying a conditional patching scheme for three tours or using an alternative approach based on a weighted solution for $k=2$.

內積 · 多樣性 · 成對型 · 近似 · 相似度 ·

2024 年 2 月 21 日

Diversity-Aware $k$-Maximum Inner Product Search Revisited

Qiang Huang,Yanhao Wang,Yiqun Sun,Anthony K. H. Tung

from arxiv, 14 pages, 9 figures, and 5 tables

The $k$-Maximum Inner Product Search ($k$MIPS) serves as a foundational component in recommender systems and various data mining tasks. However, while most existing $k$MIPS approaches prioritize the efficient retrieval of highly relevant items for users, they often neglect an equally pivotal facet of search results: \emph{diversity}. To bridge this gap, we revisit and refine the diversity-aware $k$MIPS (D$k$MIPS) problem by incorporating two well-known diversity objectives -- minimizing the average and maximum pairwise item similarities within the results -- into the original relevance objective. This enhancement, inspired by Maximal Marginal Relevance (MMR), offers users a controllable trade-off between relevance and diversity. We introduce \textsc{Greedy} and \textsc{DualGreedy}, two linear scan-based algorithms tailored for D$k$MIPS. They both achieve data-dependent approximations and, when aiming to minimize the average pairwise similarity, \textsc{DualGreedy} attains an approximation ratio of $1/4$ with an additive term for regularization. To further improve query efficiency, we integrate a lightweight Ball-Cone Tree (BC-Tree) index with the two algorithms. Finally, comprehensive experiments on ten real-world data sets demonstrate the efficacy of our proposed methods, showcasing their capability to efficiently deliver diverse and relevant search results to users.

詞元分析器 · Performance · MoDELS · 混合專家模型 · 模型評估 ·

2024 年 2 月 21 日

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE

Zhiyuan Zeng,Qipeng Guo,Zhaoye Fei,Zhangyue Yin,Yunhua Zhou,Linyang Li,Tianxiang Sun,Hang Yan,Dahua Lin,Xipeng Qiu

Sparse Mixture of Experts (MoE) models are popular for training large language models due to their computational efficiency. However, the commonly used top-$k$ routing mechanism suffers from redundancy computation and memory costs due to the unbalanced routing. Some experts are overflow, where the exceeding tokens are dropped. While some experts are vacant, which are padded with zeros, negatively impacting model performance. To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification. The Intra-GPU Rectification handles dropped tokens, efficiently routing them to experts within the GPU where they are located to avoid inter-GPU communication. The Fill-in Rectification addresses padding by replacing padding tokens with the tokens that have high routing scores. Our experimental results demonstrate that the Intra-GPU Rectification and the Fill-in Rectification effectively handle dropped tokens and padding, respectively. Furthermore, the combination of them achieves superior performance, surpassing the accuracy of the vanilla top-1 router by 4.7%.

CASE · 可約的 · 雅克比 · 線性的 · 可理解性 ·

2024 年 2 月 21 日

A Unifying Theory for Runge--Kutta-like Time Integrators: Convergence and Stability

Thomas Izgin

from arxiv, Doctoral thesis

The work deals with two major topics concerning the numerical analysis of Runge-Kutta-like (RK-like) methods, namely their stability and order of convergence. RK-like methods differ from additive RK methods in that their coefficients are allowed to depend on the solution and the step size. As a result of this, we also refer to them as non-standard additive RK (NSARK) methods. The first major part of this thesis is dedicated to providing a tool for deriving order conditions for NSARK methods. The proposed approach may yield implicit order conditions, which can be rewritten in explicit form using the NB-series of the stages. The obtained explicit order conditions can be further reduced using Gr\"obner bases computations. With the presented approach, it was possible for the first time to obtain conditions for the construction of 3rd and 4th order GeCo as well as 4th order MPRK schemes. Moreover, a new fourth order MPRK method is constructed using our theory and the order of convergence is validated numerically. The second major part is concerned with the stability of nonlinear time integrators preserving at least one linear invariant. We discuss how the given approach generalizes the notion of A-stability. We can prove that investigating the Jacobian of the generating map is sufficient to understand the stability of the nonlinear method in a neighborhood of the steady state. This approach allows for the first time the investigation of several modified Patankar. In the case of MPRK schemes, we compute a general stability function in a way that can be easily adapted to the case of PDRS. Finally, the approach from the theory of dynamical systems is used to derive a necessary condition for avoiding unrealistic oscillations of the numerical approximation.

優化器 · Lipschitz · 極大 · Oracle · 損失 ·

2024 年 2 月 20 日

Near-Optimal Quantum Algorithm for Minimizing the Maximal Loss

Hao Wang,Chenyi Zhang,Tongyang Li

from arxiv, 22 pages, 1 figure, To appear in The Twelfth International Conference on Learning Representations (ICLR 2024)

The problem of minimizing the maximum of $N$ convex, Lipschitz functions plays significant roles in optimization and machine learning. It has a series of results, with the most recent one requiring $O(N\epsilon^{-2/3} + \epsilon^{-8/3})$ queries to a first-order oracle to compute an $\epsilon$-suboptimal point. On the other hand, quantum algorithms for optimization are rapidly advancing with speedups shown on many important optimization problems. In this paper, we conduct a systematic study for quantum algorithms and lower bounds for minimizing the maximum of $N$ convex, Lipschitz functions. On one hand, we develop quantum algorithms with an improved complexity bound of $\tilde{O}(\sqrt{N}\epsilon^{-5/3} + \epsilon^{-8/3})$. On the other hand, we prove that quantum algorithms must take $\tilde{\Omega}(\sqrt{N}\epsilon^{-2/3})$ queries to a first order quantum oracle, showing that our dependence on $N$ is optimal up to poly-logarithmic factors.