国产欧美日韩综合在线,碰碰女人公开免费视频

Consider the problem of estimating a random variable $X$ from noisy observations $Y = X+ Z$, where $Z$ is standard normal, under the $L^1$ fidelity criterion. It is well known that the optimal Bayesian estimator in this setting is the conditional median. This work shows that the only prior distribution on $X$ that induces linearity in the conditional median is Gaussian. Along the way, several other results are presented. In particular, it is demonstrated that if the conditional distribution $P_{X|Y=y}$ is symmetric for all $y$, then $X$ must follow a Gaussian distribution. Additionally, we consider other $L^p$ losses and observe the following phenomenon: for $p \in [1,2]$, Gaussian is the only prior distribution that induces a linear optimal Bayesian estimator, and for $p \in (2,\infty)$, infinitely many prior distributions on $X$ can induce linearity. Finally, extensions are provided to encompass noise models leading to conditional distributions from certain exponential families.

相關內容

估計/估計量

關注 3

近似 · 無向 · 核化 · 圖 · 有向 ·

2024 年 2 月 23 日

Tight Approximation and Kernelization Bounds for Vertex-Disjoint Shortest Paths

Matthias Bentert,Fedor V. Fomin,Petr A. Golovach

We examine the possibility of approximating Maximum Vertex-Disjoint Shortest Paths. In this problem, the input is an edge-weighted (directed or undirected) $n$-vertex graph $G$ along with $k$ terminal pairs $(s_1,t_1),(s_2,t_2),\ldots,(s_k,t_k)$. The task is to connect as many terminal pairs as possible by pairwise vertex-disjoint paths such that each path is a shortest path between the respective terminals. Our work is anchored in the recent breakthrough by Lochet [SODA '21], which demonstrates the polynomial-time solvability of the problem for a fixed value of $k$. Lochet's result implies the existence of a polynomial-time $ck$-approximation for Maximum Vertex-Disjoint Shortest Paths, where $c \leq 1$ is a constant. Our first result suggests that this approximation algorithm is, in a sense, the best we can hope for. More precisely, assuming the gap-ETH, we exclude the existence of an $o(k)$-approximations within $f(k) \cdot $poly($n$) time for any function $f$ that only depends on $k$. Our second result demonstrates the infeasibility of achieving an approximation ratio of $n^{\frac{1}{2}-\varepsilon}$ in polynomial time, unless P = NP. It is not difficult to show that a greedy algorithm selecting a path with the minimum number of arcs results in a $\lceil\sqrt{\ell}\rceil$-approximation, where $\ell$ is the number of edges in all the paths of an optimal solution. Since $\ell \leq n$, this underscores the tightness of the $n^{\frac{1}{2}-\varepsilon}$-inapproximability bound. Additionally, we establish that Maximum Vertex-Disjoint Shortest Paths is fixed-parameter tractable when parameterized by $\ell$ but does not admit a polynomial kernel. Our hardness results hold for undirected graphs with unit weights, while our positive results extend to scenarios where the input graph is directed and features arbitrary (non-negative) edge weights.

情景 · 近似 · 圖 · 離散化 · 分解的 ·

2024 年 2 月 23 日

Tight Inapproximability of Target Set Reconfiguration

Naoto Ohsaka

from arxiv, 13 pages

Given a graph $G$ with a vertex threshold function $\tau$, consider a dynamic process in which any inactive vertex $v$ becomes activated whenever at least $\tau(v)$ of its neighbors are activated. A vertex set $S$ is called a target set if all vertices of $G$ would be activated when initially activating vertices of $S$. In the Minmax Target Set Reconfiguration problem, for a graph $G$ and its two target sets $X$ and $Y$, we wish to transform $X$ into $Y$ by repeatedly adding or removing a single vertex, using only target sets of $G$, so as to minimize the maximum size of any intermediate target set. We prove that it is NP-hard to approximate Minmax Target Set Reconfiguration within a factor of $2-o\left(\frac{1}{\operatorname{polylog} n}\right)$, where $n$ is the number of vertices. Our result establishes a tight lower bound on approximability of Minmax Target Set Reconfiguration, which admits a $2$-factor approximation algorithm. The proof is based on a gap-preserving reduction from Target Set Selection to Minmax Target Set Reconfiguration, where NP-hardness of approximation for the former problem is proven by Chen (SIAM J. Discrete Math., 2009) and Charikar, Naamad, and Wirth (APPROX/RANDOM 2016).

通道 · TOOLS · 估計/估計量 · 單純形 · Adobe Flash ·

2024 年 2 月 22 日

Gilbert-Varshamov Bound for Codes in $L_1$ Metric using Multivariate Analytic Combinatorics

Keshav Goyal,Duc Tu Dao,Mladen Kova?evi?,Han Mao Kiah

from arxiv, 33 pages, 3 figures, submitted to IEEE Transactions on Information Theory

Analytic combinatorics in several variables refers to a suite of tools that provide sharp asymptotic estimates for certain combinatorial quantities. In this paper, we apply these tools to determine the Gilbert--Varshamov lower bound on the rate of optimal codes in $L_1$ metric. Several different code spaces are analyzed, including the simplex and the hypercube in $\mathbb{Z^n}$, all of which are inspired by concrete data storage and transmission models such as the sticky insertion channel, the permutation channel, the adjacent transposition (bit-shift) channel, the multilevel flash memory channel, etc.

泛函 · Extensibility · 數據點 · Performer · 示例 ·

2024 年 2 月 22 日

InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks

Somnath Banerjee,Maulindu Sarkar,Punyajoy Saha,Binny Mathew,Animesh Mukherjee

from arxiv, Accepted at LREC-COLING 2024 (Long Paper)

Recently, influence functions present an apparatus for achieving explainability for deep neural models by quantifying the perturbation of individual train instances that might impact a test prediction. Our objectives in this paper are twofold. First we incorporate influence functions as a feedback into the model to improve its performance. Second, in a dataset extension exercise, using influence functions to automatically identify data points that have been initially `silver' annotated by some existing method and need to be cross-checked (and corrected) by annotators to improve the model performance. To meet these objectives, in this paper, we introduce InfFeed, which uses influence functions to compute the influential instances for a target instance. Toward the first objective, we adjust the label of the target instance based on its influencer(s) label. In doing this, InfFeed outperforms the state-of-the-art baselines (including LLMs) by a maximum macro F1-score margin of almost 4% for hate speech classification, 3.5% for stance classification, and 3% for irony and 2% for sarcasm detection. Toward the second objective we show that manually re-annotating only those silver annotated data points in the extension set that have a negative influence can immensely improve the model performance bringing it very close to the scenario where all the data points in the extension set have gold labels. This allows for huge reduction of the number of data points that need to be manually annotated since out of the silver annotated extension dataset, the influence function scheme picks up ~1/1000 points that need manual correction.

CASE · 情景 · 圖 · 路徑 · 講稿 ·

2024 年 2 月 22 日

On $k$-Plane Insertion into Plane Drawings

Julia Katheder,Philipp Kindermann,Fabian Klute,Irene Parada,Ignaz Rutter

We introduce the $k$-Plane Insertion into Plane drawing ($k$-PIP) problem: given a plane drawing of a planar graph $G$ and a set of edges $F$, insert the edges in $F$ into the drawing such that the resulting drawing is $k$-plane. In this paper, we focus on the $1$-PIP scenario. We present a linear-time algorithm for the case that $G$ is a triangulation, while proving NP-completeness for the case that $G$ is biconnected and $F$ forms a path or a matching.

Extensibility · 可約的 · 上下文窗口 · 大語言模型 · MoDELS ·

2024 年 2 月 22 日

E^2-LLM: Efficient and Extreme Length Extension of Large Language Models

Jiaheng Liu,Zhiqi Bai,Yuanxing Zhang,Chenchen Zhang,Yu Zhang,Ge Zhang,Jiakai Wang,Haoran Que,Yukang Chen,Wenbo Su,Tiezheng Ge,Jie Fu,Wenhu Chen,Bo Zheng

Typically, training LLMs with long context sizes is computationally expensive, requiring extensive training hours and GPU resources. Existing long-context extension methods usually need additional training procedures to support corresponding long-context windows, where the long-context training data (e.g., 32k) is needed, and high GPU training costs are assumed. To address the aforementioned issues, we propose an Efficient and Extreme length extension method for Large Language Models, called E 2 -LLM, with only one training procedure and dramatically reduced computation cost, which also removes the need to collect long-context data. Concretely, first, the training data of our E 2 -LLM only requires a short length (e.g., 4k), which reduces the tuning cost greatly. Second, the training procedure on the short training context window is performed only once time, and we can support different evaluation context windows at inference. Third, in E 2 - LLM, based on RoPE position embeddings, we introduce two different augmentation methods on the scale and position index parameters for different samples in training. It aims to make the model more robust to the different relative differences when directly interpolating the arbitrary context length at inference. Comprehensive experimental results on multiple benchmark datasets demonstrate the effectiveness of our E 2 -LLM on challenging long-context tasks.

3D · 三維重建 · 塑造 · 解碼 · Prompt ·

2024 年 2 月 22 日

MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion

Xin-Yang Zheng,Hao Pan,Yu-Xiao Guo,Xin Tong,Yang Liu

As a promising 3D generation technique, multiview diffusion (MVD) has received a lot of attention due to its advantages in terms of generalizability, quality, and efficiency. By finetuning pretrained large image diffusion models with 3D data, the MVD methods first generate multiple views of a 3D object based on an image or text prompt and then reconstruct 3D shapes with multiview 3D reconstruction. However, the sparse views and inconsistent details in the generated images make 3D reconstruction challenging. We present MVD$^2$, an efficient 3D reconstruction method for multiview diffusion (MVD) images. MVD$^2$ aggregates image features into a 3D feature volume by projection and convolution and then decodes volumetric features into a 3D mesh. We train MVD$^2$ with 3D shape collections and MVD images prompted by rendered views of 3D shapes. To address the discrepancy between the generated multiview images and ground-truth views of the 3D shapes, we design a simple-yet-efficient view-dependent training scheme. MVD$^2$ improves the 3D generation quality of MVD and is fast and robust to various MVD methods. After training, it can efficiently decode 3D meshes from multiview images within one second. We train MVD$^2$ with Zero-123++ and ObjectVerse-LVIS 3D dataset and demonstrate its superior performance in generating 3D models from multiview images generated by different MVD methods, using both synthetic and real images as prompts.

TAP · Prompt · state-of-the-art · 黑盒 · 大語言模型 ·

2024 年 2 月 21 日

Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

Anay Mehrotra,Manolis Zampetakis,Paul Kassianik,Blaine Nelson,Hyrum Anderson,Yaron Singer,Amin Karbasi

from arxiv, An implementation of the presented method is available at //github.com/RICommunity/TAP

While Large Language Models (LLMs) display versatile functionality, they continue to generate harmful, biased, and toxic content, as demonstrated by the prevalence of human-designed jailbreaks. In this work, we present Tree of Attacks with Pruning (TAP), an automated method for generating jailbreaks that only requires black-box access to the target LLM. TAP utilizes an LLM to iteratively refine candidate (attack) prompts using tree-of-thought reasoning until one of the generated prompts jailbreaks the target. Crucially, before sending prompts to the target, TAP assesses them and prunes the ones unlikely to result in jailbreaks. Using tree-of-thought reasoning allows TAP to navigate a large search space of prompts and pruning reduces the total number of queries sent to the target. In empirical evaluations, we observe that TAP generates prompts that jailbreak state-of-the-art LLMs (including GPT4 and GPT4-Turbo) for more than 80% of the prompts using only a small number of queries. Interestingly, TAP is also capable of jailbreaking LLMs protected by state-of-the-art guardrails, e.g., LlamaGuard. This significantly improves upon the previous state-of-the-art black-box method for generating jailbreaks.

內積 · 多樣性 · 成對型 · 近似 · 相似度 ·

2024 年 2 月 21 日

Diversity-Aware $k$-Maximum Inner Product Search Revisited

Qiang Huang,Yanhao Wang,Yiqun Sun,Anthony K. H. Tung

from arxiv, 14 pages, 9 figures, and 5 tables

The $k$-Maximum Inner Product Search ($k$MIPS) serves as a foundational component in recommender systems and various data mining tasks. However, while most existing $k$MIPS approaches prioritize the efficient retrieval of highly relevant items for users, they often neglect an equally pivotal facet of search results: \emph{diversity}. To bridge this gap, we revisit and refine the diversity-aware $k$MIPS (D$k$MIPS) problem by incorporating two well-known diversity objectives -- minimizing the average and maximum pairwise item similarities within the results -- into the original relevance objective. This enhancement, inspired by Maximal Marginal Relevance (MMR), offers users a controllable trade-off between relevance and diversity. We introduce \textsc{Greedy} and \textsc{DualGreedy}, two linear scan-based algorithms tailored for D$k$MIPS. They both achieve data-dependent approximations and, when aiming to minimize the average pairwise similarity, \textsc{DualGreedy} attains an approximation ratio of $1/4$ with an additive term for regularization. To further improve query efficiency, we integrate a lightweight Ball-Cone Tree (BC-Tree) index with the two algorithms. Finally, comprehensive experiments on ten real-world data sets demonstrate the efficacy of our proposed methods, showcasing their capability to efficiently deliver diverse and relevant search results to users.

詞元分析器 · Performance · MoDELS · 混合專家模型 · 模型評估 ·

2024 年 2 月 21 日

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE

Zhiyuan Zeng,Qipeng Guo,Zhaoye Fei,Zhangyue Yin,Yunhua Zhou,Linyang Li,Tianxiang Sun,Hang Yan,Dahua Lin,Xipeng Qiu

Sparse Mixture of Experts (MoE) models are popular for training large language models due to their computational efficiency. However, the commonly used top-$k$ routing mechanism suffers from redundancy computation and memory costs due to the unbalanced routing. Some experts are overflow, where the exceeding tokens are dropped. While some experts are vacant, which are padded with zeros, negatively impacting model performance. To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification. The Intra-GPU Rectification handles dropped tokens, efficiently routing them to experts within the GPU where they are located to avoid inter-GPU communication. The Fill-in Rectification addresses padding by replacing padding tokens with the tokens that have high routing scores. Our experimental results demonstrate that the Intra-GPU Rectification and the Fill-in Rectification effectively handle dropped tokens and padding, respectively. Furthermore, the combination of them achieves superior performance, surpassing the accuracy of the vanilla top-1 router by 4.7%.