国产乱人弄视频免费观看,日本在线视频网站WWW色下载

Chain-of-Though (CoT) prompting has shown promising performance in various reasoning tasks. Recently, Self-Consistency \citep{wang2023selfconsistency} proposes to sample a diverse set of reasoning chains which may lead to different answers while the answer that receives the most votes is selected. In this paper, we propose a novel method to use backward reasoning in verifying candidate answers. We mask a token in the question by ${\bf x}$ and ask the LLM to predict the masked token when a candidate answer is provided by \textit{a simple template}, i.e., ``\textit{\textbf{If we know the answer of the above question is \{a candidate answer\}, what is the value of unknown variable ${\bf x}$?}}'' Intuitively, the LLM is expected to predict the masked token successfully if the provided candidate answer is correct. We further propose FOBAR to combine forward and backward reasoning for estimating the probability of candidate answers. We conduct extensive experiments on six data sets and three LLMs. Experimental results demonstrate that FOBAR achieves state-of-the-art performance on various reasoning benchmarks.

相關內容

后向

關注 0

Prompt · MoDELS · tuning · 可辨認的 · Perplexity ·

2023 年 10 月 4 日

Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation

Chen Dun,Mirian Del Carmen Hipolito Garcia,Guoqing Zheng,Ahmed Hassan Awadallah,Anastasios Kyrillidis,Robert Sim

Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how one would expand prompt tuning to handle -- concomitantly -- heterogeneous tasks and data distributions is a widely open question. To address this gap, we suggest the use of \emph{Mixture of Prompts}, or MoPs, associated with smart gating functionality: the latter -- whose design is one of the contributions of this paper -- can identify relevant skills embedded in different groups of prompts and dynamically assign combined experts (i.e., collection of prompts), based on the target task. Additionally, MoPs are empirically agnostic to any model compression technique applied -- for efficiency reasons -- as well as instruction data source and task composition. In practice, MoPs can simultaneously mitigate prompt training "interference" in multi-task, multi-source scenarios (e.g., task and data heterogeneity across sources), as well as possible implications from model approximations. As a highlight, MoPs manage to decrease final perplexity from $\sim20\%$ up to $\sim70\%$, as compared to baselines, in the federated scenario, and from $\sim 3\%$ up to $\sim30\%$ in the centralized scenario.

NOMA · Networking · Weight · 層 · 解碼 ·

2023 年 10 月 3 日

Throughput Maximization for Instantly Decodable Network Coded NOMA in Broadcast Communication Systems

Zhonghui Mei

Non-orthogonal multiple access (NOMA) is a promising transmission scheme employed at the physical layer to improve the spectral efficiency. In this paper, we develop a novel cross-layer approach by employing NOMA at the physical layer and instantly decodable network coding (IDNC) at the network layer in downlink cellular networks. Following this approach, two IDNC packets are selected for each transmission, with one designed for all receivers and the other designed only for the strong receivers which can employ successive interference cancellation (SIC). The IDNC packets selection, transmission rates adaption for the two IDNC packets, and NOMA power allocation are jointly considered to improve the throughput of the network. Given the intractability of the problem, we decouple it into two separate subproblems, the IDNC scheduling which jointly selects the IDNC packets and the transmission rates with the given NOMA power allocation, and the NOMA power allocation with the given IDNC scheduling. The IDNC scheduling can be reduced to a maximum weight clique problem, and two heuristic algorithms named as maximum weight vertex (MWV) search and maximum weight path based maximum weight vertex (MWP-MWV) search are developed to solve the first subproblem. An iterative function evaluation (IFE) approach is proposed to solve the second subproblem. Simulation results are presented to demonstrates the throughput gain of the proposed approach over the existing solutions.

線性的 · Subspace · Processing（編程語言） · Krylov方法 · 相同 ·

2023 年 10 月 3 日

MinAres: An Iterative Solver for Symmetric Linear Systems

Alexis Montoison,Dominique Orban,Michael A. Saunders

from arxiv, 20 pages

We introduce an iterative solver named MINARES for symmetric linear systems $Ax \approx b$, where $A$ is possibly singular. MINARES is based on the symmetric Lanczos process, like MINRES and MINRES-QLP, but it minimizes $\|Ar_k\|$ in each Krylov subspace rather than $\|r_k\|$, where $r_k$ is the current residual vector. When $A$ is symmetric, MINARES minimizes the same quantity $\|Ar_k\|$ as LSMR, but in more relevant Krylov subspaces, and it requires only one matrix-vector product $Av$ per iteration, whereas LSMR would need two. Our numerical experiments with MINRES-QLP and LSMR show that MINARES is a pertinent alternative on consistent symmetric systems and the most suitable Krylov method for inconsistent symmetric systems. We derive properties of MINARES from an equivalent solver named CAR that is to MINARES as CR is to MINRES, is not based on the Lanczos process, and minimizes $\|Ar_k\|$ in the same Krylov subspace as MINARES. We establish that MINARES and CAR generate monotonic $\|x_k - x_{\star}\|$, $\|x_k - x_{\star}\|_A$ and $\|r_k\|$ when $A$ is positive definite.

Microsoft Surface · 優化器 · 原點 · Extensibility · 勢函數 ·

2023 年 10 月 3 日

A Volumetric Approach to Monge's Optimal Transport on Surfaces

Richard Tsai,Axel G. R. Turnquist

from arxiv, 30 pages, 8 figures

We propose a volumetric formulation for computing the Optimal Transport problem defined on surfaces in $\mathbb{R}^3$, found in disciplines like optics, computer graphics, and computational methodologies. Instead of directly tackling the original problem on the surface, we define a new Optimal Transport problem on a thin tubular region, $T_{\epsilon}$, adjacent to the surface. This extension offers enhanced flexibility and simplicity for numerical discretization on Cartesian grids. The Optimal Transport mapping and potential function computed on $T_{\epsilon}$ are consistent with the original problem on surfaces. We demonstrate that, with the proposed volumetric approach, it is possible to use simple and straightforward numerical methods to solve Optimal Transport for $\Gamma = \mathbb{S}^2$.

估計/估計量 · 線性的 · Shuffle · 線性回歸 · Medium ·

2023 年 10 月 2 日

Optimal Estimator for Linear Regression with Shuffled Labels

Hang Zhang,Ping Li

This paper considers the task of linear regression with shuffled labels, i.e., $\mathbf Y = \mathbf \Pi \mathbf X \mathbf B + \mathbf W$, where $\mathbf Y \in \mathbb R^{n\times m}, \mathbf Pi \in \mathbb R^{n\times n}, \mathbf X\in \mathbb R^{n\times p}, \mathbf B \in \mathbb R^{p\times m}$, and $\mathbf W\in \mathbb R^{n\times m}$, respectively, represent the sensing results, (unknown or missing) corresponding information, sensing matrix, signal of interest, and additive sensing noise. Given the observation $\mathbf Y$ and sensing matrix $\mathbf X$, we propose a one-step estimator to reconstruct $(\mathbf \Pi, \mathbf B)$. From the computational perspective, our estimator's complexity is $O(n^3 + np^2m)$, which is no greater than the maximum complexity of a linear assignment algorithm (e.g., $O(n^3)$) and a least square algorithm (e.g., $O(np^2 m)$). From the statistical perspective, we divide the minimum $snr$ requirement into four regimes, e.g., unknown, hard, medium, and easy regimes; and present sufficient conditions for the correct permutation recovery under each regime: $(i)$ $snr \geq \Omega(1)$ in the easy regime; $(ii)$ $snr \geq \Omega(\log n)$ in the medium regime; and $(iii)$ $snr \geq \Omega((\log n)^{c_0}\cdot n^{{c_1}/{srank(\mathbf B)}})$ in the hard regime ($c_0, c_1$ are some positive constants and $srank(\mathbf B)$ denotes the stable rank of $\mathbf B$). In the end, we also provide numerical experiments to confirm the above claims.

Packing · 近似 · 情景 · 向量化 · 分解的 ·

2023 年 10 月 2 日

Improved Hardness of Approximation for Geometric Bin Packing

Arka Ray,Sai Sandeep

from arxiv, 11 pages; bug fixes

The Geometric Bin Packing (GBP) problem is a generalization of Bin Packing where the input is a set of $d$-dimensional rectangles, and the goal is to pack them into unit $d$-dimensional cubes efficiently. It is NP-Hard to obtain a PTAS for the problem, even when $d=2$. For general $d$, the best-known approximation algorithm has an approximation guarantee exponential in $d$, while the best hardness of approximation is still a small constant inapproximability from the case when $d=2$. In this paper, we show that the problem cannot be approximated within $d^{1-\epsilon}$ factor unless NP=ZPP. Recently, $d$-dimensional Vector Bin Packing, a closely related problem to the GBP, was shown to be hard to approximate within $\Omega(\log d)$ when $d$ is a fixed constant, using a notion of Packing Dimension of set families. In this paper, we introduce a geometric analog of it, the Geometric Packing Dimension of set families. While we fall short of obtaining similar inapproximability results for the Geometric Bin Packing problem when $d$ is fixed, we prove a couple of key properties of the Geometric Packing Dimension which highlight fundamental differences between Geometric Bin Packing and Vector Bin Packing.

Performer · 語言模型化 · 前向 · 后向 · 掩碼 ·

2023 年 9 月 29 日

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

Weisen Jiang,Han Shi,Longhui Yu,Zhengying Liu,Yu Zhang,Zhenguo Li,James T. Kwok

from arxiv, Technical Report

Chain-of-Thought (CoT) prompting in large language models (LLMs) has shown promising performance on mathematical reasoning tasks. Recently, Self-Consistency samples a diverse set of reasoning chains with different answers and chooses the answer by majority voting. Though effective, its performance cannot be further improved by sampling more reasoning chains. To address this problem, we propose to integrate backward reasoning into answer verification. We first mask a number in the question by ${\bf x}$. The LLM is then asked to predict the masked number with a candidate answer $A$ embedded in the template: ``If we know the answer to the above question is $\{A\}$, what is the value of unknown variable ${\bf x}$?'' The LLM is expected to predict the masked number successfully if the provided candidate answer is correct. To further improve performance, we propose FOBAR (FOrward-BAckward Reasoning) to combine forward and backward reasoning for verifying candidate answers. Experiments are performed on six standard mathematical data sets and three LLMs (text-davinci-003, GPT-3.5-Turbo, GPT-4). Results show that FOBAR achieves state-of-the-art performance. In particular, FOBAR outperforms Self-Consistency which uses forward reasoning alone, demonstrating that combining forward and forward reasoning is better. It also outperforms existing verification methods, verifying the effectiveness of using the simple template in backward reasoning and the proposed combination.

Performer · 學成 · Boosting（一種模型訓練加速方式） · MoDELS · 可辨認的 ·

2021 年 12 月 22 日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Lin Yang,Yi Shen,Yue Mao,Longjun Cai

from arxiv, Accepted by AAAI-2022

Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.

異常檢測 · Taxonomy · Better · 示例 · 深度學習 ·

2020 年 3 月 16 日

Anomalous Instance Detection in Deep Learning: A Survey

Saikiran Bulusu,Bhavya Kailkhura,Bo Li,Pramod K. Varshney,Dawn Song

Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.

命名實體識別 · entity · 學成 · 深度學習 · 可辨認的 ·

2020 年 3 月 13 日

A Survey on Deep Learning for Named Entity Recognition

Jing Li,Aixin Sun,Jianglei Han,Chenliang Li

from arxiv, 20 pages, 12 figures, 3 tables. arXiv admin note: text overlap with arXiv:1702.02098, arXiv:1904.10503 by other authors

Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.