亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='gqjke'><del id='gqjke'><del id='gqjke'></del><pre id='gqjke'><pre id='gqjke'><option id='gqjke'><address id='gqjke'></address><bdo id='gqjke'><tr id='gqjke'><acronym id='gqjke'><pre id='gqjke'></pre></acronym><div id='gqjke'></div></tr></bdo></option></pre><small id='gqjke'><address id='gqjke'><u id='gqjke'><legend id='gqjke'><option id='gqjke'><abbr id='gqjke'></abbr><li id='gqjke'><pre id='gqjke'></pre></li></option></legend><select id='gqjke'></select></u></address></small></pre></del><sup id='gqjke'></sup><blockquote id='gqjke'><dt id='gqjke'></dt></blockquote><blockquote id='gqjke'></blockquote></dir><tt id='gqjke'></tt><u id='gqjke'><tt id='gqjke'><form id='gqjke'></form></tt><td id='gqjke'><dt id='gqjke'></dt></td></u>

<code id='gqjke'><i id='gqjke'><q id='gqjke'><legend id='gqjke'><pre id='gqjke'><style id='gqjke'><acronym id='gqjke'><i id='gqjke'><form id='gqjke'><option id='gqjke'><center id='gqjke'></center></option></form></i></acronym></style><tt id='gqjke'></tt></pre></legend></q></i></code><center id='gqjke'></center>

<dd id='gqjke'></dd>

<style id='gqjke'></style><sub id='gqjke'><dfn id='gqjke'><abbr id='gqjke'><big id='gqjke'><bdo id='gqjke'></bdo></big></abbr></dfn></sub>_{<dir id='gqjke'></dir>}

·

MoDELS · 語言模型化 · Performer · 可約的 · 復合數據 ·

2023 年 8 月 3 日

Local Large Language Models for Complex Structured Medical Tasks

V. K. Cody Bumgardner,Aaron Mullen,Sam Armstrong,Caylin Hickey,Jeff Talbert

from arxiv, 12 pages, Preprint of an article submitted for consideration in Pacific Symposium on Biocomputing \c{opyright} 2024 copyright World Scientific Publishing Company //www.worldscientific.com/

This paper introduces an approach that combines the language reasoning capabilities of large language models (LLMs) with the benefits of local training to tackle complex, domain-specific tasks. Specifically, the authors demonstrate their approach by extracting structured condition codes from pathology reports. The proposed approach utilizes local LLMs, which can be fine-tuned to respond to specific generative instructions and provide structured outputs. The authors collected a dataset of over 150k uncurated surgical pathology reports, containing gross descriptions, final diagnoses, and condition codes. They trained different model architectures, including LLaMA, BERT and LongFormer and evaluated their performance. The results show that the LLaMA-based models significantly outperform BERT-style models across all evaluated metrics, even with extremely reduced precision. The LLaMA models performed especially well with large datasets, demonstrating their ability to handle complex, multi-label tasks. Overall, this work presents an effective approach for utilizing LLMs to perform domain-specific tasks using accessible hardware, with potential applications in the medical domain, where complex data extraction and classification are required.

相關內容

MoDELS

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · MoDELS · 語言模型化 · Performer · Integration ·

2023 年 9 月 26 日

Connecting Speech Encoder and Large Language Model for ASR

Wenyi Yu,Changli Tang,Guangzhi Sun,Xianzhao Chen,Tian Tan,Wei Li,Lu Lu,Zejun Ma,Chao Zhang

The impressive capability and versatility of large language models (LLMs) have aroused increasing attention in automatic speech recognition (ASR), with several pioneering studies attempting to build integrated ASR models by connecting a speech encoder with an LLM. This paper presents a comparative study of three commonly used structures as connectors, including fully connected layers, multi-head cross-attention, and Q-Former. Speech encoders from the Whisper model series as well as LLMs from the Vicuna model series with different model sizes were studied. Experiments were performed on the commonly used LibriSpeech, Common Voice, and GigaSpeech datasets, where the LLMs with Q-Formers demonstrated consistent and considerable word error rate (WER) reductions over LLMs with other connector structures. Q-Former-based LLMs can generalise well to out-of-domain datasets, where 12% relative WER reductions over the Whisper baseline ASR model were achieved on the Eval2000 test set without using any in-domain training data from Switchboard. Moreover, a novel segment-level Q-Former is proposed to enable LLMs to recognise speech segments with a duration exceeding the limitation of the encoders, which results in 17% relative WER reductions over other connector structures on 90-second-long speech data.

層 · Networking · Tensor · Neural Networks · Jupyter ·

2023 年 9 月 25 日

Efficient Finite Initialization for Tensorized Neural Networks

Alejandro Mata Ali,I?igo Perez Delgado,Marina Ristol Roura,Aitor Moreno Fdez. de Leceta

from arxiv, 6 pages, 9 figures

We present a novel method for initializing layers of tensorized neural networks in a way that avoids the explosion of the parameters of the matrix it emulates. The method is intended for layers with a high number of nodes in which there is a connection to the input or output of all or most of the nodes. The core of this method is the use of the Frobenius norm of this layer in an iterative partial form, so that it has to be finite and within a certain range. This norm is efficient to compute, fully or partially for most cases of interest. We apply the method to different layers and check its performance. We create a Python function to run it on an arbitrary layer, available in a Jupyter Notebook in the i3BQuantum repository: //github.com/i3BQuantumTeam/Q4Real/blob/e07c827651ef16bcf74590ab965ea3985143f891/Quantum-Inspired%20Variational%20Methods/Normalization_process.ipynb

變換 · INTERACT · 展開 · FFT · state-of-the-art ·

2023 年 9 月 22 日

Pixel Adaptive Deep Unfolding Transformer for Hyperspectral Image Reconstruction

Miaoyu Li,Ying Fu,Ji Liu,Yulun Zhang

from arxiv, ICCV 2023

Hyperspectral Image (HSI) reconstruction has made gratifying progress with the deep unfolding framework by formulating the problem into a data module and a prior module. Nevertheless, existing methods still face the problem of insufficient matching with HSI data. The issues lie in three aspects: 1) fixed gradient descent step in the data module while the degradation of HSI is agnostic in the pixel-level. 2) inadequate prior module for 3D HSI cube. 3) stage interaction ignoring the differences in features at different stages. To address these issues, in this work, we propose a Pixel Adaptive Deep Unfolding Transformer (PADUT) for HSI reconstruction. In the data module, a pixel adaptive descent step is employed to focus on pixel-level agnostic degradation. In the prior module, we introduce the Non-local Spectral Transformer (NST) to emphasize the 3D characteristics of HSI for recovering. Moreover, inspired by the diverse expression of features in different stages and depths, the stage interaction is improved by the Fast Fourier Transform (FFT). Experimental results on both simulated and real scenes exhibit the superior performance of our method compared to state-of-the-art HSI reconstruction methods. The code is released at: //github.com/MyuLi/PADUT.

Processing（編程語言） · 組合性 · 操作 · 散度 · 不變 ·

2023 年 9 月 22 日

Encodability Criteria for Quantum Based Systems

Anna Schmitt,Kirstin Peters,Yuxin Deng

from arxiv, preprint for submission to LMCS

Quantum based systems are a relatively new research area for that different modelling languages including process calculi are currently under development. Encodings are often used to compare process calculi. Quality criteria are used then to rule out trivial or meaningless encodings. In this new context of quantum based systems, it is necessary to analyse the applicability of these quality criteria and to potentially extend or adapt them. As a first step, we test the suitability of classical criteria for encodings between quantum based languages and discuss new criteria. Concretely, we present an encoding, from a language inspired by CQP into a language inspired by qCCS. We show that this encoding satisfies compositionality, name invariance (for channel and qubit names), operational correspondence, divergence reflection, success sensitiveness, and that it preserves the size of quantum registers. Then we show that there is no encoding from qCCS into CQP that is compositional, operationally corresponding, and success sensitive.

MoDELS · Learning · 語言模型化 · 多樣性 · 可辨認的 ·

2023 年 9 月 22 日

Learning to Diversify Neural Text Generation via Degenerative Model

Jimin Hong,ChaeHun Park,Jaegul Choo

from arxiv, IJCNLP-AACL2023 Findings, 10 pages

Neural language models often fail to generate diverse and informative texts, limiting their applicability in real-world problems. While previous approaches have proposed to address these issues by identifying and penalizing undesirable behaviors (e.g., repetition, overuse of frequent words) from language models, we propose an alternative approach based on an observation: models primarily learn attributes within examples that are likely to cause degeneration problems. Based on this observation, we propose a new approach to prevent degeneration problems by training two models. Specifically, we first train a model that is designed to amplify undesirable patterns. We then enhance the diversity of the second model by focusing on patterns that the first model fails to learn. Extensive experiments on two tasks, namely language modeling and dialogue generation, demonstrate the effectiveness of our approach.

Performer · Extensibility · MoDELS · ROUGE · state-of-the-art ·

2023 年 9 月 22 日

Automatic Answerability Evaluation for Question Generation

Zifan Wang,Kotaro Funakoshi,Manabu Okumura

Conventional automatic evaluation metrics, such as BLEU and ROUGE, developed for natural language generation (NLG) tasks, are based on measuring the n-gram overlap between the generated and reference text. These simple metrics may be insufficient for more complex tasks, such as question generation (QG), which requires generating questions that are answerable by the reference answers. Developing a more sophisticated automatic evaluation metric, thus, remains as an urgent problem in QG research. This work proposes a Prompting-based Metric on ANswerability (PMAN), a novel automatic evaluation metric to assess whether the generated questions are answerable by the reference answers for the QG tasks. Extensive experiments demonstrate that its evaluation results are reliable and align with human evaluations. We further apply our metric to evaluate the performance of QG models, which shows our metric complements conventional metrics. Our implementation of a ChatGPT-based QG model achieves state-of-the-art (SOTA) performance in generating answerable questions.

語言模型化 · MoDELS · 自助法/自舉法 · 數學 · state-of-the-art ·

2023 年 9 月 21 日

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Longhui Yu,Weisen Jiang,Han Shi,Jincheng Yu,Zhengying Liu,Yu Zhang,James T. Kwok,Zhenguo Li,Adrian Weller,Weiyang Liu

from arxiv, Technical Report, Work in Progress. Project Page: //meta-math.github.io/

Large language models (LLMs) have pushed the limits of natural language understanding and exhibited excellent problem-solving ability. Despite the great success, most existing open-source LLMs (\eg, LLaMA-2) are still far away from satisfactory for solving mathematical problem due to the complex reasoning procedures. To bridge this gap, we propose \emph{MetaMath}, a fine-tuned language model that specializes in mathematical reasoning. Specifically, we start by bootstrapping mathematical questions by rewriting the question from multiple perspectives without extra knowledge, which results in a new dataset called {MetaMathQA}. Then we fine-tune the LLaMA-2 models on MetaMathQA. Experimental results on two popular benchmarks (\ie, GSM8K and MATH) for mathematical reasoning demonstrate that MetaMath outperforms a suite of open-source LLMs by a significant margin. Our MetaMath-7B model achieves $66.4\%$ on GSM8K and $19.4\%$ on MATH, exceeding the state-of-the-art models of the same size by $11.5\%$ and $8.7\%$. Particularly, {MetaMath-70B} achieves an accuracy of $82.3\%$ on {GSM8K}, slightly better than {GPT-3.5-Turbo}. We release the {MetaMathQA} dataset, the {MetaMath} models with different model sizes and the training code for public use.

TOOLS · Performer · 情景 · 推斷 · Extensibility ·

2023 年 9 月 20 日

LLM Guided Inductive Inference for Solving Compositional Problems

Abhigya Sodani,Lauren Moos,Matthew Mirman

from arxiv, 5 pages, ICML TEACH Workshop

While large language models (LLMs) have demonstrated impressive performance in question-answering tasks, their performance is limited when the questions require knowledge that is not included in the model's training data and can only be acquired through direct observation or interaction with the real world. Existing methods decompose reasoning tasks through the use of modules invoked sequentially, limiting their ability to answer deep reasoning tasks. We introduce a method, Recursion based extensible LLM (REBEL), which handles open-world, deep reasoning tasks by employing automated reasoning techniques like dynamic planning and forward-chaining strategies. REBEL allows LLMs to reason via recursive problem decomposition and utilization of external tools. The tools that REBEL uses are specified only by natural language description. We further demonstrate REBEL capabilities on a set of problems that require a deeply nested use of external tools in a compositional and conversational setting.

MoDELS · 代碼 · 泛函 · Automator · 語言模型化 ·

2023 年 9 月 7 日

Automatically Testing Functional Properties of Code Translation Models

Hasan Ferit Eniser,Valentin Wüstholz,Maria Christakis

from arxiv, 13 pages including appendix and references

Large language models are becoming increasingly practical for translating code across programming languages, a process known as $transpiling$. Even though automated transpilation significantly boosts developer productivity, a key concern is whether the generated code is correct. Existing work initially used manually crafted test suites to test the translations of a small corpus of programs; these test suites were later automated. In contrast, we devise the first approach for automated, functional, property-based testing of code translation models. Our general, user-provided specifications about the transpiled code capture a range of properties, from purely syntactic to purely semantic ones. As shown by our experiments, this approach is very effective in detecting property violations in popular code translation models, and therefore, in evaluating model quality with respect to given properties. We also go a step further and explore the usage scenario where a user simply aims to obtain a correct translation of some code with respect to certain properties without necessarily being concerned about the overall quality of the model. To this purpose, we develop the first property-guided search procedure for code translation models, where a model is repeatedly queried with slightly different parameters to produce alternative and potentially more correct translations. Our results show that this search procedure helps to obtain significantly better code translations.

估計/估計量 · contrastive · INFORMS · 互信息 · 表示學習 ·

2021 年 6 月 25 日

Decomposed Mutual Information Estimation for Contrastive Representation Learning

Alessandro Sordoni,Nouha Dziri,Hannes Schulz,Geoff Gordon,Phil Bachman,Remi Tachet

from arxiv, ICML 2021

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

語言模(mo)型(xing)化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<dir id='gqjke'><del id='gqjke'><del id='gqjke'></del><pre id='gqjke'><pre id='gqjke'><option id='gqjke'><address id='gqjke'></address><bdo id='gqjke'><tr id='gqjke'><acronym id='gqjke'><pre id='gqjke'></pre></acronym><div id='gqjke'></div></tr></bdo></option></pre><small id='gqjke'><address id='gqjke'><u id='gqjke'><legend id='gqjke'><option id='gqjke'><abbr id='gqjke'></abbr><li id='gqjke'><pre id='gqjke'></pre></li></option></legend><select id='gqjke'></select></u></address></small></pre></del><sup id='gqjke'></sup><blockquote id='gqjke'><dt id='gqjke'></dt></blockquote><blockquote id='gqjke'></blockquote></dir><tt id='gqjke'></tt><u id='gqjke'><tt id='gqjke'><form id='gqjke'></form></tt><td id='gqjke'><dt id='gqjke'></dt></td></u>

<code id='gqjke'><i id='gqjke'><q id='gqjke'><legend id='gqjke'><pre id='gqjke'><style id='gqjke'><acronym id='gqjke'><i id='gqjke'><form id='gqjke'><option id='gqjke'><center id='gqjke'></center></option></form></i></acronym></style><tt id='gqjke'></tt></pre></legend></q></i></code><center id='gqjke'></center>

<dd id='gqjke'></dd>

<style id='gqjke'></style><sub id='gqjke'><dfn id='gqjke'><abbr id='gqjke'><big id='gqjke'><bdo id='gqjke'></bdo></big></abbr></dfn></sub>_{<dir id='gqjke'></dir>}