丰满人妻被公侵犯高清版_色香欲天天影视综合网八区_日韩在线不卡中文字幕视频_美女裸体无遮挡免费视频_国产一区二区三区资源在线观看_精品久久久久久精品日日_色偷偷91久久综合噜噜噜

Rui Yang,Qingcheng Zeng,Keen You,Yujie Qiao,Lucas Huang,Chia-Chun Hsieh,Benjamin Rosand,Jeremy Goldwasser,Amisha D Dave,Tiarnan D. L. Keenan,Emily Y Chew,Dragomir Radev,Zhiyong Lu,Hua Xu,Qingyu Chen,Irene Li

from arxiv, 5 figures, 4 tables

This study introduces MedGen, a comprehensive natural language processing (NLP) toolkit designed for medical text processing. MedGen is tailored for biomedical researchers and healthcare professionals with an easy-to-use, all-in-one solution that requires minimal programming expertise. It includes (1) Generative Functions: For the first time, MedGen includes four advanced generative functions: question answering, text summarization, text simplification, and machine translation; (2) Basic NLP Functions: MedGen integrates 12 essential NLP functions such as word tokenization and sentence segmentation; and (3) Query and Search Capabilities: MedGen provides user-friendly query and search functions on text corpora. We fine-tuned 32 domain-specific language models, evaluated them thoroughly on 24 established benchmarks and conducted manual reviews with clinicians. Additionally, we expanded our toolkit by introducing query and search functions, while also standardizing and integrating functions from third-party libraries. The toolkit, its models, and associated data are publicly available via //github.com/Yale-LILY/MedGen.

相關內容

Processing（編(bian)程語(yu)言）

關注 121

Processing 是(shi)一門(men)開(kai)源編程語言和(he)(he)與(yu)之(zhi)配套的(de)集成(cheng)開(kai)發環境（IDE）的(de)名稱。Processing 在電子藝術(shu)和(he)(he)視覺設(she)計社區(qu)被用來教授編程基礎，并運用于大量的(de)新媒體和(he)(he)互(hu)動(dong)藝術(shu)作(zuo)品中。

知識 (knowledge) · 解碼 · 輸出 · 約束 · MoDELS ·

2024 年 1 月 19 日

DeepEdit: Knowledge Editing as Decoding with Constraints

Yiwei Wang,Muhao Chen,Nanyun Peng,Kai-Wei Chang

We develop a new perspective of knowledge editing for large language models (LLMs) as decoding with constraints. We propose DeepEdit (Depth-first Search based Progressive Decoding for Knowledge Editing), a neuro-symbolic method that improves knowledge editing with better coherence of reasoning, relevance to the question, and awareness of updated knowledge. DeepEdit can be flexibly applied to all black-box LLMs: it does not require any access to the model parameters, representations, or output vocabulary distributions. DeepEdit progressively produces the high-quality reasoning steps towards effective knowledge editing. It utilizes a depth-first search to revise the LLMs' output, which improves the output's informativeness to the input question and awareness of the updated knowledge. Qualitatively, DeepEdit effectively controls LLMs to produce more succinct reasoning in accord with knowledge editing. Quantitatively, DeepEdit yields significant gains on MQuaKE, a challenging multi-hop question-answering dataset with knowledge editing. We release the source code at //github.com/wangywUST/DeepEdit.

知識 (knowledge) · 邊 · Networking · QoS · 蒸餾 ·

2024 年 1 月 18 日

Tailoring Semantic Communication at Network Edge: A Novel Approach Using Dynamic Knowledge Distillation

Abdullatif Albaseer,Mohamed Abdallah

from arxiv, Accepted for the International Conference on Communications (ICC) 2024

Semantic Communication (SemCom) systems, empowered by deep learning (DL), represent a paradigm shift in data transmission. These systems prioritize the significance of content over sheer data volume. However, existing SemCom designs face challenges when applied to diverse computational capabilities and network conditions, particularly in time-sensitive applications. A key challenge is the assumption that diverse devices can uniformly benefit from a standard, large DL model in SemCom systems. This assumption becomes increasingly impractical, especially in high-speed, high-reliability applications such as industrial automation or critical healthcare. Therefore, this paper introduces a novel SemCom framework tailored for heterogeneous, resource-constrained edge devices and computation-intensive servers. Our approach employs dynamic knowledge distillation (KD) to customize semantic models for each device, balancing computational and communication constraints while ensuring Quality of Service (QoS). We formulate an optimization problem and develop an adaptive algorithm that iteratively refines semantic knowledge on edge devices, resulting in better models tailored to their resource profiles. This algorithm strategically adjusts the granularity of distilled knowledge, enabling devices to maintain high semantic accuracy for precise inference tasks, even under unstable network conditions. Extensive simulations demonstrate that our approach significantly reduces model complexity for edge devices, leading to better semantic extraction and achieving the desired QoS.

Branch · Networking · INFORMS · Performer · 位置編碼 ·

2024 年 1 月 18 日

CMFN: Cross-Modal Fusion Network for Irregular Scene Text Recognition

Jinzhi Zheng,Ruyi Ji,Libo Zhang,Yanjun Wu,Chen Zhao

from arxiv, Accepted to ICONIP 2023

Scene text recognition, as a cross-modal task involving vision and text, is an important research topic in computer vision. Most existing methods use language models to extract semantic information for optimizing visual recognition. However, the guidance of visual cues is ignored in the process of semantic mining, which limits the performance of the algorithm in recognizing irregular scene text. To tackle this issue, we propose a novel cross-modal fusion network (CMFN) for irregular scene text recognition, which incorporates visual cues into the semantic mining process. Specifically, CMFN consists of a position self-enhanced encoder, a visual recognition branch and an iterative semantic recognition branch. The position self-enhanced encoder provides character sequence position encoding for both the visual recognition branch and the iterative semantic recognition branch. The visual recognition branch carries out visual recognition based on the visual features extracted by CNN and the position encoding information provided by the position self-enhanced encoder. The iterative semantic recognition branch, which consists of a language recognition module and a cross-modal fusion gate, simulates the way that human recognizes scene text and integrates cross-modal visual cues for text recognition. The experiments demonstrate that the proposed CMFN algorithm achieves comparable performance to state-of-the-art algorithms, indicating its effectiveness.

大語言模型 · Agent · INTERACT · MoDELS · HTTPS ·

2024 年 1 月 18 日

R-Judge: Benchmarking Safety Risk Awareness for LLM Agents

Tongxin Yuan,Zhiwei He,Lingzhong Dong,Yiming Wang,Ruijie Zhao,Tian Xia,Lizhen Xu,Binglin Zhou,Fangqi Li,Zhuosheng Zhang,Rui Wang,Gongshen Liu

Large language models (LLMs) have exhibited great potential in autonomously completing tasks across real-world applications. Despite this, these LLM agents introduce unexpected safety risks when operating in interactive environments. Instead of centering on LLM-generated content safety in most prior studies, this work addresses the imperative need for benchmarking the behavioral safety of LLM agents within diverse environments. We introduce R-Judge, a benchmark crafted to evaluate the proficiency of LLMs in judging safety risks given agent interaction records. R-Judge comprises 162 agent interaction records, encompassing 27 key risk scenarios among 7 application categories and 10 risk types. It incorporates human consensus on safety with annotated safety risk labels and high-quality risk descriptions. Utilizing R-Judge, we conduct a comprehensive evaluation of 8 prominent LLMs commonly employed as the backbone for agents. The best-performing model, GPT-4, achieves 72.29% in contrast to the human score of 89.38%, showing considerable room for enhancing the risk awareness of LLMs. Notably, leveraging risk descriptions as environment feedback significantly improves model performance, revealing the importance of salient safety risk feedback. Furthermore, we design an effective chain of safety analysis technique to help the judgment of safety risks and conduct an in-depth case study to facilitate future research. R-Judge is publicly available at //github.com/Lordog/R-Judge.

解碼 · 優化器 · MoDELS · 樣本 · state-of-the-art ·

2024 年 1 月 18 日

SpecTr: Fast Speculative Decoding via Optimal Transport

Ziteng Sun,Ananda Theertha Suresh,Jae Hun Ro,Ahmad Beirami,Himanshu Jain,Felix Yu

from arxiv, NeurIPS 2023

Autoregressive sampling from large language models has led to state-of-the-art results in several natural language tasks. However, autoregressive sampling generates tokens one at a time making it slow, and even prohibitive in certain tasks. One way to speed up sampling is $\textit{speculative decoding}$: use a small model to sample a $\textit{draft}$ (block or sequence of tokens), and then score all tokens in the draft by the large language model in parallel. A subset of the tokens in the draft are accepted (and the rest rejected) based on a statistical method to guarantee that the final output follows the distribution of the large model. In this work, we provide a principled understanding of speculative decoding through the lens of optimal transport (OT) with $\textit{membership cost}$. This framework can be viewed as an extension of the well-known $\textit{maximal-coupling}$ problem. This new formulation enables us to generalize the speculative decoding method to allow for a set of $k$ candidates at the token-level, which leads to an improved optimal membership cost. We show that the optimal draft selection algorithm (transport plan) can be computed via linear programming, whose best-known runtime is exponential in $k$. We then propose a valid draft selection algorithm whose acceptance probability is $(1-1/e)$-optimal multiplicatively. Moreover, it can be computed in time almost linear with size of domain of a single token. Using this $new draft selection$ algorithm, we develop a new autoregressive sampling algorithm called $\textit{SpecTr}$, which provides speedup in decoding while ensuring that there is no quality degradation in the decoded output. We experimentally demonstrate that for state-of-the-art large language models, the proposed approach achieves a wall clock speedup of 2.13X, a further 1.37X speedup over speculative decoding on standard benchmarks.

Projection · Networking · 估計/估計量 · Learning · 平滑 ·

2024 年 1 月 17 日

Intensity Profile Projection: A Framework for Continuous-Time Representation Learning for Dynamic Networks

Alexander Modell,Ian Gallagher,Emma Ceccherini,Nick Whiteley,Patrick Rubin-Delanchy

from arxiv, 38 pages, 10 figures

We present a new representation learning framework, Intensity Profile Projection, for continuous-time dynamic network data. Given triples $(i,j,t)$, each representing a time-stamped ($t$) interaction between two entities ($i,j$), our procedure returns a continuous-time trajectory for each node, representing its behaviour over time. The framework consists of three stages: estimating pairwise intensity functions, e.g. via kernel smoothing; learning a projection which minimises a notion of intensity reconstruction error; and constructing evolving node representations via the learned projection. The trajectories satisfy two properties, known as structural and temporal coherence, which we see as fundamental for reliable inference. Moreoever, we develop estimation theory providing tight control on the error of any estimated trajectory, indicating that the representations could even be used in quite noise-sensitive follow-on analyses. The theory also elucidates the role of smoothing as a bias-variance trade-off, and shows how we can reduce the level of smoothing as the signal-to-noise ratio increases on account of the algorithm `borrowing strength' across the network.

MoDELS · Taxonomy · 語言模型化 · 可理解性 · Performance ·

2023 年 9 月 2 日

Explainability for Large Language Models: A Survey

Haiyan Zhao,Hanjie Chen,Fan Yang,Ninghao Liu,Huiqi Deng,Hengyi Cai,Shuaiqiang Wang,Dawei Yin,Mengnan Du

Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional machine learning models.

INFORMS · 學成 · 強化學習 · 分離的 · state-of-the-art ·

2021 年 2 月 7 日

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Jin Zhang,Jianhao Wang,Hao Hu,Tong Chen,Yingfeng Chen,Changjie Fan,Chongjie Zhang

Meta reinforcement learning (meta-RL) extracts knowledge from previous tasks and achieves fast adaptation to new tasks. Despite recent progress, efficient exploration in meta-RL remains a key challenge in sparse-reward tasks, as it requires quickly finding informative task-relevant experiences in both meta-training and adaptation. To address this challenge, we explicitly model an exploration policy learning problem for meta-RL, which is separated from exploitation policy learning, and introduce a novel empowerment-driven exploration objective, which aims to maximize information gain for task identification. We derive a corresponding intrinsic reward and develop a new off-policy meta-RL framework, which efficiently learns separate context-aware exploration and exploitation policies by sharing the knowledge of task inference. Experimental evaluation shows that our meta-RL method significantly outperforms state-of-the-art baselines on various sparse-reward MuJoCo locomotion tasks and more complex sparse-reward Meta-World tasks.

圖卷積神經網絡/圖卷積網絡 · 圖 · Networking · 結點 · Neural Networks ·

2019 年 11 月 18 日

EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs

Aldo Pareja,Giacomo Domeniconi,Jie Chen,Tengfei Ma,Toyotaro Suzumura,Hiroki Kanezashi,Tim Kaler,Tao B. Schardl,Charles E. Leiserson

from arxiv, AAAI 2020. The code is available at //github.com/IBM/EvolveGCN

Graph representation learning resurges as a trending research subject owing to the widespread use of deep learning for Euclidean data, which inspire various creative designs of neural networks in the non-Euclidean domain, particularly graphs. With the success of these graph neural networks (GNN) in the static setting, we approach further practical scenarios where the graph dynamically evolves. Existing approaches typically resort to node embeddings and use a recurrent neural network (RNN, broadly speaking) to regulate the embeddings and learn the temporal dynamics. These methods require the knowledge of a node in the full time span (including both training and testing) and are less applicable to the frequent change of the node set. In some extreme scenarios, the node sets at different time steps may completely differ. To resolve this challenge, we propose EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings. The proposed approach captures the dynamism of the graph sequence through using an RNN to evolve the GCN parameters. Two architectures are considered for the parameter evolution. We evaluate the proposed approach on tasks including link prediction, edge classification, and node classification. The experimental results indicate a generally higher performance of EvolveGCN compared with related approaches. The code is available at \url{//github.com/IBM/EvolveGCN}.

圖 · 學成 · 知識圖譜 · FreeBASIC · 強化學習 ·

2018 年 1 月 8 日

DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning

Wenhan Xiong,Thien Hoang,William Yang Wang

We study the problem of learning to reason in large scale knowledge graphs (KGs). More specifically, we describe a novel reinforcement learning framework for learning multi-hop relational paths: we use a policy-based agent with continuous states based on knowledge graph embeddings, which reasons in a KG vector space by sampling the most promising relation to extend its path. In contrast to prior work, our approach includes a reward function that takes the accuracy, diversity, and efficiency into consideration. Experimentally, we show that our proposed method outperforms a path-ranking based algorithm and knowledge graph embedding methods on Freebase and Never-Ending Language Learning datasets.