亚洲AV永久无码精品九之,国产无遮挡又黄又爽不要VIP软,亚洲不卡一二三在线观看,精品变态视频一区二区三区,三级黄漫在线免费观看大全

Large language models (LLMs) are incredibly powerful at comprehending and generating data in the form of text, but are brittle and error-prone. There has been an advent of toolkits and recipes centered around so-called prompt engineering-the process of asking an LLM to do something via a series of prompts. However, for LLM-powered data processing workflows, in particular, optimizing for quality, while keeping cost bounded, is a tedious, manual process. We put forth a vision for declarative prompt engineering. We view LLMs like crowd workers and leverage ideas from the declarative crowdsourcing literature-including leveraging multiple prompting strategies, ensuring internal consistency, and exploring hybrid-LLM-non-LLM approaches-to make prompt engineering a more principled process. Preliminary case studies on sorting, entity resolution, and imputation demonstrate the promise of our approach

相關內容

Prompt

關注 10

多峰值 · MoDELS · 輸出 · Performer · Integration ·

2023 年 9 月 28 日

Jointly Training Large Autoregressive Multimodal Models

Emanuele Aiello,Lili Yu,Yixin Nie,Armen Aghajanyan,Barlas Oguz

In recent years, advances in the large-scale pretraining of language and text-to-image models have revolutionized the field of machine learning. Yet, integrating these two modalities into a single, robust model capable of generating seamless multimodal outputs remains a significant challenge. To address this gap, we present the Joint Autoregressive Mixture (JAM) framework, a modular approach that systematically fuses existing text and image generation models. We also introduce a specialized, data-efficient instruction-tuning strategy, tailored for mixed-modal generation tasks. Our final instruct-tuned model demonstrates unparalleled performance in generating high-quality multimodal outputs and represents the first model explicitly designed for this purpose.

MoDELS · 語言模型化 · Performer · 基 · Agent ·

2023 年 9 月 28 日

Qwen Technical Report

Jinze Bai,Shuai Bai,Yunfei Chu,Zeyu Cui,Kai Dang,Xiaodong Deng,Yang Fan,Wenbin Ge,Yu Han,Fei Huang,Binyuan Hui,Luo Ji,Mei Li,Junyang Lin,Runji Lin,Dayiheng Liu,Gao Liu,Chengqiang Lu,Keming Lu,Jianxin Ma,Rui Men,Xingzhang Ren,Xuancheng Ren,Chuanqi Tan,Sinan Tan,Jianhong Tu,Peng Wang,Shijie Wang,Wei Wang,Shengguang Wu,Benfeng Xu,Jin Xu,An Yang,Hao Yang,Jian Yang,Shusheng Yang,Yang Yao,Bowen Yu,Hongyi Yuan,Zheng Yuan,Jianwei Zhang,Xingxuan Zhang,Yichang Zhang,Zhenru Zhang,Chang Zhou,Jingren Zhou,Xiaohuan Zhou,Tianhang Zhu

from arxiv, 59 pages, 5 figures

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.

Agent · Cognition · 可辨認的 · INTERACT · 語言模型化 ·

2023 年 9 月 27 日

Cognitive Architectures for Language Agents

Theodore R. Sumers,Shunyu Yao,Karthik Narasimhan,Thomas L. Griffiths

from arxiv, v2 enriched actionable insights and discussions, and polished abstract and introduction. 18 pages of main content, 12 pages of references, 5 figures. The first two authors contributed equally, order decided by coin flip. A CoALA-based repo of recent work on language agents: //github.com/ysymyth/awesome-language-agents

Recent efforts have augmented large language models (LLMs) with external resources (e.g., the Internet) or internal control flows (e.g., prompt chaining) for tasks requiring grounding or reasoning, leading to a new class of language agents. While these agents have achieved substantial empirical success, we lack a systematic framework to organize existing agents and plan future developments. In this paper, we draw on the rich history of cognitive science and symbolic artificial intelligence to propose Cognitive Architectures for Language Agents (CoALA). CoALA describes a language agent with modular memory components, a structured action space to interact with internal memory and external environments, and a generalized decision-making process to choose actions. We use CoALA to retrospectively survey and organize a large body of recent work, and prospectively identify actionable directions towards more capable agents. Taken together, CoALA contextualizes today's language agents within the broader history of AI and outlines a path towards language-based general intelligence.

語言模型化 · Prompt · Performer · 語音識別 · MoDELS ·

2023 年 9 月 27 日

Generative Speech Recognition Error Correction with Large Language Models

Chao-Han Huck Yang,Yile Gu,Yi-Chieh Liu,Shalini Ghosh,Ivan Bulyko,Andreas Stolcke

from arxiv, Accepted to IEEE Automatic Speech Recognition and Understanding (ASRU) 2023

We explore the ability of large language models (LLMs) to act as ASR post-processors that perform rescoring and error correction. Our focus is on instruction prompting to let LLMs perform these task without fine-tuning, for which we evaluate different prompting schemes, both zero- and few-shot in-context learning, and a novel task-activating prompting (TAP) method that combines instruction and demonstration. Using a pre-trained first-pass system and rescoring output on two out-of-domain tasks (ATIS and WSJ), we show that rescoring only by in-context learning with frozen LLMs achieves results that are competitive with rescoring by domain-tuned LMs. By combining prompting techniques with fine-tuning we achieve error rates below the N-best oracle level, showcasing the generalization power of the LLMs.

MoDELS · 可約的 · Continuity · 相互獨立的 · 條件獨立的 ·

2023 年 9 月 25 日

Nonlinear Heterogeneous Bayesian Decentralized Data Fusion

Ofer Dagan,Tycho L. Cinquini,Nisar R. Ahmed

from arxiv, 7 pages, 3 figures, 3 tables, submitted to IEEE/RSJ international Conference on intelligent Robots and Systems (IROS 2023). Replacing previous version to account to reviewers' comments, this is the version presented at IROS 2023

The factor graph decentralized data fusion (FG-DDF) framework was developed for the analysis and exploitation of conditional independence in {heterogeneous Bayesian decentralized fusion problems, in which robots update and fuse pdfs over different, but overlapping subsets of random states. This allows robots to efficiently use smaller probabilistic models and sparse message passing to accurately and scalably fuse relevant local parts of a larger global joint state pdf while accounting for data dependencies between robots. Whereas prior work required limiting assumptions about network connectivity and model linearity, this paper relaxes these to explore the applicability and robustness of FG-DDF in more general settings. We develop a new heterogeneous fusion rule which generalizes the homogeneous covariance intersection algorithm for such cases and test it in multi-robot tracking and localization scenarios with non-linear motion/observation models under communication dropouts. Simulation and hardware experiments show that, in practice, the FG-DDF continues to provide consistent filtered estimates under these more practical operating conditions, while reducing computation and communication costs by more than 99\%, thus enabling the design of scalable real-world multi-robot systems.

多峰值 · 異常檢測 · 點云 · Extensibility · 連結 ·

2023 年 3 月 1 日

Multimodal Industrial Anomaly Detection via Hybrid Fusion

Yue Wang,Jinlong Peng,Jiangning Zhang,Ran Yi,Yabiao Wang,Chengjie Wang

from arxiv, Accepted by CVPR 2023

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields. Existing multimodal industrial anomaly detection methods directly concatenate the multimodal features, which leads to a strong disturbance between features and harms the detection performance. In this paper, we propose Multi-3D-Memory (M3DM), a novel multimodal anomaly detection method with hybrid fusion scheme: firstly, we design an unsupervised feature fusion with patch-wise contrastive learning to encourage the interaction of different modal features; secondly, we use a decision layer fusion with multiple memory banks to avoid loss of information and additional novelty classifiers to make the final decision. We further propose a point feature alignment operation to better align the point cloud and RGB features. Extensive experiments show that our multimodal industrial anomaly detection model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTec-3D AD dataset. Code is available at //github.com/nomewang/M3DM.

INTERACT · Networking · Attention · 圖 · 結點 ·

2022 年 11 月 21 日

Graph Ordering Attention Networks

Michail Chatzianastasis,Johannes F. Lutzeyer,George Dasoulas,Michalis Vazirgiannis

from arxiv, Accepted at AAAI 2023

Graph Neural Networks (GNNs) have been successfully used in many problems involving graph-structured data, achieving state-of-the-art performance. GNNs typically employ a message-passing scheme, in which every node aggregates information from its neighbors using a permutation-invariant aggregation function. Standard well-examined choices such as the mean or sum aggregation functions have limited capabilities, as they are not able to capture interactions among neighbors. In this work, we formalize these interactions using an information-theoretic framework that notably includes synergistic information. Driven by this definition, we introduce the Graph Ordering Attention (GOAT) layer, a novel GNN component that captures interactions between nodes in a neighborhood. This is achieved by learning local node orderings via an attention mechanism and processing the ordered representations using a recurrent neural network aggregator. This design allows us to make use of a permutation-sensitive aggregator while maintaining the permutation-equivariance of the proposed GOAT layer. The GOAT model demonstrates its increased performance in modeling graph metrics that capture complex information, such as the betweenness centrality and the effective size of a node. In practical use-cases, its superior modeling capability is confirmed through its success in several real-world node classification benchmarks.

置信度 · MoDELS · Extensibility · 圖 · entity ·

2019 年 2 月 26 日

Embedding Uncertain Knowledge Graphs

Xuelu Chen,Muhao Chen,Weijia Shi,Yizhou Sun,Carlo Zaniolo

Embedding models for deterministic Knowledge Graphs (KG) have been extensively studied, with the purpose of capturing latent semantic relations between entities and incorporating the structured knowledge into machine learning. However, there are many KGs that model uncertain knowledge, which typically model the inherent uncertainty of relations facts with a confidence score, and embedding such uncertain knowledge represents an unresolved challenge. The capturing of uncertain knowledge will benefit many knowledge-driven applications such as question answering and semantic search by providing more natural characterization of the knowledge. In this paper, we propose a novel uncertain KG embedding model UKGE, which aims to preserve both structural and uncertainty information of relation facts in the embedding space. Unlike previous models that characterize relation facts with binary classification techniques, UKGE learns embeddings according to the confidence scores of uncertain relation facts. To further enhance the precision of UKGE, we also introduce probabilistic soft logic to infer confidence scores for unseen relation facts during training. We propose and evaluate two variants of UKGE based on different learning objectives. Experiments are conducted on three real-world uncertain KGs via three tasks, i.e. confidence prediction, relation fact ranking, and relation fact classification. UKGE shows effectiveness in capturing uncertain knowledge by achieving promising results on these tasks, and consistently outperforms baselines on these tasks.

圖卷積神經網絡/圖卷積網絡 · 文本分類 · 圖卷積網絡 · 圖卷積 · 圖 ·

2018 年 11 月 13 日

Graph Convolutional Networks for Text Classification

Liang Yao,Chengsheng Mao,Yuan Luo

from arxiv, Accepted by 33rd AAAI Conference on Artificial Intelligence (AAAI 2019)

Text classification is an important and classical problem in natural language processing. There have been a number of studies that applied convolutional neural networks (convolution on regular grid, e.g., sequence) to classification. However, only a limited number of studies have explored the more flexible graph convolutional neural networks (convolution on non-grid, e.g., arbitrary graph) for the task. In this work, we propose to use graph convolutional networks for text classification. We build a single text graph for a corpus based on word co-occurrence and document word relations, then learn a Text Graph Convolutional Network (Text GCN) for the corpus. Our Text GCN is initialized with one-hot representation for word and document, it then jointly learns the embeddings for both words and documents, as supervised by the known class labels for documents. Our experimental results on multiple benchmark datasets demonstrate that a vanilla Text GCN without any external word embeddings or knowledge outperforms state-of-the-art methods for text classification. On the other hand, Text GCN also learns predictive word and document embeddings. In addition, experimental results show that the improvement of Text GCN over state-of-the-art comparison methods become more prominent as we lower the percentage of training data, suggesting the robustness of Text GCN to less training data in text classification.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.