亚洲十八禁无码在线免费观看_狼友视频首页_欧美A色爱综合网欧美V_亚州精品一区二区三区_欧美一级特黄乱妇高清视频_五月丁香中文字幕在线一区二区不卡电影_视频在线导航入口

Generative Artificial Intelligence (AI) is one of the most exciting developments in Computer Science of the last decade. At the same time, Reinforcement Learning (RL) has emerged as a very successful paradigm for a variety of machine learning tasks. In this survey, we discuss the state of the art, opportunities and open research questions in applying RL to generative AI. In particular, we will discuss three types of applications, namely, RL as an alternative way for generation without specified objectives; as a way for generating outputs while concurrently maximizing an objective function; and, finally, as a way of embedding desired characteristics, which cannot be easily captured by means of an objective function, into the generative process. We conclude the survey with an in-depth discussion of the opportunities and challenges in this fascinating emerging area.

相關內容

生成(cheng)式(shi)人工智能

關注 36

生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)式(shi)人工(gong)智能(neng)是利用復雜的(de)(de)算法、模(mo)型和(he)(he)規則，從(cong)大(da)規模(mo)數據集中學習，以創(chuang)造新的(de)(de)原(yuan)創(chuang)內容(rong)的(de)(de)人工(gong)智能(neng)技(ji)術。這項技(ji)術能(neng)夠創(chuang)造文(wen)本、圖片、聲音、視(shi)頻和(he)(he)代碼等多種類型的(de)(de)內容(rong)，全面(mian)超越了(le)(le)傳(chuan)統軟件的(de)(de)數據處(chu)理和(he)(he)分析能(neng)力。2022年末，OpenAI推出(chu)的(de)(de)ChatGPT標志(zhi)著(zhu)這一技(ji)術在(zai)文(wen)本生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)領域取得了(le)(le)顯(xian)著(zhu)進(jin)(jin)展，2023年被稱(cheng)為生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)式(shi)人工(gong)智能(neng)的(de)(de)突破之(zhi)年。這項技(ji)術從(cong)單(dan)一的(de)(de)語言(yan)生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)逐步(bu)(bu)向多模(mo)態、具身化快速(su)發(fa)展。在(zai)圖像生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)方面(mian)，生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)系統在(zai)解(jie)釋提示和(he)(he)生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)逼(bi)真(zhen)輸(shu)出(chu)方面(mian)取得了(le)(le)顯(xian)著(zhu)的(de)(de)進(jin)(jin)步(bu)(bu)。同時，視(shi)頻和(he)(he)音頻的(de)(de)生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)技(ji)術也(ye)在(zai)迅速(su)發(fa)展，這為虛擬現實(shi)和(he)(he)元宇宙(zhou)的(de)(de)實(shi)現提供了(le)(le)新的(de)(de)途徑。生(sheng)(sheng)(sheng)成(cheng)(cheng)(cheng)式(shi)人工(gong)智能(neng)技(ji)術在(zai)各(ge)行業、各(ge)領域都具有廣泛的(de)(de)應(ying)用前景。

穩健性 · 語言模型化 · 大語言模型 · MoDELS · Prompt ·

2024 年 2 月 26 日

RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions

Yuansen Zhang,Xiao Wang,Zhiheng Xi,Han Xia,Tao Gui,Qi Zhang,Xuanjing Huang

from arxiv, Accepted by COLING 2024

Large Language Models (LLMs) have showcased remarkable capabilities in following human instructions. However, recent studies have raised concerns about the robustness of LLMs when prompted with instructions combining textual adversarial samples. In this paper, drawing inspiration from recent works that LLMs are sensitive to the design of the instructions, we utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions. Through this conversion, we provide LLMs with more precise instructions and strengthen the robustness of LLMs. Moreover, under few-shot scenarios, we propose a novel method to compose in-context demonstrations using both clean and adversarial samples (\textit{adversarial context method}) to further boost the robustness of the LLMs. Experiments on eight robustness datasets show that our method consistently outperforms prompting LLMs with natural language instructions. For example, with gpt-3.5-turbo, our method achieves an improvement of 5.68\% in test set accuracy and a reduction of 5.66 points in Attack Success Rate (ASR).

可辨認的 · 自動問答 · 代價 · MoDELS · Learning ·

2024 年 2 月 26 日

Quality Assurance for Artificial Intelligence: A Study of Industrial Concerns, Challenges and Best Practices

Chenyu Wang,Zhou Yang,Ze Shi Li,Daniela Damian,David Lo

Quality Assurance (QA) aims to prevent mistakes and defects in manufactured products and avoid problems when delivering products or services to customers. QA for AI systems, however, poses particular challenges, given their data-driven and non-deterministic nature as well as more complex architectures and algorithms. While there is growing empirical evidence about practices of machine learning in industrial contexts, little is known about the challenges and best practices of quality assurance for AI systems (QA4AI). In this paper, we report on a mixed-method study of QA4AI in industry practice from various countries and companies. Through interviews with fifteen industry practitioners and a validation survey with 50 practitioner responses, we studied the concerns as well as challenges and best practices in ensuring the QA4AI properties reported in the literature, such as correctness, fairness, interpretability and others. Our findings suggest correctness as the most important property, followed by model relevance, efficiency and deployability. In contrast, transferability (applying knowledge learned in one task to another task), security and fairness are not paid much attention by practitioners compared to other properties. Challenges and solutions are identified for each QA4AI property. For example, interviewees highlighted the trade-off challenge among latency, cost and accuracy for efficiency (latency and cost are parts of efficiency concern). Solutions like model compression are proposed. We identified 21 QA4AI practices across each stage of AI development, with 10 practices being well recognized and another 8 practices being marginally agreed by the survey practitioners.

CNN · 可理解性 · 卷積 · Learning · Vision ·

2024 年 2 月 23 日

A Comprehensive Survey of Convolutions in Deep Learning: Applications, Challenges, and Future Trends

Abolfazl Younesi,Mohsen Ansari,MohammadAmin Fazli,Alireza Ejlali,Muhammad Shafique,J?rg Henkel

In today's digital age, Convolutional Neural Networks (CNNs), a subset of Deep Learning (DL), are widely used for various computer vision tasks such as image classification, object detection, and image segmentation. There are numerous types of CNNs designed to meet specific needs and requirements, including 1D, 2D, and 3D CNNs, as well as dilated, grouped, attention, depthwise convolutions, and NAS, among others. Each type of CNN has its unique structure and characteristics, making it suitable for specific tasks. It's crucial to gain a thorough understanding and perform a comparative analysis of these different CNN types to understand their strengths and weaknesses. Furthermore, studying the performance, limitations, and practical applications of each type of CNN can aid in the development of new and improved architectures in the future. We also dive into the platforms and frameworks that researchers utilize for their research or development from various perspectives. Additionally, we explore the main research fields of CNN like 6D vision, generative models, and meta-learning. This survey paper provides a comprehensive examination and comparison of various CNN architectures, highlighting their architectural differences and emphasizing their respective advantages, disadvantages, applications, challenges, and future trends.

Everything（軟件） · 大語言模型 · Cognition · 蒙特卡羅 · 蒙特卡洛樹搜索 ·

2024 年 2 月 23 日

Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation

Ruomeng Ding,Chaoyun Zhang,Lu Wang,Yong Xu,Minghua Ma,Wei Zhang,Si Qin,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang

from arxiv, 17 pages, 5 figures

Recent advancements in Large Language Models (LLMs) have revolutionized decision-making by breaking down complex problems into more manageable language sequences referred to as "thoughts". An effective thought design should consider three key perspectives: performance, efficiency, and flexibility. However, existing thought can at most exhibit two of these attributes. To address these limitations, we introduce a novel thought prompting approach called "Everything of Thoughts" (XoT) to defy the law of "Penrose triangle of existing thought paradigms. XoT leverages pretrained reinforcement learning and Monte Carlo Tree Search (MCTS) to incorporate external domain knowledge into thoughts, thereby enhancing LLMs' capabilities and enabling them to generalize to unseen problems efficiently. Through the utilization of the MCTS-LLM collaborative thought revision framework, this approach autonomously produces high-quality comprehensive cognitive mappings with minimal LLM interactions. Additionally, XoT empowers LLMs to engage in unconstrained thinking, allowing for flexible cognitive mappings for problems with multiple solutions. We evaluate XoT on several challenging multi-solution problem-solving tasks, including Game of 24, 8-Puzzle, and Pocket Cube. Our results demonstrate that XoT significantly outperforms existing approaches. Notably, XoT can yield multiple solutions with just one LLM call, showcasing its remarkable proficiency in addressing complex problems across diverse domains.

可理解性 · 操作 · 隨機漫步 · Performer · 圖形處理器 ·

2024 年 2 月 23 日

Understanding Oversmoothing in Diffusion-Based GNNs From the Perspective of Operator Semigroup Theory

Weichen Zhao,Chenguang Wang,Xinyan Wang,Congying Han,Tiande Guo,Tianshu Yu

This paper presents a novel study of the oversmoothing issue in diffusion-based Graph Neural Networks (GNNs). Diverging from extant approaches grounded in random walk analysis or particle systems, we approach this problem through operator semigroup theory. This theoretical framework allows us to rigorously prove that oversmoothing is intrinsically linked to the ergodicity of the diffusion operator. This finding further poses a general and mild ergodicity-breaking condition, encompassing the various specific solutions previously offered, thereby presenting a more universal and theoretically grounded approach to mitigating oversmoothing in diffusion-based GNNs. Additionally, we offer a probabilistic interpretation of our theory, forging a link with prior works and broadening the theoretical horizon. Our experimental results reveal that this ergodicity-breaking term effectively mitigates oversmoothing measured by Dirichlet energy, and simultaneously enhances performance in node classification tasks.

多峰值 · 語言模型化 · MoDELS · 可理解性 · 模態 ·

2023 年 11 月 10 日

How to Bridge the Gap between Modalities: A Comprehensive Survey on Multimodal Large Language Model

Shezheng Song,Xiaopeng Li,Shasha Li

This review paper explores Multimodal Large Language Models (MLLMs), which integrate Large Language Models (LLMs) like GPT-4 to handle multimodal data such as text and vision. MLLMs demonstrate capabilities like generating image narratives and answering image-based questions, bridging the gap towards real-world human-computer interactions and hinting at a potential pathway to artificial general intelligence. However, MLLMs still face challenges in processing the semantic gap in multimodality, which may lead to erroneous generation, posing potential risks to society. Choosing the appropriate modality alignment method is crucial, as improper methods might require more parameters with limited performance improvement. This paper aims to explore modality alignment methods for LLMs and their existing capabilities. Implementing modality alignment allows LLMs to address environmental issues and enhance accessibility. The study surveys existing modal alignment methods in MLLMs into four groups: (1) Multimodal Converters that change data into something LLMs can understand; (2) Multimodal Perceivers to improve how LLMs perceive different types of data; (3) Tools Assistance for changing data into one common format, usually text; and (4) Data-Driven methods that teach LLMs to understand specific types of data in a dataset. This field is still in a phase of exploration and experimentation, and we will organize and update various existing research methods for multimodal information alignment.

Networking · 講稿 · 可理解性 · AI · 多代理人模型 ·

2023 年 10 月 11 日

Generative Agent-Based Social Networks for Disinformation: Research Opportunities and Open Challenges

Javier Pastor-Galindo,Pantaleone Nespoli,José A. Ruipérez-Valiente

This article presents the affordances that Generative Artificial Intelligence can have in disinformation context, one of the major threats to our digitalized society. We present a research framework to generate customized agent-based social networks for disinformation simulations that would enable understanding and evaluation of the phenomena whilst discussing open challenges.

Learning · 可理解性 · state-of-the-art · Less · Neural Networks ·

2022 年 6 月 12 日

A Survey on Uncertainty Reasoning and Quantification for Decision Making: Belief Theory Meets Deep Learning

Zhen Guo,Zelin Wan,Qisheng Zhang,Xujiang Zhao,Feng Chen,Jin-Hee Cho,Qi Zhang,Lance M. Kaplan,Dong H. Jeong,Audun J?sang

from arxiv, First four authors contributed equally. Submitted to ACM Computing Surveys

An in-depth understanding of uncertainty is the first step to making effective decisions under uncertainty. Deep/machine learning (ML/DL) has been hugely leveraged to solve complex problems involved with processing high-dimensional data. However, reasoning and quantifying different types of uncertainties to achieve effective decision-making have been much less explored in ML/DL than in other Artificial Intelligence (AI) domains. In particular, belief/evidence theories have been studied in KRR since the 1960s to reason and measure uncertainties to enhance decision-making effectiveness. We found that only a few studies have leveraged the mature uncertainty research in belief/evidence theories in ML/DL to tackle complex problems under different types of uncertainty. In this survey paper, we discuss several popular belief theories and their core ideas dealing with uncertainty causes and types and quantifying them, along with the discussions of their applicability in ML/DL. In addition, we discuss three main approaches that leverage belief theories in Deep Neural Networks (DNNs), including Evidential DNNs, Fuzzy DNNs, and Rough DNNs, in terms of their uncertainty causes, types, and quantification methods along with their applicability in diverse problem domains. Based on our in-depth survey, we discuss insights, lessons learned, limitations of the current state-of-the-art bridging belief theories and ML/DL, and finally, future research directions.

圖像字幕 · state-of-the-art · Vision · 可辨認的 · 語言模型化 ·

2021 年 7 月 14 日

From Show to Tell: A Survey on Image Captioning

Matteo Stefanini,Marcella Cornia,Lorenzo Baraldi,Silvia Cascianelli,Giuseppe Fiameni,Rita Cucchiara

Connecting Vision and Language plays an essential role in Generative Intelligence. For this reason, in the last few years, a large research effort has been devoted to image captioning, i.e. the task of describing images with syntactically and semantically meaningful sentences. Starting from 2015 the task has generally been addressed with pipelines composed of a visual encoding step and a language model for text generation. During these years, both components have evolved considerably through the exploitation of object regions, attributes, and relationships and the introduction of multi-modal connections, fully-attentive approaches, and BERT-like early-fusion strategies. However, regardless of the impressive results obtained, research in image captioning has not reached a conclusive answer yet. This work aims at providing a comprehensive overview and categorization of image captioning approaches, from visual encoding and text generation to training strategies, used datasets, and evaluation metrics. In this respect, we quantitatively compare many relevant state-of-the-art approaches to identify the most impactful technical innovations in image captioning architectures and training strategies. Moreover, many variants of the problem and its open challenges are analyzed and discussed. The final goal of this work is to serve as a tool for understanding the existing state-of-the-art and highlighting the future directions for an area of research where Computer Vision and Natural Language Processing can find an optimal synergy.

跳躍連接 · Neural Networks · 優化器 · 線性的 · 圖 ·

2021 年 5 月 10 日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Keyulu Xu,Mozhi Zhang,Stefanie Jegelka,Kenji Kawaguchi

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.