美国式禁忌电影在线观看免费观看,亚洲国产精品成人综合一区,日韩精品一区二区三区视频在线观看,一本大道加勒比香蕉中文在线

The integration of Machine Learning and Artificial Intelligence (ML/AI) into fifth-generation (5G) networks has made evident the limitations of network intelligence with ever-increasing, strenuous requirements for current and next-generation devices. This transition to ubiquitous intelligence demands high connectivity, synchronicity, and end-to-end communication between users and network operators, and will pave the way towards full network automation without human intervention. Intent-based networking is a key factor in the reduction of human actions, roles, and responsibilities while shifting towards novel extraction and interpretation of automated network management. This paper presents the development of a custom Large Language Model (LLM) for 5G and next-generation intent-based networking and provides insights into future LLM developments and integrations to realize end-to-end intent-based networking for fully automated network intelligence.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

INFORMS · 回合 · 語言模型化 · Agent · MoDELS ·

2024 年 4 月 16 日

Asking Before Acting: Gather Information in Embodied Decision Making with Language Models

Xiaoyu Chen,Shenao Zhang,Pushi Zhang,Li Zhao,Jianyu Chen

With strong capabilities of reasoning and a broad understanding of the world, Large Language Models (LLMs) have demonstrated immense potential in building versatile embodied decision-making agents capable of executing a wide array of tasks. Nevertheless, when deployed in unfamiliar environments, we show that LLM agents encounter challenges in efficiently gathering essential information, leading to suboptimal performance. Conversely, human individuals often seek additional information from their peers prior to taking action, harnessing external knowledge to avoid unnecessary trial and error. Drawing inspiration from this behavior, we propose \textit{Asking Before Acting} (ABA), a method that empowers the agent to proactively inquire with external sources for pertinent information using natural language during their interactions within the environment. In this way, the agent is able to enhance its efficiency and performance by circumventing potentially laborious steps and combating the difficulties associated with exploration in unfamiliar environments and vagueness of the instructions. We conduct extensive experiments involving a spectrum of environments including text-based household everyday tasks, robot arm manipulation tasks, and real world open domain image based embodied tasks. The experiments involve various models from Vicuna to GPT-4. The results demonstrate that, even with modest prompts modifications, ABA exhibits substantial advantages on both performance and efficiency over baseline LLM agents. Further finetuning ABA with reformulated metadata (ABA-FT) faciliates learning the rationale for asking and allows for additional enhancements especially in tasks that baselines struggle to solve.

SGD · Analysis · CASE · 目標函數 · 散度 ·

2024 年 4 月 11 日

Demystifying Why Local Aggregation Helps: Convergence Analysis of Hierarchical SGD

Jiayi Wang,Shiqiang Wang,Rong-Rong Chen,Mingyue Ji

from arxiv, 36 pages, in AAAI 2022

Hierarchical SGD (H-SGD) has emerged as a new distributed SGD algorithm for multi-level communication networks. In H-SGD, before each global aggregation, workers send their updated local models to local servers for aggregations. Despite recent research efforts, the effect of local aggregation on global convergence still lacks theoretical understanding. In this work, we first introduce a new notion of "upward" and "downward" divergences. We then use it to conduct a novel analysis to obtain a worst-case convergence upper bound for two-level H-SGD with non-IID data, non-convex objective function, and stochastic gradient. By extending this result to the case with random grouping, we observe that this convergence upper bound of H-SGD is between the upper bounds of two single-level local SGD settings, with the number of local iterations equal to the local and global update periods in H-SGD, respectively. We refer to this as the "sandwich behavior". Furthermore, we extend our analytical approach based on "upward" and "downward" divergences to study the convergence for the general case of H-SGD with more than two levels, where the "sandwich behavior" still holds. Our theoretical results provide key insights of why local aggregation can be beneficial in improving the convergence of H-SGD.

狀態空間 · MoDELS · 數據集 · 得分 · Performer ·

2024 年 4 月 11 日

ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model

Hongruixuan Chen,Jian Song,Chengxi Han,Junshi Xia,Naoto Yokoya

Convolutional neural networks (CNN) and Transformers have made impressive progress in the field of remote sensing change detection (CD). However, both architectures have inherent shortcomings. Recently, the Mamba architecture, based on state space models, has shown remarkable performance in a series of natural language processing tasks, which can effectively compensate for the shortcomings of the above two architectures. In this paper, we explore for the first time the potential of the Mamba architecture for remote sensing CD tasks. We tailor the corresponding frameworks, called MambaBCD, MambaSCD, and MambaBDA, for binary change detection (BCD), semantic change detection (SCD), and building damage assessment (BDA), respectively. All three frameworks adopt the cutting-edge Visual Mamba architecture as the encoder, which allows full learning of global spatial contextual information from the input images. For the change decoder, which is available in all three architectures, we propose three spatio-temporal relationship modeling mechanisms, which can be naturally combined with the Mamba architecture and fully utilize its attribute to achieve spatio-temporal interaction of multi-temporal features, thereby obtaining accurate change information. On five benchmark datasets, our proposed frameworks outperform current CNN- and Transformer-based approaches without using any complex training strategies or tricks, fully demonstrating the potential of the Mamba architecture in CD tasks. Specifically, we obtained 83.11%, 88.39% and 94.19% F1 scores on the three BCD datasets SYSU, LEVIR-CD+, and WHU-CD; on the SCD dataset SECOND, we obtained 24.11% SeK; and on the BDA dataset xBD, we obtained 81.41% overall F1 score. Further experiments show that our architecture is quite robust to degraded data. The source code will be available in //github.com/ChenHongruixuan/MambaCD

圖 · INFORMS · 變換 · 詞元分析器 · Graph Transformer ·

2024 年 4 月 8 日

Technical Report: The Graph Spectral Token -- Enhancing Graph Transformers with Spectral Information

Zihan Pengmei,Zimu Li

from arxiv, Technical Report. The code is available at //github.com/zpengmei/SubFormer-Spec

Graph Transformers have emerged as a powerful alternative to Message-Passing Graph Neural Networks (MP-GNNs) to address limitations such as over-squashing of information exchange. However, incorporating graph inductive bias into transformer architectures remains a significant challenge. In this report, we propose the Graph Spectral Token, a novel approach to directly encode graph spectral information, which captures the global structure of the graph, into the transformer architecture. By parameterizing the auxiliary [CLS] token and leaving other tokens representing graph nodes, our method seamlessly integrates spectral information into the learning process. We benchmark the effectiveness of our approach by enhancing two existing graph transformers, GraphTrans and SubFormer. The improved GraphTrans, dubbed GraphTrans-Spec, achieves over 10% improvements on large graph benchmark datasets while maintaining efficiency comparable to MP-GNNs. SubFormer-Spec demonstrates strong performance across various datasets.

Networking · 設計 · Integration · Analysis · 情景 ·

2024 年 4 月 8 日

SRAM-PG: Power Delivery Network Benchmarks from SRAM Circuits

Shan Shen,Zhiqiang Liu,Wenjian Yu

from arxiv, Oral presentation at ISQED'24

Designing the power delivery network (PDN) in very large-scale integrated (VLSI) circuits is increasingly important, especially for nowadays low-power integrated circuit (IC) design. In order to ensure that the designed PDN enables a low level of voltage drop and noise which is required for the success of IC design, accurate analysis of PDN is largely demanded and brings a challenge of computation during the whole process of IC design. This promotes the research of efficient and scalable simulation methods for PDN. However, the lack of sufficient public PDN benchmarks hinders the relevant research. % on this aspect since it is hard to conduct a rapid and clear comparison between different approaches to solving this problem. To this end, we construct and release a set of PDN benchmarks (named \emph{SRAM-PG}) from SRAM circuit design in this work. The benchmarks are obtained from realistic and state-of-the-art SRAM designs, following a workflow for generating the post-layout PDN netlists with full RC parasitics. With careful modeling of load currents, the benchmarks reflect the dynamic work mode of the IC and can be used for both transient and DC analysis. The benchmarks are derived from the designs for diverse applications. And, sharing them in the public domain with detailed descriptions would largely benefit the relevant research. The whole set of benchmarks is available at \href{github}{//github.com/ShenShan123/SRAM-PG}.

語言模型化 · MoDELS · 可理解性 · Learning · 教程 ·

2024 年 4 月 4 日

Understanding Language Modeling Paradigm Adaptations in Recommender Systems: Lessons Learned and Open Challenges

Lemei Zhang,Peng Liu,Yashar Deldjoo,Yong Zheng,Jon Atle Gulla

from arxiv, Tutorial held at the 27th European Conference on Artificial Intelligence (ECAI) in Santiago de Compostela, Spain, on October 19-24, 2024

The emergence of Large Language Models (LLMs) has achieved tremendous success in the field of Natural Language Processing owing to diverse training paradigms that empower LLMs to effectively capture intricate linguistic patterns and semantic representations. In particular, the recent "pre-train, prompt and predict" training paradigm has attracted significant attention as an approach for learning generalizable models with limited labeled data. In line with this advancement, these training paradigms have recently been adapted to the recommendation domain and are seen as a promising direction in both academia and industry. This half-day tutorial aims to provide a thorough understanding of extracting and transferring knowledge from pre-trained models learned through different training paradigms to improve recommender systems from various perspectives, such as generality, sparsity, effectiveness and trustworthiness. In this tutorial, we first introduce the basic concepts and a generic architecture of the language modeling paradigm for recommendation purposes. Then, we focus on recent advancements in adapting LLM-related training strategies and optimization objectives for different recommendation tasks. After that, we will systematically introduce ethical issues in LLM-based recommender systems and discuss possible approaches to assessing and mitigating them. We will also summarize the relevant datasets, evaluation metrics, and an empirical study on the recommendation performance of training paradigms. Finally, we will conclude the tutorial with a discussion of open challenges and future directions.

Learning · 路徑 · 優化器 · GROUP · Extensibility ·

2024 年 4 月 4 日

No Panacea in Planning: Algorithm Selection for Suboptimal Multi-Agent Path Finding

Weizhe Chen,Zhihan Wang,Jiaoyang Li,Sven Koenig,Bistra Dilkina

Since more and more algorithms are proposed for multi-agent path finding (MAPF) and each of them has its strengths, choosing the correct one for a specific scenario that fulfills some specified requirements is an important task. Previous research in algorithm selection for MAPF built a standard workflow and showed that machine learning can help. In this paper, we study general solvers for MAPF, which further include suboptimal algorithms. We propose different groups of optimization objectives and learning tasks to handle the new tradeoff between runtime and solution quality. We conduct extensive experiments to show that the same loss can not be used for different groups of optimization objectives, and that standard computer vision models are no worse than customized architecture. We also provide insightful discussions on how feature-sensitive pre-processing is needed for learning for MAPF, and how different learning metrics are correlated to different learning tasks.

大語言模型 · 語言模型化 · AIM · MoDELS · 可辨認的 ·

2024 年 4 月 3 日

Large Language Model for Vulnerability Detection and Repair: Literature Review and Roadmap

Xin Zhou,Sicong Cao,Xiaobing Sun,David Lo

from arxiv, 11 pages

The significant advancements in Large Language Models (LLMs) have resulted in their widespread adoption across various tasks within Software Engineering (SE), including vulnerability detection and repair. Numerous recent studies have investigated the application of LLMs to enhance vulnerability detection and repair tasks. Despite the increasing research interest, there is currently no existing survey that focuses on the utilization of LLMs for vulnerability detection and repair. In this paper, we aim to bridge this gap by offering a systematic literature review of approaches aimed at improving vulnerability detection and repair through the utilization of LLMs. The review encompasses research work from leading SE, AI, and Security conferences and journals, covering 36 papers published at 21 distinct venues. By answering three key research questions, we aim to (1) summarize the LLMs employed in the relevant literature, (2) categorize various LLM adaptation techniques in vulnerability detection, and (3) classify various LLM adaptation techniques in vulnerability repair. Based on our findings, we have identified a series of challenges that still need to be tackled considering existing studies. Additionally, we have outlined a roadmap highlighting potential opportunities that we believe are pertinent and crucial for future research endeavors.

MoDELS · 變換 · 優化器 · Taxonomy · HTTPS ·

2023 年 11 月 21 日

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

Yunpeng Huang,Jingwei Xu,Zixu Jiang,Junyu Lai,Zenan Li,Yuan Yao,Taolue Chen,Lijuan Yang,Zhou Xin,Xiaoxing Ma

from arxiv, 35 pages, 3 figures, 4 tables

With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs) have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been applied in diverse areas as knowledge bases, human interfaces, and dynamic agents. However, a prevailing limitation exists: many current LLMs, constrained by resources, are primarily pre-trained on shorter texts, rendering them less effective for longer-context prompts, commonly encountered in real-world settings. In this paper, we present a comprehensive survey focusing on the advancement of model architecture in Transformer-based LLMs to optimize long-context capabilities across all stages from pre-training to inference. We firstly delineate and analyze the problems of handling long-context input and output with the current Transformer-based models. Then, we mainly offer a holistic taxonomy to navigate the landscape of Transformer upgrades on architecture to solve these problems. Afterward, we provide the investigation on wildly used evaluation necessities tailored for long-context LLMs, including datasets, metrics, and baseline models, as well as some amazing optimization toolkits like libraries, systems, and compilers to augment LLMs' efficiency and efficacy across different stages. Finally, we further discuss the predominant challenges and potential avenues for future research in this domain. Additionally, we have established a repository where we curate relevant literature with real-time updates at //github.com/Strivin0311/long-llms-learning.

GANs · Taxonomy · Vision · 生成式對抗網絡 · 計算機視覺 ·

2020 年 12 月 21 日

Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Zhengwei Wang,Qi She,Tomas E. Ward

from arxiv, Accepted by ACM Computing Surveys, 23 November 2020

Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably their most significant impact has been in the area of computer vision where great advances have been made in challenges such as plausible image generation, image-to-image translation, facial attribute manipulation and similar domains. Despite the significant successes achieved to date, applying GANs to real-world problems still poses significant challenges, three of which we focus on here. These are: (1) the generation of high quality images, (2) diversity of image generation, and (3) stable training. Focusing on the degree to which popular GAN technologies have made progress against these challenges, we provide a detailed review of the state of the art in GAN-related research in the published scientific literature. We further structure this review through a convenient taxonomy we have adopted based on variations in GAN architectures and loss functions. While several reviews for GANs have been presented to date, none have considered the status of this field based on their progress towards addressing practical challenges relevant to computer vision. Accordingly, we review and critically discuss the most popular architecture-variant, and loss-variant GANs, for tackling these challenges. Our objective is to provide an overview as well as a critical analysis of the status of GAN research in terms of relevant progress towards important computer vision application requirements. As we do this we also discuss the most compelling applications in computer vision in which GANs have demonstrated considerable success along with some suggestions for future research directions. Code related to GAN-variants studied in this work is summarized on //github.com/sheqi/GAN_Review.