一级a视频免费一区二区_久久国产乱子伦精品噜噜_国产男女激情一区二区_午夜看大片视频在线观看_日本人妖一区二区三区_日本在线看片免费视频_黄色美女视频在线观看

from arxiv, To be published in the proceedings of the P3HPC workshop, hosted at SC23 (International Conference for High Performance Computing, Networking, Storage, and Analysis)

In recent history, GPUs became a key driver of compute performance in HPC. With the installation of the Frontier supercomputer, they became the enablers of the Exascale era; further largest-scale installations are in progress (Aurora, El Capitan, JUPITER). But the early-day dominance by NVIDIA and their CUDA programming model has changed: The current HPC GPU landscape features three vendors (AMD, Intel, NVIDIA), each with native and derived programming models. The choices are ample, but not all models are supported on all platforms, especially if support for Fortran is needed; in addition, some restrictions might apply. It is hard for scientific programmers to navigate this abundance of choices and limits. This paper gives a guide by matching the GPU platforms with supported programming models, presented in a concise table and further elaborated in detailed comments. An assessment is made regarding the level of support of a model on a platform.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 異常檢測 · 監督 · 情景 · 鄰域聚合 ·

2023 年 11 月 16 日

GADBench: Revisiting and Benchmarking Supervised Graph Anomaly Detection

Jianheng Tang,Fengrui Hua,Ziqi Gao,Peilin Zhao,Jia Li

from arxiv, NeurIPS 2023 Datasets and Benchmarks Track camera ready version

With a long history of traditional Graph Anomaly Detection (GAD) algorithms and recently popular Graph Neural Networks (GNNs), it is still not clear (1) how they perform under a standard comprehensive setting, (2) whether GNNs can outperform traditional algorithms such as tree ensembles, and (3) how about their efficiency on large-scale graphs. In response, we introduce GADBench -- a benchmark tool dedicated to supervised anomalous node detection in static graphs. GADBench facilitates a detailed comparison across 29 distinct models on ten real-world GAD datasets, encompassing thousands to millions ($\sim$6M) nodes. Our main finding is that tree ensembles with simple neighborhood aggregation can outperform the latest GNNs tailored for the GAD task. We shed light on the current progress of GAD, setting a robust groundwork for subsequent investigations in this domain. GADBench is open-sourced at //github.com/squareRoot3/GADBench.

優化器 · Learning · 深度強化學習 · 強化學習 · Extensibility ·

2023 年 11 月 16 日

Short vs. Long-term Coordination of Drones: When Distributed Optimization Meets Deep Reinforcement Learning

Chuhao Qin,Evangelos Pournaras

from arxiv, 14 pages, 13 figures

Swarms of smart drones, with the support of charging technology, can provide completing sensing capabilities in Smart Cities, such as traffic monitoring and disaster response. Existing approaches, including distributed optimization and deep reinforcement learning (DRL), aim to coordinate drones to achieve cost-effective, high-quality navigation, sensing, and recharging. However, they have distinct challenges: short-term optimization struggles to provide sustained benefits, while long-term DRL lacks scalability, resilience, and flexibility. To bridge this gap, this paper introduces a new progressive approach that encompasses the planning and selection based on distributed optimization, as well as DRL-based flying direction scheduling. Extensive experiment with datasets generated from realisitic urban mobility demonstrate the outstanding performance of the proposed solution in traffic monitoring compared to three baseline methods.

Prompt · Engineering · 優化器 · Everything（軟件） · Networking ·

2023 年 11 月 15 日

Optimizing Mobile-Edge AI-Generated Everything (AIGX) Services by Prompt Engineering: Fundamental, Framework, and Case Study

Yinqiu Liu,Hongyang Du,Dusit Niyato,Jiawen Kang,Shuguang Cui,Xuemin Shen,Ping Zhang

from arxiv, 9 pages, 6 figure

As the next-generation paradigm for content creation, AI-Generated Content (AIGC), i.e., generating content automatically by Generative AI (GAI) based on user prompts, has gained great attention and success recently. With the ever-increasing power of GAI, especially the emergence of Pretrained Foundation Models (PFMs) that contain billions of parameters and prompt engineering methods (i.e., finding the best prompts for the given task), the application range of AIGC is rapidly expanding, covering various forms of information for human, systems, and networks, such as network designs, channel coding, and optimization solutions. In this article, we present the concept of mobile-edge AI-Generated Everything (AIGX). Specifically, we first review the building blocks of AIGX, the evolution from AIGC to AIGX, as well as practical AIGX applications. Then, we present a unified mobile-edge AIGX framework, which employs edge devices to provide PFM-empowered AIGX services and optimizes such services via prompt engineering. More importantly, we demonstrate that suboptimal prompts lead to poor generation quality, which adversely affects user satisfaction, edge network performance, and resource utilization. Accordingly, we conduct a case study, showcasing how to train an effective prompt optimizer using ChatGPT and investigating how much improvement is possible with prompt engineering in terms of user experience, quality of generation, and network performance.

MIMO · 6G · Performer · 秩 · Networking ·

2023 年 11 月 15 日

MIMO Evolution toward 6G: End-User-Centric Collaborative MIMO

Lung-Sheng Tsai,Shang-Ling Shih,Pei-Kai Liao,Chao-Kai Wen

from arxiv, 7 pages, 5 figures, 1 table. This work has been accepted in IEEE Communications Magazine

In 6G, the trend of transitioning from massive antenna elements to even more massive ones is continued. However, installing additional antennas in the limited space of user equipment (UE) is challenging, resulting in limited capacity scaling gain for end users, despite network side support for increasing numbers of antennas. To address this issue, we propose an end-user-centric collaborative MIMO (UE-CoMIMO) framework that groups several fixed or portable devices to provide a virtual abundance of antennas. This article outlines how advanced L1 relays and conventional relays enable device collaboration to offer diversity, rank, and localization enhancements. We demonstrate through system-level simulations how the UE-CoMIMO approaches lead to significant performance gains. Lastly, we discuss necessary research efforts to make UE-CoMIMO available for 6G and future research directions.

Agent · 語言模型化 · Learning · MoDELS · Prompt ·

2023 年 5 月 25 日

Voyager: An Open-Ended Embodied Agent with Large Language Models

Guanzhi Wang,Yuqi Xie,Yunfan Jiang,Ajay Mandlekar,Chaowei Xiao,Yuke Zhu,Linxi Fan,Anima Anandkumar

from arxiv, Project website and open-source codebase: //voyager.minedojo.org/

We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components: 1) an automatic curriculum that maximizes exploration, 2) an ever-growing skill library of executable code for storing and retrieving complex behaviors, and 3) a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement. Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. The skills developed by Voyager are temporally extended, interpretable, and compositional, which compounds the agent's abilities rapidly and alleviates catastrophic forgetting. Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize. We open-source our full codebase and prompts at //voyager.minedojo.org/.

文本分類 · 圖 · Neural Networks · 圖形處理器 · Networking ·

2023 年 4 月 27 日

Graph Neural Networks for Text Classification: A Survey

Kunze Wang,Yihao Ding,Soyeon Caren Han

from arxiv, 28 pages

Text Classification is the most essential and fundamental problem in Natural Language Processing. While numerous recent text classification models applied the sequential deep learning technique, graph neural network-based models can directly deal with complex structured text data and exploit global information. Many real text classification applications can be naturally cast into a graph, which captures words, documents, and corpus global features. In this survey, we bring the coverage of methods up to 2023, including corpus-level and document-level graph neural networks. We discuss each of these methods in detail, dealing with the graph construction mechanisms and the graph-based learning process. As well as the technological survey, we look at issues behind and future directions addressed in text classification using graph neural networks. We also cover datasets, evaluation metrics, and experiment design and present a summary of published performance on the publicly available benchmarks. Note that we present a comprehensive comparison between different techniques and identify the pros and cons of various evaluation metrics in this survey.

2022 年 9 月 21 日

Deep Learning for Medical Image Segmentation: Tricks, Challenges and Future Directions

Dong Zhang,Yi Lin,Hao Chen,Zhuotao Tian,Xin Yang,Jinhui Tang,Kwang Ting Cheng

from arxiv, Under consideration

Over the past few years, the rapid development of deep learning technologies for computer vision has greatly promoted the performance of medical image segmentation (MedISeg). However, the recent MedISeg publications usually focus on presentations of the major contributions (e.g., network architectures, training strategies, and loss functions) while unwittingly ignoring some marginal implementation details (also known as "tricks"), leading to a potential problem of the unfair experimental result comparisons. In this paper, we collect a series of MedISeg tricks for different model implementation phases (i.e., pre-training model, data pre-processing, data augmentation, model implementation, model inference, and result post-processing), and experimentally explore the effectiveness of these tricks on the consistent baseline models. Compared to paper-driven surveys that only blandly focus on the advantages and limitation analyses of segmentation models, our work provides a large number of solid experiments and is more technically operable. With the extensive experimental results on both the representative 2D and 3D medical image datasets, we explicitly clarify the effect of these tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong MedISeg repository, where each of its components has the advantage of plug-and-play. We believe that this milestone work not only completes a comprehensive and complementary survey of the state-of-the-art MedISeg approaches, but also offers a practical guide for addressing the future medical image processing challenges including but not limited to small dataset learning, class imbalance learning, multi-modality learning, and domain adaptation. The code has been released at: //github.com/hust-linyi/MedISeg

變換 · 學成 · Performer · MoDELS · Vision ·

2022 年 3 月 24 日

Transformers Meet Visual Learning Understanding: A Comprehensive Review

Yuting Yang,Licheng Jiao,Xu Liu,Fang Liu,Shuyuan Yang,Zhixi Feng,Xu Tang

from arxiv, arXiv admin note: text overlap with arXiv:2010.11929, arXiv:1706.03762 by other authors

Dynamic attention mechanism and global modeling ability make Transformer show strong feature learning ability. In recent years, Transformer has become comparable to CNNs methods in computer vision. This review mainly investigates the current research progress of Transformer in image and video applications, which makes a comprehensive overview of Transformer in visual learning understanding. First, the attention mechanism is reviewed, which plays an essential part in Transformer. And then, the visual Transformer model and the principle of each module are introduced. Thirdly, the existing Transformer-based models are investigated, and their performance is compared in visual learning understanding applications. Three image tasks and two video tasks of computer vision are investigated. The former mainly includes image classification, object detection, and image segmentation. The latter contains object tracking and video classification. It is significant for comparing different models' performance in various tasks on several public benchmark data sets. Finally, ten general problems are summarized, and the developing prospects of the visual Transformer are given in this review.

Engineering · Performer · MoDELS · 向量化 · FAST ·

2020 年 8 月 10 日

Beyond Lexical: A Semantic Retrieval Framework for Textual SearchEngine

Kuan Fang,Long Zhao,Zhan Shen,RuiXing Wang,RiKang Zhour,LiWen Fan

from arxiv, 9 pages

Search engine has become a fundamental component in various web and mobile applications. Retrieving relevant documents from the massive datasets is challenging for a search engine system, especially when faced with verbose or tail queries. In this paper, we explore a vector space search framework for document retrieval. Specifically, we trained a deep semantic matching model so that each query and document can be encoded as a low dimensional embedding. Our model was trained based on BERT architecture. We deployed a fast k-nearest-neighbor index service for online serving. Both offline and online metrics demonstrate that our method improved retrieval performance and search quality considerably, particularly for tail

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.