女生喊疼男生越往里寨的免费观看,日日狠狠久久一区二区三区色综,国内成人自拍视频,久久天天躁夜夜躁狠狠2000,在线视频亚洲欧美在线视频

Yang Zhao,Di Huang,Chongxiao Li,Pengwei Jin,Ziyuan Nan,Tianyun Ma,Lei Qi,Yansong Pan,Zhenxing Zhang,Rui Zhang,Xishan Zhang,Zidong Du,Qi Guo,Xing Hu,Yunji Chen

from arxiv, 16 pages, 8 figures, conference

The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.

流 · Learning · 圖片分類 · state-of-the-art · 離散化 ·

2024 年 7 月 15 日

Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain

Pawe? Zyblewski,Jakub Klikowski,Weronika Borek-Marciniec,Pawe? Ksieniewicz

from arxiv, 8 pages, 8 figures

Tabular data is considered the last unconquered castle of deep learning, yet the task of data stream classification is stated to be an equally important and demanding research area. Due to the temporal constraints, it is assumed that deep learning methods are not the optimal solution for application in this field. However, excluding the entire -- and prevalent -- group of methods seems rather rash given the progress that has been made in recent years in its development. For this reason, the following paper is the first to present an approach to natural language data stream classification using the sentence space method, which allows for encoding text into the form of a discrete digital signal. This allows the use of convolutional deep networks dedicated to image classification to solve the task of recognizing fake news based on text data. Based on the real-life Fakeddit dataset, the proposed approach was compared with state-of-the-art algorithms for data stream classification based on generalization ability and time complexity.

MoDELS · Performer · 大語言模型 · 樣例 · 語言模型化 ·

2024 年 7 月 15 日

An Empirical Study of Validating Synthetic Data for Formula Generation

Usneek Singh,José Cambronero,Sumit Gulwani,Aditya Kanade,Anirudh Khatry,Vu Le,Mukul Singh,Gust Verbruggen

Large language models (LLMs) can be leveraged to help with writing formulas in spreadsheets, but resources on these formulas are scarce, impacting both the base performance of pre-trained models and limiting the ability to fine-tune them. Given a corpus of formulas, we can use a(nother) model to generate synthetic natural language utterances for fine-tuning. However, it is important to validate whether the NL generated by the LLM is indeed accurate to be beneficial for fine-tuning. In this paper, we provide empirical results on the impact of validating these synthetic training examples with surrogate objectives that evaluate the accuracy of the synthetic annotations. We demonstrate that validation improves performance over raw data across four models (2 open and 2 closed weight). Interestingly, we show that although validation tends to prune more challenging examples, it increases the complexity of problems that models can solve after being fine-tuned on validated data.

Performer · GPT-3.5 · 代碼 · 大語言模型 · SOTA ·

2024 年 7 月 15 日

Empowering LLMs for Verilog Generation through Multi-Level Summarization

Yang Zhao,Di Huang,Chongxiao Li,Pengwei Jin,Ziyuan Nan,Tianyun Ma,Lei Qi,Yansong Pan,Zhenxing Zhang,Rui Zhang,Xishan Zhang,Zidong Du,Qi Guo,Xing Hu,Yunji Chen

from arxiv, 16 pages, 8 figures, conference

The increasing complexity and high costs associated with modern processor design have led to a surge in demand for processor design automation. Instruction-tuned large language models (LLMs) have demonstrated remarkable performance in automatically generating code for general-purpose programming languages like Python. However, these methods fail on hardware description languages (HDLs) like Verilog due to the scarcity of high-quality instruction tuning data, as even advanced LLMs like GPT-3.5 exhibit limited performance on Verilog generation. Regarding this issue, we observe that (1) Verilog code collected from the real world has higher quality than those generated by LLMs. (2) LLMs like GPT-3.5 excel in summarizing Verilog code rather than generating it. Based on these observations, this paper introduces CodeV, a series of open-source instruction-tuned Verilog generation LLMs. Instead of generating descriptions first and then getting the corresponding code from advanced LLMs, we prompt the LLM with Verilog code and let the LLM generate the corresponding natural language description by multi-level summarization. Experimental results show that CodeV relatively surpasses the previous open-source SOTA by 14.4% (BetterV in VerilogEval) and 11.3% (RTLCoder in RTLLM) respectively, and also relatively outperforms previous commercial SOTA GPT-4 by 22.1% in VerilogEval.

LIDAR · 模型評估 · 3D · 估計/估計量 · Integration ·

2024 年 7 月 11 日

Roadside LiDAR Assisted Cooperative Localization for Connected Autonomous Vehicles

Yuze Jiang,Ehsan Javanmardi,Jin Nakazato,Manabu Tsukada,Hiroshi Esaki

from arxiv, Accepted by 2023 International Conference on Intelligent Computing and its Emerging Applications

Advancements in LiDAR technology have led to more cost-effective production while simultaneously improving precision and resolution. As a result, LiDAR has become integral to vehicle localization, achieving centimeter-level accuracy through techniques like Normal Distributions Transform (NDT) and other advanced 3D registration algorithms. Nonetheless, these approaches are reliant on high-definition 3D point cloud maps, the creation of which involves significant expenditure. When such maps are unavailable or lack sufficient features for 3D registration algorithms, localization accuracy diminishes, posing a risk to road safety. To address this, we proposed to use LiDAR-equipped roadside unit and Vehicle-to-Infrastructure (V2I) communication to accurately estimate the connected autonomous vehicle's position and help the vehicle when its self-localization is not accurate enough. Our simulation results indicate that this method outperforms traditional NDT scan matching-based approaches in terms of localization accuracy.

MoDELS · 語言模型化 · Learning · 逼真度 · 講稿 ·

2024 年 4 月 11 日

Best Practices and Lessons Learned on Synthetic Data for Language Models

Ruibo Liu,Jerry Wei,Fangyu Liu,Chenglei Si,Yanzhe Zhang,Jinmeng Rao,Steven Zheng,Daiyi Peng,Diyi Yang,Denny Zhou,Andrew M. Dai

The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.

語言模型化 · MoDELS · 泛化理論 · 可辨認的 · Continuity ·

2023 年 7 月 12 日

A Comprehensive Overview of Large Language Models

Humza Naveed,Asad Ullah Khan,Shi Qiu,Muhammad Saqib,Saeed Anwar,Muhammad Usman,Nick Barnes,Ajmal Mian

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

道德化 · 極小點 · Agent · Continuity · MoDELS ·

2023 年 7 月 2 日

Minimum Levels of Interpretability for Artificial Moral Agents

Avish Vijayaraghavan,Cosmin Badea

As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.

Neural Networks · Networking · 可約的 · Continuity · 推斷 ·

2021 年 6 月 21 日

A Survey of Quantization Methods for Efficient Neural Network Inference

Amir Gholami,Sehoon Kim,Zhen Dong,Zhewei Yao,Michael W. Mahoney,Kurt Keutzer

from arxiv, Book Chapter: Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

圖像分割 · 代價 · Performer · SCAN · Better ·

2018 年 1 月 31 日

Improved Image Segmentation via Cost Minimization of Multiple Hypotheses

Marc Bosch,Christopher M. Gifford,Austin G. Dress,Clare W. Lau,Jeffrey G. Skibo,Gordon A. Christie

from arxiv, Accepted BMVC 17

Image segmentation is an important component of many image understanding systems. It aims to group pixels in a spatially and perceptually coherent manner. Typically, these algorithms have a collection of parameters that control the degree of over-segmentation produced. It still remains a challenge to properly select such parameters for human-like perceptual grouping. In this work, we exploit the diversity of segments produced by different choices of parameters. We scan the segmentation parameter space and generate a collection of image segmentation hypotheses (from highly over-segmented to under-segmented). These are fed into a cost minimization framework that produces the final segmentation by selecting segments that: (1) better describe the natural contours of the image, and (2) are more stable and persistent among all the segmentation hypotheses. We compare our algorithm's performance with state-of-the-art algorithms, showing that we can achieve improved results. We also show that our framework is robust to the choice of segmentation kernel that produces the initial set of hypotheses.

亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

相關內容