亚洲综合蜜桃久久丁香婷,亚洲欧洲视频图片,午夜福利视频1692

Srinivas Sridharan,Taekyung Heo,Louis Feng,Zhaodong Wang,Matt Bergeron,Wenyin Fu,Shengbao Zheng,Brian Coutinho,Saeed Rashidi,Changhai Man,Tushar Krishna

Benchmarking and co-design are essential for driving optimizations and innovation around ML models, ML software, and next-generation hardware. Full workload benchmarks, e.g. MLPerf, play an essential role in enabling fair comparison across different software and hardware stacks especially once systems are fully designed and deployed. However, the pace of AI innovation demands a more agile methodology to benchmark creation and usage by simulators and emulators for future system co-design. We propose Chakra, an open graph schema for standardizing workload specification capturing key operations and dependencies, also known as Execution Trace (ET). In addition, we propose a complementary set of tools/capabilities to enable collection, generation, and adoption of Chakra ETs by a wide range of simulators, emulators, and benchmarks. For instance, we use generative AI models to learn latent statistical properties across thousands of Chakra ETs and use these models to synthesize Chakra ETs. These synthetic ETs can obfuscate key proprietary information and also target future what-if scenarios. As an example, we demonstrate an end-to-end proof-of-concept that converts PyTorch ETs to Chakra ETs and uses this to drive an open-source training system simulator (ASTRA-sim). Our end-goal is to build a vibrant industry-wide ecosystem of agile benchmarks and tools to drive future AI system co-design.

相關內容

ETS

關注 0

ETS：European Test Symposium。 Explanation：歐洲測試研討會。 Publisher：IEEE。 SIT：

估計/估計量 · 查準率/準確率 · MoDELS · Learning · Integration ·

2023 年 7 月 11 日

Improving Image-Based Precision Medicine with Uncertainty-Aware Causal Models

Joshua Durso-Finley,Jean-Pierre Falet,Raghav Mehta,Douglas L. Arnold,Nick Pawlowski,Tal Arbel

Image-based precision medicine aims to personalize treatment decisions based on an individual's unique imaging features so as to improve their clinical outcome. Machine learning frameworks that integrate uncertainty estimation as part of their treatment recommendations would be safer and more reliable. However, little work has been done in adapting uncertainty estimation techniques and validation metrics for precision medicine. In this paper, we use Bayesian deep learning for estimating the posterior distribution over factual and counterfactual outcomes on several treatments. This allows for estimating the uncertainty for each treatment option and for the individual treatment effects (ITE) between any two treatments. We train and evaluate this model to predict future new and enlarging T2 lesion counts on a large, multi-center dataset of MR brain images of patients with multiple sclerosis, exposed to several treatments during randomized controlled trials. We evaluate the correlation of the uncertainty estimate with the factual error, and, given the lack of ground truth counterfactual outcomes, demonstrate how uncertainty for the ITE prediction relates to bounds on the ITE error. Lastly, we demonstrate how knowledge of uncertainty could modify clinical decision-making to improve individual patient and clinical trial outcomes.

Oracle · Extensibility · Better · 泛函 · Engineering ·

2023 年 7 月 11 日

Tests4Py: A Benchmark for System Testing

Marius Smytzek,Martin Eberlein,Batuhan Serce,Lars Grunske,Andreas Zeller

from arxiv, 5 pages, 4 figures

Benchmarks are among the main drivers of progress in software engineering research, especially in software testing and debugging. However, current benchmarks in this field could be better suited for specific research tasks, as they rely on weak system oracles like crash detection, come with few unit tests only, need more elaborative research, or cannot verify the outcome of system tests. Our Tests4Py benchmark addresses these issues. It is derived from the popular BugsInPy benchmark, including 30 bugs from 5 real-world Python applications. Each subject in Tests4Py comes with an oracle to verify the functional correctness of system inputs. Besides, it enables the generation of system tests and unit tests, allowing for qualitative studies by investigating essential aspects of test sets and extensive evaluations. These opportunities make Tests4Py a next-generation benchmark for research in test generation, debugging, and automatic program repair.

MoDELS · state-of-the-art · Performer · HTTPS · Networking ·

2023 年 7 月 11 日

RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths

Zeyue Xue,Guanglu Song,Qiushan Guo,Boxiao Liu,Zhuofan Zong,Yu Liu,Ping Luo

from arxiv, Project Page: //miaohua.sensetime.com/en

Text-to-image generation has recently witnessed remarkable achievements. We introduce a text-conditional image diffusion model, termed RAPHAEL, to generate highly artistic images, which accurately portray the text prompts, encompassing multiple nouns, adjectives, and verbs. This is achieved by stacking tens of mixture-of-experts (MoEs) layers, i.e., space-MoE and time-MoE layers, enabling billions of diffusion paths (routes) from the network input to the output. Each path intuitively functions as a "painter" for depicting a particular textual concept onto a specified image region at a diffusion timestep. Comprehensive experiments reveal that RAPHAEL outperforms recent cutting-edge models, such as Stable Diffusion, ERNIE-ViLG 2.0, DeepFloyd, and DALL-E 2, in terms of both image quality and aesthetic appeal. Firstly, RAPHAEL exhibits superior performance in switching images across diverse styles, such as Japanese comics, realism, cyberpunk, and ink illustration. Secondly, a single model with three billion parameters, trained on 1,000 A100 GPUs for two months, achieves a state-of-the-art zero-shot FID score of 6.61 on the COCO dataset. Furthermore, RAPHAEL significantly surpasses its counterparts in human evaluation on the ViLG-300 benchmark. We believe that RAPHAEL holds the potential to propel the frontiers of image generation research in both academia and industry, paving the way for future breakthroughs in this rapidly evolving field. More details can be found on a webpage: //miaohua.sensetime.com/en.

圖 · CASES · Performer · Analysis · 知識 (knowledge) ·

2023 年 7 月 10 日

The Linked Data Benchmark Council (LDBC): Driving competition and collaboration in the graph data management space

Gábor Szárnyas,Brad Bebee,Altan Birler,Alin Deutsch,George Fletcher,Henry A. Gabb,Denise Gosnell,Alastair Green,Zhihui Guo,Keith W. Hare,Jan Hidders,Alexandru Iosup,Atanas Kiryakov,Tomas Kovatchev,Xinsheng Li,Leonid Libkin,Heng Lin,Xiaojian Luo,Arnau Prat-Pérez,David Püroja,Shipeng Qi,Oskar van Rest,Benjamin A. Steer,Dávid Szakállas,Bing Tong,Jack Waudby,Mingxi Wu,Bin Yang,Wenyuan Yu,Chen Zhang,Jason Zhang,Yan Zhou,Peter Boncz

Graph data management is instrumental for several use cases such as recommendation, root cause analysis, financial fraud detection, and enterprise knowledge representation. Efficiently supporting these use cases yields a number of unique requirements, including the need for a concise query language and graph-aware query optimization techniques. The goal of the Linked Data Benchmark Council (LDBC) is to design a set of standard benchmarks that capture representative categories of graph data management problems, making the performance of systems comparable and facilitating competition among vendors. LDBC also conducts research on graph schemas and graph query languages. This paper introduces the LDBC organization and its work over the last decade.

泛函 · 可約的 · Integration · ARM · Continuity ·

2023 年 7 月 9 日

Towards a RISC-V Open Platform for Next-generation Automotive ECUs

Luca Cuomo,Claudio Scordino,Alessandro Ottaviano,Nils Wistoff,Robert Balas,Luca Benini,Errico Guidieri,Ida Maria Savino

from arxiv, 8 pages, 2023 12th Mediterranean Conference on Embedded Computing (MECO)

The complexity of automotive systems is increasing quickly due to the integration of novel functionalities such as assisted or autonomous driving. However, increasing complexity poses considerable challenges to the automotive supply chain since the continuous addition of new hardware and network cabling is not considered tenable. The availability of modern heterogeneous multi-processor chips represents a unique opportunity to reduce vehicle costs by integrating multiple functionalities into fewer Electronic Control Units (ECUs). In addition, the recent improvements in open-hardware technology allow to further reduce costs by avoiding lock-in solutions. This paper presents a mixed-criticality multi-OS architecture for automotive ECUs based on open hardware and open-source technologies. Safety-critical functionalities are executed by an AUTOSAR OS running on a RISC-V processor, while the Linux OS executes more advanced functionalities on a multi-core ARM CPU. Besides presenting the implemented stack and the communication infrastructure, this paper provides a quantitative gap analysis between an HW/SW optimized version of the RISC-V processor and a COTS Arm Cortex-R in terms of real-time features, confirming that RISC-V is a valuable candidate for running AUTOSAR Classic stacks of next-generation automotive MCUs.

設計 · 情景 · surge · EASE · Performer ·

2023 年 7 月 9 日

Tartarus: A Benchmarking Platform for Realistic And Practical Inverse Molecular Design

AkshatKumar Nigam,Robert Pollice,Gary Tom,Kjell Jorner,Luca A. Thiede,Anshul Kundaje,Alan Aspuru-Guzik

from arxiv, 29+21 pages, 6+19 figures, 6+2 tables

The efficient exploration of chemical space to design molecules with intended properties enables the accelerated discovery of drugs, materials, and catalysts, and is one of the most important outstanding challenges in chemistry. Encouraged by the recent surge in computer power and artificial intelligence development, many algorithms have been developed to tackle this problem. However, despite the emergence of many new approaches in recent years, comparatively little progress has been made in developing realistic benchmarks that reflect the complexity of molecular design for real-world applications. In this work, we develop a set of practical benchmark tasks relying on physical simulation of molecular systems mimicking real-life molecular design problems for materials, drugs, and chemical reactions. Additionally, we demonstrate the utility and ease of use of our new benchmark set by demonstrating how to compare the performance of several well-established families of algorithms. Overall, we believe that our benchmark suite will help move the field towards more realistic molecular design benchmarks, and move the development of inverse molecular design algorithms closer to the practice of designing molecules that solve existing problems in both academia and industry alike.

通用智能 · 語言模型化 · MoDELS · INTERACT · Performer ·

2023 年 7 月 7 日

Brain in a Vat: On Missing Pieces Towards Artificial General Intelligence in Large Language Models

Yuxi Ma,Chi Zhang,Song-Chun Zhu

In this perspective paper, we first comprehensively review existing evaluations of Large Language Models (LLMs) using both standardized tests and ability-oriented benchmarks. We pinpoint several problems with current evaluation methods that tend to overstate the capabilities of LLMs. We then articulate what artificial general intelligence should encompass beyond the capabilities of LLMs. We propose four characteristics of generally intelligent agents: 1) they can perform unlimited tasks; 2) they can generate new tasks within a context; 3) they operate based on a value system that underpins task generation; and 4) they have a world model reflecting reality, which shapes their interaction with the world. Building on this viewpoint, we highlight the missing pieces in artificial general intelligence, that is, the unity of knowing and acting. We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations. Additionally, knowledge acquisition isn't solely reliant on passive input but requires repeated trials and errors. We conclude by outlining promising future research directions in the field of artificial general intelligence.

機器學習建模 · 示例 · MoDELS · ML · Learning ·

2023 年 7 月 7 日

Improving the Efficiency of Human-in-the-Loop Systems: Adding Artificial to Human Experts

Johannes Jakubik,Daniel Weber,Patrick Hemmer,Michael V?ssing,Gerhard Satzger

from arxiv, Accepted at International Conference on Wirtschaftsinformatik, 2023

Information systems increasingly leverage artificial intelligence (AI) and machine learning (ML) to generate value from vast amounts of data. However, ML models are imperfect and can generate incorrect classifications. Hence, human-in-the-loop (HITL) extensions to ML models add a human review for instances that are difficult to classify. This study argues that continuously relying on human experts to handle difficult model classifications leads to a strong increase in human effort, which strains limited resources. To address this issue, we propose a hybrid system that creates artificial experts that learn to classify data instances from unknown classes previously reviewed by human experts. Our hybrid system assesses which artificial expert is suitable for classifying an instance from an unknown class and automatically assigns it. Over time, this reduces human effort and increases the efficiency of the system. Our experiments demonstrate that our approach outperforms traditional HITL systems for several benchmarks on image classification.

Neural Networks · Networking · 可約的 · Continuity · 推斷 ·

2021 年 6 月 21 日

A Survey of Quantization Methods for Efficient Neural Network Inference

Amir Gholami,Sehoon Kim,Zhen Dong,Zhewei Yao,Michael W. Mahoney,Kurt Keutzer

from arxiv, Book Chapter: Low-Power Computer Vision: Improving the Efficiency of Artificial Intelligence

As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.

MoDELS · Transformer模型 · 變換 · 推斷 · 模型評估 ·

2020 年 6 月 23 日

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Zhuohan Li,Eric Wallace,Sheng Shen,Kevin Lin,Kurt Keutzer,Dan Klein,Joseph E. Gonzalez

from arxiv, ICML 2020

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.