成人不卡顿免费视频在线,一区二区三区高清视频精品

Quantum computing promises to help humanity solve problems that would otherwise be intractable on classical computers. Unlike today's machines, quantum computers use a novel computing process that leverages the foundational quantum mechanical laws of nature. This unlocks unparalleled compute power for certain applications and promises to help solve some of our generation's gravest challenges, including the climate crisis, food insecurity, and widespread disease. No one entity will be able to realize this end state alone. Developing a fault-tolerant quantum supercomputer and a vibrant ecosystem around it will require deep partnerships between industry, governments, and academia. It will also require collective action to enable and promote positive applications of quantum computing and ensure that the safe and responsible use of the technology is at the center of its development and deployment. Achieving these objectives will require focusing on three priorities: 1. Impact. Ensure quantum computing benefits all of humankind by developing quantum solutions to solve critical, global problems. 2. Use. Protect against malicious use by accelerating the deployment of quantum-safe cryptography and developing governance processes and controls for the responsible use of quantum machines. 3. Access. Democratize the potential for economic growth across all of society through skilling, workforce and ecosystem development, and digital infrastructure. This paper discusses each in turn.

知識薈萃

精品入門和進階教程、論文和代碼整理等

查看相關VIP內容、論文、資訊等

估計/估計量 · Processing（編程語言） · 感知機 · Learning · 深度學習 ·

2024 年 4 月 17 日

A Comparison of Traditional and Deep Learning Methods for Parameter Estimation of the Ornstein-Uhlenbeck Process

Jacob Fein-Ashley

We consider the Ornstein-Uhlenbeck (OU) process, a stochastic process widely used in finance, physics, and biology. Parameter estimation of the OU process is a challenging problem. Thus, we review traditional tracking methods and compare them with novel applications of deep learning to estimate the parameters of the OU process. We use a multi-layer perceptron to estimate the parameters of the OU process and compare its performance with traditional parameter estimation methods, such as the Kalman filter and maximum likelihood estimation. We find that the multi-layer perceptron can accurately estimate the parameters of the OU process given a large dataset of observed trajectories; however, traditional parameter estimation methods may be more suitable for smaller datasets.

Vision · MoDELS · 變換 · state-of-the-art · 模型評估 ·

2024 年 4 月 16 日

Comprehensive Survey of Model Compression and Speed up for Vision Transformers

Feiyang Chen,Ziqian Luo,Lisang Zhou,Xueting Pan,Ying Jiang

Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks. However, their practical deployment is hampered by high computational and memory demands. This study addresses the challenge by evaluating four primary model compression techniques: quantization, low-rank approximation, knowledge distillation, and pruning. We methodically analyze and compare the efficacy of these techniques and their combinations in optimizing ViTs for resource-constrained environments. Our comprehensive experimental evaluation demonstrates that these methods facilitate a balanced compromise between model accuracy and computational efficiency, paving the way for wider application in edge computing devices.

同態加密 · Analysis · 優化器 · Performer · Performance ·

2024 年 4 月 12 日

CiFlow: Dataflow Analysis and Optimization of Key Switching for Homomorphic Encryption

Negar Neda,Austin Ebel,Benedict Reynwar,Brandon Reagen

Homomorphic encryption (HE) is a privacy-preserving computation technique that enables computation on encrypted data. Today, the potential of HE remains largely unrealized as it is impractically slow, preventing it from being used in real applications. A major computational bottleneck in HE is the key-switching operation, accounting for approximately 70% of the overall HE execution time and involving a large amount of data for inputs, intermediates, and keys. Prior research has focused on hardware accelerators to improve HE performance, typically featuring large on-chip SRAMs and high off-chip bandwidth to deal with large scale data. In this paper, we present a novel approach to improve key-switching performance by rigorously analyzing its dataflow. Our primary goal is to optimize data reuse with limited on-chip memory to minimize off-chip data movement. We introduce three distinct dataflows: Max-Parallel (MP), Digit-Centric (DC), and Output-Centric (OC), each with unique scheduling approaches for key-switching computations. Through our analysis, we show how our proposed Output-Centric technique can effectively reuse data by significantly lowering the intermediate key-switching working set and alleviating the need for massive off-chip bandwidth. We thoroughly evaluate the three dataflows using the RPU, a recently published vector processor tailored for ring processing algorithms, which includes HE. This evaluation considers sweeps of bandwidth and computational throughput, and whether keys are buffered on-chip or streamed. With OC, we demonstrate up to 4.16x speedup over the MP dataflow and show how OC can save 12.25x on-chip SRAM by streaming keys for minimal performance penalty.

Performer · Engineering · Learning · 深度學習框架 · Analysis ·

2024 年 4 月 8 日

Design and Implementation of an Analysis Pipeline for Heterogeneous Data

Arup Kumar Sarker,Aymen Alsaadi,Niranda Perera,Mills Staylor,Gregor von Laszewski,Matteo Turilli,Ozgur Ozan Kilic,Mikhail Titov,Andre Merzky,Shantenu Jha,Geoffrey Fox

from arxiv, 14 pages, 16 figures, 2 tables

Managing and preparing complex data for deep learning, a prevalent approach in large-scale data science can be challenging. Data transfer for model training also presents difficulties, impacting scientific fields like genomics, climate modeling, and astronomy. A large-scale solution like Google Pathways with a distributed execution environment for deep learning models exists but is proprietary. Integrating existing open-source, scalable runtime tools and data frameworks on high-performance computing (HPC) platforms is crucial to address these challenges. Our objective is to establish a smooth and unified method of combining data engineering and deep learning frameworks with diverse execution capabilities that can be deployed on various high-performance computing platforms, including cloud and supercomputers. We aim to support heterogeneous systems with accelerators, where Cylon and other data engineering and deep learning frameworks can utilize heterogeneous execution. To achieve this, we propose Radical-Cylon, a heterogeneous runtime system with a parallel and distributed data framework to execute Cylon as a task of Radical Pilot. We thoroughly explain Radical-Cylon's design and development and the execution process of Cylon tasks using Radical Pilot. This approach enables the use of heterogeneous MPI-communicators across multiple nodes. Radical-Cylon achieves better performance than Bare-Metal Cylon with minimal and constant overhead. Radical-Cylon achieves (4~15)% faster execution time than batch execution while performing similar join and sort operations with 35 million and 3.5 billion rows with the same resources. The approach aims to excel in both scientific and engineering research HPC systems while demonstrating robust performance on cloud infrastructures. This dual capability fosters collaboration and innovation within the open-source scientific research community.

Performer · MoDELS · 優化器 · 核化 · CUDA ·

2024 年 4 月 5 日

Evaluation of Programming Models and Performance for Stencil Computation on Current GPU Architectures

Baodi Shan,Mauricio Araya-Polo

Accelerated computing is widely used in high-performance computing. Therefore, it is crucial to experiment and discover how to better utilize GPUGPUs latest generations on relevant applications. In this paper, we present results and share insights about highly tuned stencil-based kernels for NVIDIA Ampere (A100) and Hopper (GH200) architectures. Performance results yield useful insights into the behavior of this type of algorithms for these new accelerators. This knowledge can be leveraged by many scientific applications which involve stencils computations. Further, evaluation of three different programming models: CUDA, OpenACC, and OpenMP target offloading is conducted on aforementioned accelerators. We extensively study the performance and portability of various kernels under each programming model and provide corresponding optimization recommendations. Furthermore, we compare the performance of different programming models on the mentioned architectures. Up to 58% performance improvement was achieved against the previous GPGPU's architecture generation for an highly optimized kernel of the same class, and up to 42% for all classes. In terms of programming models, and keeping portability in mind, optimized OpenACC implementation outperforms OpenMP implementation by 33%. If portability is not a factor, our best tuned CUDA implementation outperforms the optimized OpenACC one by 2.1x.

量子計算 · MoDELS · 類別 · INFORMS · Natural Computing ·

2024 年 4 月 4 日

Quantum Algorithms in a Superposition of Spacetimes

Omri Shmueli

Quantum computers are expected to revolutionize our ability to process information. The advancement from classical to quantum computing is a product of our advancement from classical to quantum physics -- the more our understanding of the universe grows, so does our ability to use it for computation. A natural question that arises is, what will physics allow in the future? Can more advanced theories of physics increase our computational power, beyond quantum computing? An active field of research in physics studies theoretical phenomena outside the scope of explainable quantum mechanics, that form when attempting to combine Quantum Mechanics (QM) with General Relativity (GR) into a unified theory of Quantum Gravity (QG). QG is known to present the possibility of a quantum superposition of causal structure and event orderings. In the literature of quantum information theory, this translates to a superposition of unitary evolution orders. In this work we show a first example of a natural computational model based on QG, that provides an exponential speedup over standard quantum computation (under standard hardness assumptions). We define a model and complexity measure for a quantum computer that has the ability to generate a superposition of unitary evolution orders, and show that such computer is able to solve in polynomial time two of the fundamental problems in computer science: The Graph Isomorphism Problem ($\mathsf{GI}$) and the Gap Closest Vector Problem ($\mathsf{GapCVP}$), with gap $O\left( n^{2} \right)$. These problems are believed by experts to be hard to solve for a regular quantum computer. Interestingly, our model does not seem overpowered, and we found no obvious way to solve entire complexity classes that are considered hard in computer science, like the classes $\mathbf{NP}$ and $\mathbf{SZK}$.

知識 (knowledge) · 損失 · Engineering · 損失函數（機器學習） · 原點 ·

2024 年 4 月 3 日

Techniques for Measuring the Inferential Strength of Forgetting Policies

Patrick Doherty,Andrzej Szalas

The technique of forgetting in knowledge representation has been shown to be a powerful and useful knowledge engineering tool with widespread application. Yet, very little research has been done on how different policies of forgetting, or use of different forgetting operators, affects the inferential strength of the original theory. The goal of this paper is to define loss functions for measuring changes in inferential strength based on intuitions from model counting and probability theory. Properties of such loss measures are studied and a pragmatic knowledge engineering tool is proposed for computing loss measures using Problog. The paper includes a working methodology for studying and determining the strength of different forgetting policies, in addition to concrete examples showing how to apply the theoretical results using Problog. Although the focus is on forgetting, the results are much more general and should have wider application to other areas.

Learning · 泛化理論 · 概率近似正確 · 泛化誤差 · 少試學習 ·

2022 年 7 月 29 日

A Survey of Learning on Small Data

Xiaofeng Cao,Weixin Bu,Shengjun Huang,Yingpeng Tang,Yaming Guo,Yi Chang,Ivor W. Tsang

Learning on big data brings success for artificial intelligence (AI), but the annotation and training costs are expensive. In future, learning on small data is one of the ultimate purposes of AI, which requires machines to recognize objectives and scenarios relying on small data as humans. A series of machine learning models is going on this way such as active learning, few-shot learning, deep clustering. However, there are few theoretical guarantees for their generalization performance. Moreover, most of their settings are passive, that is, the label distribution is explicitly controlled by one specified sampling scenario. This survey follows the agnostic active sampling under a PAC (Probably Approximately Correct) framework to analyze the generalization error and label complexity of learning on small data using a supervised and unsupervised fashion. With these theoretical analyses, we categorize the small data learning models from two geometric perspectives: the Euclidean and non-Euclidean (hyperbolic) mean representation, where their optimization solutions are also presented and discussed. Later, some potential learning scenarios that may benefit from small data learning are then summarized, and their potential learning scenarios are also analyzed. Finally, some challenging applications such as computer vision, natural language processing that may benefit from learning on small data are also surveyed.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

學成 · CASES · 大數據 · 人工智能 · 統計方法 ·

2018 年 9 月 25 日

A Survey of Learning Causality with Data: Problems and Methods

Ruocheng Guo,Lu Cheng,Jundong Li,P. Richard Hahn,Huan Liu

from arxiv, 35 pages, under review

The era of big data provides researchers with convenient access to copious data. However, people often have little knowledge about it. The increasing prevalence of big data is challenging the traditional methods of learning causality because they are developed for the cases with limited amount of data and solid prior causal knowledge. This survey aims to close the gap between big data and learning causality with a comprehensive and structured review of traditional and frontier methods and a discussion about some open problems of learning causality. We begin with preliminaries of learning causality. Then we categorize and revisit methods of learning causality for the typical problems and data types. After that, we discuss the connections between learning causality and machine learning. At the end, some open problems are presented to show the great potential of learning causality with data.