A级日本乱理伦片免费入口,一区二区三区有码在线观看,日韩精品福利片午夜免费,欧美电影一区二区三区在线观看

In the current high-performance and embedded computing era, full-stack energy-centric design is paramount. Use cases require increasingly high performance at an affordable power budget, often under real-time constraints. Extreme heterogeneity and parallelism address these issues but greatly complicate online power consumption assessment, which is essential for dynamic hardware and software stack adaptations. We introduce a novel architecture-agnostic power modeling methodology with state-of-the-art accuracy, low overhead, and high responsiveness. Our methodology identifies the best Performance Monitoring Counters (PMCs) to model the power consumption of each hardware sub-system at each Dynamic Voltage and Frequency Scaling (DVFS) state. The individual linear models are combined into a complete model that effectively describes the power consumption of the whole system, achieving high accuracy and low overhead. Our evaluation reports an average estimation error of 7.5 % for power consumption and 1.3 % for energy. Furthermore, we propose Runmeter, an open-source, PMC-based monitoring framework integrated into the Linux kernel. Runmeter manages PMC samples collection and manipulation, efficiently evaluating our power models at runtime. With a time overhead of only 0.7 % in the worst case, Runmeter provides responsive and accurate power measurements directly in the kernel, which can be employed for actuation policies such as Dynamic Power Management (DPM) and power-aware task scheduling.

相關內容

Performer

關注 10

路徑 · 標注 · Weight · 優化器 · CASES ·

2024 年 2 月 19 日

Labeling Methods for Partially Ordered Paths

Ricardo Euler,Pedro Maristany de las Casas

The landscape of applications and subroutines relying on shortest path computations continues to grow steadily. This growth is driven by the undeniable success of shortest path algorithms in theory and practice. It also introduces new challenges as the models and assessing the optimality of paths become more complicated. Hence, multiple recent publications in the field adapt existing labeling methods in an ad-hoc fashion to their specific problem variant without considering the underlying general structure: they always deal with multi-criteria scenarios and those criteria define different partial orders on the paths. In this paper, we introduce the partial order shortest path problem (POSP), a generalization of the multi-objective shortest path problem (MOSP) and in turn also of the classical shortest path problem. POSP captures the particular structure of many shortest path applications as special cases. In this generality, we study optimality conditions or the lack of them, depending on the objective functions' properties. Our final contribution is a big lookup table summarizing our findings and providing the reader an easy way to choose among the most recent multicriteria shortest path algorithms depending on their problem's weight structure. Examples range from time-dependent shortest path and bottleneck path problems to the fuzzy shortest path problem and complex financial weight functions studied in the public transportation community. Our results hold for general digraphs and therefore surpass previous generalizations that were limited to acyclic graphs.

SOI · Better · contrastive · 損失 · Performer ·

2024 年 2 月 16 日

A Low-Dissipation and Scalable GEMM Accelerator with Silicon Nitride Photonics

Venkata Sai Praneeth Karempudi,Sairam Sri Vatsavai,Ishan Thakkar,Oluwaseun Adewunmi Alo,Jeffrey Todd Hastings,Justin Scott Woods

from arxiv, To Appear at ISQED 2024

Over the past few years, several microring resonator (MRR)-based analog photonic architectures have been proposed to accelerate general matrix-matrix multiplications (GEMMs), which are found in abundance in deep learning workloads.These architectures have dramatically grown in popularity because they offer exceptional throughput and energy efficiency compared to their electronic counterparts. However, such architectures, due to their traditional realization based on the silicon-on-insulator (SOI) material platform, face two shortcomings. First, the high-index contrast of the SOI platform incurs high scattering losses, which mandates the provisioning of high optical input power.Second, SOI waveguides are susceptible to two-photon absorption, which can incur substantial optical signal losses at moderate-to-high signal fan-in. These shortcomings have severely detrimental effects on the achievable parallelism, throughput, and energy efficiency of SOI MRR-based GEMM accelerators. To address these shortcomings, we present a novel Silicon Nitride (SiN)-Based Photonic GEMM Accelerator called SiNPhAR. SiNPhAR architecture employs SiN-based active and passive devices to implement analog GEMM functions. Since the SiN material exhibits lower index contrast and no TPA, the optical signal losses in our SiNPhAR architecture are very low. This advantage significantly enhances the achievable processing parallelism, throughput, and energy efficiency of SiNPhAR architecture, compared to SOI-based photonic GEMM accelerators from prior work. We quantify and compare these benefits of SiNPhAR architecture via our cross-layer evaluation for a benchmark workload comprising four modern deep neural network models. From the system-level performance analysis, SiNPhAR demonstrates at least 1.7x better throughput FPS while consuming at least 2.8x better energy efficiency (FPS/W) than prior SOI-based GEMM accelerators.

穩健性 · 表示 · 表示學習 · Learning · Pivotal（公司） ·

2024 年 2 月 16 日

Enhancement-Driven Pretraining for Robust Fingerprint Representation Learning

Ekta Gavas,Kaustubh Olpadkar,Anoop Namboodiri

from arxiv, 8 pages, 4 figures, Accepted at 19th VISIGRAPP 2024: VISAPP conference

Fingerprint recognition stands as a pivotal component of biometric technology, with diverse applications from identity verification to advanced search tools. In this paper, we propose a unique method for deriving robust fingerprint representations by leveraging enhancement-based pre-training. Building on the achievements of U-Net-based fingerprint enhancement, our method employs a specialized encoder to derive representations from fingerprint images in a self-supervised manner. We further refine these representations, aiming to enhance the verification capabilities. Our experimental results, tested on publicly available fingerprint datasets, reveal a marked improvement in verification performance against established self-supervised training techniques. Our findings not only highlight the effectiveness of our method but also pave the way for potential advancements. Crucially, our research indicates that it is feasible to extract meaningful fingerprint representations from degraded images without relying on enhanced samples.

峰值 · 圖 · INTERACT · ForCES · 凸集 ·

2024 年 2 月 15 日

Towards Tight Convex Relaxations for Contact-Rich Manipulation

Bernhard P. Graesdal,Shao Y. C. Chia,Tobia Marcucci,Savva Morozov,Alexandre Amice,Pablo A. Parrilo,Russ Tedrake

We present a method for global motion planning of robotic systems that interact with the environment through contacts. Our method directly handles the hybrid nature of such tasks using tools from convex optimization. We formulate the motion-planning problem as a shortest-path problem in a graph of convex sets, where a path in the graph corresponds to a contact sequence and a convex set models the quasi-static dynamics within a fixed contact mode. For each contact mode, we use semidefinite programming to relax the nonconvex dynamics that results from the simultaneous optimization of the object's pose, contact locations, and contact forces. The result is a tight convex relaxation of the overall planning problem, that can be efficiently solved and quickly rounded to find a feasible contact-rich trajectory. As a first application of this technique, we focus on the task of planar pushing. Exhaustive experiments show that our convex-optimization method generates plans that are consistently within a small percentage of the global optimum. We demonstrate the quality of these plans on a real robotic system.

INFORMS · 優化器 · 泛函 · 控制器 · 傳感器 ·

2024 年 2 月 14 日

Semantic Filtering and Source Coding in Distributed Wireless Monitoring Systems

Pouya Agheli,Nikolaos Pappas,Marios Kountouris

from arxiv, Accepted to be published in IEEE Transactions on Communications

The problem of goal-oriented semantic filtering and timely source coding in multiuser communication systems is considered here. We study a distributed monitoring system in which multiple information sources, each observing a physical process, provide status update packets to multiple monitors having heterogeneous goals. Two semantic filtering schemes are first proposed as a means to admit or drop arrival packets based on their goal-dependent importance, which is a function of the intrinsic and extrinsic attributes of information and the probability of occurrence of each realization. Admitted packets at each sensor are then encoded and transmitted over block-fading wireless channels so that served monitors can timely fulfill their goals. A truncated error control scheme is derived, which allows transmitters to drop or retransmit undelivered packets based on their significance. Then, we formulate the timely source encoding optimization problem and analytically derive the optimal codeword lengths assigned to the admitted packets which maximize a weighted sum of semantic utility functions for all pairs of communicating sensors and monitors. Our analytical and numerical results provide the optimal design parameters for different arrival rates and highlight the improvement in timely status update delivery using the proposed semantic filtering, source coding, and error control schemes.

多任務學習 · 學成 · 可理解性 · INFORMS · 泛化理論 ·

2022 年 3 月 28 日

Multi-Task Learning for Visual Scene Understanding

Simon Vandenhende

from arxiv, PhD Thesis

Despite the recent progress in deep learning, most approaches still go for a silo-like solution, focusing on learning each task in isolation: training a separate neural network for each individual task. Many real-world problems, however, call for a multi-modal approach and, therefore, for multi-tasking models. Multi-task learning (MTL) aims to leverage useful information across tasks to improve the generalization capability of a model. This thesis is concerned with multi-task learning in the context of computer vision. First, we review existing approaches for MTL. Next, we propose several methods that tackle important aspects of multi-task learning. The proposed methods are evaluated on various benchmarks. The results show several advances in the state-of-the-art of multi-task learning. Finally, we discuss several possibilities for future work.

Prompt · MoDELS · 學成 · Extensibility · 向量化 ·

2022 年 3 月 10 日

Conditional Prompt Learning for Vision-Language Models

Kaiyang Zhou,Jingkang Yang,Chen Change Loy,Ziwei Liu

from arxiv, CVPR 2022. TL;DR: We propose a conditional prompt learning approach to solve the generalizability issue of static prompts

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets. A recently proposed method named Context Optimization (CoOp) introduces the concept of prompt learning -- a recent trend in NLP -- to the vision domain for adapting pre-trained vision-language models. Specifically, CoOp turns context words in a prompt into a set of learnable vectors and, with only a few labeled images for learning, can achieve huge improvements over intensively-tuned manual prompts. In our study we identify a critical problem of CoOp: the learned context is not generalizable to wider unseen classes within the same dataset, suggesting that CoOp overfits base classes observed during training. To address the problem, we propose Conditional Context Optimization (CoCoOp), which extends CoOp by further learning a lightweight neural network to generate for each image an input-conditional token (vector). Compared to CoOp's static prompts, our dynamic prompts adapt to each instance and are thus less sensitive to class shift. Extensive experiments show that CoCoOp generalizes much better than CoOp to unseen classes, even showing promising transferability beyond a single dataset; and yields stronger domain generalization performance as well. Code is available at //github.com/KaiyangZhou/CoOp.

學成 · Machine Learning · INTERACT · 圖 · INFORMS ·

2021 年 5 月 27 日

Graph-Based Deep Learning for Medical Diagnosis and Analysis: Past, Present and Future

David Ahmedt-Aristizabal,Mohammad Ali Armin,Simon Denman,Clinton Fookes,Lars Petersson

With the advances of data-driven machine learning research, a wide variety of prediction problems have been tackled. It has become critical to explore how machine learning and specifically deep learning methods can be exploited to analyse healthcare data. A major limitation of existing methods has been the focus on grid-like data; however, the structure of physiological recordings are often irregular and unordered which makes it difficult to conceptualise them as a matrix. As such, graph neural networks have attracted significant attention by exploiting implicit information that resides in a biological system, with interactive nodes connected by edges whose weights can be either temporal associations or anatomical junctions. In this survey, we thoroughly review the different types of graph architectures and their applications in healthcare. We provide an overview of these methods in a systematic manner, organized by their domain of application including functional connectivity, anatomical structure and electrical-based analysis. We also outline the limitations of existing techniques and discuss potential directions for future research.

學成 · 深度學習 · Vision · 計算機視覺 · Performance ·

2019 年 9 月 5 日

Scene Text Detection and Recognition: The Deep Learning Era

Shangbang Long,Xin He,Cong Yao

from arxiv, Submitted version

With the rise and development of deep learning, computer vision has been tremendously transformed and reshaped. As an important research area in computer vision, scene text detection and recognition has been inescapably influenced by this wave of revolution, consequentially entering the era of deep learning. In recent years, the community has witnessed substantial advancements in mindset, approach and performance. This survey is aimed at summarizing and analyzing the major changes and significant progresses of scene text detection and recognition in the deep learning era. Through this article, we devote to: (1) introduce new insights and ideas; (2) highlight recent techniques and benchmarks; (3) look ahead into future trends. Specifically, we will emphasize the dramatic differences brought by deep learning and the grand challenges still remained. We expect that this review paper would serve as a reference book for researchers in this field. Related resources are also collected and compiled in our Github repository: //github.com/Jyouhou/SceneTextPapers.

目標檢測 · Fashion MNIST (數據集) · SimPLe · Vision · 訓練數據 ·

2018 年 5 月 17 日

Zero-Shot Object Detection by Hybrid Region Embedding

Berkan Demirel,Ramazan Gokberk Cinbis,Nazli Ikizler-Cinbis

Object detection is considered as one of the most challenging problems in computer vision, since it requires correct prediction of both classes and locations of objects in images. In this study, we define a more difficult scenario, namely zero-shot object detection (ZSD) where no visual training data is available for some of the target object classes. We present a novel approach to tackle this ZSD problem, where a convex combination of embeddings are used in conjunction with a detection framework. For evaluation of ZSD methods, we propose a simple dataset constructed from Fashion-MNIST images and also a custom zero-shot split for the Pascal VOC detection challenge. The experimental results suggest that our method yields promising results for ZSD.