亚洲精品无码国产爽快A片百度_国产高清一区二区在线影院_日韩在线观看成人一区二区三区_一区二区三区精品国产亚洲_欧美性爱亚州最新一区二区三区_天天拍拍拍夜夜拍拍拍_欧美国产日韩在线观看

High Dynamic Range (HDR) videos are able to represent wider ranges of contrasts and colors than Standard Dynamic Range (SDR) videos, giving more vivid experiences. Due to this, HDR videos are expected to grow into the dominant video modality of the future. However, HDR videos are incompatible with existing SDR displays, which form the majority of affordable consumer displays on the market. Because of this, HDR videos must be processed by tone-mapping them to reduced bit-depths to service a broad swath of SDR-limited video consumers. Here, we analyze the impact of tone-mapping operators on the visual quality of streaming HDR videos. To this end, we built the first large-scale subjectively annotated open-source database of compressed tone-mapped HDR videos, containing 15,000 tone-mapped sequences derived from 40 unique HDR source contents. The videos in the database were labeled with more than 750,000 subjective quality annotations, collected from more than 1,600 unique human observers. We demonstrate the usefulness of the new subjective database by benchmarking objective models of visual quality on it. We envision that the new LIVE Tone-Mapped HDR (LIVE-TMHDR) database will enable significant progress on HDR video tone mapping and quality assessment in the future. To this end, we make the database freely available to the community at //live.ece.utexas.edu/research/LIVE_TMHDR/index.html

相關內容

值域(yu)

關注 0

Attention · Processing（編程語言） · MoDELS · Extensibility · 多模態 ·

2024 年 5 月 6 日

Hierarchical Space-Time Attention for Micro-Expression Recognition

Haihong Hao,Shuo Wang,Huixia Ben,Yanbin Hao,Yansong Wang,Weiwei Wang

from arxiv, 9 pages, 4 figures

Micro-expression recognition (MER) aims to recognize the short and subtle facial movements from the Micro-expression (ME) video clips, which reveal real emotions. Recent MER methods mostly only utilize special frames from ME video clips or extract optical flow from these special frames. However, they neglect the relationship between movements and space-time, while facial cues are hidden within these relationships. To solve this issue, we propose the Hierarchical Space-Time Attention (HSTA). Specifically, we first process ME video frames and special frames or data parallelly by our cascaded Unimodal Space-Time Attention (USTA) to establish connections between subtle facial movements and specific facial areas. Then, we design Crossmodal Space-Time Attention (CSTA) to achieve a higher-quality fusion for crossmodal data. Finally, we hierarchically integrate USTA and CSTA to grasp the deeper facial cues. Our model emphasizes temporal modeling without neglecting the processing of special data, and it fuses the contents in different modalities while maintaining their respective uniqueness. Extensive experiments on the four benchmarks show the effectiveness of our proposed HSTA. Specifically, compared with the latest method on the CASME3 dataset, it achieves about 3% score improvement in seven-category classification.

數據集 · Taxonomy · 可辨認的 · Extensibility · state-of-the-art ·

2024 年 5 月 4 日

Towards Real-world Video Face Restoration: A New Benchmark

Ziyan Chen,Jingwen He,Xinqi Lin,Yu Qiao,Chao Dong

from arxiv, Project page: //ziyannchen.github.io/projects/VFRxBenchmark/

Blind face restoration (BFR) on images has significantly progressed over the last several years, while real-world video face restoration (VFR), which is more challenging for more complex face motions such as moving gaze directions and facial orientations involved, remains unsolved. Typical BFR methods are evaluated on privately synthesized datasets or self-collected real-world low-quality face images, which are limited in their coverage of real-world video frames. In this work, we introduced new real-world datasets named FOS with a taxonomy of "Full, Occluded, and Side" faces from mainly video frames to study the applicability of current methods on videos. Compared with existing test datasets, FOS datasets cover more diverse degradations and involve face samples from more complex scenarios, which helps to revisit current face restoration approaches more comprehensively. Given the established datasets, we benchmarked both the state-of-the-art BFR methods and the video super resolution (VSR) methods to comprehensively study current approaches, identifying their potential and limitations in VFR tasks. In addition, we studied the effectiveness of the commonly used image quality assessment (IQA) metrics and face IQA (FIQA) metrics by leveraging a subjective user study. With extensive experimental results and detailed analysis provided, we gained insights from the successes and failures of both current BFR and VSR methods. These results also pose challenges to current face restoration approaches, which we hope stimulate future advances in VFR research.

可約的 · 知識 (knowledge) · MoDELS · INFORMS · 信息先驗 ·

2024 年 5 月 3 日

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Nathaniel Li,Alexander Pan,Anjali Gopal,Summer Yue,Daniel Berrios,Alice Gatti,Justin D. Li,Ann-Kathrin Dombrowski,Shashwat Goel,Long Phan,Gabriel Mukobi,Nathan Helm-Burger,Rassin Lababidi,Lennart Justen,Andrew B. Liu,Michael Chen,Isabelle Barrass,Oliver Zhang,Xiaoyuan Zhu,Rishub Tamirisa,Bhrugu Bharathi,Adam Khoja,Zhenqi Zhao,Ariel Herbert-Voss,Cort B. Breuer,Samuel Marks,Oam Patel,Andy Zou,Mantas Mazeika,Zifan Wang,Palash Oswal,Weiran Liu,Adam A. Hunt,Justin Tienken-Harder,Kevin Y. Shih,Kemper Talley,John Guan,Russell Kaplan,Ian Steneker,David Campbell,Brad Jokubaitis,Alex Levinson,Jean Wang,William Qian,Kallol Krishna Karmakar,Steven Basart,Stephen Fitz,Mindy Levine,Ponnurangam Kumaraguru,Uday Tupakula,Vijay Varadharajan,Ruoyu Wang,Yan Shoshitaishvili,Jimmy Ba,Kevin M. Esvelt,Alexandr Wang,Dan Hendrycks

from arxiv, See the project page at //wmdp.ai

The White House Executive Order on Artificial Intelligence highlights the risks of large language models (LLMs) empowering malicious actors in developing biological, cyber, and chemical weapons. To measure these risks of malicious use, government institutions and major AI labs are developing evaluations for hazardous capabilities in LLMs. However, current evaluations are private, preventing further research into mitigating risk. Furthermore, they focus on only a few, highly specific pathways for malicious use. To fill these gaps, we publicly release the Weapons of Mass Destruction Proxy (WMDP) benchmark, a dataset of 3,668 multiple-choice questions that serve as a proxy measurement of hazardous knowledge in biosecurity, cybersecurity, and chemical security. WMDP was developed by a consortium of academics and technical consultants, and was stringently filtered to eliminate sensitive information prior to public release. WMDP serves two roles: first, as an evaluation for hazardous knowledge in LLMs, and second, as a benchmark for unlearning methods to remove such hazardous knowledge. To guide progress on unlearning, we develop RMU, a state-of-the-art unlearning method based on controlling model representations. RMU reduces model performance on WMDP while maintaining general capabilities in areas such as biology and computer science, suggesting that unlearning may be a concrete path towards reducing malicious use from LLMs. We release our benchmark and code publicly at //wmdp.ai

特化 · 剪枝 · MoDELS · 語言模型化 · Unstructured ·

2024 年 5 月 3 日

Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large Language Models

Zhiyu Guo,Hidetaka Kamigaito,Taro Wanatnabe

The rapid advancement in Large Language Models (LLMs) has markedly enhanced the capabilities of language understanding and generation. However, the substantial model size poses hardware challenges, affecting both memory size for serving and inference latency for token generation. To address those challenges, we propose Dependency-aware Semi-structured Sparsity (DaSS), a novel method for the recent prevalent SwiGLU-based LLMs pruning. Our approach incorporates structural dependency into the weight magnitude-based unstructured pruning. We introduce an MLP-specific pruning metric that evaluates the importance of each weight by jointly considering its magnitude and its corresponding MLP intermediate activation norms. DaSS facilitates a balance between the adaptability offered by unstructured pruning and the structural consistency inherent in dependency-based structured pruning. Empirical evaluations on Mistral and LLaMA2 model families demonstrate that DaSS not only outperforms both SparseGPT and Wanda in achieving hardware-friendly N:M sparsity patterns but also maintains the computational efficiency of Wanda.

匯聚 · 相關系數 · MoDELS · Performer · 變換 ·

2024 年 5 月 2 日

Deep Learning for Melt Pool Depth Contour Prediction From Surface Thermal Images via Vision Transformers

Francis Ogoke,Peter Myung-Won Pak,Alexander Myers,Guadalupe Quirarte,Jack Beuth,Jonathan Malen,Amir Barati Farimani

Insufficient overlap between the melt pools produced during Laser Powder Bed Fusion (L-PBF) can lead to lack-of-fusion defects and deteriorated mechanical and fatigue performance. In-situ monitoring of the melt pool subsurface morphology requires specialized equipment that may not be readily accessible or scalable. Therefore, we introduce a machine learning framework to correlate in-situ two-color thermal images observed via high-speed color imaging to the two-dimensional profile of the melt pool cross-section. Specifically, we employ a hybrid CNN-Transformer architecture to establish a correlation between single bead off-axis thermal image sequences and melt pool cross-section contours measured via optical microscopy. In this architecture, a ResNet model embeds the spatial information contained within the thermal images to a latent vector, while a Transformer model correlates the sequence of embedded vectors to extract temporal information. Our framework is able to model the curvature of the subsurface melt pool structure, with improved performance in high energy density regimes compared to analytical melt pool models. The performance of this model is evaluated through dimensional and geometric comparisons to the corresponding experimental melt pool observations.

聯邦學習 · Learning · MoDELS · contrastive · 泛函 ·

2024 年 5 月 2 日

Sharp Bounds for Sequential Federated Learning on Heterogeneous Data

Yipeng Li,Xinchen Lyu

from arxiv, arXiv admin note: text overlap with arXiv:2311.03154

There are two paradigms in Federated Learning (FL): parallel FL (PFL), where models are trained in a parallel manner across clients; and sequential FL (SFL), where models are trained in a sequential manner across clients. In contrast to that of PFL, the convergence theory of SFL on heterogeneous data is still lacking. To resolve the theoretical dilemma of SFL, we establish sharp convergence guarantees for SFL on heterogeneous data with both upper and lower bounds. Specifically, we derive the upper bounds for strongly convex, general convex and non-convex objective functions, and construct the matching lower bounds for the strongly convex and general convex objective functions. Then, we compare the upper bounds of SFL with those of PFL, showing that SFL outperforms PFL (at least, when the level of heterogeneity is relatively high). Experimental results on quadratic functions and real data sets validate the counterintuitive comparison result.

INFORMS · 模型評估 · 信息抽取 · Performer · 可理解性 ·

2024 年 5 月 2 日

Single Image Super-Resolution Based on Global-Local Information Synergy

Nianzu Qiao,Lamei Di,Changyin Sun

Although several image super-resolution solutions exist, they still face many challenges. CNN-based algorithms, despite the reduction in computational complexity, still need to improve their accuracy. While Transformer-based algorithms have higher accuracy, their ultra-high computational complexity makes them difficult to be accepted in practical applications. To overcome the existing challenges, a novel super-resolution reconstruction algorithm is proposed in this paper. The algorithm achieves a significant increase in accuracy through a unique design while maintaining a low complexity. The core of the algorithm lies in its cleverly designed Global-Local Information Extraction Module and Basic Block Module. By combining global and local information, the Global-Local Information Extraction Module aims to understand the image content more comprehensively so as to recover the global structure and local details in the image more accurately, which provides rich information support for the subsequent reconstruction process. Experimental results show that the comprehensive performance of the algorithm proposed in this paper is optimal, providing an efficient and practical new solution in the field of super-resolution reconstruction.

閾值 · Networking · TCS · TFS · 線性的 ·

2024 年 5 月 2 日

Computing Threshold Circuits with Bimolecular Void Reactions in Step Chemical Reaction Networks

Rachel Anderson,Bin Fu,Aiden Massie,Gourab Mukhopadhyay,Adrian Salinas,Robert Schweller,Evan Tomai,Tim Wylie

from arxiv, arXiv admin note: text overlap with arXiv:2402.08220

Step Chemical Reaction Networks (step CRNs) are an augmentation of the Chemical Reaction Network (CRN) model where additional species may be introduced to the system in a sequence of ``steps.'' We study step CRN systems using a weak subset of reaction rules, \emph{void} rules, in which molecular species can only be deleted. We demonstrate that step CRNs with only void rules of size (2,0) can simulate threshold formulas (TFs) under linear resources. These limited systems can also simulate threshold \emph{circuits} (TCs) by modifying the volume of the system to be exponential. We then prove a matching exponential lower bound on the required volume for simulating threshold circuits in a step CRN with (2,0)-size rules under a restricted \emph{gate-wise} simulation, thus showing our construction is optimal for simulating circuits in this way.

馬爾可夫隨機場 · 圖 · 圖像分割 · SSL · SC ·

2018 年 5 月 21 日

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Dwarikanath Mahapatra

Medical image segmentation requires consensus ground truth segmentations to be derived from multiple expert annotations. A novel approach is proposed that obtains consensus segmentations from experts using graph cuts (GC) and semi supervised learning (SSL). Popular approaches use iterative Expectation Maximization (EM) to estimate the final annotation and quantify annotator's performance. Such techniques pose the risk of getting trapped in local minima. We propose a self consistency (SC) score to quantify annotator consistency using low level image features. SSL is used to predict missing annotations by considering global features and local image consistency. The SC score also serves as the penalty cost in a second order Markov random field (MRF) cost function optimized using graph cuts to derive the final consensus label. Graph cut obtains a global maximum without an iterative procedure. Experimental results on synthetic images, real data of Crohn's disease patients and retinal images show our final segmentation to be accurate and more consistent than competing methods.

Single-Shot · Branch · 目標檢測 · 推斷 · MS ·

2018 年 4 月 8 日

Single-Shot Object Detection with Enriched Semantics

Zhishuai Zhang,Siyuan Qiao,Cihang Xie,Wei Shen,Bo Wang,Alan L. Yuille

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.