高清国产三级在线播放_国产在线观看成永久免费视频_影音先锋在线精品视频网站_在线精品亚洲欧美日韩国产_美女视频黄频A免费_国产人人在线成视频_日韩欧美不卡成人中文字幕

We present DINAR, an approach for creating realistic rigged fullbody avatars from single RGB images. Similarly to previous works, our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars while keeping them easy to animate and fast to infer. To restore the texture, we use a latent diffusion model and show how such model can be trained in the neural texture space. The use of the diffusion model allows us to realistically reconstruct large unseen regions such as the back of a person given the frontal view. The models in our pipeline are trained using 2D images and videos only. In the experiments, our approach achieves state-of-the-art rendering quality and good generalization to new poses and viewpoints. In particular, the approach improves state-of-the-art on the SnapshotPeople public benchmark.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Neural Networks · 多樣性 · Extensibility · state-of-the-art ·

2023 年 10 月 5 日

PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs

Zhongkai Hao,Jiachen Yao,Chang Su,Hang Su,Ziao Wang,Fanzhi Lu,Zeyu Xia,Yichi Zhang,Songming Liu,Lu Lu,Jun Zhu

While significant progress has been made on Physics-Informed Neural Networks (PINNs), a comprehensive comparison of these methods across a wide range of Partial Differential Equations (PDEs) is still lacking. This study introduces PINNacle, a benchmarking tool designed to fill this gap. PINNacle provides a diverse dataset, comprising over 20 distinct PDEs from various domains, including heat conduction, fluid dynamics, biology, and electromagnetics. These PDEs encapsulate key challenges inherent to real-world problems, such as complex geometry, multi-scale phenomena, nonlinearity, and high dimensionality. PINNacle also offers a user-friendly toolbox, incorporating about 10 state-of-the-art PINN methods for systematic evaluation and comparison. We have conducted extensive experiments with these methods, offering insights into their strengths and weaknesses. In addition to providing a standardized means of assessing performance, PINNacle also offers an in-depth analysis to guide future research, particularly in areas such as domain decomposition methods and loss reweighting for handling multi-scale problems and complex geometry. To the best of our knowledge, it is the largest benchmark with a diverse and comprehensive evaluation that will undoubtedly foster further research in PINNs.

流 · Extensibility · Continuity · Twitch · Analysis ·

2023 年 10 月 5 日

Toward One-Second Latency: Evolution of Live Media Streaming

Abdelhak Bentaleb,May Lim,Mehmet N. Akcay,Ali C. Begen,Sarra Hammoudi,Roger Zimmermann

This survey presents the evolution of live media streaming and the technological developments behind today's IP-based low-latency live streaming systems. Live streaming primarily involves capturing, encoding, packaging and delivering real-time events such as live sports, live news, personal broadcasts and surveillance videos. Live streaming also involves concurrent streaming of linear TV programming off the satellite, cable, over-the-air or IPTV broadcast, where the programming is not necessarily a real-time event. The survey starts with a discussion on the latency and latency continuum in streaming applications. Then, it lays out the existing live streaming workflows and protocols, followed by an in-depth analysis of the latency sources in these workflows and protocols. The survey continues with the technology enablers, low-latency extensions for the popular HTTP adaptive streaming methods and enhancements for robust low-latency playback. An entire section is dedicated to the detailed summary and findings of Twitch's grand challenge on low-latency live streaming. The survey concludes with a discussion of ongoing research problems in this space.

秩 · LORA · 線性的 · 線性組合 · MoDELS ·

2023 年 10 月 4 日

NOLA: Networks as Linear Combination of Low Rank Random Basis

Soroush Abbasi Koohpayegani,KL Navaneet,Parsa Nooralinejad,Soheil Kolouri,Hamed Pirsiavash

from arxiv, Our code is available here: //github.com/UCDvision/NOLA

Large Language Models (LLMs) have recently gained popularity due to their impressive few-shot performance across various downstream tasks. However, fine-tuning all parameters and storing a unique model for each downstream task or domain becomes impractical because of the massive size of checkpoints (e.g., 350GB in GPT-3). Current literature, such as LoRA, showcases the potential of low-rank modifications to the original weights of an LLM, enabling efficient adaptation and storage for task-specific models. These methods can reduce the number of parameters needed to fine-tune an LLM by several orders of magnitude. Yet, these methods face two primary limitations: 1) the parameter reduction is lower-bounded by the rank one decomposition, and 2) the extent of reduction is heavily influenced by both the model architecture and the chosen rank. For instance, in larger models, even a rank one decomposition might exceed the number of parameters truly needed for adaptation. In this paper, we introduce NOLA, which overcomes the rank one lower bound present in LoRA. It achieves this by re-parameterizing the low-rank matrices in LoRA using linear combinations of randomly generated matrices (basis) and optimizing the linear mixture coefficients only. This approach allows us to decouple the number of trainable parameters from both the choice of rank and the network architecture. We present adaptation results using GPT-2 and ViT in natural language and computer vision tasks. NOLA performs as well as, or better than models with equivalent parameter counts. Furthermore, we demonstrate that we can halve the parameters in larger models compared to LoRA with rank one, without sacrificing performance.

NLP · Extensibility · 可辨認的 · motivation · Notability ·

2023 年 10 月 3 日

Hierarchical Evaluation Framework: Best Practices for Human Evaluation

Iva Bojic,Jessica Chen,Si Yuan Chang,Qi Chwen Ong,Shafiq Joty,Josip Car

Human evaluation plays a crucial role in Natural Language Processing (NLP) as it assesses the quality and relevance of developed systems, thereby facilitating their enhancement. However, the absence of widely accepted human evaluation metrics in NLP hampers fair comparisons among different systems and the establishment of universal assessment standards. Through an extensive analysis of existing literature on human evaluation metrics, we identified several gaps in NLP evaluation methodologies. These gaps served as motivation for developing our own hierarchical evaluation framework. The proposed framework offers notable advantages, particularly in providing a more comprehensive representation of the NLP system's performance. We applied this framework to evaluate the developed Machine Reading Comprehension system, which was utilized within a human-AI symbiosis model. The results highlighted the associations between the quality of inputs and outputs, underscoring the necessity to evaluate both components rather than solely focusing on outputs. In future work, we will investigate the potential time-saving benefits of our proposed framework for evaluators assessing NLP systems.

分離的 · MoDELS · state-of-the-art · 位置嵌入 · Performance ·

2023 年 10 月 3 日

Mel-Band RoFormer for Music Source Separation

Ju-Chiang Wang,Wei-Tsung Lu,Minz Won

from arxiv, submitted as an ISMIR 2023 late-breaking and demo paper

Recently, multi-band spectrogram-based approaches such as Band-Split RNN (BSRNN) have demonstrated promising results for music source separation. In our recent work, we introduce the BS-RoFormer model which inherits the idea of band-split scheme in BSRNN at the front-end, and then uses the hierarchical Transformer with Rotary Position Embedding (RoPE) to model the inner-band and inter-band sequences for multi-band mask estimation. This model has achieved state-of-the-art performance, but the band-split scheme is defined empirically, without analytic supports from the literature. In this paper, we propose Mel-RoFormer, which adopts the Mel-band scheme that maps the frequency bins into overlapped subbands according to the mel scale. In contract, the band-split mapping in BSRNN and BS-RoFormer is non-overlapping and designed based on heuristics. Using the MUSDB18HQ dataset for experiments, we demonstrate that Mel-RoFormer outperforms BS-RoFormer in the separation tasks of vocals, drums, and other stems.

Learning · 表示 · 強化學習 · 移動平均 · HTTPS ·

2023 年 10 月 2 日

H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation

Yanjie Ze,Yuyao Liu,Ruizhe Shi,Jiaxin Qin,Zhecheng Yuan,Jiashun Wang,Huazhe Xu

from arxiv, NeurIPS 2023. Code and videos: //yanjieze.com/H-InDex

Human hands possess remarkable dexterity and have long served as a source of inspiration for robotic manipulation. In this work, we propose a human $\textbf{H}$and$\textbf{-In}$formed visual representation learning framework to solve difficult $\textbf{Dex}$terous manipulation tasks ($\textbf{H-InDex}$) with reinforcement learning. Our framework consists of three stages: (i) pre-training representations with 3D human hand pose estimation, (ii) offline adapting representations with self-supervised keypoint detection, and (iii) reinforcement learning with exponential moving average BatchNorm. The last two stages only modify $0.36\%$ parameters of the pre-trained representation in total, ensuring the knowledge from pre-training is maintained to the full extent. We empirically study 12 challenging dexterous manipulation tasks and find that H-InDex largely surpasses strong baseline methods and the recent visual foundation models for motor control. Code is available at //yanjieze.com/H-InDex .

IPU · Performer · Analysis · Processing（編程語言） · 方陣 ·

2023 年 9 月 30 日

On Performance Analysis of Graphcore IPUs: Analyzing Squared and Skewed Matrix Multiplication

S. -Kazem Shekofteh,Christian Alles,Nils Kochend?rfer,Holger Fr?ning

In recent decades, High Performance Computing (HPC) has undergone significant enhancements, particularly in the realm of hardware platforms, aimed at delivering increased processing power while keeping power consumption within reasonable limits. The Intelligence Processing Unit (IPU) represents an entirely novel category of massively parallel processors, meticulously designed to expedite parallel computations through a multitude of processing cores and on-chip memory components interconnected via high-speed fabrics. While IPUs are primarily tailored for machine learning applications and come equipped with several libraries for the seamless implementation of neural networks, they also retain the capability to execute traditional parallel programs like matrix multiplication. However, it is essential to acknowledge that there are certain considerations and limitations when utilizing IPUs for such tasks. This paper embarks on an extensive analytical examination of matrix multiplications (MM) executed on an IPU, focusing on aspects such as execution efficiency and memory usage. Additionally, a comparative analysis is conducted, pitting the IPU against a GPU. Our findings indicate that IPUs can outperform modern GPUs, especially in handling the consistently challenging skewed matrix multiplication operations. For a more comprehensive understanding, we scrutinize various aspect ratios of matrices for these operations on an IPU and a Turing-class GPU (RTX 2080TI), revealing that the IPU consistently delivers more robust performance when dealing with skewed matrices compared to a GPU.

剪枝 · Better · CAP · contrastive · MoDELS ·

2021 年 12 月 14 日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Runxin Xu,Fuli Luo,Chengyu Wang,Baobao Chang,Jun Huang,Songfang Huang,Fei Huang

from arxiv, Accepted to AAAI 2022

Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processing (NLP) tasks under the pre-training and fine-tuning paradigm. With large quantities of parameters, PLMs are computation-intensive and resource-hungry. Hence, model pruning has been introduced to compress large-scale PLMs. However, most prior approaches only consider task-specific knowledge towards downstream tasks, but ignore the essential task-agnostic knowledge during pruning, which may cause catastrophic forgetting problem and lead to poor generalization ability. To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. It is designed as a general framework, compatible with both structured and unstructured pruning. Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge. Besides, to better retain the performance of the pruned model, the snapshots (i.e., the intermediate models at each pruning iteration) also serve as effective supervisions for pruning. Our extensive experiments show that adopting CAP consistently yields significant improvements, especially in extremely high sparsity scenarios. With only 3% model parameters reserved (i.e., 97% sparsity), CAP successfully achieves 99.2% and 96.3% of the original BERT performance in QQP and MNLI tasks. In addition, our probing experiments demonstrate that the model pruned by CAP tends to achieve better generalization ability.

contrastive · Performer · 無監督 · 控制器 · 學成 ·

2020 年 4 月 28 日

CURL: Contrastive Unsupervised Representations for Reinforcement Learning

Aravind Srinivas,Michael Laskin,Pieter Abbeel

from arxiv, First two authors contributed equally, website: //mishalaskin.github.io/curl code: //github.com/MishaLaskin/curl

We present CURL: Contrastive Unsupervised Representations for Reinforcement Learning. CURL extracts high-level features from raw pixels using contrastive learning and performs off-policy control on top of the extracted features. CURL outperforms prior pixel-based methods, both model-based and model-free, on complex tasks in the DeepMind Control Suite and Atari Games showing 1.9x and 1.6x performance gains at the 100K environment and interaction steps benchmarks respectively. On the DeepMind Control Suite, CURL is the first image-based algorithm to nearly match the sample-efficiency and performance of methods that use state-based features.

XAI · Responsible AI · Taxonomy · AI · MoDELS ·

2019 年 10 月 22 日

Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

Alejandro Barredo Arrieta,Natalia Díaz-Rodríguez,Javier Del Ser,Adrien Bennetot,Siham Tabik,Alberto Barbado,Salvador García,Sergio Gil-López,Daniel Molina,Richard Benjamins,Raja Chatila,Francisco Herrera

from arxiv, 67 pages, 13 figures, under review in the Information Fusion journal

In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI. Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is acknowledged as a crucial feature for the practical deployment of AI models. This overview examines the existing literature in the field of XAI, including a prospect toward what is yet to be reached. We summarize previous efforts to define explainability in Machine Learning, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought. We then propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at Deep Learning methods for which a second taxonomy is built. This literature analysis serves as the background for a series of challenges faced by XAI, such as the crossroads between data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to XAI with a reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.