91婷婷国产精选国产色_亚洲中文字幕一二区精品_国产美女精品一区二区三区四区_97精品国产高清自在线观看超_牲激情无码一区二区三区四区_欧美日韩国产在线视频你懂的_8050午夜一级毛片久久亚洲欧

This paper describes a data-driven approach to creating real-time neural network models of guitar amplifiers, recreating the amplifiers' sonic response to arbitrary inputs at the full range of controls present on the physical device. While the focus on the paper is on the data collection pipeline, we demonstrate the effectiveness of this conditioned black-box approach by training an LSTM model to the task, and comparing its performance to an offline white-box SPICE circuit simulation. Our listening test results demonstrate that the neural amplifier modeling approach can match the subjective performance of a high-quality SPICE model, all while using an automated, non-intrusive data collection process, and an end-to-end trainable, real-time feasible neural network model.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · INTERACT · 連結 · 正則的 · 逼真度 ·

2024 年 4 月 25 日

GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting

Kyusun Cho,Joungbin Lee,Heeji Yoon,Yeobin Hong,Jaehoon Ko,Sangjun Ahn,Seungryong Kim

from arxiv, Project Page: //ku-cvlab.github.io/GaussianTalker

We propose GaussianTalker, a novel framework for real-time generation of pose-controllable talking heads. It leverages the fast rendering capabilities of 3D Gaussian Splatting (3DGS) while addressing the challenges of directly controlling 3DGS with speech audio. GaussianTalker constructs a canonical 3DGS representation of the head and deforms it in sync with the audio. A key insight is to encode the 3D Gaussian attributes into a shared implicit feature representation, where it is merged with audio features to manipulate each Gaussian attribute. This design exploits the spatial-aware features and enforces interactions between neighboring points. The feature embeddings are then fed to a spatial-audio attention module, which predicts frame-wise offsets for the attributes of each Gaussian. It is more stable than previous concatenation or multiplication approaches for manipulating the numerous Gaussians and their intricate parameters. Experimental results showcase GaussianTalker's superiority in facial fidelity, lip synchronization accuracy, and rendering speed compared to previous methods. Specifically, GaussianTalker achieves a remarkable rendering speed up to 120 FPS, surpassing previous benchmarks. Our code is made available at //github.com/KU-CVLAB/GaussianTalker/ .

2024 年 4 月 24 日

Introducing EEG Analyses to Help Personal Music Preference Prediction

Zhiyu He,Jiayu Li,Weizhi Ma,Min Zhang,Yiqun Liu,Shaoping Ma

from arxiv, Accepted by CHCI 2022

Nowadays, personalized recommender systems play an increasingly important role in music scenarios in our daily life with the preference prediction ability. However, existing methods mainly rely on users' implicit feedback (e.g., click, dwell time) which ignores the detailed user experience. This paper introduces Electroencephalography (EEG) signals to personal music preferences as a basis for the personalized recommender system. To realize collection in daily life, we use a dry-electrodes portable device to collect data. We perform a user study where participants listen to music and record preferences and moods. Meanwhile, EEG signals are collected with a portable device. Analysis of the collected data indicates a significant relationship between music preference, mood, and EEG signals. Furthermore, we conduct experiments to predict personalized music preference with the features of EEG signals. Experiments show significant improvement in rating prediction and preference classification with the help of EEG. Our work demonstrates the possibility of introducing EEG signals in personal music preference with portable devices. Moreover, our approach is not restricted to the music scenario, and the EEG signals as explicit feedback can be used in personalized recommendation tasks.

原點 · 可約的 · 模型評估 · 計算機科學 · 機器人 ·

2024 年 4 月 24 日

Visualizing High-Dimensional Configuration Spaces: A Comprehensive Analytical Approach

Jorge Ocampo Jimenez,Wael Suleiman

from arxiv, 8 pages, 11 figures

The representation of a Configuration Space C plays a vital role in accelerating the finding of a collision-free path for sampling-based motion planners where the majority of computation time is spent in collision checking of states. Traditionally, planners evaluate C's representations through limited evaluations of collision-free paths using the collision checker or by reducing the dimensionality of C for visualization. However, a collision checker may indicate high accuracy even when only a subset of the original C is represented; limiting the motion planner's ability to find paths comparable to those in the original C. Additionally, dealing with high-dimensional Cs is challenging, as qualitative evaluations become increasingly difficult in dimensions higher than three, where reduced-dimensional C evaluation may decrease accuracy in cluttered environments. In this paper, we present a novel approach for visualizing representations of high-dimensional Cs of manipulator robots in a 2D format. We provide a new tool for qualitative evaluation of high-dimensional Cs approximations without reducing the original dimension. This enhances our ability to compare the accuracy and coverage of two different high-dimensional Cs. Leveraging the kinematic chain of manipulator robots and human color perception, we show the efficacy of our method using a 7-degree-of-freedom CS of a manipulator robot. This visualization offers qualitative insights into the joint boundaries of the robot and the coverage of collision state combinations without reducing the dimensionality of the original data. To support our claim, we conduct a numerical evaluation of the proposed visualization.

知識 (knowledge) · 可理解性 · Processing（編程語言） · CASE · 在線 ·

2024 年 4 月 23 日

Exploring Convergence in Relation using Association Rules Mining: A Case Study in Collaborative Knowledge Production

Jiahe Ling,Corey B. Jackson

This study delves into the pivotal role played by non-experts in knowledge production on open collaboration platforms, with a particular focus on the intricate process of tag development that culminates in the proposal of new glitch classes. Leveraging the power of Association Rule Mining (ARM), this research endeavors to unravel the underlying dynamics of collaboration among citizen scientists. By meticulously quantifying tag associations and scrutinizing their temporal dynamics, the study provides a comprehensive and nuanced understanding of how non-experts collaborate to generate valuable scientific insights. Furthermore, this investigation extends its purview to examine the phenomenon of ideological convergence within online citizen science knowledge production. To accomplish this, a novel measurement algorithm, based on the Mann-Kendall Trend Test, is introduced. This innovative approach sheds illuminating light on the dynamics of collaborative knowledge production, revealing both the vast opportunities and daunting challenges inherent in leveraging non-expert contributions for scientific research endeavors. Notably, the study uncovers a robust pattern of convergence in ideology, employing both the newly proposed convergence testing method and the traditional approach based on the stationarity of time series data. This groundbreaking discovery holds significant implications for understanding the dynamics of online citizen science communities and underscores the crucial role played by non-experts in shaping the scientific landscape of the digital age. Ultimately, this study contributes significantly to our understanding of online citizen science communities, highlighting their potential to harness collective intelligence for tackling complex scientific tasks and enriching our comprehension of collaborative knowledge production processes in the digital age.

機器人 · INTERACT · Processing（編程語言） · 掩碼 · INFORMS ·

2024 年 4 月 22 日

Robotic Blended Sonification: Consequential Robot Sound as Creative Material for Human-Robot Interaction

Stine S. Johansen,Yanto Browning,Anthony Brumpton,Jared Donovan,Markus Rittenbruch

from arxiv, Paper accepted at ISEA 24, The 29th International Symposium on Electronic Art, Brisbane, Australia, 21-29 June 2024

Current research in robotic sounds generally focuses on either masking the consequential sound produced by the robot or on sonifying data about the robot to create a synthetic robot sound. We propose to capture, modify, and utilise rather than mask the sounds that robots are already producing. In short, this approach relies on capturing a robot's sounds, processing them according to contextual information (e.g., collaborators' proximity or particular work sequences), and playing back the modified sound. Previous research indicates the usefulness of non-semantic, and even mechanical, sounds as a communication tool for conveying robotic affect and function. Adding to this, this paper presents a novel approach which makes two key contributions: (1) a technique for real-time capture and processing of consequential robot sounds, and (2) an approach to explore these sounds through direct human-robot interaction. Drawing on methodologies from design, human-robot interaction, and creative practice, the resulting 'Robotic Blended Sonification' is a concept which transforms the consequential robot sounds into a creative material that can be explored artistically and within application-based studies.

Learning · 變換 · 表示 · contrastive · 表示學習 ·

2024 年 4 月 19 日

MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer

Dong Yao,Jieming Zhu,Jiahao Xun,Shengyu Zhang,Zhou Zhao,Liqun Deng,Wenqiao Zhang,Zhenhua Dong,Xin Jiang

from arxiv, Short paper accepted by WWW 2024. This is revised and condensed based on the previous version titled "Music-PAW: Learning Music Representations via Hierarchical Part-whole Interaction and Contrast". For more experimental details and discussions, please refer to the original long paper at arXiv:2312.06197v1

Recent research in self-supervised contrastive learning of music representations has demonstrated remarkable results across diverse downstream tasks. However, a prevailing trend in existing methods involves representing equally-sized music clips in either waveform or spectrogram formats, often overlooking the intrinsic part-whole hierarchies within music. In our quest to comprehend the bottom-up structure of music, we introduce MART, a hierarchical music representation learning approach that facilitates feature interactions among cropped music clips while considering their part-whole hierarchies. Specifically, we propose a hierarchical part-whole transformer to capture the structural relationships between music clips in a part-whole hierarchy. Furthermore, a hierarchical contrastive learning objective is crafted to align part-whole music representations at adjacent levels, progressively establishing a multi-hierarchy representation space. The effectiveness of our music representation learning from part-whole hierarchies has been empirically validated across multiple downstream tasks, including music classification and cover song identification.

tuning · MoDELS · 評論員 · 語言模型化 · 數據集 ·

2023 年 8 月 21 日

Instruction Tuning for Large Language Models: A Survey

Shengyu Zhang,Linfeng Dong,Xiaoya Li,Sen Zhang,Xiaofei Sun,Shuhe Wang,Jiwei Li,Runyi Hu,Tianwei Zhang,Fei Wu,Guoyin Wang

from arxiv, A Survey paper, Pre-print

This paper surveys research works in the quickly advancing field of instruction tuning (IT), a crucial technique to enhance the capabilities and controllability of large language models (LLMs). Instruction tuning refers to the process of further training LLMs on a dataset consisting of \textsc{(instruction, output)} pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere to human instructions. In this work, we make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications, along with an analysis on aspects that influence the outcome of IT (e.g., generation of instruction outputs, size of the instruction dataset, etc). We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.

Networking · CNN · MoDELS · Performer · 數學 ·

2023 年 3 月 5 日

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

Xuan Shen,Yaohua Wang,Ming Lin,Yilun Huang,Hao Tang,Xiuyu Sun,Yanzhi Wang

from arxiv, Accepted by CVPR 2023

The rapid advances in Vision Transformer (ViT) refresh the state-of-the-art performances in various vision tasks, overshadowing the conventional CNN-based models. This ignites a few recent striking-back research in the CNN world showing that pure CNN models can achieve as good performance as ViT models when carefully tuned. While encouraging, designing such high-performance CNN models is challenging, requiring non-trivial prior knowledge of network design. To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way. In DeepMAD, a CNN network is modeled as an information processing system whose expressiveness and effectiveness can be analytically formulated by their structural parameters. Then a constrained mathematical programming (MP) problem is proposed to optimize these structural parameters. The MP problem can be easily solved by off-the-shelf MP solvers on CPUs with a small memory footprint. In addition, DeepMAD is a pure mathematical framework: no GPU or training data is required during network design. The superiority of DeepMAD is validated on multiple large-scale computer vision benchmark datasets. Notably on ImageNet-1k, only using conventional convolutional layers, DeepMAD achieves 0.7% and 1.5% higher top-1 accuracy than ConvNeXt and Swin on Tiny level, and 0.8% and 0.9% higher on Small level.

度量學習 · 縮放 · CSL · 值域 · 學成 ·

2021 年 3 月 22 日

Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales

Yifan Sun,Yuke Zhu,Yuhan Zhang,Pengkun Zheng,Xi Qiu,Chi Zhang,Yichen Wei

from arxiv, 8pages, accepted by CVPR 2021

This paper introduces a new fundamental characteristic, \ie, the dynamic range, from real-world metric tools to deep visual recognition. In metrology, the dynamic range is a basic quality of a metric tool, indicating its flexibility to accommodate various scales. Larger dynamic range offers higher flexibility. In visual recognition, the multiple scale problem also exist. Different visual concepts may have different semantic scales. For example, ``Animal'' and ``Plants'' have a large semantic scale while ``Elk'' has a much smaller one. Under a small semantic scale, two different elks may look quite \emph{different} to each other . However, under a large semantic scale (\eg, animals and plants), these two elks should be measured as being \emph{similar}. %We argue that such flexibility is also important for deep metric learning, because different visual concepts indeed correspond to different semantic scales. Introducing the dynamic range to deep metric learning, we get a novel computer vision task, \ie, the Dynamic Metric Learning. It aims to learn a scalable metric space to accommodate visual concepts across multiple semantic scales. Based on three types of images, \emph{i.e.}, vehicle, animal and online products, we construct three datasets for Dynamic Metric Learning. We benchmark these datasets with popular deep metric learning methods and find Dynamic Metric Learning to be very challenging. The major difficulty lies in a conflict between different scales: the discriminative ability under a small scale usually compromises the discriminative ability under a large one, and vice versa. As a minor contribution, we propose Cross-Scale Learning (CSL) to alleviate such conflict. We show that CSL consistently improves the baseline on all the three datasets. The datasets and the code will be publicly available at //github.com/SupetZYK/DynamicMetricLearning.

entity · 鏈路預測 · Extensibility · 圖 · 知識圖譜 ·

2019 年 3 月 13 日

MMKG: Multi-Modal Knowledge Graphs

Ye Liu,Hui Li,Alberto Garcia-Duran,Mathias Niepert,Daniel Onoro-Rubio,David S. Rosenblum

from arxiv, ESWC 2019

We present MMKG, a collection of three knowledge graphs that contain both numerical features and (links to) images for all entities as well as entity alignments between pairs of KGs. Therefore, multi-relational link prediction and entity matching communities can benefit from this resource. We believe this data set has the potential to facilitate the development of novel multi-modal learning approaches for knowledge graphs.We validate the utility ofMMKG in the sameAs link prediction task with an extensive set of experiments. These experiments show that the task at hand benefits from learning of multiple feature types.