唯美清纯另类亚洲一区二区-日本中文字幕高清专区久久

Tablature notation is widely used in popular music to transcribe and share guitar musical content. As a complement to standard score notation, tablatures transcribe performance gesture information including finger positions and a variety of guitar-specific playing techniques such as slides, hammer-on/pull-off or bends.This paper focuses on bends, which enable to progressively shift the pitch of a note, therefore circumventing physical limitations of the discrete fretted fingerboard. In this paper, we propose a set of 25 high-level features, computed for each note of the tablature, to study how bend occurrences can be predicted from their past and future short-term context. Experiments are performed on a corpus of 932 lead guitar tablatures of popular music and show that a decision tree successfully predicts bend occurrences with an F1 score of 0.71 anda limited amount of false positive predictions, demonstrating promising applications to assist the arrangement of non-guitar music into guitar tablatures.

相關內容

Performer

關注 10

MoDELS · 噪聲 · Automator · Performer · 余弦相似度 ·

2023 年 10 月 12 日

Effective Slogan Generation with Noise Perturbation

Jongeun Kim,MinChung Kim,Taehwan Kim

from arxiv, Accepted in CIKM 2023 short paper //github.com/joannekim0420/SloganGeneration

Slogans play a crucial role in building the brand's identity of the firm. A slogan is expected to reflect firm's vision and brand's value propositions in memorable and likeable ways. Automating the generation of slogans with such characteristics is challenging. Previous studies developted and tested slogan generation with syntactic control and summarization models which are not capable of generating distinctive slogans. We introduce a a novel apporach that leverages pre-trained transformer T5 model with noise perturbation on newly proposed 1:N matching pair dataset. This approach serves as a contributing fator in generting distinctive and coherent slogans. Turthermore, the proposed approach incorporates descriptions about the firm and brand into the generation of slogans. We evaluate generated slogans based on ROUGE1, ROUGEL and Cosine Similarity metrics and also assess them with human subjects in terms of slogan's distinctiveness, coherence, and fluency. The results demonstrate that our approach yields better performance than baseline models and other transformer-based models.

2023 年 10 月 12 日

Conditional Mutual Information for Disentangled Representations in Reinforcement Learning

Mhairi Dunion,Trevor McInroe,Kevin Sebastian Luck,Josiah P. Hanna,Stefano V. Albrecht

from arxiv, Conference on Neural Information Processing Systems (NeurIPS), 2023

Reinforcement Learning (RL) environments can produce training data with spurious correlations between features due to the amount of training data or its limited feature coverage. This can lead to RL agents encoding these misleading correlations in their latent representation, preventing the agent from generalising if the correlation changes within the environment or when deployed in the real world. Disentangled representations can improve robustness, but existing disentanglement techniques that minimise mutual information between features require independent features, thus they cannot disentangle correlated features. We propose an auxiliary task for RL algorithms that learns a disentangled representation of high-dimensional observations with correlated features by minimising the conditional mutual information between features in the representation. We demonstrate experimentally, using continuous control tasks, that our approach improves generalisation under correlation shifts, as well as improving the training performance of RL algorithms in the presence of correlated features.

INTERACT · 情景 · state-of-the-art · 異常點 · 分解的 ·

2023 年 10 月 11 日

Faster Location in Combinatorial Interaction Testing

Ryan E. Dougherty,Dylan N. Green,Grace M. Kim

Factors within a large-scale software system that simultaneously interact and strongly impact the system's response under a configuration are often difficult to identify. Although screening such a system for the existence of such interactions is important, determining their location is more useful for system engineers. Combinatorial interaction testing (CIT) concerns creation of test suites that nonadaptively either detect or locate the desired interactions, each of at most a specified size or show that no such set exists. Under the assumption that there are at most a given number of such interactions causing such a response, locating arrays (LAs) guarantee unique location for every such set of interactions and an algorithm to deal with outliers and nondeterministic behavior from real systems, we additionally require the LAs to have a "separation" between these collections. State-of-the-art approaches generate LAs that can locate at most one interaction of size at most three, due to the massive number of interaction combinations for larger parameters if no constraints are given. This paper presents LocAG, a two-stage algorithm that generates (unconstrained) LAs using a simple, but powerful partitioning strategy of these combinations. In particular, we are able to generate LAs with more factors, with any desired separation, and greater interaction size than existing approaches.

2023 年 10 月 11 日

MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong,Bin Chen,Xiulong Liu,Pawel Polak,Peng Zhang

We introduce MuseChat, an innovative dialog-based music recommendation system. This unique platform not only offers interactive user engagement but also suggests music tailored for input videos, so that users can refine and personalize their music selections. In contrast, previous systems predominantly emphasized content compatibility, often overlooking the nuances of users' individual preferences. For example, all the datasets only provide basic music-video pairings or such pairings with textual music descriptions. To address this gap, our research offers three contributions. First, we devise a conversation-synthesis method that simulates a two-turn interaction between a user and a recommendation system, which leverages pre-trained music tags and artist information. In this interaction, users submit a video to the system, which then suggests a suitable music piece with a rationale. Afterwards, users communicate their musical preferences, and the system presents a refined music recommendation with reasoning. Second, we introduce a multi-modal recommendation engine that matches music either by aligning it with visual cues from the video or by harmonizing visual information, feedback from previously recommended music, and the user's textual input. Third, we bridge music representations and textual data with a Large Language Model(Vicuna-7B). This alignment equips MuseChat to deliver music recommendations and their underlying reasoning in a manner resembling human communication. Our evaluations show that MuseChat surpasses existing state-of-the-art models in music retrieval tasks and pioneers the integration of the recommendation process within a natural language framework.

CASE · INTERACT · CASES · SimPLe · TOOLS ·

2023 年 10 月 10 日

Towards Proof Repair in Cubical Agda

Cosmo Viola,Max Fan,Talia Ringer

from arxiv, for associated code, see //github.com/InnovativeInventor/proof-repair-cubical

Recent work introduced an algorithm and tool in Coq to automatically repair broken proofs in response to changes that correspond to type equivalences. We report on case studies for manual proof repair across type equivalences using an adaptation of this algorithm in Cubical Agda. Crucially, these case studies capture proof repair use cases that were challenging to impossible in prior work in Coq due to type theoretic limitations, highlighting three benefits to working in Cubical Agda: (1) quotient types enrich the space of repairs we can express as type equivalences, (2) dependent path equality makes it possible to internally state and prove correctness of repaired proofs relative to the original proofs, and (3) functional extensionality and transport make it simple to move between slow and fast computations after repair. They also highlight two challenges of working in Cubical Agda, namely those introduced by: (1) lack of tools for automation, and (2) proof relevance, especially as it interacts with definitional equality. We detail these benefits and challenges in hopes to set the stage for later work in proof repair bridging the benefits of both languages.

符號函數 · 泛函 · Oracle · 論文 ·

2023 年 10 月 10 日

Double Public Key Signing Function Oracle Attack on EdDSA Software Implementations

Sam Grierson,Konstantinos Chalkias,William J Buchanan,Leandros Maglaras

EdDSA is a standardised elliptic curve digital signature scheme introduced to overcome some of the issues prevalent in the more established ECDSA standard. Due to the EdDSA standard specifying that the EdDSA signature be deterministic, if the signing function were to be used as a public key signing oracle for the attacker, the unforgeability notion of security of the scheme can be broken. This paper describes an attack against some of the most popular EdDSA implementations, which results in an adversary recovering the private key used during signing. With this recovered secret key, an adversary can sign arbitrary messages that would be seen as valid by the EdDSA verification function. A list of libraries with vulnerable APIs at the time of publication is provided. Furthermore, this paper provides two suggestions for securing EdDSA signing APIs against this vulnerability while it additionally discusses failed attempts to solve the issue.

控制器 · MoDELS · 數據集 · INFORMS · HTTPS ·

2023 年 10 月 10 日

Audio Generation with Multiple Conditional Diffusion Model

Zhifang Guo,Jianguo Mao,Rui Tao,Long Yan,Kazushige Ouchi,Hong Liu,Xiangdong Wang

from arxiv, Submitted to AAAI 2024

Text-based audio generation models have limitations as they cannot encompass all the information in audio, leading to restricted controllability when relying solely on text. To address this issue, we propose a novel model that enhances the controllability of existing pre-trained text-to-audio models by incorporating additional conditions including content (timestamp) and style (pitch contour and energy contour) as supplements to the text. This approach achieves fine-grained control over the temporal order, pitch, and energy of generated audio. To preserve the diversity of generation, we employ a trainable control condition encoder that is enhanced by a large language model and a trainable Fusion-Net to encode and fuse the additional conditions while keeping the weights of the pre-trained text-to-audio model frozen. Due to the lack of suitable datasets and evaluation metrics, we consolidate existing datasets into a new dataset comprising the audio and corresponding conditions and use a series of evaluation metrics to evaluate the controllability performance. Experimental results demonstrate that our model successfully achieves fine-grained control to accomplish controllable audio generation. Audio samples and our dataset are publicly available at //conditionaudiogen.github.io/conditionaudiogen/

語言模型化 · 數據選擇 · 簇 · MoDELS · 無監督 ·

2020 年 4 月 5 日

Unsupervised Domain Clusters in Pretrained Language Models

Roee Aharoni,Yoav Goldberg

from arxiv, Accepted as a long paper in ACL 2020

The notion of "in-domain data" in NLP is often over-simplistic and vague, as textual data varies in many nuanced linguistic aspects such as topic, style or level of formality. In addition, domain labels are many times unavailable, making it challenging to build domain-specific systems. We show that massive pre-trained language models implicitly learn sentence representations that cluster by domains without supervision -- suggesting a simple data-driven definition of domains in textual data. We harness this property and propose domain data selection methods based on such models, which require only a small set of in-domain monolingual data. We evaluate our data selection methods for neural machine translation across five diverse domains, where they outperform an established approach as measured by both BLEU and by precision and recall of sentence selection with respect to an oracle.

MoDELS · Performer · ELMo · 詞向量表示 · Processing（編程語言） ·

2020 年 3 月 16 日

A Survey on Contextual Embeddings

Qi Liu,Matt J. Kusner,Phil Blunsom

from arxiv, 13 pages

Contextual embeddings, such as ELMo and BERT, move beyond global word representations like Word2Vec and achieve ground-breaking performance on a wide range of natural language processing tasks. Contextual embeddings assign each word a representation based on its context, thereby capturing uses of words across varied contexts and encoding knowledge that transfers across languages. In this survey, we review existing contextual embedding models, cross-lingual polyglot pre-training, the application of contextual embeddings in downstream tasks, model compression, and model analyses.

Single-Shot · Branch · 目標檢測 · 推斷 · MS ·

2018 年 4 月 8 日

Single-Shot Object Detection with Enriched Semantics

Zhishuai Zhang,Siyuan Qiao,Cihang Xie,Wei Shen,Bo Wang,Alan L. Yuille

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.