99视频在线播放喷射_亚洲成A人片在线观看网站黄_一级黄色视频一区_久久久久久国产免费A片_国产欧美日韩精品一区二区三区蜜桃不卡_中文字幕色欲AV一区二区三区_乱女乱妇熟女熟妇综合网网站

Speech anonymisation prevents misuse of spoken data by removing any personal identifier while preserving at least linguistic content. However, emotion preservation is crucial for natural human-computer interaction. The well-known voice conversion technique StarGANv2-VC achieves anonymisation but fails to preserve emotion. This work presents an any-to-many semi-supervised StarGANv2-VC variant trained on partially emotion-labelled non-parallel data. We propose emotion-aware losses computed on the emotion embeddings and acoustic features correlated to emotion. Additionally, we use an emotion classifier to provide direct emotion supervision. Objective and subjective evaluations show that the proposed approach significantly improves emotion preservation over the vanilla StarGANv2-VC. This considerable improvement is seen over diverse datasets, emotions, target speakers, and inter-group conversions without compromising intelligibility and anonymisation.

相關內容

可辨認的

關注 4

估計/估計量 · 學習器 · Learning · 學習率 · Adam ·

2023 年 10 月 29 日

Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Konstantin Mishchenko,Aaron Defazio

We consider the problem of estimating the learning rate in adaptive methods, such as Adagrad and Adam. We describe two techniques, Prodigy and Resetting, to provably estimate the distance to the solution $D$, which is needed to set the learning rate optimally. Our techniques are modifications of the D-Adaptation method for learning-rate-free learning. Our methods improve upon the convergence rate of D-Adaptation by a factor of $O(\sqrt{\log(D/d_0)})$, where $d_0$ is the initial estimate of $D$. We test our methods on 12 common logistic-regression benchmark datasets, VGG11 and ResNet-50 training on CIFAR10, ViT training on Imagenet, LSTM training on IWSLT14, DLRM training on Criteo dataset, VarNet on Knee MRI dataset, as well as RoBERTa and GPT transformer training on BookWiki. Our experimental results show that our approaches consistently outperform D-Adaptation and reach test accuracy values close to that of hand-tuned Adam.

Attention · INTERACT · Learning · Performer · Integration ·

2023 年 10 月 28 日

DACOOP-A: Decentralized Adaptive Cooperative Pursuit via Attention

Zheng Zhang,Dengyu Zhang,Qingrui Zhang,Wei Pan,Tianjiang Hu

from arxiv, 8 Pages; This manuscript has been accepted by IEEE Robotics and Automation Letters

Integrating rule-based policies into reinforcement learning promises to improve data efficiency and generalization in cooperative pursuit problems. However, most implementations do not properly distinguish the influence of neighboring robots in observation embedding or inter-robot interaction rules, leading to information loss and inefficient cooperation. This paper proposes a cooperative pursuit algorithm named Decentralized Adaptive COOperative Pursuit via Attention (DACOOP-A) by empowering reinforcement learning with artificial potential field and attention mechanisms. An attention-based framework is developed to emphasize important neighbors by concurrently integrating the learned attention scores into observation embedding and inter-robot interaction rules. A KL divergence regularization is introduced to alleviate the resultant learning stability issue. Improvements in data efficiency and generalization are demonstrated through numerical simulations. Extensive quantitative analysis and ablation studies are performed to illustrate the advantages of the proposed modules. Real-world experiments are performed to justify the feasibility of deploying DACOOP-A in physical systems.

潛在 · 優化器 · 可辨認的 · 流形 · MoDELS ·

2023 年 10 月 27 日

Neural Latent Geometry Search: Product Manifold Inference via Gromov-Hausdorff-Informed Bayesian Optimization

Haitz Saez de Ocariz Borde,Alvaro Arroyo,Ismael Morales,Ingmar Posner,Xiaowen Dong

Recent research indicates that the performance of machine learning models can be improved by aligning the geometry of the latent space with the underlying data structure. Rather than relying solely on Euclidean space, researchers have proposed using hyperbolic and spherical spaces with constant curvature, or combinations thereof, to better model the latent space and enhance model performance. However, little attention has been given to the problem of automatically identifying the optimal latent geometry for the downstream task. We mathematically define this novel formulation and coin it as neural latent geometry search (NLGS). More specifically, we introduce an initial attempt to search for a latent geometry composed of a product of constant curvature model spaces with a small number of query evaluations, under some simplifying assumptions. To accomplish this, we propose a novel notion of distance between candidate latent geometries based on the Gromov-Hausdorff distance from metric geometry. In order to compute the Gromov-Hausdorff distance, we introduce a mapping function that enables the comparison of different manifolds by embedding them in a common high-dimensional ambient space. We then design a graph search space based on the notion of smoothness between latent geometries and employ the calculated distances as an additional inductive bias. Finally, we use Bayesian optimization to search for the optimal latent geometry in a query-efficient manner. This is a general method which can be applied to search for the optimal latent geometry for a variety of models and downstream tasks. We perform experiments on synthetic and real-world datasets to identify the optimal latent geometry for multiple machine learning problems.

Analysis · 情感分析 · 數據集 · MoDELS · Prompt ·

2023 年 10 月 27 日

SentMix-3L: A Bangla-English-Hindi Code-Mixed Dataset for Sentiment Analysis

Md Nishat Raihan,Dhiman Goswami,Antara Mahmud,Antonios Anstasopoulos,Marcos Zampieri

Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several datasets have been build with the goal of training computational models for code-mixing. Although it is very common to observe code-mixing with multiple languages, most datasets available contain code-mixed between only two languages. In this paper, we introduce SentMix-3L, a novel dataset for sentiment analysis containing code-mixed data between three languages Bangla, English, and Hindi. We carry out a comprehensive evaluation using SentMix-3L. We show that zero-shot prompting with GPT-3.5 outperforms all transformer-based models on SentMix-3L.

可約的 · INTERACT · 泛函 · 控制器 · Automator ·

2023 年 10 月 26 日

VST-A: A Foundationally Sound Annotation Verifier

Litao Zhou,Jianxing Qin,Qinshi Wang,Andrew W. Appel,Qinxiang Cao

Program verifiers for imperative languages such as C may be annotation-based, in which assertions and invariants are put into source files and then checked, or tactic-based, where proof scripts separate from programs are interactively developed in a proof assistant such as Coq. Annotation verifiers have been more automated and convenient, but some interactive verifiers have richer assertion languages and formal proofs of soundness. We present VST-A, an annotation verifier that uses the rich assertion language of VST, leverages the formal soundness proof of VST, but allows users to describe functional correctness proofs intuitively by inserting assertions. VST-A analyzes control flow graphs, decomposes every C function into control flow paths between assertions, and reduces program verification problems into corresponding straightline Hoare triples. Compared to existing foundational program verification tools like VST and Iris, in VST-A, such decompositions and reductions are allowed to be nonstructural, which makes VST-A more flexible to use. VST-A's decomposition and reduction is defined in Coq, proved sound in Coq, and computed in a call-by-value way in Coq. The soundness proof for reduction is totally logical, independent of the complicated semantic model (and soundness proof) of VST's Hoare triple. Because of the rich assertion language, not all reduced proof goals can be automatically checked, but the system allows users to prove residual proof goals using the full power of the Coq proof assistant.

優化器 · 機器人 · SOTA · 線搜索 · state-of-the-art ·

2023 年 10 月 26 日

CuRobo: Parallelized Collision-Free Minimum-Jerk Robot Motion Generation

Balakumar Sundaralingam,Siva Kumar Sastry Hari,Adam Fishman,Caelan Garrett,Karl Van Wyk,Valts Blukis,Alexander Millane,Helen Oleynikova,Ankur Handa,Fabio Ramos,Nathan Ratliff,Dieter Fox

from arxiv, Technical Report, 61 pages, Website: //curobo.org

This paper explores the problem of collision-free motion generation for manipulators by formulating it as a global motion optimization problem. We develop a parallel optimization technique to solve this problem and demonstrate its effectiveness on massively parallel GPUs. We show that combining simple optimization techniques with many parallel seeds leads to solving difficult motion generation problems within 50ms on average, 60x faster than state-of-the-art (SOTA) trajectory optimization methods. We achieve SOTA performance by combining L-BFGS step direction estimation with a novel parallel noisy line search scheme and a particle-based optimization solver. To further aid trajectory optimization, we develop a parallel geometric planner that plans within 20ms and also introduce a collision-free IK solver that can solve over 7000 queries/s. We package our contributions into a state of the art GPU accelerated motion generation library, CuRobo and release it to enrich the robotics community. Additional details are available at //curobo.org

相似度 · MoDELS · Extensibility · Performer · 語言模型化 ·

2023 年 10 月 26 日

X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity

Taejun Yun,Jinhyeon Kim,Deokyeong Kang,Seong Hoon Lim,Jihoon Kim,Taeuk Kim

from arxiv, Accepted to EMNLP 2023 (Findings)

Cross-lingual transfer (XLT) is an emergent ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process. While English, due to its widespread usage, is typically regarded as the primary language for model adaption in various tasks, recent studies have revealed that the efficacy of XLT can be amplified by selecting the most appropriate source languages based on specific conditions. In this work, we propose the utilization of sub-network similarity between two languages as a proxy for predicting the compatibility of the languages in the context of XLT. Our approach is model-oriented, better reflecting the inner workings of foundation models. In addition, it requires only a moderate amount of raw text from candidate languages, distinguishing it from the majority of previous methods that rely on external resources. In experiments, we demonstrate that our method is more effective than baselines across diverse tasks. Specifically, it shows proficiency in ranking candidates for zero-shot XLT, achieving an improvement of 4.6% on average in terms of NDCG@3. We also provide extensive analyses that confirm the utility of sub-networks for XLT prediction.

tuning · Continuity · 數據監管 · Better · Learning ·

2023 年 10 月 26 日

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

Da Yin,Xiao Liu,Fan Yin,Ming Zhong,Hritik Bansal,Jiawei Han,Kai-Wei Chang

from arxiv, EMNLP 2023. Code and data are available at //github.com/WadeYin9712/Dynosaur

Instruction tuning has emerged to enhance the capabilities of large language models (LLMs) to comprehend instructions and generate appropriate responses. Existing methods either manually annotate or employ LLM (e.g., GPT-series) to generate data for instruction tuning. However, they often overlook associating instructions with existing annotated datasets. In this paper, we propose Dynosaur, a dynamic growth paradigm for the automatic curation of instruction-tuning data. Based on the metadata of existing datasets, we use LLMs to automatically construct instruction-tuning data by identifying relevant data fields and generating appropriate instructions. By leveraging the existing annotated datasets, Dynosaur offers several advantages: 1) it reduces the API cost for generating instructions (e.g., it costs less than $12 USD by calling GPT-3.5-turbo for generating 800K instruction tuning samples; 2) it provides high-quality data for instruction tuning (e.g., it performs better than Alpaca and Flan on Super-NI and Longform with comparable data sizes); and 3) it supports the continuous improvement of models by generating instruction-tuning data when a new annotated dataset becomes available. We further investigate a continual learning scheme for learning with the ever-growing instruction-tuning dataset, and demonstrate that replaying tasks with diverse instruction embeddings not only helps mitigate forgetting issues but generalizes to unseen tasks better. Code and data are available at //github.com/WadeYin9712/Dynosaur.

BART · 圖 · MoDELS · 知識圖譜 · 生成模型 ·

2021 年 1 月 21 日

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Ye Liu,Yao Wan,Lifang He,Hao Peng,Philip S. Yu

from arxiv, 10 pages, 7 figures, Appear in AAAI 2021

Generative commonsense reasoning which aims to empower machines to generate sentences with the capacity of reasoning over a set of concepts is a critical bottleneck for text generation. Even the state-of-the-art pre-trained language generation models struggle at this task and often produce implausible and anomalous sentences. One reason is that they rarely consider incorporating the knowledge graph which can provide rich relational information among the commonsense concepts. To promote the ability of commonsense reasoning for text generation, we propose a novel knowledge graph augmented pre-trained language generation model KG-BART, which encompasses the complex relations of concepts through the knowledge graph and produces more logical and natural sentences as output. Moreover, KG-BART can leverage the graph attention to aggregate the rich concept semantics that enhances the model generalization on unseen concept sets. Experiments on benchmark CommonGen dataset verify the effectiveness of our proposed approach by comparing with several strong pre-trained language generation models, particularly KG-BART outperforms BART by 5.80, 4.60, in terms of BLEU-3, 4. Moreover, we also show that the generated context by our model can work as background scenarios to benefit downstream commonsense QA tasks.

有向 · 注意力機制 · 可理解性 · 模型評估 · Networking ·

2017 年 11 月 20 日

DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding

Tao Shen,Tianyi Zhou,Guodong Long,Jing Jiang,Shirui Pan,Chengqi Zhang

from arxiv, 10 pages, 8 figures; Accepted in AAAI-18

Recurrent neural nets (RNN) and convolutional neural nets (CNN) are widely used on NLP tasks to capture the long-term and local dependencies, respectively. Attention mechanisms have recently attracted enormous interest due to their highly parallelizable computation, significantly less training time, and flexibility in modeling dependencies. We propose a novel attention mechanism in which the attention between elements from input sequence(s) is directional and multi-dimensional (i.e., feature-wise). A light-weight neural net, "Directional Self-Attention Network (DiSAN)", is then proposed to learn sentence embedding, based solely on the proposed attention without any RNN/CNN structure. DiSAN is only composed of a directional self-attention with temporal order encoded, followed by a multi-dimensional attention that compresses the sequence into a vector representation. Despite its simple form, DiSAN outperforms complicated RNN models on both prediction quality and time efficiency. It achieves the best test accuracy among all sentence encoding methods and improves the most recent best result by 1.02% on the Stanford Natural Language Inference (SNLI) dataset, and shows state-of-the-art test accuracy on the Stanford Sentiment Treebank (SST), Multi-Genre natural language inference (MultiNLI), Sentences Involving Compositional Knowledge (SICK), Customer Review, MPQA, TREC question-type classification and Subjectivity (SUBJ) datasets.