嘘嘘中国免费观看网站,美女午夜一区视频在线播放,精品欧洲视频人人视频网站,成人A视频片在线观看免费,综合国产一区二区三区欧美

Speech technologies rely on capturing a speaker's voice variability while obtaining comprehensive language information. Textual prompts and sentence selection methods have been proposed in the literature to comprise such adequate phonetic data, referred to as a phonetically rich \textit{corpus}. However, they are still insufficient for acoustic modeling, especially critical for languages with limited resources. Hence, this paper proposes a novel approach and outlines the methodological aspects required to create a \textit{corpus} with broad phonetic coverage for a low-resourced language, Brazilian Portuguese. Our methodology includes text dataset collection up to a sentence selection algorithm based on triphone distribution. Furthermore, we propose a new phonemic classification according to acoustic-articulatory speech features since the absolute number of distinct triphones, or low-probability triphones, does not guarantee an adequate representation of every possible combination. Using our algorithm, we achieve a 55.8\% higher percentage of distinct triphones -- for samples of similar size -- while the currently available phonetic-rich corpus, CETUC and TTS-Portuguese, 12.6\% and 12.3\% in comparison to a non-phonetically rich dataset.

相關內容

評論(lun)員

關注 1

自動問答 · 視覺問答 · Vision · 基準 · MoDELS ·

2024 年 3 月 20 日

Improved Baselines for Data-efficient Perceptual Augmentation of LLMs

Théophane Vallaeys,Mustafa Shukor,Matthieu Cord,Jakob Verbeek

The abilities of large language models (LLMs) have recently progressed to unprecedented levels, paving the way to novel applications in a wide variety of areas. In computer vision, LLMs can be used to prime vision-language tasks such image captioning and visual question answering when coupled with pre-trained vision backbones. While different approaches have been explored to interface LLMs with ``perceptual backbones'' that process, e.g., visual or audio data, they are often explored for different tasks, different datasets, and using different perceptual backbones and language models, hindering direct comparison of the interfacing mechanisms. To remedy this lack of comparability between methods, we present an extensive experimental evaluation of different interfacing mechanisms, across multiple tasks (including image, video, and audio captioning as well as visual question answering), datasets and backbones, paying special attention to low-data settings. We find improved performance using existing mechanisms over state-of-the-art results, and identify a new interfacing mechanism that yields (near) optimal results across different tasks, while obtaining a 4x reduction in training time.

Metamaterial · MoDELS · 等變 · GNN · 數據集 ·

2024 年 3 月 20 日

Energy-conserving equivariant GNN for elasticity of lattice architected metamaterials

Ivan Grega,Ilyes Batatia,Gábor Csányi,Sri Karlapati,Vikram S. Deshpande

from arxiv, International Conference on Learning Representations 2024

Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work, we generate a big dataset of structure-property relationships for strut-based lattices. The dataset is made available to the community which can fuel the development of methods anchored in physical principles for the fitting of fourth-order tensors. In addition, we present a higher-order GNN model trained on this dataset. The key features of the model are (i) SE(3) equivariance, and (ii) consistency with the thermodynamic law of conservation of energy. We compare the model to non-equivariant models based on a number of error metrics and demonstrate its benefits in terms of predictive performance and reduced training requirements. Finally, we demonstrate an example application of the model to an architected material design task. The methods which we developed are applicable to fourth-order tensors beyond elasticity such as piezo-optical tensor etc.

吸引點 · 無限 · 分解的 · 周期的 · 情景 ·

2024 年 3 月 20 日

String attractors and bi-infinite words

Pierre Béaur,France Gheeraert,Benjamin Hellouin de Menibus

from arxiv, 19 pages

String attractors are a combinatorial tool coming from the field of data compression. It is a set of positions within a word which intersects an occurrence of every factor. While one-sided infinite words admitting a finite string attractor are eventually periodic, the situation is different for two-sided infinite words. In this paper, we characterise the bi-infinite words admitting a finite string attractor as the characteristic Sturmian words and their morphic images. For words that do not admit finite string attractors, we study the structure and properties of their infinite string attractors.

多樣性 · MoDELS · contrastive · Learning · 語音合成 ·

2024 年 3 月 20 日

Building speech corpus with diverse voice characteristics for its prompt-based representation

Aya Watanabe,Shinnosuke Takamichi,Yuki Saito,Wataru Nakata,Detai Xin,Hiroshi Saruwatari

from arxiv, Submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing. arXiv admin note: text overlap with arXiv:2309.13509

In text-to-speech synthesis, the ability to control voice characteristics is vital for various applications. By leveraging thriving text prompt-based generation techniques, it should be possible to enhance the nuanced control of voice characteristics. While previous research has explored the prompt-based manipulation of voice characteristics, most studies have used pre-recorded speech, which limits the diversity of voice characteristics available. Thus, we aim to address this gap by creating a novel corpus and developing a model for prompt-based manipulation of voice characteristics in text-to-speech synthesis, facilitating a broader range of voice characteristics. Specifically, we propose a method to build a sizable corpus pairing voice characteristics descriptions with corresponding speech samples. This involves automatically gathering voice-related speech data from the Internet, ensuring its quality, and manually annotating it using crowdsourcing. We implement this method with Japanese language data and analyze the results to validate its effectiveness. Subsequently, we propose a construction method of the model to retrieve speech from voice characteristics descriptions based on a contrastive learning method. We train the model using not only conservative contrastive learning but also feature prediction learning to predict quantitative speech features corresponding to voice characteristics. We evaluate the model performance via experiments with the corpus we constructed above.

周期的 · INFORMS · TOOLS · Integration · 論文 ·

2024 年 3 月 20 日

(Non-)retracted academic papers in OpenAlex

Christian Hauschke,Serhii Nazarovets

The proliferation of scholarly publications underscores the necessity for reliable tools to navigate scientific literature. OpenAlex, an emerging platform amalgamating data from diverse academic sources, holds promise in meeting these evolving demands. Nonetheless, our investigation uncovered a flaw in OpenAlex's portrayal of publication status, particularly concerning retractions. Despite accurate metadata sourced from Crossref database, OpenAlex consolidated this information into a single boolean field, "is_retracted," leading to misclassifications of papers. This challenge not only impacts OpenAlex users but also extends to users of other academic resources integrating the OpenAlex API. The issue affects data provided by OpenAlex in the period between 22 Dec 2023 and 19 Mar 2024. Anyone using data from this period should urgently check it and replace it if necessary.

Networking · Performer · 演化計算 · 結點 · 泛函 ·

2024 年 3 月 19 日

Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

Raphael Norman-Tenazas,David Kleinberg,Erik C. Johnson,Daniel P. Lathrop,Matthew J. Roos

It has been shown that unclocked, recurrent networks of Boolean gates in FPGAs can be used for low-SWaP reservoir computing. In such systems, topology and node functionality of the network are randomly initialized. To create a network that solves a task, weights are applied to output nodes and learning is achieved by adjusting those weights with conventional machine learning methods. However, performance is often limited compared to networks where all parameters are learned. Herein, we explore an alternative learning approach for unclocked, recurrent networks in FPGAs. We use evolutionary computation to evolve the Boolean functions of network nodes. In one type of implementation the output nodes are used directly to perform a task and all learning is via evolution of the network's node functions. In a second type of implementation a back-end classifier is used as in traditional reservoir computing. In that case, both evolution of node functions and adjustment of output node weights contribute to learning. We demonstrate the practicality of node function evolution, obtaining an accuracy improvement of ~30% on an image classification task while processing at a rate of over three million samples per second. We additionally demonstrate evolvability of network memory and dynamic output signals.

對數幾率回歸 · 稀疏 · 預測器/決策函數 · 頻率主義學派 · 聯合分布 ·

2024 年 3 月 19 日

On high-dimensional classification by sparse generalized Bayesian logistic regression

The Tien Mai

This work addresses the problem of high-dimensional classification by exploring the generalized Bayesian logistic regression method under a sparsity-inducing prior distribution. The method involves utilizing a fractional power of the likelihood resulting the fractional posterior. Our study yields concentration results for the fractional posterior, not only on the joint distribution of the predictor and response variable but also for the regression coefficients. Significantly, we derive novel findings concerning misclassification excess risk bounds using sparse generalized Bayesian logistic regression. These results parallel recent findings for penalized methods in the frequentist literature. Furthermore, we extend our results to the scenario of model misspecification, which is of critical importance.

Processing（編程語言） · 流形 · 泛函 · 核化 · 易處理的 ·

2024 年 3 月 19 日

Simulating conditioned diffusions on manifolds

Marc Corstanje,Frank van der Meulen,Moritz Schauer,Stefan Sommer

To date, most methods for simulating conditioned diffusions are limited to the Euclidean setting. The conditioned process can be constructed using a change of measure known as Doob's $h$-transform. The specific type of conditioning depends on a function $h$ which is typically unknown in closed form. To resolve this, we extend the notion of guided processes to a manifold $M$, where one replaces $h$ by a function based on the heat kernel on $M$. We consider the case of a Brownian motion with drift, constructed using the frame bundle of $M$, conditioned to hit a point $x_T$ at time $T$. We prove equivalence of the laws of the conditioned process and the guided process with a tractable Radon-Nikodym derivative. Subsequently, we show how one can obtain guided processes on any manifold $N$ that is diffeomorphic to $M$ without assuming knowledge of the heat kernel on $N$. We illustrate our results with numerical simulations and an example of parameter estimation where a diffusion process on the torus is observed discretely in time.

異常點 · 決策樹 · Performer · 估計/估計量 · 得分 ·

2024 年 3 月 19 日

DTOR: Decision Tree Outlier Regressor to explain anomalies

Riccardo Crupi,Daniele Regoli,Alessandro Damiano Sabatino,Immacolata Marano,Massimiliano Brinis,Luca Albertazzi,Andrea Cirillo,Andrea Claudio Cosentini

Explaining outliers occurrence and mechanism of their occurrence can be extremely important in a variety of domains. Malfunctions, frauds, threats, in addition to being correctly identified, oftentimes need a valid explanation in order to effectively perform actionable counteracts. The ever more widespread use of sophisticated Machine Learning approach to identify anomalies make such explanations more challenging. We present the Decision Tree Outlier Regressor (DTOR), a technique for producing rule-based explanations for individual data points by estimating anomaly scores generated by an anomaly detection model. This is accomplished by first applying a Decision Tree Regressor, which computes the estimation score, and then extracting the relative path associated with the data point score. Our results demonstrate the robustness of DTOR even in datasets with a large number of features. Additionally, in contrast to other rule-based approaches, the generated rules are consistently satisfied by the points to be explained. Furthermore, our evaluation metrics indicate comparable performance to Anchors in outlier explanation tasks, with reduced execution time.

Tensor · Better · MoDELS · 估計/估計量 · 統計量 ·

2024 年 3 月 19 日

An efficient tensor regression for high-dimensional data

Yuefeng Si,Yingying Zhang,Yuxi Cai,Chunling Liu,Guodong Li

Most currently used tensor regression models for high-dimensional data are based on Tucker decomposition, which has good properties but loses its efficiency in compressing tensors very quickly as the order of tensors increases, say greater than four or five. However, for the simplest tensor autoregression in handling time series data, its coefficient tensor already has the order of six. This paper revises a newly proposed tensor train (TT) decomposition and then applies it to tensor regression such that a nice statistical interpretation can be obtained. The new tensor regression can well match the data with hierarchical structures, and it even can lead to a better interpretation for the data with factorial structures, which are supposed to be better fitted by models with Tucker decomposition. More importantly, the new tensor regression can be easily applied to the case with higher order tensors since TT decomposition can compress the coefficient tensors much more efficiently. The methodology is also extended to tensor autoregression for time series data, and nonasymptotic properties are derived for the ordinary least squares estimations of both tensor regression and autoregression. A new algorithm is introduced to search for estimators, and its theoretical justification is also discussed. Theoretical and computational properties of the proposed methodology are verified by simulation studies, and the advantages over existing methods are illustrated by two real examples.