亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='FYUWR'></tfoot>

<legend id='tgsNs'><style id='jga9W'><dir id='x9E7D'><q id='fKcfl'></q></dir></style></legend>

<i id='NRzBO'><tr id='OhbUb'><dt id='8Oz1S'><q id='C3P7p'><span id='zy3ds'><b id='gOFGh'><form id='owVDL'><ins id='Wg5I6'></ins><ul id='gRyFJ'></ul><sub id='KMmND'></sub></form><legend id='mIZ5U'></legend><bdo id='pYvjI'><pre id='0LFMX'><center id='K36p7'></center></pre></bdo></b><th id='w9haj'></th></span></q></dt></tr></i><div id='oAZVd'><tfoot id='oA3Ce'></tfoot><dl id='HSDRg'><fieldset id='ecn49'></fieldset></dl></div>

<li id='pBuQn'><abbr id='2MuKx'></abbr></li>

·

有偏 · 特化 · 分解的 · Weight · 二階導數 ·

2023 年 11 月 1 日

Penalising the biases in norm regularisation enforces sparsity

Etienne Boursier,Nicolas Flammarion

Controlling the parameters' norm often yields good generalisation when training neural networks. Beyond simple intuitions, the relation between regularising parameters' norm and obtained estimators remains theoretically misunderstood. For one hidden ReLU layer networks with unidimensional data, this work shows the parameters' norm required to represent a function is given by the total variation of its second derivative, weighted by a $\sqrt{1+x^2}$ factor. Notably, this weighting factor disappears when the norm of bias terms is not regularised. The presence of this additional weighting factor is of utmost significance as it is shown to enforce the uniqueness and sparsity (in the number of kinks) of the minimal norm interpolator. Conversely, omitting the bias' norm allows for non-sparse solutions. Penalising the bias terms in the regularisation, either explicitly or implicitly, thus leads to sparse estimators.

相關內容

穩健性 · 相關系數 · 噪聲 · 蒸餾 · MoDELS ·

2023 年 12 月 19 日

Noise robust distillation of self-supervised speech models via correlation metrics

Fabian Ritter-Gutierrez,Kuan-Po Huang,Dianwen Ng,Jeremy H. M. Wong,Hung-yi Lee,Eng Siong Chng,Nancy F. Chen

from arxiv, 6 pages

Compared to large speech foundation models, small distilled models exhibit degraded noise robustness. The student's robustness can be improved by introducing noise at the inputs during pre-training. Despite this, using the standard distillation loss still yields a student with degraded performance. Thus, this paper proposes improving student robustness via distillation with correlation metrics. Teacher behavior is learned by maximizing the teacher and student cross-correlation matrix between their representations towards identity. Noise robustness is encouraged via the student's self-correlation minimization. The proposed method is agnostic of the teacher model and consistently outperforms the previous approach. This work also proposes an heuristic to weigh the importance of the two correlation terms automatically. Experiments show consistently better clean and noise generalization on Intent Classification, Keyword Spotting, and Automatic Speech Recognition tasks on SUPERB Challenge.

預測器/決策函數 · 模型選擇 · MoDELS · 代價 · 路徑 ·

2023 年 12 月 18 日

Flexible cost-penalized Bayesian model selection: developing inclusion paths with an application to diagnosis of heart disease

Erica M. Porter,Christopher T. Franck,Stephen Adams

We propose a Bayesian model selection approach that allows medical practitioners to select among predictor variables while taking their respective costs into account. Medical procedures almost always incur costs in time and/or money. These costs might exceed their usefulness for modeling the outcome of interest. We develop Bayesian model selection that uses flexible model priors to penalize costly predictors a priori and select a subset of predictors useful relative to their costs. Our approach (i) gives the practitioner control over the magnitude of cost penalization, (ii) enables the prior to scale well with sample size, and (iii) enables the creation of our proposed inclusion path visualization, which can be used to make decisions about individual candidate predictors using both probabilistic and visual tools. We demonstrate the effectiveness of our inclusion path approach and the importance of being able to adjust the magnitude of the prior's cost penalization through a dataset pertaining to heart disease diagnosis in patients at the Cleveland Clinic Foundation, where several candidate predictors with various costs were recorded for patients, and through simulated data.

Networking · 估計/估計量 · 損失 · 均值 · 均方誤差 ·

2023 年 12 月 18 日

Estimation of individual causal effects in network setup for multiple treatments

Abhinav Thorat,Ravi Kolla,Niranjan Pedanekar,Naoyuki Onoe

from arxiv, 7 pages, accepted at AAAI-GCLR 2024

We study the problem of estimation of Individual Treatment Effects (ITE) in the context of multiple treatments and networked observational data. Leveraging the network information, we aim to utilize hidden confounders that may not be directly accessible in the observed data, thereby enhancing the practical applicability of the strong ignorability assumption. To achieve this, we first employ Graph Convolutional Networks (GCN) to learn a shared representation of the confounders. Then, our approach utilizes separate neural networks to infer potential outcomes for each treatment. We design a loss function as a weighted combination of two components: representation loss and Mean Squared Error (MSE) loss on the factual outcomes. To measure the representation loss, we extend existing metrics such as Wasserstein and Maximum Mean Discrepancy (MMD) from the binary treatment setting to the multiple treatments scenario. To validate the effectiveness of our proposed methodology, we conduct a series of experiments on the benchmark datasets such as BlogCatalog and Flickr. The experimental results consistently demonstrate the superior performance of our models when compared to baseline methods.

語音合成 · Extensibility · MoDELS · state-of-the-art · 講稿 ·

2023 年 12 月 17 日

ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations

Neil Shah,Saiteja Kosgi,Vishal Tambrahalli,Neha Sahipjohn,Niranjan Pedanekar,Vineet Gandhi

We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations. It can train a multi-speaker variant effectively using transcripts from a single speaker. ParrotTTS adapts to a new language in low resource setup and generalizes to languages not seen while training the self-supervised backbone. Moreover, without training on bilingual or parallel examples, ParrotTTS can transfer voices across languages while preserving the speaker specific characteristics, e.g., synthesizing fluent Hindi speech using a French speaker's voice and accent. We present extensive results in monolingual and multi-lingual scenarios. ParrotTTS outperforms state-of-the-art multi-lingual TTS models using only a fraction of paired data as latter.

估計/估計量 · 無偏 · 過估計 · HTTPS · TOOLS ·

2023 年 12 月 16 日

One step closer to unbiased aleatoric uncertainty estimation

Wang Zhang,Ziwen Ma,Subhro Das,Tsui-Wei Weng,Alexandre Megretski,Luca Daniel,Lam M. Nguyen

Neural networks are powerful tools in various applications, and quantifying their uncertainty is crucial for reliable decision-making. In the deep learning field, the uncertainties are usually categorized into aleatoric (data) and epistemic (model) uncertainty. In this paper, we point out that the existing popular variance attenuation method highly overestimates aleatoric uncertainty. To address this issue, we propose a new estimation method by actively de-noising the observed data \footnote{Source code available at \url{//github.com/wz16/DVA}.}. By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.

Performer · 泛函 · MoDELS · INFORMS · 大語言模型 ·

2023 年 12 月 15 日

Low-resource classification of mobility functioning information in clinical sentences using large language models

Tuan Dung Le,Thanh Duong,Thanh Thieu

Objective: Function is increasingly recognized as an important indicator of whole-person health. This study evaluates the ability of publicly available large language models (LLMs) to accurately identify the presence of functioning information from clinical notes. We explore various strategies to improve the performance on this task. Materials and Methods: We collect a balanced binary classification dataset of 1000 sentences from the Mobility NER dataset, which was curated from n2c2 clinical notes. For evaluation, we construct zero-shot and few-shot prompts to query the LLMs whether a given sentence contains mobility functioning information. Two sampling techniques, random sampling and k-nearest neighbor (kNN)-based sampling, are used to select the few-shot examples. Furthermore, we apply a parameter-efficient prompt-based fine-tuning method to the LLMs and evaluate their performance under various training settings. Results: Flan-T5-xxl outperforms all other models in both zero-shot and few-shot settings, achieving a F1 score of 0.865 with a single demonstrative example selected by kNN sampling. In prompt-based fine-tuning experiments, this foundation model also demonstrates superior performance across all low-resource settings, particularly achieving an impressive F1 score of 0.922 using the full training dataset. The smaller model, Flan-T5-xl, requires fine-tuning with only 2.3M additional parameters to achieve comparable performance to the fully fine-tuned Gatortron-base model, both surpassing 0.9 F1 score. Conclusion: Open-source instruction-tuned LLMs demonstrate impressive in-context learning capability in the mobility functioning classification task. The performance of these models can be further improved by continuing fine-tuning on a task-specific dataset.

Taxonomy · 多詞一義性 · Extensibility · 標注 · AIM ·

2023 年 12 月 15 日

Evaluation of semantic relations impact in query expansion-based retrieval systems

With the increasing demand of intelligent systems capable of operating in different contexts (e.g. users on the move) the correct interpretation of the user-need by such systems has become crucial to give consistent answers to the user questions. The most effective applications addressing such task are in the fields of natural language processing and semantic expansion of terms. These techniques are aimed at estimating the goal of an input query reformulating it as an intent, commonly relying on textual resources built exploiting different semantic relations like \emph{synonymy}, \emph{antonymy} and many others. The aim of this paper is to generate such resources using the labels of a given taxonomy as source of information. The obtained resources are integrated into a plain classifier for reformulating a set of input queries as intents and tracking the effect of each relation, in order to quantify the impact of each semantic relation on the classification. As an extension to this, the best tradeoff between improvement and noise introduction when combining such relations is evaluated. The assessment is made generating the resources and their combinations and using them for tuning the classifier which is used to reformulate the user questions as labels. The evaluation employs a wide and varied taxonomy as a use-case, exploiting its labels as basis for the semantic expansion and producing several corpora with the purpose of enhancing the pseudo-queries estimation.

估計/估計量 · 情景 · Performer · 規范化的 · 樣例 ·

2023 年 12 月 14 日

Stein estimation in a multivariate setting

Adrian Fischer,Robert E. Gaunt,Yvik Swan

from arxiv, 19 pages

We use Stein characterisations to derive new moment-type estimators for the parameters of several multivariate distributions in the i.i.d. case; we also derive the asymptotic properties of these estimators. Our examples include the multivariate truncated normal distribution and several spherical distributions. The estimators are explicit and therefore provide an interesting alternative to the maximum-likelihood estimator. The quality of these estimators is assessed through competitive simulation studies in which we compare their behaviour to the performance of other estimators available in the literature.

優化器 · INTERACT · Networking · 知識 (knowledge) · Performer ·

2022 年 5 月 11 日

Dynamic neighbourhood optimisation for task allocation using multi-agent

Niall Creech,Natalia Criado Pacheco,Simon Miles

from arxiv, 28 pages

In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.

貪心 · 模態 · MoDELS · 學成 · 泛化理論 ·

2022 年 2 月 10 日

Characterizing and overcoming the greedy nature of learning in multi-modal deep neural networks

Nan Wu,Stanis?aw Jastrz?bski,Kyunghyun Cho,Krzysztof J. Geras

We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tr id='XMZhc'><strong id='GEWVs'></strong><small id='U2KMm'></small><button id='mIFBi'></button><li id='P1njw'><noscript id='lewZQ'><big id='iglNP'></big><dt id='svEWy'></dt></noscript></li></tr><ol id='CWJxE'><option id='q2RyN'><table id='TwguZ'><blockquote id='8yQU2'><tbody id='Ch8fk'></tbody></blockquote></table></option></ol><u id='NP774'></u><kbd id='2eScQ'><kbd id='4q1Og'></kbd></kbd>

<code id='nOuyw'><strong id='EyQXz'></strong></code>

<fieldset id='RKANY'></fieldset>

<span id='u460m'></span>

<ins id='Ayrkh'></ins>

<acronym id='IvLy0'><em id='DfJFB'></em><td id='6LO82'><div id='JhULd'></div></td></acronym><address id='Klm8M'><big id='oR4nq'><big id='MX3Xb'></big><legend id='gMDUH'></legend></big></address>

<i id='MGC7m'><div id='WgtVJ'><ins id='0YdqC'></ins></div></i>

<i id='vXIpK'></i>