国产一本二本三本的区别视频_免费A级毛片无码A中文字幕_牛牛AV人人夜夜澡人人爽_91人妻社区论坛精选_高清一区二区三区你懂得视频_久久综合网欧美色妞网_国产高清在线观看视频WWW

The efficiency of natural language processing has improved dramatically with the advent of machine learning models, particularly neural network-based solutions. However, some tasks are still challenging, especially when considering specific domains. In this paper, we present a cloud-based system that can extract insights from customer reviews using machine learning methods integrated into a pipeline. For topic modeling, our composite model uses transformer-based neural networks designed for natural language processing, vector embedding-based keyword extraction, and clustering. The elements of our model have been integrated and further developed to meet better the requirements of efficient information extraction, topic modeling of the extracted information, and user needs. Furthermore, our system can achieve better results than this task's existing topic modeling and keyword extraction solutions. Our approach is validated and compared with other state-of-the-art methods using publicly available datasets for benchmarking.

相關內容

Machine Learning

關注 2241

機(ji)(ji)器學(xue)(xue)(xue)習（Machine Learning）是一個研(yan)究(jiu)計(ji)算學(xue)(xue)(xue)習方(fang)(fang)法(fa)(fa)(fa)的(de)(de)(de)(de)(de)(de)(de)國際論(lun)(lun)壇(tan)。該雜(za)志發(fa)表文(wen)章(zhang)，報(bao)告廣泛的(de)(de)(de)(de)(de)(de)(de)學(xue)(xue)(xue)習方(fang)(fang)法(fa)(fa)(fa)應(ying)(ying)用(yong)(yong)于各(ge)種(zhong)學(xue)(xue)(xue)習問(wen)題的(de)(de)(de)(de)(de)(de)(de)實(shi)質(zhi)性結果。該雜(za)志的(de)(de)(de)(de)(de)(de)(de)特色論(lun)(lun)文(wen)描述(shu)研(yan)究(jiu)的(de)(de)(de)(de)(de)(de)(de)問(wen)題和方(fang)(fang)法(fa)(fa)(fa)，應(ying)(ying)用(yong)(yong)研(yan)究(jiu)和研(yan)究(jiu)方(fang)(fang)法(fa)(fa)(fa)的(de)(de)(de)(de)(de)(de)(de)問(wen)題。有關(guan)學(xue)(xue)(xue)習問(wen)題或方(fang)(fang)法(fa)(fa)(fa)的(de)(de)(de)(de)(de)(de)(de)論(lun)(lun)文(wen)通過實(shi)證研(yan)究(jiu)、理(li)(li)論(lun)(lun)分析或與心理(li)(li)現象的(de)(de)(de)(de)(de)(de)(de)比較提供(gong)了(le)(le)堅實(shi)的(de)(de)(de)(de)(de)(de)(de)支(zhi)持(chi)。應(ying)(ying)用(yong)(yong)論(lun)(lun)文(wen)展(zhan)示了(le)(le)如何應(ying)(ying)用(yong)(yong)學(xue)(xue)(xue)習方(fang)(fang)法(fa)(fa)(fa)來解決重要的(de)(de)(de)(de)(de)(de)(de)應(ying)(ying)用(yong)(yong)問(wen)題。研(yan)究(jiu)方(fang)(fang)法(fa)(fa)(fa)論(lun)(lun)文(wen)改進了(le)(le)機(ji)(ji)器學(xue)(xue)(xue)習的(de)(de)(de)(de)(de)(de)(de)研(yan)究(jiu)方(fang)(fang)法(fa)(fa)(fa)。所有的(de)(de)(de)(de)(de)(de)(de)論(lun)(lun)文(wen)都以其(qi)他研(yan)究(jiu)人員可(ke)以驗證或復制的(de)(de)(de)(de)(de)(de)(de)方(fang)(fang)式描述(shu)了(le)(le)支(zhi)持(chi)證據(ju)。論(lun)(lun)文(wen)還詳細說明(ming)了(le)(le)學(xue)(xue)(xue)習的(de)(de)(de)(de)(de)(de)(de)組成部(bu)分，并討論(lun)(lun)了(le)(le)關(guan)于知識(shi)表示和性能任務的(de)(de)(de)(de)(de)(de)(de)假設。官網地(di)址(zhi)：

語言模型化 · MoDELS · 條件語言模型 · Continuity · AIM ·

2023 年 8 月 11 日

Neural Conversation Models and How to Rein Them in: A Survey of Failures and Fixes

Fabian Galetzka,Anne Beyer,David Schlangen

from arxiv, Represents the state of the field in 2022; partially based on the first authors 2022 PhD thesis

Recent conditional language models are able to continue any kind of text source in an often seemingly fluent way. This fact encouraged research in the area of open-domain conversational systems that are based on powerful language models and aim to imitate an interlocutor by generating appropriate contributions to a written dialogue. From a linguistic perspective, however, the complexity of contributing to a conversation is high. In this survey, we interpret Grice's maxims of cooperative conversation from the perspective of this specific research area and systematize the literature under the aspect of what makes a contribution appropriate: A neural conversation model has to be fluent, informative, consistent, coherent, and follow social norms. In order to ensure these qualities, recent approaches try to tame the underlying language models at various intervention points, such as data, training regime or decoding. Sorted by these categories and intervention points, we discuss promising attempts and suggest novel ways for future research.

情景 · MoDELS · CASE · Machine Translation · 噪聲 ·

2023 年 8 月 11 日

A Case Study on Context Encoding in Multi-Encoder based Document-Level Neural Machine Translation

Ramakrishna Appicharla,Baban Gain,Santanu Pal,Asif Ekbal

from arxiv, Accepted to MT Summit 2023 (oral)

Recent studies have shown that the multi-encoder models are agnostic to the choice of context, and the context encoder generates noise which helps improve the models in terms of BLEU score. In this paper, we further explore this idea by evaluating with context-aware pronoun translation test set by training multi-encoder models trained on three different context settings viz, previous two sentences, random two sentences, and a mix of both as context. Specifically, we evaluate the models on the ContraPro test set to study how different contexts affect pronoun translation accuracy. The results show that the model can perform well on the ContraPro test set even when the context is random. We also analyze the source representations to study whether the context encoder generates noise. Our analysis shows that the context encoder provides sufficient information to learn discourse-level information. Additionally, we observe that mixing the selected context (the previous two sentences in this case) and the random context is generally better than the other settings.

控制器 · 正則化項 · 平滑 · Agent · Learning ·

2023 年 8 月 11 日

Image-based Regularization for Action Smoothness in Autonomous Miniature Racing Car with Deep Reinforcement Learning

Hoang-Giang Cao,I Lee,Bo-Jiun Hsu,Zheng-Yi Lee,Yu-Wei Shih,Hsueh-Cheng Wang,I-Chen Wu

from arxiv, Accepted to IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)2023

Deep reinforcement learning has achieved significant results in low-level controlling tasks. However, for some applications like autonomous driving and drone flying, it is difficult to control behavior stably since the agent may suddenly change its actions which often lowers the controlling system's efficiency, induces excessive mechanical wear, and causes uncontrollable, dangerous behavior to the vehicle. Recently, a method called conditioning for action policy smoothness (CAPS) was proposed to solve the problem of jerkiness in low-dimensional features for applications such as quadrotor drones. To cope with high-dimensional features, this paper proposes image-based regularization for action smoothness (I-RAS) for solving jerky control in autonomous miniature car racing. We also introduce a control based on impact ratio, an adaptive regularization weight to control the smoothness constraint, called IR control. In the experiment, an agent with I-RAS and IR control significantly improves the success rate from 59% to 95%. In the real-world-track experiment, the agent also outperforms other methods, namely reducing the average finish lap time, while also improving the completion rate even without real world training. This is also justified by an agent based on I-RAS winning the 2022 AWS DeepRacer Final Championship Cup.

SimPLe · MoDELS · Continuity · 高斯過程回歸 · motivation ·

2023 年 8 月 10 日

A Simple Approach for Local and Global Variable Importance in Nonlinear Regression Models

Emily T. Winn-Nu?ez,Maryclare Griffin,Lorin Crawford

The ability to interpret machine learning models has become increasingly important as their usage in data science continues to rise. Most current interpretability methods are optimized to work on either (\textit{i}) a global scale, where the goal is to rank features based on their contributions to overall variation in an observed population, or (\textit{ii}) the local level, which aims to detail on how important a feature is to a particular individual in the data set. In this work, a new operator is proposed called the "GlObal And Local Score" (GOALS): a simple \textit{post hoc} approach to simultaneously assess local and global feature variable importance in nonlinear models. Motivated by problems in biomedicine, the approach is demonstrated using Gaussian process regression where the task of understanding how genetic markers are associated with disease progression both within individuals and across populations is of high interest. Detailed simulations and real data analyses illustrate the flexible and efficient utility of GOALS over state-of-the-art variable importance strategies.

文本分類 · Machine Learning · Learning · Analysis · 可辨認的 ·

2023 年 8 月 10 日

Exploring Machine Learning and Transformer-based Approaches for Deceptive Text Classification: A Comparative Analysis

Anusuya Krishnan

from arxiv, 12 pages, 8 figures

Deceptive text classification is a critical task in natural language processing that aims to identify deceptive or fraudulent content. This study presents a comparative analysis of machine learning and transformer-based approaches for deceptive text classification. We investigate the effectiveness of traditional machine learning algorithms and state-of-the-art transformer models, such as BERT, XLNET, DistilBERT, and RoBERTa, in detecting deceptive text. A labeled dataset consisting of deceptive and non-deceptive texts is used for training and evaluation purposes. Through extensive experimentation, we compare the performance metrics, including accuracy, precision, recall, and F1 score, of the different approaches. The results of this study shed light on the strengths and limitations of machine learning and transformer-based methods for deceptive text classification, enabling researchers and practitioners to make informed decisions when dealing with deceptive content

控制器 · Learning · 可約的 · 求逆 · MoDELS ·

2023 年 8 月 8 日

Characterization of Human Balance through a Reinforcement Learning-based Muscle Controller

Kübra Akba?,Carlotta Mummolo,Xianlian Zhou

Balance assessment during physical rehabilitation often relies on rubric-oriented battery tests to score a patient's physical capabilities, leading to subjectivity. While some objective balance assessments exist, they are often limited to tracking the center of pressure (COP), which does not fully capture the whole-body postural stability. This study explores the use of the center of mass (COM) state space and presents a promising avenue for monitoring the balance capabilities in humans. We employ a musculoskeletal model integrated with a balance controller, trained through reinforcement learning (RL), to investigate balancing capabilities. The RL framework consists of two interconnected neural networks governing balance recovery and muscle coordination respectively, trained using Proximal Policy Optimization (PPO) with reference state initialization, early termination, and multiple training strategies. By exploring recovery from random initial COM states (position and velocity) space for a trained controller, we obtain the final BR enclosing successful balance recovery trajectories. Comparing the BRs with analytical postural stability limits from a linear inverted pendulum model, we observe a similar trend in successful COM states but more limited ranges in the recoverable areas. We further investigate the effect of muscle weakness and neural excitation delay on the BRs, revealing reduced balancing capability in different regions. Overall, our approach of learning muscular balance controllers presents a promising new method for establishing balance recovery limits and objectively assessing balance capability in bipedal systems, particularly in humans.

知識 (knowledge) · 語言模型化 · MoDELS · NLU · Learning ·

2022 年 11 月 17 日

A Survey of Knowledge-Enhanced Pre-trained Language Models

Linmei Hu,Zeyi Liu,Ziwang Zhao,Lei Hou,Liqiang Nie,Juanzi Li

Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge-Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.

稀疏 · 學成 · 特征提取器 · Less · 自動問答 ·

2021 年 2 月 11 日

Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

Jie Lei,Linjie Li,Luowei Zhou,Zhe Gan,Tamara L. Berg,Mohit Bansal,Jingjing Liu

from arxiv, 12 pages, 5 figures, 11 tables. - Happy Chinese New Year!

The canonical approach to video-and-language learning (e.g., video question answering) dictates a neural model to learn from offline-extracted dense video features from vision models and text features from language models. These feature extractors are trained independently and usually on tasks different from the target domains, rendering these fixed features sub-optimal for downstream tasks. Moreover, due to the high computational overload of dense video features, it is often difficult (or infeasible) to plug feature extractors directly into existing approaches for easy finetuning. To provide a remedy to this dilemma, we propose a generic framework ClipBERT that enables affordable end-to-end learning for video-and-language tasks, by employing sparse sampling, where only a single or a few sparsely sampled short clips from a video are used at each training step. Experiments on text-to-video retrieval and video question answering on six datasets demonstrate that ClipBERT outperforms (or is on par with) existing methods that exploit full-length videos, suggesting that end-to-end learning with just a few sparsely sampled clips is often more accurate than using densely extracted offline features from full-length videos, proving the proverbial less-is-more principle. Videos in the datasets are from considerably different domains and lengths, ranging from 3-second generic domain GIF videos to 180-second YouTube human activity videos, showing the generalization ability of our approach. Comprehensive ablation studies and thorough analyses are provided to dissect what factors lead to this success. Our code is publicly available at //github.com/jayleicn/ClipBERT

Machine Learning · 學成 · 可辨認的 · 統計量 · 話題 ·

2020 年 4 月 3 日

Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods

Eyke Hüllermeier,Willem Waegeman

from arxiv, 52 pages

The notion of uncertainty is of major importance in machine learning and constitutes a key element of machine learning methodology. In line with the statistical tradition, uncertainty has long been perceived as almost synonymous with standard probability and probabilistic predictions. Yet, due to the steadily increasing relevance of machine learning for practical applications and related issues such as safety requirements, new problems and challenges have recently been identified by machine learning scholars, and these problems may call for new methodological developments. In particular, this includes the importance of distinguishing between (at least) two different types of uncertainty, often refereed to as aleatoric and epistemic. In this paper, we provide an introduction to the topic of uncertainty in machine learning as well as an overview of hitherto attempts at handling uncertainty in general and formalizing this distinction in particular.

entity · Performer · 命名實體識別 · state-of-the-art · 主動學習 ·

2018 年 2 月 4 日

Deep Active Learning for Named Entity Recognition

Yanyao Shen,Hyokun Yun,Zachary C. Lipton,Yakov Kronrod,Animashree Anandkumar

Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the amount of labeled training data can be drastically reduced when deep learning is combined with active learning. While active learning is sample-efficient, it can be computationally expensive since it requires iterative retraining. To speed this up, we introduce a lightweight architecture for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and word encoders and a long short term memory (LSTM) tag decoder. The model achieves nearly state-of-the-art performance on standard datasets for the task while being computationally much more efficient than best performing models. We carry out incremental active learning, during the training process, and are able to nearly match state-of-the-art performance with just 25\% of the original training data.