亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tfoot id='uasil'></tfoot>

<legend id='uasil'><style id='uasil'><dir id='uasil'><q id='uasil'></q></dir></style></legend>

<i id='uasil'><tr id='uasil'><dt id='uasil'><q id='uasil'><span id='uasil'><b id='uasil'><form id='uasil'><ins id='uasil'></ins><ul id='uasil'></ul><sub id='uasil'></sub></form><legend id='uasil'></legend><bdo id='uasil'><pre id='uasil'><center id='uasil'></center></pre></bdo></b><th id='uasil'></th></span></q></dt></tr></i><div id='uasil'><tfoot id='uasil'></tfoot><dl id='uasil'><fieldset id='uasil'></fieldset></dl></div>

·

Networking · Neural Networks · Learning · 泛化理論 · 相似度 ·

2023 年 9 月 22 日

What Makes a Language Easy to Deep-Learn?

Lukas Galke,Yoav Ram,Limor Raviv

from arxiv, 32 pages, major update: improved text, added new analyses, added supplementary material

Neural networks drive the success of natural language processing. A fundamental property of language is its compositional structure, allowing humans to produce forms for new meanings systematically. However, unlike humans, neural networks notoriously struggle with systematic generalization, and do not necessarily benefit from compositional structure in emergent communication simulations. This poses a problem for using neural networks to simulate human language learning and evolution, and suggests crucial differences in the biases of the different learning systems. Here, we directly test how neural networks compare to humans in learning and generalizing different input languages that vary in their degree of structure. We evaluate the memorization and generalization capabilities of a pre-trained language model GPT-3.5 (analagous to an adult second language learner) and recurrent neural networks trained from scratch (analaogous to a child first language learner). Our results show striking similarities between deep neural networks and adult human learners, with more structured linguistic input leading to more systematic generalization and to better convergence between neural networks and humans. These findings suggest that all the learning systems are sensitive to the structure of languages in similar ways with compositionality being advantageous for learning. Our findings draw a clear prediction regarding children's learning biases, as well as highlight the challenges of automated processing of languages spoken by small communities. Notably, the similarity between humans and machines opens new avenues for research on language learning and evolution.

相關內容

Networking

Networking：IFIP International Conferences on Networking。 Explanation：國際網絡會議。 Publisher：IFIP。 SIT：

可理解性 · Analysis · 優化器 · 多樣性 · 設計 ·

2023 年 11 月 7 日

What Makes a Fantastic Passenger-Car Driver in Urban Contexts?

Yueteng Yu,Zhijie Yi,Xinyu Yang,Mengdi Chu,Junrong Lu,Xiang Chang,Yiyao Liu,Jingli Qin,Ye Jin,Jialin Song,Xingrui Gu,Jirui Yuan,Guyue Zhou,Jiangtao Gong

The accurate evaluation of the quality of driving behavior is crucial for optimizing and implementing autonomous driving technology in practice. However, there is no comprehensive understanding of good driving behaviors currently. In this paper, we sought to understand driving behaviors from the perspectives of both drivers and passengers. We invited 10 expert drivers and 14 novice drivers to complete a 5.7-kilometer urban road driving task. After the experiments, we conducted semi-structured interviews with 24 drivers and 48 of their passengers (two passengers per driver). Through the analysis of interview data, we found passengers' assessing logic of driving behaviors, divers' considerations and efforts to achieve good driving, and gaps between these perspectives. Our research provided insights into a systematic evaluation of autonomous driving and the design implications for future autonomous vehicles.

知識 (knowledge) · 蒸餾 · Networking · 相似度 · 相同 ·

2023 年 11 月 6 日

What Knowledge Gets Distilled in Knowledge Distillation?

Utkarsh Ojha,Yuheng Li,Anirudh Sundara Rajan,Yingyu Liang,Yong Jae Lee

from arxiv, NeurIPS 2023 camera ready

Knowledge distillation aims to transfer useful information from a teacher network to a student network, with the primary goal of improving the student's performance for the task at hand. Over the years, there has a been a deluge of novel techniques and use cases of knowledge distillation. Yet, despite the various improvements, there seems to be a glaring gap in the community's fundamental understanding of the process. Specifically, what is the knowledge that gets distilled in knowledge distillation? In other words, in what ways does the student become similar to the teacher? Does it start to localize objects in the same way? Does it get fooled by the same adversarial samples? Does its data invariance properties become similar? Our work presents a comprehensive study to try to answer these questions. We show that existing methods can indeed indirectly distill these properties beyond improving task performance. We further study why knowledge distillation might work this way, and show that our findings have practical implications as well.

代碼 · Continuity · 評論員 · FAST · Processing（編程語言） ·

2023 年 11 月 4 日

Does Code Review Speed Matter for Practitioners?

Gunnar Kudrjavets,Ayushi Rastogi

from arxiv, 29 pages, 7 figures. To be published in Empirical Software Engineering An International Journal

Increasing code velocity is a common goal for a variety of software projects. The efficiency of the code review process significantly impacts how fast the code gets merged into the final product and reaches the customers. We conducted a survey to study the code velocity-related beliefs and practices in place. We analyzed 75 completed surveys from 39 participants from the industry and 36 from the open-source community. Our critical findings are (a) the industry and open-source community hold a similar set of beliefs, (b) quick reaction time is of utmost importance and applies to the tooling infrastructure and the behavior of other engineers, (c) time-to-merge is the essential code review metric to improve, (d) engineers have differing opinions about the benefits of increased code velocity for their career growth, and (e) the controlled application of the commit-then-review model can increase code velocity. Our study supports the continued need to invest in and improve code velocity regardless of the underlying organizational ecosystem.

MoDELS · Taxonomy · 語言模型化 · 可理解性 · Performance ·

2023 年 9 月 2 日

Explainability for Large Language Models: A Survey

Haiyan Zhao,Hanjie Chen,Fan Yang,Ninghao Liu,Huiqi Deng,Hengyi Cai,Shuaiqiang Wang,Dawei Yin,Mengnan Du

Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, their internal mechanisms are still unclear and this lack of transparency poses unwanted risks for downstream applications. Therefore, understanding and explaining these models is crucial for elucidating their behaviors, limitations, and social impacts. In this paper, we introduce a taxonomy of explainability techniques and provide a structured overview of methods for explaining Transformer-based language models. We categorize techniques based on the training paradigms of LLMs: traditional fine-tuning-based paradigm and prompting-based paradigm. For each paradigm, we summarize the goals and dominant approaches for generating local explanations of individual predictions and global explanations of overall model knowledge. We also discuss metrics for evaluating generated explanations, and discuss how explanations can be leveraged to debug models and improve performance. Lastly, we examine key challenges and emerging opportunities for explanation techniques in the era of LLMs in comparison to conventional machine learning models.

語言模型化 · Performer · Agent · MoDELS · Learning ·

2023 年 5 月 19 日

Introspective Tips: Large Language Model for In-Context Decision Making

Liting Chen,Lu Wang,Hang Dong,Yali Du,Jie Yan,Fangkai Yang,Shuang Li,Pu Zhao,Si Qin,Saravan Rajmohan,Qingwei Lin,Dongmei Zhang

from arxiv, 22 pages, 4 figures

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks. In this study, we employ ``Introspective Tips" to facilitate LLMs in self-optimizing their decision-making. By introspectively examining trajectories, LLM refines its policy by generating succinct and valuable tips. Our method enhances the agent's performance in both few-shot and zero-shot learning situations by considering three essential scenarios: learning from the agent's past experiences, integrating expert demonstrations, and generalizing across diverse games. Importantly, we accomplish these improvements without fine-tuning the LLM parameters; rather, we adjust the prompt to generalize insights from the three aforementioned situations. Our framework not only supports but also emphasizes the advantage of employing LLM in in-contxt decision-making. Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.

變換 · Vision · 可辨認的 · Taxonomy · Prompt ·

2022 年 1 月 24 日

Transformers in Medical Imaging: A Survey

Fahad Shamshad,Salman Khan,Syed Waqas Zamir,Muhammad Haris Khan,Munawar Hayat,Fahad Shahbaz Khan,Huazhu Fu

from arxiv, 41 pages, \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, reconstruction, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}.

長短期記憶網絡 · RNN · Networking · Weight · MoDELS ·

2020 年 6 月 10 日

Do RNN and LSTM have Long Memory?

Jingyu Zhao,Feiqing Huang,Jia Lv,Yanjie Duan,Zhen Qin,Guodong Li,Guangjian Tian

from arxiv, Accepted by ICML 2020. Added references, experiments and acknowledgements

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

文本分類 · 語言模型化 · BERT · state-of-the-art · MoDELS ·

2019 年 5 月 14 日

How to Fine-Tune BERT for Text Classification?

Chi Sun,Xipeng Qiu,Yige Xu,Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.

entity · Performer · 命名實體識別 · state-of-the-art · 主動學習 ·

2018 年 2 月 4 日

Deep Active Learning for Named Entity Recognition

Yanyao Shen,Hyokun Yun,Zachary C. Lipton,Yakov Kronrod,Animashree Anandkumar

Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the amount of labeled training data can be drastically reduced when deep learning is combined with active learning. While active learning is sample-efficient, it can be computationally expensive since it requires iterative retraining. To speed this up, we introduce a lightweight architecture for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and word encoders and a long short term memory (LSTM) tag decoder. The model achieves nearly state-of-the-art performance on standard datasets for the task while being computationally much more efficient than best performing models. We carry out incremental active learning, during the training process, and are able to nearly match state-of-the-art performance with just 25\% of the original training data.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Neural Networks

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='uasil'></tfoot>

<legend id='uasil'><style id='uasil'><dir id='uasil'><q id='uasil'></q></dir></style></legend>

<i id='uasil'><tr id='uasil'><dt id='uasil'><q id='uasil'><span id='uasil'><b id='uasil'><form id='uasil'><ins id='uasil'></ins><ul id='uasil'></ul><sub id='uasil'></sub></form><legend id='uasil'></legend><bdo id='uasil'><pre id='uasil'><center id='uasil'></center></pre></bdo></b><th id='uasil'></th></span></q></dt></tr></i><div id='uasil'><tfoot id='uasil'></tfoot><dl id='uasil'><fieldset id='uasil'></fieldset></dl></div>