高清国产三级在线播放,亚洲自偷拍狠无码,好男人神马影院在线观看

We present a novel computational model employing hierarchical active inference to simulate reading and eye movements. The model characterizes linguistic processing as inference over a hierarchical generative model, facilitating predictions and inferences at various levels of granularity, from syllables to sentences. Our approach combines the strengths of large language models for realistic textual predictions and active inference for guiding eye movements to informative textual information, enabling the testing of predictions. The model exhibits proficiency in reading both known and unknown words and sentences, adhering to the distinction between lexical and nonlexical routes in dual-route theories of reading. Notably, our model permits the exploration of maladaptive inference effects on eye movements during reading, such as in dyslexia. To simulate this condition, we attenuate the contribution of priors during the reading process, leading to incorrect inferences and a more fragmented reading style, characterized by a greater number of shorter saccades. This alignment with empirical findings regarding eye movements in dyslexic individuals highlights the model's potential to aid in understanding the cognitive processes underlying reading and eye movements, as well as how reading deficits associated with dyslexia may emerge from maladaptive predictive processing. In summary, our model represents a significant advancement in comprehending the intricate cognitive processes involved in reading and eye movements, with potential implications for understanding and addressing dyslexia through the simulation of maladaptive inference. It may offer valuable insights into this condition and contribute to the development of more effective interventions for treatment.

相關內容

推斷

關注 5

Learning · 類別 · AIM · 匯聚 · 混合 ·

2023 年 9 月 29 日

Forest Mixing: investigating the impact of multiple search trees and a shared refinements pool on ontology learning

Marco Pop-Mihali,Adrian Groza

We aim at development white-box machine learning algorithms. We focus here on algorithms for learning axioms in description logic. We extend the Class Expression Learning for Ontology Engineering (CELOE) algorithm contained in the DL-Learner tool. The approach uses multiple search trees and a shared pool of refinements in order to split the search space in smaller subspaces. We introduce the conjunction operation of best class expressions from each tree, keeping the results which give the most information. The aim is to foster exploration from a diverse set of starting classes and to streamline the process of finding class expressions in ontologies. %, particularly in large search spaces. The current implementation and settings indicated that the Forest Mixing approach did not outperform the traditional CELOE. Despite these results, the conceptual proposal brought forward by this approach may stimulate future improvements in class expression finding in ontologies. % and influence. % the way we traverse search spaces in general.

縮放 · 語言模型化 · 推斷 · MoDELS · 可約的 ·

2023 年 9 月 29 日

Training and inference of large language models using 8-bit floating point

Sergio P. Perez,Yan Zhang,James Briggs,Charlie Blake,Josh Levy-Kramer,Paul Balanca,Carlo Luschi,Stephen Barlow,Andrew William Fitzgibbon

FP8 formats are gaining popularity to boost the computational efficiency for training and inference of large deep learning models. Their main challenge is that a careful choice of scaling is needed to prevent degradation due to the reduced dynamic range compared to higher-precision formats. Although there exists ample literature about selecting such scalings for INT formats, this critical aspect has yet to be addressed for FP8. This paper presents a methodology to select the scalings for FP8 linear layers, based on dynamically updating per-tensor scales for the weights, gradients and activations. We apply this methodology to train and validate large language models of the type of GPT and Llama 2 using FP8, for model sizes ranging from 111M to 70B. To facilitate the understanding of the FP8 dynamics, our results are accompanied by plots of the per-tensor scale distribution for weights, activations and gradients during both training and inference.

語音識別 · MoDELS · 剪枝 · 稀疏 · Learning ·

2023 年 9 月 28 日

Learning ASR pathways: A sparse multilingual ASR model

Mu Yang,Andros Tjandra,Chunxi Liu,David Zhang,Duc Le,Ozlem Kalinli

from arxiv, Accepted by ICASSP 2023

Neural network pruning compresses automatic speech recognition (ASR) models effectively. However, in multilingual ASR, language-agnostic pruning may lead to severe performance drops on some languages because language-agnostic pruning masks may not fit all languages and discard important language-specific parameters. In this work, we present ASR pathways, a sparse multilingual ASR model that activates language-specific sub-networks ("pathways"), such that the parameters for each language are learned explicitly. With the overlapping sub-networks, the shared parameters can also enable knowledge transfer for lower-resource languages via joint multilingual training. We propose a novel algorithm to learn ASR pathways, and evaluate the proposed method on 4 languages with a streaming RNN-T model. Our proposed ASR pathways outperform both dense models and a language-agnostically pruned model, and provide better performance on low-resource languages compared to the monolingual sparse models.

Networking · Neural Networks · Networks · Processing（編程語言） · 規范化的 ·

2023 年 9 月 28 日

Don't trust your eyes: on the (un)reliability of feature visualizations

Robert Geirhos,Roland S. Zimmermann,Blair Bilodeau,Wieland Brendel,Been Kim

How do neural networks extract patterns from pixels? Feature visualizations attempt to answer this important question by visualizing highly activating patterns through optimization. Today, visualization methods form the foundation of our knowledge about the internal workings of neural networks, as a type of mechanistic interpretability. Here we ask: How reliable are feature visualizations? We start our investigation by developing network circuits that trick feature visualizations into showing arbitrary patterns that are completely disconnected from normal network behavior on natural input. We then provide evidence for a similar phenomenon occurring in standard, unmanipulated networks: feature visualizations are processed very differently from standard input, casting doubt on their ability to "explain" how neural networks process natural images. This can be used as a sanity check for feature visualizations. We underpin our empirical findings by theory proving that the set of functions that can be reliably understood by feature visualization is extremely small and does not include general black-box neural networks. Therefore, a promising way forward could be the development of networks that enforce certain structures in order to ensure more reliable feature visualizations.

層 · 變換 · 自頂向下 · 自下而上 · 表示 ·

2023 年 9 月 28 日

Augmenting transformers with recursively composed multi-grained representations

Xiang Hu,Qingyang Zhu,Kewei Tu,Wei Wu

from arxiv, preprint

We present ReCAT, a recursive composition augmented Transformer that is able to explicitly model hierarchical syntactic structures of raw texts without relying on gold trees during both learning and inference. Existing research along this line restricts data to follow a hierarchical tree structure and thus lacks inter-span communications. To overcome the problem, we propose a novel contextual inside-outside (CIO) layer that learns contextualized representations of spans through bottom-up and top-down passes, where a bottom-up pass forms representations of high-level spans by composing low-level spans, while a top-down pass combines information inside and outside a span. By stacking several CIO layers between the embedding layer and the attention layers in Transformer, the ReCAT model can perform both deep intra-span and deep inter-span interactions, and thus generate multi-grained representations fully contextualized with other spans. Moreover, the CIO layers can be jointly pre-trained with Transformers, making ReCAT enjoy scaling ability, strong performance, and interpretability at the same time. We conduct experiments on various sentence-level and span-level tasks. Evaluation results indicate that ReCAT can significantly outperform vanilla Transformer models on all span-level tasks and baselines that combine recursive networks with Transformers on natural language inference tasks. More interestingly, the hierarchical structures induced by ReCAT exhibit strong consistency with human-annotated syntactic trees, indicating good interpretability brought by the CIO layers.

語言模型化 · MoDELS · 可理解性 · ASSETS · TOOLS ·

2023 年 9 月 28 日

Language models in molecular discovery

Nikita Janakarajan,Tim Erdmann,Sarath Swaminathan,Teodoro Laino,Jannis Born

from arxiv, Under review

The success of language models, especially transformer-based architectures, has trickled into other domains giving rise to "scientific language models" that operate on small molecules, proteins or polymers. In chemistry, language models contribute to accelerating the molecule discovery cycle as evidenced by promising recent findings in early-stage drug discovery. Here, we review the role of language models in molecular discovery, underlining their strength in de novo drug design, property prediction and reaction chemistry. We highlight valuable open-source software assets thus lowering the entry barrier to the field of scientific language modeling. Last, we sketch a vision for future molecular design that combines a chatbot interface with access to computational chemistry tools. Our contribution serves as a valuable resource for researchers, chemists, and AI enthusiasts interested in understanding how language models can and will be used to accelerate chemical discovery.

自動問答 · MoDELS · Learning · 變換 · Transformer模型 ·

2023 年 9 月 27 日

Question answering using deep learning in low resource Indian language Marathi

Dhiraj Amin,Sharvari Govilkar,Sagar Kulkarni

Precise answers are extracted from a text for a given input question in a question answering system. Marathi question answering system is created in recent studies by using ontology, rule base and machine learning based approaches. Recently transformer models and transfer learning approaches are used to solve question answering challenges. In this paper we investigate different transformer models for creating a reading comprehension-based Marathi question answering system. We have experimented on different pretrained Marathi language multilingual and monolingual models like Multilingual Representations for Indian Languages (MuRIL), MahaBERT, Indic Bidirectional Encoder Representations from Transformers (IndicBERT) and fine-tuned it on a Marathi reading comprehension-based data set. We got the best accuracy in a MuRIL multilingual model with an EM score of 0.64 and F1 score of 0.74 by fine tuning the model on the Marathi dataset.

模型評估 · 弦 · INFORMS · 優化器 · 圖 ·

2023 年 9 月 27 日

A user study of visualisations of spatio-temporal eye tracking data

Marcel Claus,Frouke Hermens,Stefano Bromuri

from arxiv, 16 pages, 8 figures

Eye movements have a spatial (where people look), but also a temporal (when people look) component. Various types of visualizations have been proposed that take this spatio-temporal nature of the data into account, but it is unclear how well each one can be interpreted and whether such interpretation depends on the question asked about the data or the nature of the data-set that is being visualised. In this study, four spatio-temporal visualization techniques for eye movements (chord diagram, scanpath, scarfplot, space-time cube) were compared in a user study. Participants (N = 25) answered three questions (what region first, what region most, which regions most between) about each visualization, which was based on two types of data-sets (eye movements towards adverts, eye movements towards pairs of gambles). Accuracy of the answers depended on a combination of the data-set, the question that needed to answered, and the type of visualization. For most questions, the scanpath, which did not use area of interest (AOI) information, resulted in lower accuracy than the other graphs. This suggests that AOIs improve the information conveyed by graphs. No effects of experience with reading graphs (for work or not for work) or education on accuracy of the answer was found. The results therefore suggest that there is no single best visualisation of the spatio-temporal aspects of eye movements. When visualising eye movement data, a user study may therefore be beneficial to determine the optimal visualization of the data-set and research question at hand.

Learning · Networking · INFORMS · HTTPS · 估計/估計量 ·

2023 年 9 月 27 日

NoSENSE: Learned unrolled cardiac MRI reconstruction without explicit sensitivity maps

Felix Frederik Zimmermann,Andreas Kofler

from arxiv, Accepted at MICCAI STACOM 2023

We present a novel learned image reconstruction method for accelerated cardiac MRI with multiple receiver coils based on deep convolutional neural networks (CNNs) and algorithm unrolling. In contrast to many existing learned MR image reconstruction techniques that necessitate coil-sensitivity map (CSM) estimation as a distinct network component, our proposed approach avoids explicit CSM estimation. Instead, it implicitly captures and learns to exploit the inter-coil relationships of the images. Our method consists of a series of novel learned image and k-space blocks with shared latent information and adaptation to the acquisition parameters by feature-wise modulation (FiLM), as well as coil-wise data-consistency (DC) blocks. Our method achieved PSNR values of 34.89 and 35.56 and SSIM values of 0.920 and 0.942 in the cine track and mapping track validation leaderboard of the MICCAI STACOM CMRxRecon Challenge, respectively, ranking 4th among different teams at the time of writing. Code will be made available at //github.com/fzimmermann89/CMRxRecon

學成 · Performer · 深度學習 · Processing（編程語言） · 圖像處理 ·

2018 年 7 月 31 日

Deep learning in agriculture: A survey

Andreas Kamilaris,Francesc X. Prenafeta-Boldu

Deep learning constitutes a recent, modern technique for image processing and data analysis, with promising results and large potential. As deep learning has been successfully applied in various domains, it has recently entered also the domain of agriculture. In this paper, we perform a survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges. We examine the particular agricultural problems under study, the specific models and frameworks employed, the sources, nature and pre-processing of data used, and the overall performance achieved according to the metrics used at each work under study. Moreover, we study comparisons of deep learning with other existing popular techniques, in respect to differences in classification or regression performance. Our findings indicate that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.