亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<tr id='ZwRYf'><strong id='drwCP'></strong><small id='iNQ7d'></small><button id='qV2FX'></button><li id='0Vy64'><noscript id='2mTan'><big id='4GE3v'></big><dt id='008ZM'></dt></noscript></li></tr><ol id='dziVo'><option id='2MEIg'><table id='XdcfI'><blockquote id='d9W6n'><tbody id='2HwoV'></tbody></blockquote></table></option></ol><u id='LiYLd'></u><kbd id='qXc3T'><kbd id='JMMRb'></kbd></kbd>

<code id='zAPXF'><strong id='T3Jg6'></strong></code>

<fieldset id='dWDfj'></fieldset>

<span id='NhWCa'></span>

<ins id='JvyNL'></ins>

<acronym id='N3dU0'><em id='ejkrT'></em><td id='6g8qu'><div id='EUdTu'></div></td></acronym><address id='JVKQe'><big id='yK6zw'><big id='KQLtp'></big><legend id='1f3t2'></legend></big></address>

<i id='L8vPt'><div id='aGxmp'><ins id='FYB3A'></ins></div></i>

<i id='jQB28'></i>

·

向量化 · 自動問答 · 模態 · MoDELS · 語言模型化 ·

2023 年 10 月 13 日

Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

Long Chen,Oleg Sinavski,Jan Hünermann,Alice Karnsund,Andrew James Willmott,Danny Birch,Daniel Maund,Jamie Shotton

Large Language Models (LLMs) have shown promise in the autonomous driving sector, particularly in generalization and interpretability. We introduce a unique object-level multimodal LLM architecture that merges vectorized numeric modalities with a pre-trained LLM to improve context understanding in driving situations. We also present a new dataset of 160k QA pairs derived from 10k driving scenarios, paired with high quality control commands collected with RL agent and question answer pairs generated by teacher LLM (GPT-3.5). A distinct pretraining strategy is devised to align numeric vector modalities with static LLM representations using vector captioning language data. We also introduce an evaluation metric for Driving QA and demonstrate our LLM-driver's proficiency in interpreting driving scenarios, answering questions, and decision-making. Our findings highlight the potential of LLM-based driving action generation in comparison to traditional behavioral cloning. We make our benchmark, datasets, and model available for further exploration.

相關內容

向量化

Attention · 可約的 · state-of-the-art · 監督 · 評論員 ·

2023 年 12 月 1 日

AME-CAM: Attentive Multiple-Exit CAM for Weakly Supervised Segmentation on MRI Brain Tumor

Yu-Jen Chen,Xinrong Hu,Yiyu Shi,Tsung-Yi Ho

from arxiv, arXiv admin note: text overlap with arXiv:2306.05476

Magnetic resonance imaging (MRI) is commonly used for brain tumor segmentation, which is critical for patient evaluation and treatment planning. To reduce the labor and expertise required for labeling, weakly-supervised semantic segmentation (WSSS) methods with class activation mapping (CAM) have been proposed. However, existing CAM methods suffer from low resolution due to strided convolution and pooling layers, resulting in inaccurate predictions. In this study, we propose a novel CAM method, Attentive Multiple-Exit CAM (AME-CAM), that extracts activation maps from multiple resolutions to hierarchically aggregate and improve prediction accuracy. We evaluate our method on the BraTS 2021 dataset and show that it outperforms state-of-the-art methods.

Conformer · 可交換的 · 推斷 · CASE · Performer ·

2023 年 12 月 1 日

AdaptiveConformal: An R Package for Adaptive Conformal Inference

Herbert Susmann,Antoine Chambaz,Julie Josse

Conformal Inference (CI) is a popular approach for generating finite sample prediction intervals based on the output of any point prediction method when data are exchangeable. Adaptive Conformal Inference (ACI) algorithms extend CI to the case of sequentially observed data, such as time series, and exhibit strong theoretical guarantees without having to assume exchangeability of the observed data. The common thread that unites algorithms in the ACI family is that they adaptively adjust the width of the generated prediction intervals in response to the observed data. We provide a detailed description of five ACI algorithms and their theoretical guarantees, and test their performance in simulation studies. We then present a case study of producing prediction intervals for influenza incidence in the United States based on black-box point forecasts. Implementations of all the algorithms are released as an open-source R package, AdaptiveConformal, which also includes tools for visualizing and summarizing conformal prediction intervals.

語言模型化 · MoDELS · 數據集 · Extensibility · Performer ·

2023 年 11 月 30 日

Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models

Yin Fang,Xiaozhuan Liang,Ningyu Zhang,Kangwei Liu,Rui Huang,Zhuo Chen,Xiaohui Fan,Huajun Chen

from arxiv, Project homepage: //github.com/zjunlp/Mol-Instructions, add more experiments

Large Language Models (LLMs), with their remarkable task-handling capabilities and innovative outputs, have catalyzed significant advancements across a spectrum of fields. However, their proficiency within specialized domains such as biomolecular studies remains limited. To address this challenge, we introduce Mol-Instructions, a comprehensive instruction dataset designed for the biomolecular domain. Mol-Instructions encompasses three key components: molecule-oriented instructions, protein-oriented instructions, and biomolecular text instructions. Each component aims to improve the understanding and prediction capabilities of LLMs concerning biomolecular features and behaviors. Through extensive instruction tuning experiments on LLMs, we demonstrate the effectiveness of Mol-Instructions in enhancing large models' performance in the intricate realm of biomolecular studies, thus fostering progress in the biomolecular research community. Mol-Instructions is publicly available for ongoing research and will undergo regular updates to enhance its applicability.

推斷 · 表示 · 數據集 · 深度前饋網絡 · state-of-the-art ·

2023 年 11 月 30 日

ECNR: Efficient Compressive Neural Representation of Time-Varying Volumetric Datasets

Kaiyuan Tang,Chaoli Wang

Due to its conceptual simplicity and generality, compressive neural representation has emerged as a promising alternative to traditional compression methods for managing massive volumetric datasets. The current practice of neural compression utilizes a single large multilayer perceptron (MLP) to encode the global volume, incurring slow training and inference. This paper presents an efficient compressive neural representation (ECNR) solution for time-varying data compression, utilizing the Laplacian pyramid for adaptive signal fitting. Following a multiscale structure, we leverage multiple small MLPs at each scale for fitting local content or residual blocks. By assigning similar blocks to the same MLP via size uniformization, we enable balanced parallelization among MLPs to significantly speed up training and inference. Working in concert with the multiscale structure, we tailor a deep compression strategy to compact the resulting model. We show the effectiveness of ECNR with multiple datasets and compare it with state-of-the-art compression methods (mainly SZ3, TTHRESH, and neurcomp). The results position ECNR as a promising solution for volumetric data compression.

語言模型化 · 知識 (knowledge) · 詞元分析器 · MoDELS · 解碼 ·

2023 年 11 月 29 日

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

Qidong Huang,Xiaoyi Dong,Pan Zhang,Bin Wang,Conghui He,Jiaqi Wang,Dahua Lin,Weiming Zhang,Nenghai Yu

from arxiv, technical report

Hallucination, posed as a pervasive challenge of multi-modal large language models (MLLMs), has significantly impeded their real-world usage that demands precise judgment. Existing methods mitigate this issue with either training with specific designed data or inferencing with external knowledge from other sources, incurring inevitable additional costs. In this paper, we present OPERA, a novel MLLM decoding method grounded in an Over-trust Penalty and a Retrospection-Allocation strategy, serving as a nearly free lunch to alleviate the hallucination issue without additional data, knowledge, or training. Our approach begins with an interesting observation that, most hallucinations are closely tied to the knowledge aggregation patterns manifested in the self-attention matrix, i.e., MLLMs tend to generate new tokens by focusing on a few summary tokens, but not all the previous tokens. Such partial over-trust inclination results in the neglecting of image tokens and describes the image content with hallucination. Statistically, we observe an 80%$\sim$95% co-currency rate between hallucination contents and such knowledge aggregation patterns. Based on the observation, OPERA introduces a penalty term on the model logits during the beam-search decoding to mitigate the over-trust issue, along with a rollback strategy that retrospects the presence of summary tokens in the previously generated tokens, and re-allocate the token selection if necessary. With extensive experiments, OPERA shows significant hallucination-mitigating performance on different MLLMs and metrics, proving its effectiveness and generality. Our code is available at: //github.com/shikiw/OPERA.

自適應學習 · Learning · 學習率 · INFORMS · MoDELS ·

2023 年 11 月 29 日

FedAgg: Adaptive Federated Learning with Aggregated Gradients

Wenhao Yuan,Xuehe Wang

from arxiv, 16 pages, 9 figures

Federated Learning (FL) has become an emerging norm for distributed model training, which enables multiple devices cooperatively to train a shared model utilizing their own datasets scheduled by a central server while keeping private data localized. However, during the training process, the non-independent-and-identically-distributed (Non-IID) data generated on heterogeneous clients and frequent communication across participants may significantly influence the training performance, slow down the convergent rate, and increase communication consumption. In this paper, we ameliorate the standard stochastic gradient descent approach by introducing the aggregated gradients at each local update epoch and propose an adaptive learning rate iterative algorithm that further takes the deviation between the local parameter and global parameter into account. The aforementioned adaptive learning rate design mechanism requires local information of all clients, which is challenging as there is no communication during the local update epochs. To obtain a decentralized adaptive learning rate for each client, we introduce the mean-field approach by utilizing two mean-field terms to estimate the average local parameters and gradients respectively without exchanging clients' local information with each other over time. Through theoretical analysis, we prove that our method can provide the convergence guarantee for model training and derive a convergent upper bound for the client drifting term. Extensive numerical results show that our proposed framework is superior to the state-of-the-art FL schemes in both model accuracy and convergent rate on real-world datasets with IID and Non-IID data distribution.

估計/估計量 · 變換 · Vision · SimPLe · 詞元分析器 ·

2023 年 11 月 29 日

PViT-6D: Overclocking Vision Transformers for 6D Pose Estimation with Confidence-Level Prediction and Pose Tokens

Sebastian Stapf,Tobias Bauernfeind,Marco Riboldi

In the current state of 6D pose estimation, top-performing techniques depend on complex intermediate correspondences, specialized architectures, and non-end-to-end algorithms. In contrast, our research reframes the problem as a straightforward regression task by exploring the capabilities of Vision Transformers for direct 6D pose estimation through a tailored use of classification tokens. We also introduce a simple method for determining pose confidence, which can be readily integrated into most 6D pose estimation frameworks. This involves modifying the transformer architecture by decreasing the number of query elements based on the network's assessment of the scene complexity. Our method that we call Pose Vision Transformer or PViT-6D provides the benefits of simple implementation and being end-to-end learnable while outperforming current state-of-the-art methods by +0.3% ADD(-S) on Linemod-Occlusion and +2.7% ADD(-S) on the YCB-V dataset. Moreover, our method enhances both the model's interpretability and the reliability of its performance during inference.

通道 · Microsoft Surface · MoDELS · MIMO · Notability ·

2023 年 11 月 29 日

Holographic MIMO Communications with Arbitrary Surface Placements: Near-Field LoS Channel Model and Capacity Limit

Tierui Gong,Li Wei,Chongwen Huang,Zhijia Yang,Jiguang He,Mérouane Debbah,Chau Yuen

from arxiv, double column, 17 pages, 13 figures, accepted by IEEE Journal on Selected Areas in Communications

Envisioned as one of the most promising technologies, holographic multiple-input multiple-output (H-MIMO) recently attracts notable research interests for its great potential in expanding wireless possibilities and achieving fundamental wireless limits. Empowered by the nearly continuous, large and energy-efficient surfaces with powerful electromagnetic (EM) wave control capabilities, H-MIMO opens up the opportunity for signal processing in a more fundamental EM-domain, paving the way for realizing holographic imaging level communications in supporting the extremely high spectral efficiency and energy efficiency in future networks. In this article, we try to implement a generalized EM-domain near-field channel modeling and study its capacity limit of point-to-point H-MIMO systems that equips arbitrarily placed surfaces in a line-of-sight (LoS) environment. Two effective and computational-efficient channel models are established from their integral counterpart, where one is with a sophisticated formula but showcases more accurate, and another is concise with a slight precision sacrifice. Furthermore, we unveil the capacity limit using our channel model, and derive a tight upper bound based upon an elaborately built analytical framework. Our result reveals that the capacity limit grows logarithmically with the product of transmit element area, receive element area, and the combined effects of $1/{{d}_{mn}^2}$, $1/{{d}_{mn}^4}$, and $1/{{d}_{mn}^6}$ over all transmit and receive antenna elements, where $d_{mn}$ indicates the distance between each transmit and receive elements. Numerical evaluations validate the effectiveness of our channel models, and showcase the slight disparity between the upper bound and the exact capacity, which is beneficial for predicting practical system performance.

變換 · Vision · 可辨認的 · Taxonomy · Prompt ·

2022 年 1 月 24 日

Transformers in Medical Imaging: A Survey

Fahad Shamshad,Salman Khan,Syed Waqas Zamir,Muhammad Haris Khan,Munawar Hayat,Fahad Shahbaz Khan,Huazhu Fu

from arxiv, 41 pages, \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, reconstruction, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}.

圖 · Networking · 學成 · Performer · 深度學習 ·

2020 年 10 月 9 日

Temporal Graph Networks for Deep Learning on Dynamic Graphs

Emanuele Rossi,Ben Chamberlain,Fabrizio Frasca,Davide Eynard,Federico Monti,Michael Bronstein

Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

語言模型化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<form id='OGYgE'></form>

<bdo id='wZHu1'><sup id='xFvy5'><div id='xLCCx'><bdo id='FJtdC'></bdo></div></sup></bdo>