亚洲黄色网站不卡免费,免费一区二区日韩精品视频,亚洲成人高清在线观看

Large Language Models (LLMs) have recently demonstrated exceptional performance in various Natural Language Processing (NLP) tasks. They have also shown the ability to perform chain-of-thought (CoT) reasoning to solve complex problems. Recent studies have explored CoT reasoning in complex multimodal scenarios, such as the science question answering task, by fine-tuning multimodal models with high-quality human-annotated CoT rationales. However, collecting high-quality COT rationales is usually time-consuming and costly. Besides, the annotated rationales are hardly accurate due to the external essential information missed. To address these issues, we propose a novel method termed \emph{T-SciQ} that aims at teaching science question answering with LLM signals. The T-SciQ approach generates high-quality CoT rationales as teaching signals and is advanced to train much smaller models to perform CoT reasoning in complex modalities. Additionally, we introduce a novel data mixing strategy to produce more effective teaching data samples by policy for simple and complex science question answer problems. Extensive experimental results show that our T-SciQ method achieves a new state-of-the-art performance on the ScienceQA benchmark, with an accuracy of 96.18\%. Moreover, our approach outperforms the most powerful fine-tuned baseline by 4.5\%.

相關內容

CoT

關注 2

Networking · Neural Networks · 多樣性 · Extensibility · state-of-the-art ·

2023 年 10 月 5 日

PINNacle: A Comprehensive Benchmark of Physics-Informed Neural Networks for Solving PDEs

Zhongkai Hao,Jiachen Yao,Chang Su,Hang Su,Ziao Wang,Fanzhi Lu,Zeyu Xia,Yichi Zhang,Songming Liu,Lu Lu,Jun Zhu

While significant progress has been made on Physics-Informed Neural Networks (PINNs), a comprehensive comparison of these methods across a wide range of Partial Differential Equations (PDEs) is still lacking. This study introduces PINNacle, a benchmarking tool designed to fill this gap. PINNacle provides a diverse dataset, comprising over 20 distinct PDEs from various domains, including heat conduction, fluid dynamics, biology, and electromagnetics. These PDEs encapsulate key challenges inherent to real-world problems, such as complex geometry, multi-scale phenomena, nonlinearity, and high dimensionality. PINNacle also offers a user-friendly toolbox, incorporating about 10 state-of-the-art PINN methods for systematic evaluation and comparison. We have conducted extensive experiments with these methods, offering insights into their strengths and weaknesses. In addition to providing a standardized means of assessing performance, PINNacle also offers an in-depth analysis to guide future research, particularly in areas such as domain decomposition methods and loss reweighting for handling multi-scale problems and complex geometry. To the best of our knowledge, it is the largest benchmark with a diverse and comprehensive evaluation that will undoubtedly foster further research in PINNs.

語言模型化 · MoDELS · 穩健性 · 置信度 · 推斷 ·

2023 年 10 月 5 日

A Formalism and Approach for Improving Robustness of Large Language Models Using Risk-Adjusted Confidence Scores

Ke Shen,Mayank Kejriwal

Large Language Models (LLMs), such as ChatGPT, have achieved impressive milestones in natural language processing (NLP). Despite their impressive performance, the models are known to pose important risks. As these models are deployed in real-world applications, a systematic understanding of different risks posed by these models on tasks such as natural language inference (NLI), is much needed. In this paper, we define and formalize two distinct types of risk: decision risk and composite risk. We also propose a risk-centric evaluation framework, and four novel metrics, for assessing LLMs on these risks in both in-domain and out-of-domain settings. Finally, we propose a risk-adjusted calibration method called DwD for helping LLMs minimize these risks in an overall NLI architecture. Detailed experiments, using four NLI benchmarks, three baselines and two LLMs, including ChatGPT, show both the practical utility of the evaluation framework, and the efficacy of DwD in reducing decision and composite risk. For instance, when using DwD, an underlying LLM is able to address an extra 20.1% of low-risk inference tasks (but which the LLM erroneously deems high-risk without risk adjustment) and skip a further 19.8% of high-risk tasks, which would have been answered incorrectly.

講稿 · DeepFakes · MoDELS · 遷移學習 · Learning ·

2023 年 10 月 3 日

PAD-Phys: Exploiting Physiology for Presentation Attack Detection in Face Biometrics

Luis F. Gomez,Julian Fierrez,Aythami Morales,Mahdi Ghafourian,Ruben Tolosana,Imanol Solano,Alejandro Garcia,Francisco Zamora-Martinez

from arxiv, Preprint of the paper presented to the Workshop on IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC, 2023)

Presentation Attack Detection (PAD) is a crucial stage in facial recognition systems to avoid leakage of personal information or spoofing of identity to entities. Recently, pulse detection based on remote photoplethysmography (rPPG) has been shown to be effective in face presentation attack detection. This work presents three different approaches to the presentation attack detection based on rPPG: (i) The physiological domain, a domain using rPPG-based models, (ii) the Deepfakes domain, a domain where models were retrained from the physiological domain to specific Deepfakes detection tasks; and (iii) a new Presentation Attack domain was trained by applying transfer learning from the two previous domains to improve the capability to differentiate between bona-fides and attacks. The results show the efficiency of the rPPG-based models for presentation attack detection, evidencing a 21.70% decrease in average classification error rate (ACER) (from 41.03% to 19.32%) when the presentation attack domain is compared to the physiological and Deepfakes domains. Our experiments highlight the efficiency of transfer learning in rPPG-based models and perform well in presentation attack detection in instruments that do not allow copying of this physiological feature.

向量化 · 自動問答 · 模態 · MoDELS · 語言模型化 ·

2023 年 10 月 3 日

Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving

Long Chen,Oleg Sinavski,Jan Hünermann,Alice Karnsund,Andrew James Willmott,Danny Birch,Daniel Maund,Jamie Shotton

Large Language Models (LLMs) have shown promise in the autonomous driving sector, particularly in generalization and interpretability. We introduce a unique object-level multimodal LLM architecture that merges vectorized numeric modalities with a pre-trained LLM to improve context understanding in driving situations. We also present a new dataset of 160k QA pairs derived from 10k driving scenarios, paired with high quality control commands collected with RL agent and question answer pairs generated by teacher LLM (GPT-3.5). A distinct pretraining strategy is devised to align numeric vector modalities with static LLM representations using vector captioning language data. We also introduce an evaluation metric for Driving QA and demonstrate our LLM-driver's proficiency in interpreting driving scenarios, answering questions, and decision-making. Our findings highlight the potential of LLM-based driving action generation in comparison to traditional behavioral cloning. We make our benchmark, datasets, and model available for further exploration.

Processing（編程語言） · NeRF · 縮放 · Subspace · 深度前饋網絡 ·

2023 年 10 月 3 日

Adaptive Multi-NeRF: Exploit Efficient Parallelism in Adaptive Multiple Scale Neural Radiance Field Rendering

Tong Wang,Shuichi Kurabayashi

Recent advances in Neural Radiance Fields (NeRF) have demonstrated significant potential for representing 3D scene appearances as implicit neural networks, enabling the synthesis of high-fidelity novel views. However, the lengthy training and rendering process hinders the widespread adoption of this promising technique for real-time rendering applications. To address this issue, we present an effective adaptive multi-NeRF method designed to accelerate the neural rendering process for large scenes with unbalanced workloads due to varying scene complexities. Our method adaptively subdivides scenes into axis-aligned bounding boxes using a tree hierarchy approach, assigning smaller NeRFs to different-sized subspaces based on the complexity of each scene portion. This ensures the underlying neural representation is specific to a particular part of the scene. We optimize scene subdivision by employing a guidance density grid, which balances representation capability for each Multilayer Perceptron (MLP). Consequently, samples generated by each ray can be sorted and collected for parallel inference, achieving a balanced workload suitable for small MLPs with consistent dimensions for regular and GPU-friendly computations. We aosl demonstrated an efficient NeRF sampling strategy that intrinsically adapts to increase parallelism, utilization, and reduce kernel calls, thereby achieving much higher GPU utilization and accelerating the rendering process.

多樣性 · 語言模型化 · MoDELS · TOOLS · 穩健性 ·

2023 年 10 月 3 日

LLMatic: Neural Architecture Search via Large Language Models and Quality Diversity Optimization

Muhammad U. Nasir,Sam Earle,Julian Togelius,Steven James,Christopher Cleghorn

Large Language Models (LLMs) have emerged as powerful tools capable of accomplishing a broad spectrum of tasks. Their abilities span numerous areas, and one area where they have made a significant impact is in the domain of code generation. In this context, we view LLMs as mutation and crossover tools. Meanwhile, Quality-Diversity (QD) algorithms are known to discover diverse and robust solutions. By merging the code-generating abilities of LLMs with the diversity and robustness of QD solutions, we introduce LLMatic, a Neural Architecture Search (NAS) algorithm. While LLMs struggle to conduct NAS directly through prompts, LLMatic uses a procedural approach, leveraging QD for prompts and network architecture to create diverse and highly performant networks. We test LLMatic on the CIFAR-10 image classification benchmark, demonstrating that it can produce competitive networks with just $2,000$ searches, even without prior knowledge of the benchmark domain or exposure to any previous top-performing models for the benchmark.

可理解性 · 語言模型化 · MoDELS · 知識 (knowledge) · Learning ·

2023 年 10 月 3 日

RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models

Brandon Kynoch,Hugo Latapie,Dwane van der Sluis

from arxiv, 8 pages, 7 figures, 1 table, Our code is publicly available online at: //github.com/cisco-open/DeepVision/tree/main/recallm

Large Language Models (LLMs) have made extraordinary progress in the field of Artificial Intelligence and have demonstrated remarkable capabilities across a large variety of tasks and domains. However, as we venture closer to creating Artificial General Intelligence (AGI) systems, we recognize the need to supplement LLMs with long-term memory to overcome the context window limitation and more importantly, to create a foundation for sustained reasoning, cumulative learning and long-term user interaction. In this paper we propose RecallM, a novel architecture for providing LLMs with an adaptable and updatable long-term memory mechanism. Unlike previous methods, the RecallM architecture is particularly effective at belief updating and maintaining a temporal understanding of the knowledge provided to it. We demonstrate through various experiments the effectiveness of this architecture. Furthermore, through our own temporal understanding and belief updating experiments, we show that RecallM is four times more effective than using a vector database for updating knowledge previously stored in long-term memory. We also demonstrate that RecallM shows competitive performance on general question-answering and in-context learning tasks.

查準率/準確率 · Learning · DNN · Extensibility · SOTA ·

2023 年 10 月 2 日

RF-ULM: Deep Learning for Radio-Frequency Ultrasound Localization Microscopy

Christopher Hahne,Georges Chabouh,Arthur Chavignon,Olivier Couture,Raphael Sznitman

In Ultrasound Localization Microscopy (ULM), achieving high-resolution images relies on the precise localization of contrast agent particles across consecutive beamformed frames. However, our study uncovers an enormous potential: The process of delay-and-sum beamforming leads to an irreversible reduction of Radio-Frequency (RF) data, while its implications for localization remain largely unexplored. The rich contextual information embedded within RF wavefronts, including their hyperbolic shape and phase, offers great promise for guiding Deep Neural Networks (DNNs) in challenging localization scenarios. To fully exploit this data, we propose to directly localize scatterers in RF signals. Our approach involves a custom super-resolution DNN using learned feature channel shuffling and a novel semi-global convolutional sampling block tailored for reliable and accurate localization in RF input data. Additionally, we introduce a geometric point transformation that facilitates seamless mapping between B-mode and RF spaces. To validate the effectiveness of our method and understand the impact of beamforming, we conduct an extensive comparison with State-Of-The-Art (SOTA) techniques in ULM. We present the inaugural in vivo results from an RF-trained DNN, highlighting its real-world practicality. Our findings show that RF-ULM bridges the domain gap between synthetic and real datasets, offering a considerable advantage in terms of precision and complexity. To enable the broader research community to benefit from our findings, our code and the associated SOTA methods are made available at //github.com/hahnec/rf-ulm.

剪枝 · Better · CAP · contrastive · MoDELS ·

2021 年 12 月 14 日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Runxin Xu,Fuli Luo,Chengyu Wang,Baobao Chang,Jun Huang,Songfang Huang,Fei Huang

from arxiv, Accepted to AAAI 2022

Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processing (NLP) tasks under the pre-training and fine-tuning paradigm. With large quantities of parameters, PLMs are computation-intensive and resource-hungry. Hence, model pruning has been introduced to compress large-scale PLMs. However, most prior approaches only consider task-specific knowledge towards downstream tasks, but ignore the essential task-agnostic knowledge during pruning, which may cause catastrophic forgetting problem and lead to poor generalization ability. To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. It is designed as a general framework, compatible with both structured and unstructured pruning. Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge. Besides, to better retain the performance of the pruned model, the snapshots (i.e., the intermediate models at each pruning iteration) also serve as effective supervisions for pruning. Our extensive experiments show that adopting CAP consistently yields significant improvements, especially in extremely high sparsity scenarios. With only 3% model parameters reserved (i.e., 97% sparsity), CAP successfully achieves 99.2% and 96.3% of the original BERT performance in QQP and MNLI tasks. In addition, our probing experiments demonstrate that the model pruned by CAP tends to achieve better generalization ability.

相似度 · INFORMS · 估計/估計量 · Extensibility · 無監督 ·

2021 年 3 月 10 日

SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance

Fu-Zhao Ou,Xingyu Chen,Ruixin Zhang,Yuge Huang,Shaoxin Li,Jilin Li,Yong Li,Liujuan Cao,Yuan-Gen Wang

In recent years, Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system to guarantee the stability and reliability of recognition performance in an unconstrained scenario. For this purpose, the FIQA method should consider both the intrinsic property and the recognizability of the face image. Most previous works aim to estimate the sample-wise embedding uncertainty or pair-wise similarity as the quality score, which only considers the information from partial intra-class. However, these methods ignore the valuable information from the inter-class, which is for estimating to the recognizability of face image. In this work, we argue that a high-quality face image should be similar to its intra-class samples and dissimilar to its inter-class samples. Thus, we propose a novel unsupervised FIQA method that incorporates Similarity Distribution Distance for Face Image Quality Assessment (SDD-FIQA). Our method generates quality pseudo-labels by calculating the Wasserstein Distance (WD) between the intra-class similarity distributions and inter-class similarity distributions. With these quality pseudo-labels, we are capable of training a regression network for quality prediction. Extensive experiments on benchmark datasets demonstrate that the proposed SDD-FIQA surpasses the state-of-the-arts by an impressive margin. Meanwhile, our method shows good generalization across different recognition systems.