亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The Versal Adaptive Compute Acceleration Platform (ACAP) is a new architecture that combines AI Engines (AIEs) with reconfigurable fabric. This architecture offers significant acceleration potential for uniform recurrences in various domains, such as deep learning, high-performance computation, and signal processing. However, efficiently mapping these computations onto the Versal ACAP architecture while achieving high utilization of AIEs poses a challenge. To address this issue, we propose a mapping scheme called \fname, which aims to accelerate uniform recurrences on the Versal ACAP architecture by leveraging the features of both the hardware and the computations. Considering the array architecture of AIEs, our approach utilizes space-time transformations based on the polyhedral model to generate legally optimized systolic array mappings. Concurrently, we have developed a routing-aware PLIO assignment algorithm tailored for communication on the AIE array, and the algorithm aims at successful compilation while maximizing array utilization. Furthermore, we introduce an automatic mapping framework. This framework is designed to generate the corresponding executable code for uniform recurrences, which encompasses the AIE kernel program, programmable logic bitstreams, and the host program. The experimental results validate the effectiveness of our mapping scheme. Specifically, when applying our scheme to matrix multiplication computations on the VCK5000 board, we achieve a throughput of 4.15TOPS on float data type, which is 1.11$\times$ higher compared to the state-of-the-art accelerator on the Versal ACAP architecture.

相關內容

Emotion recognition in conversation (ERC) is a task which predicts the emotion of an utterance in the context of a conversation. It tightly depends on dialogue context, speaker identity information, multiparty dialogue scenario and so on. However, the state-of-the-art method (instructERC) solely identifying speaker, and ignores commonsense knowledge(i.e., reaction of the listeners and intention of the speaker, etc.) behind speakers during a conversation, which can deeply mine speaker information. To this end, we propose a novel joint large language models with commonsense knowledge framework for emotion recognition in conversation, namely CKERC.We design prompts to generate interlocutors' commonsense based on historical utterances with large language model. And we use the interlocutor commonsense identification task for LLM pre-training to fine-tune speaker implicit clues information.By solving above challenge, our method achieve state-of-the-art.We extensive experiment on three widely-used datasets, i.e., IEMOCAP, MELD, EmoryNLP, demonstrate our method superiority. Also, we conduct in-depth analysis and further demonstrate the effectiveness of commonsense knowledge in ERC task in large language model.

Recent advances in instruction-tuned Large Vision-Language Models (LVLMs) have imbued the models with the ability to generate high-level, image-grounded explanations with ease. While such capability is largely attributed to the rich world knowledge contained within the Large Language Models (LLMs), our work reveals their shortcomings in fine-grained visual categorization (FGVC) across six different benchmark settings. Most recent state-of-the-art LVLMs like LLaVa-1.5, InstructBLIP and GPT-4V not only severely deteriorate in terms of classification performance, e.g., average drop of 65.58 in EM for Stanford Dogs for LLaVA-1.5, but also struggle to generate an accurate explanation with detailed attributes based on the concept that appears within an input image despite their capability to generate holistic image-level descriptions. In-depth analyses show that instruction-tuned LVLMs exhibit modality gap, showing discrepancy when given textual and visual inputs that correspond to the same concept, preventing the image modality from leveraging the rich parametric knowledge within the LLMs. In an effort to further the community's endeavor in this direction, we propose a multiple granularity attribute-centric evaluation benchmark, Finer, which aims to establish a ground to evaluate LVLMs' fine-grained visual comprehension ability and provide significantly improved explainability.

Multiple Instance Learning (MIL) has emerged as a dominant paradigm to extract discriminative feature representations within Whole Slide Images (WSIs) in computational pathology. Despite driving notable progress, existing MIL approaches suffer from limitations in facilitating comprehensive and efficient interactions among instances, as well as challenges related to time-consuming computations and overfitting. In this paper, we incorporate the Selective Scan Space State Sequential Model (Mamba) in Multiple Instance Learning (MIL) for long sequence modeling with linear complexity, termed as MambaMIL. By inheriting the capability of vanilla Mamba, MambaMIL demonstrates the ability to comprehensively understand and perceive long sequences of instances. Furthermore, we propose the Sequence Reordering Mamba (SR-Mamba) aware of the order and distribution of instances, which exploits the inherent valuable information embedded within the long sequences. With the SR-Mamba as the core component, MambaMIL can effectively capture more discriminative features and mitigate the challenges associated with overfitting and high computational overhead. Extensive experiments on two public challenging tasks across nine diverse datasets demonstrate that our proposed framework performs favorably against state-of-the-art MIL methods. The code is released at //github.com/isyangshu/MambaMIL.

We present Polish Information Retrieval Benchmark (PIRB), a comprehensive evaluation framework encompassing 41 text information retrieval tasks for Polish. The benchmark incorporates existing datasets as well as 10 new, previously unpublished datasets covering diverse topics such as medicine, law, business, physics, and linguistics. We conduct an extensive evaluation of over 20 dense and sparse retrieval models, including the baseline models trained by us as well as other available Polish and multilingual methods. Finally, we introduce a three-step process for training highly effective language-specific retrievers, consisting of knowledge distillation, supervised fine-tuning, and building sparse-dense hybrid retrievers using a lightweight rescoring model. In order to validate our approach, we train new text encoders for Polish and compare their results with previously evaluated methods. Our dense models outperform the best solutions available to date, and the use of hybrid methods further improves their performance.

In this study, we delve into the Thresholding Linear Bandit (TLB) problem, a nuanced domain within stochastic Multi-Armed Bandit (MAB) problems, focusing on maximizing decision accuracy against a linearly defined threshold under resource constraints. We present LinearAPT, a novel algorithm designed for the fixed budget setting of TLB, providing an efficient solution to optimize sequential decision-making. This algorithm not only offers a theoretical upper bound for estimated loss but also showcases robust performance on both synthetic and real-world datasets. Our contributions highlight the adaptability, simplicity, and computational efficiency of LinearAPT, making it a valuable addition to the toolkit for addressing complex sequential decision-making challenges.

Distributed Deep Learning (DDL), as a paradigm, dictates the use of GPU-based clusters as the optimal infrastructure for training large-scale Deep Neural Networks (DNNs). However, the high cost of such resources makes them inaccessible to many users. Public cloud services, particularly Spot Virtual Machines (VMs), offer a cost-effective alternative, but their unpredictable availability poses a significant challenge to the crucial checkpointing process in DDL. To address this, we introduce DeepVM, a novel solution that recommends cost-effective cluster configurations by intelligently balancing the use of Spot and On-Demand VMs. DeepVM leverages a four-stage process that analyzes instance performance using the FLOPP (FLoating-point Operations Per Price) metric, performs architecture-level analysis with linear programming, and identifies the optimal configuration for the user-specific needs. Extensive simulations and real-world deployments in the AWS environment demonstrate that DeepVM consistently outperforms other policies, reducing training costs and overall makespan. By enabling cost-effective checkpointing with Spot VMs, DeepVM opens up DDL to a wider range of users and facilitates a more efficient training of complex DNNs.

Large language models (LLMs) are demonstrating remarkable capabilities across various tasks despite lacking a foundation in human cognition. This raises the question: can these models, beyond simply mimicking human language patterns, offer insights into the mechanisms underlying human cognition? This study explores the ability of ChatGPT to predict human performance in a language-based memory task. Building upon theories of text comprehension, we hypothesize that recognizing ambiguous sentences (e.g., "Because Bill drinks wine is never kept in the house") is facilitated by preceding them with contextually relevant information. Participants, both human and ChatGPT, were presented with pairs of sentences. The second sentence was always a garden-path sentence designed to be inherently ambiguous, while the first sentence either provided a fitting (e.g., "Bill has chronic alcoholism") or an unfitting context (e.g., "Bill likes to play golf"). We measured both human's and ChatGPT's ratings of sentence relatedness, ChatGPT's memorability ratings for the garden-path sentences, and humans' spontaneous memory for the garden-path sentences. The results revealed a striking alignment between ChatGPT's assessments and human performance. Sentences deemed more related and assessed as being more memorable by ChatGPT were indeed better remembered by humans, even though ChatGPT's internal mechanisms likely differ significantly from human cognition. This finding, which was confirmed with a robustness check employing synonyms, underscores the potential of generative AI models to predict human performance accurately. We discuss the broader implications of these findings for leveraging LLMs in the development of psychological theories and for gaining a deeper understanding of human cognition.

The moving discontinuous Galerkin method with interface condition enforcement (MDG-ICE) is a high-order, r-adaptive method that treats the grid as a variable and weakly enforces the conservation law, constitutive law, and corresponding interface conditions in order to implicitly fit high-gradient flow features. In this paper, we develop an optimization solver based on the Levenberg-Marquardt algorithm that features an anisotropic, locally adaptive penalty method to enhance robustness and prevent cell degeneration in the computation of hypersonic, viscous flows. Specifically, we incorporate an anisotropic grid regularization based on the mesh-implied metric that inhibits grid motion in directions with small element length scales, an element shape regularization that inhibits nonlinear deformations of the high-order elements, and a penalty regularization that penalizes degenerate elements. Additionally, we introduce a procedure for locally scaling the regularization operators in an adaptive, elementwise manner in order to maintain grid validity. We apply the proposed MDG-ICE formulation to two- and three-dimensional test cases involving viscous shocks and/or boundary layers, including Mach 17.6 hypersonic viscous flow over a circular cylinder and Mach 5 hypersonic viscous flow over a sphere, which are very challenging test cases for conventional numerical schemes on simplicial grids. Even without artificial dissipation, the computed solutions are free from spurious oscillations and yield highly symmetric surface heat-flux profiles.

In the Architecture, Engineering and Construction (AEC) sector, data extracted from building information modelling (BIM) can be used to create a digital twin (DT). The algorithms of a BIM-based DT can facilitate the retrieval of information, which can then be used to improve building operation and maintenance procedures. However, with the increased complexity and automation of the building, maintenance operations are likely to become more complex and may require expert intervention. Collaboration and interaction between the operator and the expert may be limited as the latter may not be on site or within the company. Recently, extended reality (XR) technologies have proven to be effective in improving collaboration during maintenance operations,through data display and shared interactions. This paper presents a new collaborative solution using these technologies to enhance collaboration during remote maintenance operations. The proposed approach consists of a mixed reality (MR) set-up for the operator, a virtual reality (VR) set-up for the remote expert and a shared Digital Model of a heat exchanger. The MR set-up is used for tracking and displaying specific information, provided by the VR module. A user study was carried out to compare the efficiency of our solution with a standard audio-video collaboration. Our approach demonstrated substantial enhancements in collaborative inspection, resulting in a significative reduction in both the overall completion time of the inspection and the frequency of errors committed by the operators.

The Pretrained Foundation Models (PFMs) are regarded as the foundation for various downstream tasks with different data modalities. A pretrained foundation model, such as BERT, GPT-3, MAE, DALLE-E, and ChatGPT, is trained on large-scale data which provides a reasonable parameter initialization for a wide range of downstream applications. The idea of pretraining behind PFMs plays an important role in the application of large models. Different from previous methods that apply convolution and recurrent modules for feature extractions, the generative pre-training (GPT) method applies Transformer as the feature extractor and is trained on large datasets with an autoregressive paradigm. Similarly, the BERT apples transformers to train on large datasets as a contextual language model. Recently, the ChatGPT shows promising success on large language models, which applies an autoregressive language model with zero shot or few show prompting. With the extraordinary success of PFMs, AI has made waves in a variety of fields over the past few years. Considerable methods, datasets, and evaluation metrics have been proposed in the literature, the need is raising for an updated survey. This study provides a comprehensive review of recent research advancements, current and future challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities. We first review the basic components and existing pretraining in natural language processing, computer vision, and graph learning. We then discuss other advanced PFMs for other data modalities and unified PFMs considering the data quality and quantity. Besides, we discuss relevant research about the fundamentals of the PFM, including model efficiency and compression, security, and privacy. Finally, we lay out key implications, future research directions, challenges, and open problems.

北京阿比特科技有限公司