A级日本乱理伦片免费入口_99久久久无码国产精品69_波多野结衣久久一区二区三区_国产在线一区二区三区视频_欧美一级片久久悠悠_国产精品亚洲综合天堂夜夜_免费视频亚洲国产美女久久久久久

Despite recent community revelations about the advancements and potential of Large Language Models (LLMs) in understanding Text-Attributed Graphs (TAG), the deployment of LLMs for production is hindered by their high computational and storage requirements, as well as long latencies during inference. Simultaneously, although traditional Graph Neural Networks (GNNs) are light weight and adept at learning structural features of graphs, their ability to grasp the complex semantics in TAGs is somewhat constrained for real applications. To address these limitations, we concentrate on the downstream task of node classification in TAG and propose a novel graph knowledge distillation framework, termed Linguistic Graph Knowledge Distillation (LinguGKD), using LLMs as teacher models and GNNs as student models for knowledge distillation. It involves TAG-oriented instruction tuning of LLM on designed node classification prompts, followed by aligning the hierarchically learned node features of the teacher LLM and the student GNN in latent space, employing a layer-adaptive contrastive learning strategy. Through extensive experiments on a variety of LLM and GNN models and multiple benchmark datasets, the proposed LinguGKD significantly boosts the student GNN's predictive accuracy and convergence rate, without the need of extra data or model parameters. Compared to teacher LLM, distilled GNN achieves superior inference speed equipped with much fewer computing and storage demands, when surpassing the teacher LLM's classification performance on some of benchmark datasets.

相關內容

大語(yu)言模(mo)型

關注 56

大(da)語(yu)(yu)言(yan)模(mo)型是基于(yu)海量(liang)文本(ben)數(shu)(shu)據訓練的(de)(de)(de)深度學習模(mo)型。它(ta)不僅能(neng)夠生成(cheng)自然語(yu)(yu)言(yan)文本(ben)，還能(neng)夠深入理解文本(ben)含義，處(chu)理各種自然語(yu)(yu)言(yan)任務(wu)，如文本(ben)摘要、問(wen)答、翻譯等。2023年(nian)(nian)，大(da)語(yu)(yu)言(yan)模(mo)型及其在人(ren)工智(zhi)能(neng)領域的(de)(de)(de)應用已成(cheng)為(wei)全球(qiu)科技研究的(de)(de)(de)熱點，其在規模(mo)上的(de)(de)(de)增長尤為(wei)引人(ren)注目，參數(shu)(shu)量(liang)已從最初的(de)(de)(de)十(shi)幾(ji)億(yi)躍升到如今(jin)的(de)(de)(de)一(yi)萬億(yi)。參數(shu)(shu)量(liang)的(de)(de)(de)提升使得模(mo)型能(neng)夠更加(jia)(jia)精細地(di)捕捉人(ren)類語(yu)(yu)言(yan)微(wei)妙之(zhi)處(chu)，更加(jia)(jia)深入地(di)理解人(ren)類語(yu)(yu)言(yan)的(de)(de)(de)復雜性。在過去的(de)(de)(de)一(yi)年(nian)(nian)里，大(da)語(yu)(yu)言(yan)模(mo)型在吸納新知識、分解復雜任務(wu)以及圖文對齊等多方(fang)面都有(you)顯著提升。隨著技術(shu)的(de)(de)(de)不斷成(cheng)熟，它(ta)將不斷拓展其應用范圍，為(wei)人(ren)類提供更加(jia)(jia)智(zhi)能(neng)化和個性化的(de)(de)(de)服務(wu)，進一(yi)步改善人(ren)們的(de)(de)(de)生活和生產方(fang)式。

binary · 類別 · 有偏 · 數據集 · 正則化項 ·

2024 年 3 月 21 日

Biased Binary Attribute Classifiers Ignore the Majority Classes

Xinyi Zhang,Johanna Sophie Bieri,Manuel Günther

To visualize the regions of interest that classifiers base their decisions on, different Class Activation Mapping (CAM) methods have been developed. However, all of these techniques target categorical classifiers only, though most real-world tasks are binary classification. In this paper, we extend gradient-based CAM techniques to work with binary classifiers and visualize the active regions for binary facial attribute classifiers. When training an unbalanced binary classifier on an imbalanced dataset, it is well-known that the majority class, i.e. the class with many training samples, is mostly predicted much better than minority class with few training instances. In our experiments on the CelebA dataset, we verify these results, when training an unbalanced classifier to extract 40 facial attributes simultaneously. One would expect that the biased classifier has learned to extract features mainly for the majority classes and that the proportional energy of the activations mainly reside in certain specific regions of the image where the attribute is located. However, we find very little regular activation for samples of majority classes, while the active regions for minority classes seem mostly reasonable and overlap with our expectations. These results suggest that biased classifiers mainly rely on bias activation for majority classes. When training a balanced classifier on the imbalanced data by employing attribute-specific class weights, majority and minority classes are classified similarly well and show expected activations for almost all attributes

Performer · 可約的 · 降維 · 無監督 · PCA ·

2024 年 3 月 20 日

Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained Sentence Embeddings

Gaifan Zhang,Yi Zhou,Danushka Bollegala

Sentence embeddings produced by Pretrained Language Models (PLMs) have received wide attention from the NLP community due to their superior performance when representing texts in numerous downstream applications. However, the high dimensionality of the sentence embeddings produced by PLMs is problematic when representing large numbers of sentences in memory- or compute-constrained devices. As a solution, we evaluate unsupervised dimensionality reduction methods to reduce the dimensionality of sentence embeddings produced by PLMs. Our experimental results show that simple methods such as Principal Component Analysis (PCA) can reduce the dimensionality of sentence embeddings by almost $50\%$, without incurring a significant loss in performance in multiple downstream tasks. Surprisingly, reducing the dimensionality further improves performance over the original high-dimensional versions for the sentence embeddings produced by some PLMs in some tasks.

線性的 · 端到端 · MoDELS · Performer · 數據集 ·

2024 年 3 月 20 日

Practical End-to-End Optical Music Recognition for Pianoform Music

Ji?í Mayer,Milan Straka,Jan Haji? jr.,Pavel Pecina

from arxiv, 15+4 pages, 6 figures

The majority of recent progress in Optical Music Recognition (OMR) has been achieved with Deep Learning methods, especially models following the end-to-end paradigm, reading input images and producing a linear sequence of tokens. Unfortunately, many music scores, especially piano music, cannot be easily converted to a linear sequence. This has led OMR researchers to use custom linearized encodings, instead of broadly accepted structured formats for music notation. Their diversity makes it difficult to compare the performance of OMR systems directly. To bring recent OMR model progress closer to useful results: (a) We define a sequential format called Linearized MusicXML, allowing to train an end-to-end model directly and maintaining close cohesion and compatibility with the industry-standard MusicXML format. (b) We create a dev and test set for benchmarking typeset OMR with MusicXML ground truth based on the OpenScore Lieder corpus. They contain 1,438 and 1,493 pianoform systems, each with an image from IMSLP. (c) We train and fine-tune an end-to-end model to serve as a baseline on the dataset and employ the TEDn metric to evaluate the model. We also test our model against the recently published synthetic pianoform dataset GrandStaff and surpass the state-of-the-art results.

估計/估計量 · 點估計 · 矩 · 樣本 · 隨機場 ·

2024 年 3 月 19 日

Quadratic Point Estimate Method for Probabilistic Moments Computation

Minhyeok Ko,Konstantinos G. Papakonstantinou

This paper presents in detail the originally developed Quadratic Point Estimate Method (QPEM), aimed at efficiently and accurately computing the first four output moments of probabilistic distributions, using 2n^2+1 sample (or sigma) points, with n, the number of input random variables. The proposed QPEM particularly offers an effective, superior, and practical alternative to existing sampling and quadrature methods for low- and moderately-high-dimensional problems. Detailed theoretical derivations are provided proving that the proposed method can achieve a fifth or higher-order accuracy for symmetric input distributions. Various numerical examples, from simple polynomial functions to nonlinear finite element analyses with random field representations, support the theoretical findings and further showcase the validity, efficiency, and applicability of the QPEM, from low- to high-dimensional problems.

Agent · Performer · Processing（編程語言） · 大語言模型 · MoDELS ·

2024 年 3 月 19 日

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning

Fucai Ke,Zhixi Cai,Simindokht Jahangard,Weiqing Wang,Pari Delir Haghighi,Hamid Rezatofighi

Recent advances in visual reasoning (VR), particularly with the aid of Large Vision-Language Models (VLMs), show promise but require access to large-scale datasets and face challenges such as high computational costs and limited generalization capabilities. Compositional visual reasoning approaches have emerged as effective strategies; however, they heavily rely on the commonsense knowledge encoded in Large Language Models (LLMs) to perform planning, reasoning, or both, without considering the effect of their decisions on the visual reasoning process, which can lead to errors or failed procedures. To address these challenges, we introduce HYDRA, a multi-stage dynamic compositional visual reasoning framework designed for reliable and incrementally progressive general reasoning. HYDRA integrates three essential modules: a planner, a Reinforcement Learning (RL) agent serving as a cognitive controller, and a reasoner. The planner and reasoner modules utilize an LLM to generate instruction samples and executable code from the selected instruction, respectively, while the RL agent dynamically interacts with these modules, making high-level decisions on selection of the best instruction sample given information from the historical state stored through a feedback loop. This adaptable design enables HYDRA to adjust its actions based on previous feedback received during the reasoning process, leading to more reliable reasoning outputs and ultimately enhancing its overall effectiveness. Our framework demonstrates state-of-the-art performance in various VR tasks on four different widely-used datasets.

contrastive · Automator · Learning · 對比學習 · 數據集 ·

2024 年 3 月 19 日

Automated Contrastive Learning Strategy Search for Time Series

Baoyu Jing,Yansen Wang,Guoxin Sui,Jing Hong,Jingrui He,Yuqing Yang,Dongsheng Li,Kan Ren

from arxiv, Preprint. Work in progress

In recent years, Contrastive Learning (CL) has become a predominant representation learning paradigm for time series. Most existing methods in the literature focus on manually building specific Contrastive Learning Strategies (CLS) by human heuristics for certain datasets and tasks. However, manually developing CLS usually require excessive prior knowledge about the datasets and tasks, e.g., professional cognition of the medical time series in healthcare, as well as huge human labor and massive experiments to determine the detailed learning configurations. In this paper, we present an Automated Machine Learning (AutoML) practice at Microsoft, which automatically learns to contrastively learn representations for various time series datasets and tasks, namely Automated Contrastive Learning (AutoCL). We first construct a principled universal search space of size over 3x1012, covering data augmentation, embedding transformation, contrastive pair construction and contrastive losses. Further, we introduce an efficient reinforcement learning algorithm, which optimizes CLS from the performance on the validation tasks, to obtain more effective CLS within the space. Experimental results on various real-world tasks and datasets demonstrate that AutoCL could automatically find the suitable CLS for a given dataset and task. From the candidate CLS found by AutoCL on several public datasets/tasks, we compose a transferable Generally Good Strategy (GGS), which has a strong performance for other datasets. We also provide empirical analysis as a guidance for future design of CLS.

估計/估計量 · 散度 · MoDELS · 易處理的 · 蒙特卡羅 ·

2024 年 3 月 18 日

Variational Approach for Efficient KL Divergence Estimation in Dirichlet Mixture Models

Samyajoy Pal,Christian Heumann

This study tackles the efficient estimation of Kullback-Leibler (KL) Divergence in Dirichlet Mixture Models (DMM), crucial for clustering compositional data. Despite the significance of DMMs, obtaining an analytically tractable solution for KL Divergence has proven elusive. Past approaches relied on computationally demanding Monte Carlo methods, motivating our introduction of a novel variational approach. Our method offers a closed-form solution, significantly enhancing computational efficiency for swift model comparisons and robust estimation evaluations. Validation using real and simulated data showcases its superior efficiency and accuracy over traditional Monte Carlo-based methods, opening new avenues for rapid exploration of diverse DMM models and advancing statistical analyses of compositional data.

Performer · Processing（編程語言） · 英特爾 (Intel) · 優化器 · 操作 ·

2024 年 3 月 18 日

Benchmarking Analytical Query Processing in Intel SGXv2

Adrian Lutsch,Muhammad El-Hindi,Matthias Heinrich,Daniel Ritter,Zsolt István,Carsten Binnig

from arxiv, 14 pages, 17 figures, submitted for VLDB 2024 in the EA&B category, associated code is available under //github.com/DataManagementLab/sgxv2-analytical-query-processing-benchmarks

The recently introduced second generation of Intel SGX (SGXv2) lifts memory size limitations of the first generation. Theoretically, this promises to enable secure and highly efficient analytical DBMSs in the cloud. To validate this promise, in this paper, we conduct the first in-depth evaluation study of running analytical query processing algorithms inside SGXv2. Our study reveals that state-of-the-art query operators like radix joins and SIMD-based scans can indeed achieve high performance inside SGXv2 enclaves. These operations are orders of magnitude faster than joins optimized for the discontinued SGXv1 hardware. However, substantial performance overheads are still caused by subtle hardware and software differences influencing code execution inside an SGX enclave. We investigate these differences and propose new optimizations to bring the performance inside the enclave on par with native code execution outside an enclave.

MoDELS · 變換 · 優化器 · Taxonomy · HTTPS ·

2023 年 11 月 21 日

Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey

Yunpeng Huang,Jingwei Xu,Zixu Jiang,Junyu Lai,Zenan Li,Yuan Yao,Taolue Chen,Lijuan Yang,Zhou Xin,Xiaoxing Ma

from arxiv, 35 pages, 3 figures, 4 tables

With the bomb ignited by ChatGPT, Transformer-based Large Language Models (LLMs) have paved a revolutionary path toward Artificial General Intelligence (AGI) and have been applied in diverse areas as knowledge bases, human interfaces, and dynamic agents. However, a prevailing limitation exists: many current LLMs, constrained by resources, are primarily pre-trained on shorter texts, rendering them less effective for longer-context prompts, commonly encountered in real-world settings. In this paper, we present a comprehensive survey focusing on the advancement of model architecture in Transformer-based LLMs to optimize long-context capabilities across all stages from pre-training to inference. We firstly delineate and analyze the problems of handling long-context input and output with the current Transformer-based models. Then, we mainly offer a holistic taxonomy to navigate the landscape of Transformer upgrades on architecture to solve these problems. Afterward, we provide the investigation on wildly used evaluation necessities tailored for long-context LLMs, including datasets, metrics, and baseline models, as well as some amazing optimization toolkits like libraries, systems, and compilers to augment LLMs' efficiency and efficacy across different stages. Finally, we further discuss the predominant challenges and potential avenues for future research in this domain. Additionally, we have established a repository where we curate relevant literature with real-time updates at //github.com/Strivin0311/long-llms-learning.

清華大學智能產業研究院 · 值域 · TEAM · 決定系數 · MoDELS ·

2021 年 11 月 4 日

Engagement Decision Support for Beyond Visual Range Air Combat

Joao P. A. Dantas,Andre N. Costa,Diego Geraldo,Marcos R. O. A. Maximo,Takashi Yoneyama

This work aims to provide an engagement decision support tool for Beyond Visual Range (BVR) air combat in the context of Defensive Counter Air (DCA) missions. In BVR air combat, engagement decision refers to the choice of the moment the pilot engages a target by assuming an offensive stance and executing corresponding maneuvers. To model this decision, we use the Brazilian Air Force's Aerospace Simulation Environment (\textit{Ambiente de Simula\c{c}\~ao Aeroespacial - ASA} in Portuguese), which generated 3,729 constructive simulations lasting 12 minutes each and a total of 10,316 engagements. We analyzed all samples by an operational metric called the DCA index, which represents, based on the experience of subject matter experts, the degree of success in this type of mission. This metric considers the distances of the aircraft of the same team and the opposite team, the point of Combat Air Patrol, and the number of missiles used. By defining the engagement status right before it starts and the average of the DCA index throughout the engagement, we create a supervised learning model to determine the quality of a new engagement. An algorithm based on decision trees, working with the XGBoost library, provides a regression model to predict the DCA index with a coefficient of determination close to 0.8 and a Root Mean Square Error of 0.05 that can furnish parameters to the BVR pilot to decide whether or not to engage. Thus, using data obtained through simulations, this work contributes by building a decision support system based on machine learning for BVR air combat.