亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<li id='lpqin'></li>

_{^{<dd id='lpqin'><tbody id='lpqin'><td id='lpqin'><optgroup id='lpqin'><strong id='lpqin'></strong></optgroup><address id='lpqin'><ul id='lpqin'></ul></address><big id='lpqin'></big></td><table id='lpqin'></table></tbody><pre id='lpqin'></pre></dd><span id='lpqin'><b id='lpqin'></b></span>}}


<dfn id='lpqin'><optgroup id='lpqin'></optgroup></dfn><tfoot id='lpqin'><bdo id='lpqin'><div id='lpqin'></div><i id='lpqin'><dt id='lpqin'></dt></i></bdo></tfoot>

_{<fieldset id='lpqin'></fieldset>}

·

序列化 · 大語言模型 · 語言模型化 · MoDELS · Better ·

2023 年 12 月 21 日

Towards Better Serialization of Tabular Data for Few-shot Classification with Large Language Models

Sukriti Jaitly,Tanay Shah,Ashish Shugani,Razik Singh Grewal

from arxiv, 4 pages, 2 figures

We present a study on the integration of Large Language Models (LLMs) in tabular data classification, emphasizing an efficient framework. Building upon existing work done in TabLLM (arXiv:2210.10723), we introduce three novel serialization techniques, including the standout LaTeX serialization method. This method significantly boosts the performance of LLMs in processing domain-specific datasets, Our method stands out for its memory efficiency and ability to fully utilize complex data structures. Through extensive experimentation, including various serialization approaches like feature combination and importance, we demonstrate our work's superiority in accuracy and efficiency over traditional models.

相關內容

序列化

(Serialization)將對象的(de)狀態信息轉換為可以存儲或傳輸的(de)形式的(de)過程。

Extensibility · on the fly · MoDELS · Performer · Agent ·

2024 年 2 月 9 日

On the Fly Detection of Root Causes from Observed Data with Application to IT Systems

Lei Zan,Charles K. Assaad,Emilie Devijver,Eric Gaussier

This paper introduces a new structural causal model tailored for representing threshold-based IT systems and presents a new algorithm designed to rapidly detect root causes of anomalies in such systems. When root causes are not causally related, the method is proven to be correct; while an extension is proposed based on the intervention of an agent to relax this assumption. Our algorithm and its agent-based extension leverage causal discovery from offline data and engage in subgraph traversal when encountering new anomalies in online data. Our extensive experiments demonstrate the superior performance of our methods, even when applied to data generated from alternative structural causal models or real IT monitoring data.

層 · 推斷 · MoDELS · Extensibility · 相互獨立的 ·

2024 年 2 月 8 日

Answering Causal Queries at Layer 3 with DiscoSCMs-Embracing Heterogeneity

In the realm of causal inference, Potential Outcomes (PO) and Structural Causal Models (SCM) are recognized as the principal frameworks.However, when it comes to Layer 3 valuations -- counterfactual queries deeply entwined with individual-level semantics -- both frameworks encounter limitations due to the degenerative issues brought forth by the consistency rule. This paper advocates for the Distribution-consistency Structural Causal Models (DiscoSCM) framework as a pioneering approach to counterfactual inference, skillfully integrating the strengths of both PO and SCM. The DiscoSCM framework distinctively incorporates a unit selection variable $U$ and embraces the concept of uncontrollable exogenous noise realization. Through personalized incentive scenarios, we demonstrate the inadequacies of PO and SCM frameworks in representing the probability of a user being a complier (a Layer 3 event) without degeneration, an issue adeptly resolved by adopting the assumption of independent counterfactual noises within DiscoSCM. This innovative assumption broadens the foundational counterfactual theory, facilitating the extension of numerous theoretical results regarding the probability of causation to an individual granularity level and leading to a comprehensive set of theories on heterogeneous counterfactual bounds. Ultimately, our paper posits that if one acknowledges and wishes to leverage the ubiquitous heterogeneity, understanding causality as invariance across heterogeneous units, then DiscoSCM stands as a significant advancement in the methodology of counterfactual inference.

線性的 · 泛函 · 近似 · 正則化項 · 近似誤差 ·

2024 年 2 月 8 日

Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation

Semih Cayci,Niao He,R. Srikant

Natural policy gradient (NPG) methods with entropy regularization achieve impressive empirical success in reinforcement learning problems with large state-action spaces. However, their convergence properties and the impact of entropy regularization remain elusive in the function approximation regime. In this paper, we establish finite-time convergence analyses of entropy-regularized NPG with linear function approximation under softmax parameterization. In particular, we prove that entropy-regularized NPG with averaging satisfies the \emph{persistence of excitation} condition, and achieves a fast convergence rate of $\tilde{O}(1/T)$ up to a function approximation error in regularized Markov decision processes. This convergence result does not require any a priori assumptions on the policies. Furthermore, under mild regularity conditions on the concentrability coefficient and basis vectors, we prove that entropy-regularized NPG exhibits \emph{linear convergence} up to a function approximation error.

大語言模型 · 語言模型化 · MoDELS · Prompt · Performer ·

2024 年 2 月 8 日

Guiding Large Language Models with Divide-and-Conquer Program for Discerning Problem Solving

Yizhou Zhang,Lun Du,Defu Cao,Qiang Fu,Yan Liu

from arxiv, Preprint

Foundation models, such as Large language Models (LLMs), have attracted significant amount of interest due to their large number of applications. Existing works show that appropriate prompt design, such as Chain-of-Thoughts, can unlock LLM's powerful capacity in diverse areas. However, when handling tasks involving repetitive sub-tasks and/or deceptive contents, such as arithmetic calculation and article-level fake news detection, existing prompting strategies either suffers from insufficient expressive power or intermediate errors triggered by hallucination. To make LLM more discerning to such intermediate errors, we propose to guide LLM with a Divide-and-Conquer program that simultaneously ensures superior expressive power and disentangles task decomposition, sub-task resolution, and resolution assembly process. Theoretic analysis reveals that our strategy can guide LLM to extend the expressive power of fixed-depth Transformer. Experiments indicate that our proposed method can achieve better performance than typical prompting strategies in tasks bothered by intermediate errors and deceptive contents, such as large integer multiplication, hallucination detection and misinformation detection.

掩碼 · Prompt · 稀疏 · 未標記 · MoDELS ·

2024 年 2 月 7 日

Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image Modeling for CBCT Tooth Segmentation

Pengyu Dai,Yafei Ou,Yang Liu,Yue Zhao

Accurate tooth identification and segmentation in Cone Beam Computed Tomography (CBCT) dental images can significantly enhance the efficiency and precision of manual diagnoses performed by dentists. However, existing segmentation methods are mainly developed based on large data volumes training, on which their annotations are extremely time-consuming. Meanwhile, the teeth of each class in CBCT dental images being closely positioned, coupled with subtle inter-class differences, gives rise to the challenge of indistinct boundaries when training model with limited data. To address these challenges, this study aims to propose a tasked-oriented Masked Auto-Encoder paradigm to effectively utilize large amounts of unlabeled data to achieve accurate tooth segmentation with limited labeled data. Specifically, we first construct a self-supervised pre-training framework of masked auto encoder to efficiently utilize unlabeled data to enhance the network performance. Subsequently, we introduce a sparse masked prompt mechanism based on graph attention to incorporate boundary information of the teeth, aiding the network in learning the anatomical structural features of teeth. To the best of our knowledge, we are pioneering the integration of the mask pre-training paradigm into the CBCT tooth segmentation task. Extensive experiments demonstrate both the feasibility of our proposed method and the potential of the boundary prompt mechanism.

2024 年 2 月 6 日

An Implementation of the Extended Tower Number Field Sieve using 4d Sieving in a Box and a Record Computation in Fp4

We report on an implementation of the Extended Tower Number Field Sieve (ExTNFS) and record computation in a medium characteristic finite field $\mathbb{F}_{p^4}$ of 512 bits size. Empirically, we show that sieving in a 4-dimensional box (orthotope) for collecting relations for ExTNFS in $\mathbb{F}_{p^4}$ is faster than sieving in a 4-dimensional hypersphere. We also give a new intermediate descent method, `descent using random vectors', without which the descent stage in our ExTNFS computation would have been difficult/impossible, and analyze its complexity.

傳感器 · INFORMS · Performance · MoDELS · Machine Learning ·

2024 年 2 月 3 日

A Plug-in Tiny AI Module for Intelligent and Selective Sensor Data Transmission

Wenjun Huang,Arghavan Rezvani,Hanning Chen,Yang Ni,Sanggeon Yun,Sungheon Jeong,Mohsen Imani

from arxiv, 14 pages, 6 figures

Applications in the Internet of Things (IoT) utilize machine learning to analyze sensor-generated data. However, a major challenge lies in the lack of targeted intelligence in current sensing systems, leading to vast data generation and increased computational and communication costs. To address this challenge, we propose a novel sensing module to equip sensing frameworks with intelligent data transmission capabilities by integrating a highly efficient machine learning model placed near the sensor. This model provides prompt feedback for the sensing system to transmit only valuable data while discarding irrelevant information by regulating the frequency of data transmission. The near-sensor model is quantized and optimized for real-time sensor control. To enhance the framework's performance, the training process is customized and a "lazy" sensor deactivation strategy utilizing temporal information is introduced. The suggested method is orthogonal to other IoT frameworks and can be considered as a plugin for selective data transmission. The framework is implemented, encompassing both software and hardware components. The experiments demonstrate that the framework utilizing the suggested module achieves over 85% system efficiency in terms of energy consumption and storage, with negligible impact on performance. This methodology has the potential to significantly reduce data output from sensors, benefiting a wide range of IoT applications.

優化器 · 圖 · 圖形處理器 · Neural Networks · 核化 ·

2021 年 1 月 28 日

Interpreting and Unifying Graph Neural Networks with An Optimization Framework

Meiqi Zhu,Xiao Wang,Chuan Shi,Houye Ji,Peng Cui

from arxiv, WWW2021, 12 pages

Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.

語言模型化 · MoDELS · 詞表 · 優化器 · state-of-the-art ·

2019 年 9 月 25 日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Sanqiang Zhao,Raghav Gupta,Yang Song,Denny Zhou

Pre-trained deep neural network language models such as ELMo, GPT, BERT and XLNet have recently achieved state-of-the-art performance on a variety of language understanding tasks. However, their size makes them impractical for a number of scenarios, especially on mobile and edge devices. In particular, the input word embedding matrix accounts for a significant proportion of the model's memory footprint, due to the large input vocabulary and embedding dimensions. Knowledge distillation techniques have had success at compressing large neural network models, but they are ineffective at yielding student models with vocabularies different from the original teacher models. We introduce a novel knowledge distillation technique for training a student model with a significantly smaller vocabulary as well as lower embedding and hidden state dimensions. Specifically, we employ a dual-training mechanism that trains the teacher and student models simultaneously to obtain optimal word embeddings for the student vocabulary. We combine this approach with learning shared projection matrices that transfer layer-wise knowledge from the teacher model to the student model. Our method is able to compress the BERT_BASE model by more than 60x, with only a minor drop in downstream task metrics, resulting in a language model with a footprint of under 7MB. Experimental results also demonstrate higher compression efficiency and accuracy when compared with other state-of-the-art compression techniques.

命名實體識別 · entity · 學成 · 深度學習 · 可辨認的 ·

2018 年 12 月 22 日

A Survey on Deep Learning for Named Entity Recognition

Jing Li,Aixin Sun,Jianglei Han,Chenliang Li

from arxiv, 20 pages, 15 figures

Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

大語言模(mo)型

語言模型(xing)化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<form id='KFi6s'></form>

<bdo id='SeThs'><sup id='zROJq'><div id='C7Yy5'><bdo id='W4YPH'></bdo></div></sup></bdo>