一本色道综合久久欧美日韩精品,亚洲国产中文精品在线观看香蕉,日韩AⅤ精品国内在线

Decision Tree (DT) Learning is a fundamental problem in Interpretable Machine Learning, yet it poses a formidable optimisation challenge. Despite numerous efforts dating back to the early 1990's, practical algorithms have only recently emerged, primarily leveraging Dynamic Programming (DP) and Branch & Bound (B&B) techniques. These methods fall into two categories: algorithms like DL8.5, MurTree and STreeD utilise an efficient DP strategy but lack effective bounds for pruning the search space; while algorithms like OSDT and GOSDT employ more efficient pruning bounds but at the expense of a less refined DP strategy. We introduce Branches, a new algorithm that combines the strengths of both approaches. Using DP and B&B with a novel analytical bound for efficient pruning, Branches offers both speed and sparsity optimisation. Unlike other methods, it also handles non-binary features. Theoretical analysis shows its lower complexity compared to existing methods, and empirical results confirm that Branches outperforms the state-of-the-art in speed, iterations, and optimality.

相關內容

Branch

關注 0

平滑 · MoDELS · 語言模型化 · 大語言模型 · Pivotal（公司） ·

2024 年 11 月 12 日

ASER: Activation Smoothing and Error Reconstruction for Large Language Model Quantization

Weibo Zhao,Yubin Shi,Xinyu Lyu,Wanchen Sui,Shen Li,Yong Li

Quantization stands as a pivotal technique for large language model (LLM) serving, yet it poses significant challenges particularly in achieving effective low-bit quantization. The limited numerical mapping makes the quantized model produce a non-trivial error, bringing out intolerable performance degration. This paper is anchored in the basic idea of model compression objectives, and delves into the layer-wise error distribution of LLMs during post-training quantization. Subsequently, we introduce ASER, an algorithm consisting of (1) Error Reconstruction: low-rank compensation for quantization error with LoRA-style matrices constructed by whitening SVD; (2) Activation Smoothing: outlier extraction to gain smooth activation and better error compensation. ASER is capable of quantizing typical LLMs to low-bit ones, particularly preserving accuracy even in W4A8 per-channel setup. Experimental results show that ASER is competitive among the state-of-the-art quantization algorithms, showing potential to activation quantization, with minor overhead.

MoDELS · 翻轉 · 有偏 · NLP · 數據增強 ·

2024 年 11 月 12 日

LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study

Van Bach Nguyen,Paul Youssef,Christin Seifert,J?rg Schl?tterer

from arxiv, Accepted to EMNLP Findings 2024

As NLP models become more complex, understanding their decisions becomes more crucial. Counterfactuals (CFs), where minimal changes to inputs flip a model's prediction, offer a way to explain these models. While Large Language Models (LLMs) have shown remarkable performance in NLP tasks, their efficacy in generating high-quality CFs remains uncertain. This work fills this gap by investigating how well LLMs generate CFs for two NLU tasks. We conduct a comprehensive comparison of several common LLMs, and evaluate their CFs, assessing both intrinsic metrics, and the impact of these CFs on data augmentation. Moreover, we analyze differences between human and LLM-generated CFs, providing insights for future research directions. Our results show that LLMs generate fluent CFs, but struggle to keep the induced changes minimal. Generating CFs for Sentiment Analysis (SA) is less challenging than NLI where LLMs show weaknesses in generating CFs that flip the original label. This also reflects on the data augmentation performance, where we observe a large gap between augmenting with human and LLMs CFs. Furthermore, we evaluate LLMs' ability to assess CFs in a mislabelled data setting, and show that they have a strong bias towards agreeing with the provided labels. GPT4 is more robust against this bias and its scores correlate well with automatic metrics. Our findings reveal several limitations and point to potential future work directions.

INFORMS · 詞元分析器 · Processing（編程語言） · 表示 · MoDELS ·

2024 年 11 月 11 日

ChuLo: Chunk-Level Key Information Representation for Long Document Processing

Yan Li,Soyeon Caren Han,Yue Dai,Feiqi Cao

from arxiv, The paper has been submitted to a conference and is currently under review

Transformer-based models have achieved remarkable success in various Natural Language Processing (NLP) tasks, yet their ability to handle long documents is constrained by computational limitations. Traditional approaches, such as truncating inputs, sparse self-attention, and chunking, attempt to mitigate these issues, but they often lead to information loss and hinder the model's ability to capture long-range dependencies. In this paper, we introduce ChuLo, a novel chunk representation method for long document classification that addresses these limitations. Our ChuLo groups input tokens using unsupervised keyphrase extraction, emphasizing semantically important keyphrase based chunk to retain core document content while reducing input length. This approach minimizes information loss and improves the efficiency of Transformer-based models. Preserving all tokens in long document understanding, especially token classification tasks, is especially important to ensure that fine-grained annotations, which depend on the entire sequence context, are not lost. We evaluate our method on multiple long document classification tasks and long document token classification tasks, demonstrating its effectiveness through comprehensive qualitative and quantitative analyses.

代碼 · SQL · MoDELS · 大語言模型 · Performer ·

2024 年 11 月 11 日

PDC & DM-SFT: A Road for LLM SQL Bug-Fix Enhancing

Yiwen Duan,Yonghong Yu,Xiaoming Zhao,Yichang Wu,Wenbo Liu

from arxiv, COLING-Industry 2025 accepted

Code Large Language Models (Code LLMs), such as Code llama and DeepSeek-Coder, have demonstrated exceptional performance in the code generation tasks. However, most existing models focus on the abilities of generating correct code, but often struggle with bug repair. We introduce a suit of methods to enhance LLM's SQL bug-fixing abilities. The methods are mainly consisted of two parts: A Progressive Dataset Construction (PDC) from scratch and Dynamic Mask Supervised Fine-tuning (DM-SFT). PDC proposes two data expansion methods from the perspectives of breadth first and depth first respectively. DM-SFT introduces an efficient bug-fixing supervised learning approach, which effectively reduce the total training steps and mitigate the "disorientation" in SQL code bug-fixing training. In our evaluation, the code LLM models trained with two methods have exceeds all current best performing model which size is much larger.

Processing（編程語言） · MINE · 語言模型化 · MoDELS · Prompt ·

2024 年 11 月 11 日

Mining Causality: AI-Assisted Search for Instrumental Variables

Sukjin Han

The instrumental variables (IVs) method is a leading empirical strategy for causal inference. Finding IVs is a heuristic and creative process, and justifying its validity--especially exclusion restrictions--is largely rhetorical. We propose using large language models (LLMs) to search for new IVs through narratives and counterfactual reasoning, similar to how a human researcher would. The stark difference, however, is that LLMs can dramatically accelerate this process and explore an extremely large search space. We demonstrate how to construct prompts to search for potentially valid IVs. We contend that multi-step and role-playing prompting strategies are effective for simulating the endogenous decision-making processes of economic agents and for navigating language models through the realm of real-world scenarios. We apply our method to three well-known examples in economics: returns to schooling, supply and demand, and peer effects. We then extend our strategy to finding (i) control variables in regression and difference-in-differences and (ii) running variables in regression discontinuity designs.

MoDELS · Networking · 門控 · RNN · Neural Networks ·

2024 年 11 月 10 日

Delayed Memory Unit: Modelling Temporal Dependency Through Delay Gate

Pengfei Sun,Jibin Wu,Malu Zhang,Paul Devos,Dick Botteldooren

from arxiv, Accepted for publication in IEEE Transactions on Neural Networks and Learning Systems, 2024

Recurrent Neural Networks (RNNs) are widely recognized for their proficiency in modeling temporal dependencies, making them highly prevalent in sequential data processing applications. Nevertheless, vanilla RNNs are confronted with the well-known issue of gradient vanishing and exploding, posing a significant challenge for learning and establishing long-range dependencies. Additionally, gated RNNs tend to be over-parameterized, resulting in poor computational efficiency and network generalization. To address these challenges, this paper proposes a novel Delayed Memory Unit (DMU). The DMU incorporates a delay line structure along with delay gates into vanilla RNN, thereby enhancing temporal interaction and facilitating temporal credit assignment. Specifically, the DMU is designed to directly distribute the input information to the optimal time instant in the future, rather than aggregating and redistributing it over time through intricate network dynamics. Our proposed DMU demonstrates superior temporal modeling capabilities across a broad range of sequential modeling tasks, utilizing considerably fewer parameters than other state-of-the-art gated RNN models in applications such as speech recognition, radar gesture recognition, ECG waveform segmentation, and permuted sequential image classification.

MoDELS · 有偏 · 語言模型化 · 大語言模型 · AI ·

2024 年 11 月 10 日

PRISM: A Methodology for Auditing Biases in Large Language Models

Leif Azzopardi,Yashar Moshfeghi

Auditing Large Language Models (LLMs) to discover their biases and preferences is an emerging challenge in creating Responsible Artificial Intelligence (AI). While various methods have been proposed to elicit the preferences of such models, countermeasures have been taken by LLM trainers, such that LLMs hide, obfuscate or point blank refuse to disclosure their positions on certain subjects. This paper presents PRISM, a flexible, inquiry-based methodology for auditing LLMs - that seeks to illicit such positions indirectly through task-based inquiry prompting rather than direct inquiry of said preferences. To demonstrate the utility of the methodology, we applied PRISM on the Political Compass Test, where we assessed the political leanings of twenty-one LLMs from seven providers. We show LLMs, by default, espouse positions that are economically left and socially liberal (consistent with prior work). We also show the space of positions that these models are willing to espouse - where some models are more constrained and less compliant than others - while others are more neutral and objective. In sum, PRISM can more reliably probe and audit LLMs to understand their preferences, biases and constraints.

MoDELS · 語言模型化 · Learning · 數據集 · ML ·

2024 年 11 月 10 日

LML-DAP: Language Model Learning a Dataset for Data-Augmented Prediction

Praneeth Vadlapati

from arxiv, Made the abstract and the content clearer

Classification tasks are typically handled using Machine Learning (ML) models, which lack a balance between accuracy and interpretability. This paper introduces a new approach for classification tasks using Large Language Models (LLMs) in an explainable method. Unlike ML models, which rely heavily on data cleaning and feature engineering, this method streamlines the process using LLMs. This paper proposes a method called "Language Model Learning (LML)" powered by a new method called "Data-Augmented Prediction (DAP)." The classification is performed by LLMs using a method similar to that used by humans who manually explore and understand the data to decide classifications. In the process of LML, a dataset is summarized and evaluated to determine the features leading to each label the most. In the DAP process, the system uses the data summary and a row of the testing dataset to automatically generate a query to retrieve relevant rows from the dataset for context-aware classification. LML and DAP unlock new possibilities in areas that require explainable and context-aware decisions by ensuring satisfactory accuracy even with complex data. The system scored an accuracy above 90% in some test cases, confirming the effectiveness and potential of the system to outperform ML models in various scenarios. The source code is available at //github.com/Pro-GenAI/LML-DAP

端到端 · 語言模型化 · MoDELS · INTERACT · Performer ·

2024 年 11 月 9 日

SiriusBI: Building End-to-End Business Intelligence Enhanced by Large Language Models

Jie Jiang,Haining Xie,Yu Shen,Zihan Zhang,Meng Lei,Yifeng Zheng,Yide Fang,Chunyou Li,Danqing Huang,Wentao Zhang,Yang Li,Xiaofeng Yang,Bin Cui,Peng Chen

from arxiv, 14 pages, 5figures

The rapid advancement of AI technologies, particularly Large Language Models (LLMs), is establishing a new paradigm for Business Intelligence (BI). Despite the emergence of pioneering work in enhancing BI systems with LLMs, we have identified the following three issues when deployed in real industrial scenarios: interaction limitations, performance bottlenecks, and functionality deficiencies. In this paper, we present SiriusBI, an end-to-end business intelligence system that is designed to address the three issues simultaneously. First, we propose an intelligent and application-oriented module called multi-round dialogue with querying, which aims to overcome the prevalent interaction limitations in current BI solutions. Next, to mitigate the performance bottlenecks caused by scenario migration, we introduce two SQL generation methods that strike a balance between accuracy and deployment costs. Finally, to tackle the practical challenges posed by functionality deficiencies, we develop an end-to-end workflow that covers the entire BI process, ensuring that SiriusBI delivers a robust and complete set of functionalities. As an independent cloud service in Tencent's data platform, SiriusBI has been applied across Tencent's finance, advertising, and cloud sectors, providing services to dozens of enterprise clients. Experiments on real-world datasets and practical applications in industrial BI scenarios demonstrate the practicality and effectiveness of SiriusBI. Remarkably, SiriusBI achieves remarkable accuracy rates of 97% in SQL generation for Tencent Finance, 89% for Tencent Advertisement, and 91% for Tencent Cloud.

MoDELS · 模型評估 · 語言模型化 · Vision · LORA ·

2024 年 11 月 8 日

HPE-CogVLM: Advancing Vision Language Models with a Head Pose Grounding Task

Yu Tian,Tianqi Shao,Tsukasa Demizu,Xuyang Wu,Hsin-Tai Wu

from arxiv, This work has been submitted to the IEEE for possible publication

Head pose estimation (HPE) requires a sophisticated understanding of 3D spatial relationships to generate precise yaw, pitch, and roll angles. Previous HPE models, primarily CNN-based, rely on cropped close-up human head images as inputs and often lack robustness in real-world scenario. Vision Language Models (VLMs) can analyze entire images while focusing on specific objects through their attention mechanisms. In this paper, we propose a novel framework to improve the HPE accuracy by leveraging the object detection grounding capability of a VLM, referred to as CogVLM. We empirically find that directly LoRA fine-tuning of this VLM for the HPE task fails to achieve desirable HPE accuracy, while some model merging methods can improve accuracy but frequently produce blended invalid response formats, struggling to handle both object detection and HPE tasks simultaneously. To integrate HPE capability into CogVLM effectively, we develop a novel LoRA layer-based model merging method. This merging approach applies a high cosine similarity threshold and a winner-takes-all layer selection strategy, aligning attention to the HPE task while preserving original object detection knowledge. It successfully resolves issues with blended invalid response formats and improves accuracy. Results show that our HPE-CogVLM achieves a 31.5\% reduction in Mean Absolute Error over the current state-of-the-art CNN model, 6DRepNet, in cross-dataset evaluation. Furthermore, HPE-CogVLM outperforms both directly LoRA fine-tuned and task arithmetic-based merged VLMs across all HPE metrics.