91精品综合久久久久久五月天_非会员试看十分钟做受小视频_国产精成人品一区_欧美麻豆久久久久久中文_亚洲中文字幕永久无线码_国偷自产一区二区免费_色色色区色色视频

The paper presents the main characteristics and a preliminary implementation of a novel computational framework named CompLog. Inspired by probabilistic programming systems like ProbLog, CompLog builds upon the inferential mechanisms proposed by Simplicity Theory, relying on the computation of two Kolmogorov complexities (here implemented as min-path searches via ASP programs) rather than probabilistic inference. The proposed system enables users to compute ex-post and ex-ante measures of unexpectedness of a certain situation, mapping respectively to posterior and prior subjective probabilities. The computation is based on the specification of world and mental models by means of causal and descriptive relations between predicates weighted by complexity. The paper illustrates a few examples of application: generating relevant descriptions, and providing alternative approaches to disjunction and to negation.

相關內容

ASP

關注 44

ASP是(shi)(shi)(shi)Active Server Page的縮寫，意為(wei)“動態服務器頁面”。ASP是(shi)(shi)(shi)微軟公(gong)司開發的代(dai)替(ti)CGI腳本(ben)程序(xu)的一種應用，它(ta)可以與數據(ju)庫和其它(ta)程序(xu)進行交互，是(shi)(shi)(shi)一種簡單(dan)、方(fang)便(bian)的編程工具(ju)。

MoDELS · Performer · SimPLe · Learning · 可辨認的 ·

2023 年 10 月 25 日

Revisiting Deep Learning Models for Tabular Data

Yury Gorishniy,Ivan Rubachev,Valentin Khrulkov,Artem Babenko

from arxiv, NeurIPS 2021 camera-ready. Code: //github.com/yandex-research/tabular-dl-revisiting-models (v4: minor update)

The existing literature on deep learning for tabular data proposes a wide range of novel architectures and reports competitive results on various datasets. However, the proposed models are usually not properly compared to each other and existing works often use different benchmarks and experiment protocols. As a result, it is unclear for both researchers and practitioners what models perform best. Additionally, the field still lacks effective baselines, that is, the easy-to-use models that provide competitive performance across different problems. In this work, we perform an overview of the main families of DL architectures for tabular data and raise the bar of baselines in tabular DL by identifying two simple and powerful deep architectures. The first one is a ResNet-like architecture which turns out to be a strong baseline that is often missing in prior works. The second model is our simple adaptation of the Transformer architecture for tabular data, which outperforms other solutions on most tasks. Both models are compared to many existing architectures on a diverse set of tasks under the same training and tuning protocols. We also compare the best DL models with Gradient Boosted Decision Trees and conclude that there is still no universally superior solution.

Performer · MoDELS · 相互獨立的 · Learning · 表示 ·

2023 年 10 月 25 日

Learning Independent Program and Architecture Representations for Generalizable Performance Modeling

Lingda Li,Thomas Flynn,Adolfy Hoisie

This paper proposes PerfVec, a novel deep learning-based performance modeling framework that learns high-dimensional, independent/orthogonal program and microarchitecture representations. Once learned, a program representation can be used to predict its performance on any microarchitecture, and likewise, a microarchitecture representation can be applied in the performance prediction of any program. Additionally, PerfVec yields a foundation model that captures the performance essence of instructions, which can be directly used by developers in numerous performance modeling related tasks without incurring its training cost. The evaluation demonstrates that PerfVec is more general, efficient, and accurate than previous approaches.

Analysis · 操作 · 置換 · 回合 · 有向 ·

2023 年 10 月 24 日

A Pure Demand Operational Semantics With Applications to Program Analysis

Scott Smith,Robert Zhang

from arxiv, 32 pages, 21 figures

This paper develops a novel minimal-state operational semantics for higher-order functional languages which uses only the call stack and two source program points as the complete state information: there is no environment, no substitution, no continuation, etc. We prove this form of operational semantics is equivalent to standard presentations. We then show how this approach can open the door to potential new applications: we define a program analysis as a direct finitization of this operational semantics. The program analysis that naturally emerges has a number of novel and interesting properties compared to standard program analyses for higher-order programs: for example, it can infer recurrences, and does not need value widening. We both give a formal definition of the analysis and describe our current implementation.

可理解性 · MoDELS · 相似度 · 可辨認的 · binary ·

2023 年 10 月 23 日

Paraphrase Types for Generation and Detection

Jan Philip Wahle,Bela Gipp,Terry Ruas

from arxiv, Published at EMNLP 2023

Current approaches in paraphrase generation and detection heavily rely on a single general similarity score, ignoring the intricate linguistic properties of language. This paper introduces two new tasks to address this shortcoming by considering paraphrase types - specific linguistic perturbations at particular text positions. We name these tasks Paraphrase Type Generation and Paraphrase Type Detection. Our results suggest that while current techniques perform well in a binary classification scenario, i.e., paraphrased or not, the inclusion of fine-grained paraphrase types poses a significant challenge. While most approaches are good at generating and detecting general semantic similar content, they fail to understand the intrinsic linguistic variables they manipulate. Models trained in generating and identifying paraphrase types also show improvements in tasks without them. In addition, scaling these models further improves their ability to understand paraphrase types. We believe paraphrase types can unlock a new paradigm for developing paraphrase models and solving tasks in the future.

MoDELS · 表示 · 粵港澳大灣區數字經濟研究院 · Subspace · 可辨認的 ·

2023 年 10 月 23 日

Concept Algebra for Score-Based Conditional Models

Zihao Wang,Lin Gui,Jeffrey Negrea,Victor Veitch

This paper concerns the structure of learned representations in text-guided generative models, focusing on score-based models. A key property of such models is that they can compose disparate concepts in a `disentangled' manner. This suggests these models have internal representations that encode concepts in a `disentangled' manner. Here, we focus on the idea that concepts are encoded as subspaces of some representation space. We formalize what this means, show there's a natural choice for the representation, and develop a simple method for identifying the part of the representation corresponding to a given concept. In particular, this allows us to manipulate the concepts expressed by the model through algebraic manipulation of the representation. We demonstrate the idea with examples using Stable Diffusion.

自編碼器 · MoDELS · 優化器 · 潛在 · 可約的 ·

2023 年 10 月 21 日

Exploring Autoencoder-based Error-bounded Compression for Scientific Data

Jinyang Liu,Sheng Di,Kai Zhao,Sian Jin,Dingwen Tao,Xin Liang,Zizhong Chen,Franck Cappello

Error-bounded lossy compression is becoming an indispensable technique for the success of today's scientific projects with vast volumes of data produced during simulations or instrument data acquisitions. Not only can it significantly reduce data size, but it also can control the compression errors based on user-specified error bounds. Autoencoder (AE) models have been widely used in image compression, but few AE-based compression approaches support error-bounding features, which are highly required by scientific applications. To address this issue, we explore using convolutional autoencoders to improve error-bounded lossy compression for scientific data, with the following three key contributions. (1) We provide an in-depth investigation of the characteristics of various autoencoder models and develop an error-bounded autoencoder-based framework in terms of the SZ model. (2) We optimize the compression quality for the main stages in our designed AE-based error-bounded compression framework, fine-tuning the block sizes and latent sizes and also optimizing the compression efficiency of latent vectors. (3) We evaluate our proposed solution using five real-world scientific datasets and compare them with six other related works. Experiments show that our solution exhibits a very competitive compression quality among all the compressors in our tests. In absolute terms, it can obtain a much better compression quality (100% ~ 800% improvement in compression ratio with the same data distortion) compared with SZ2.1 and ZFP in cases with a high compression ratio.

噪聲 · Learning · 上下文窗口 · 單試學習 · 容差 ·

2023 年 10 月 20 日

An Event based Prediction Suffix Tree

Evie Andrew,Travis Monk,André van Schaik

This article introduces the Event based Prediction Suffix Tree (EPST), a biologically inspired, event-based prediction algorithm. The EPST learns a model online based on the statistics of an event based input and can make predictions over multiple overlapping patterns. The EPST uses a representation specific to event based data, defined as a portion of the power set of event subsequences within a short context window. It is explainable, and possesses many promising properties such as fault tolerance, resistance to event noise, as well as the capability for one-shot learning. The computational features of the EPST are examined in a synthetic data prediction task with additive event noise, event jitter, and dropout. The resulting algorithm outputs predicted projections for the near term future of the signal, which may be applied to tasks such as event based anomaly detection or pattern recognition.

泛化理論 · 變換 · 相互獨立的 · 掩碼 · 線性變換 ·

2023 年 10 月 19 日

Sequence Length Independent Norm-Based Generalization Bounds for Transformers

Jacob Trauger,Ambuj Tewari

from arxiv, 18 pages

This paper provides norm-based generalization bounds for the Transformer architecture that do not depend on the input sequence length. We employ a covering number based approach to prove our bounds. We use three novel covering number bounds for the function class of bounded linear transformations to upper bound the Rademacher complexity of the Transformer. Furthermore, we show this generalization bound applies to the common Transformer training technique of masking and then predicting the masked word. We also run a simulated study on a sparse majority data set that empirically validates our theoretical findings.

MoDELS · Integration · 支持向量機 · 模型評估 · AI ·

2023 年 10 月 14 日

Software Metadata Classification based on Generative Artificial Intelligence

Seetharam Killivalavan,Durairaj Thenmozhi

from arxiv, FIRE Track: Information Retrieval in Software Engineering (IRSE), 9 pages

This paper presents a novel approach to enhance the performance of binary code comment quality classification models through the application of Generative Artificial Intelligence (AI). By leveraging the OpenAI API, a dataset comprising 1239 newly generated code-comment pairs, extracted from various GitHub repositories and open-source projects, has been labelled as "Useful" or "Not Useful", and integrated into the existing corpus of 9048 pairs in the C programming language. Employing a cutting-edge Large Language Model Architecture, the generated dataset demonstrates notable improvements in model accuracy. Specifically, when incorporated into the Support Vector Machine (SVM) model, a 6% increase in precision is observed, rising from 0.79 to 0.85. Additionally, the Artificial Neural Network (ANN) model exhibits a 1.5% increase in recall, climbing from 0.731 to 0.746. This paper sheds light on the potential of Generative AI in augmenting code comment quality classification models. The results affirm the effectiveness of this methodology, indicating its applicability in broader contexts within software development and quality assurance domains. The findings underscore the significance of integrating generative techniques to advance the accuracy and efficacy of machine learning models in practical software engineering scenarios.

MoDELS · CLUES · INTERACT · 圖形處理器 · Neural Networks ·

2021 年 1 月 28 日

A Graph-based Relevance Matching Model for Ad-hoc Retrieval

Yufeng Zhang,Jinghao Zhang,Zeyu Cui,Shu Wu,Liang Wang

from arxiv, To appear at AAAI 2021

To retrieve more relevant, appropriate and useful documents given a query, finding clues about that query through the text is crucial. Recent deep learning models regard the task as a term-level matching problem, which seeks exact or similar query patterns in the document. However, we argue that they are inherently based on local interactions and do not generalise to ubiquitous, non-consecutive contextual relationships.In this work, we propose a novel relevance matching model based on graph neural networks to leverage the document-level word relationships for ad-hoc retrieval. In addition to the local interactions, we explicitly incorporate all contexts of a term through the graph-of-word text format. Matching patterns can be revealed accordingly to provide a more accurate relevance score. Our approach significantly outperforms strong baselines on two ad-hoc benchmarks. We also experimentally compare our model with BERT and show our ad-vantages on long documents.