亚洲十八禁无码在线免费观看,午夜男女爽爽爽免费大片,无码精品无码亚洲一区二区三区,最新亚洲手机在线人成网站,成人福利视频在线观看视频

In this paper, we give a generalization on the error correcting capability of twisted centralizer codes obtained from a fixed rank 1 matrix. In particular, we fix the combinatorial matrix which is obtained by getting the linear combination of the matrix whose all entries are 1 and the identity matrix of order n. Results reveal that such codes have a dimension 1 for any fixed combinatorial matrix and constant a hence having a relatively low information rate due to the way its codewords are constructed, but are found to be maximum distance separable codes.

相關內容

秩

關注 0

詞元分析器 · Analysis · SCAM · 模型評估 · 代價 ·

2024 年 12 月 19 日

From Programming Bugs to Multimillion-Dollar Scams: An Analysis of Trapdoor Tokens on Uniswap

Phuong Duy Huynh,Thisal De Silva,Son Hoang Dau,Xiaodong Li,Iqbal Gondal,Emanuele Viterbo

from arxiv, 22 pages, 11 figures

We investigate in this work a recently emerged type of scam ERC-20 token called Trapdoor, which has cost investors billions of US dollars on Uniswap, the largest decentralised exchange on Ethereum, from 2020 to 2023. In essence, Trapdoor tokens allow users to buy but preventing them from selling by embedding logical bugs and/or owner-only features in their smart contracts. By manually inspecting a number of Trapdoor samples, we established the first systematic classification of Trapdoor tokens and a comprehensive list of techniques that scammers used to embed and conceal malicious codes, accompanied by a detailed analysis of representative scam contracts. In particular, we developed TrapdoorAnalyser, a fine-grained detection tool that generates and crosschecks the error-log of a buy-and-sell test and the list of embedded Trapdoor indicators from a contract-semantic check to reliably identify a Trapdoor token. TrapdoorAnalyser not only outperforms the state-of-the-art commercial tool GoPlus in accuracy, but also provides traces of malicious code with a full explanation, which most of the existing tools lack. Using TrapdoorAnalyser, we constructed the very first dataset of about 30,000 Trapdoor and non-Trapdoor tokens on UniswapV2, which allows us to train several machine learning algorithms that can detect with very high accuracy even Trapdoor tokens with no available Solidity source codes.

MoDELS · 語言模型化 · 可理解性 · contrastive · Integration ·

2024 年 12 月 18 日

TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings

Alexander Shabalin,Viacheslav Meshchaninov,Egor Chimbulatov,Vladislav Lapikov,Roman Kim,Grigory Bartosh,Dmitry Molchanov,Sergey Markov,Dmitry Vetrov

from arxiv, 15 pages, 13 figures

This paper presents the Text Encoding Diffusion Model (TEncDM), a novel approach to diffusion modeling that operates in the space of pre-trained language model encodings. In contrast to traditionally used embeddings, encodings integrate contextual information. In our approach, we also employ a transformer-based decoder, specifically designed to incorporate context in the token prediction process. We conduct a comprehensive examination of the influence of the encoder, decoder, noise scheduler, and self-conditioning on zero-shot generation. Furthermore, we compare TEncDM with previous approaches on three conditional text generation tasks: QQP, XSum, and Wiki-Auto. The results show that TEncDM exhibits superior performance compared to existing non-autoregressive diffusion models. Our code is available at //github.com/M0RJIQUE/tencdm.

CLIP · MoDELS · 可理解性 · 類別 · 對象識別 ·

2024 年 12 月 18 日

Real Classification by Description: Extending CLIP's Limits of Part Attributes Recognition

Ethan Baron,Idan Tankel,Peter Tu,Guy Ben-Yosef

In this study, we define and tackle zero shot "real" classification by description, a novel task that evaluates the ability of Vision-Language Models (VLMs) like CLIP to classify objects based solely on descriptive attributes, excluding object class names. This approach highlights the current limitations of VLMs in understanding intricate object descriptions, pushing these models beyond mere object recognition. To facilitate this exploration, we introduce a new challenge and release description data for six popular fine-grained benchmarks, which omit object names to encourage genuine zero-shot learning within the research community. Additionally, we propose a method to enhance CLIP's attribute detection capabilities through targeted training using ImageNet21k's diverse object categories, paired with rich attribute descriptions generated by large language models. Furthermore, we introduce a modified CLIP architecture that leverages multiple resolutions to improve the detection of fine-grained part attributes. Through these efforts, we broaden the understanding of part-attribute recognition in CLIP, improving its performance in fine-grained classification tasks across six popular benchmarks, as well as in the PACO dataset, a widely used benchmark for object-attribute recognition. Code is available at: //github.com/ethanbar11/grounding_ge_public.

MoDELS · 語音增強 · Processing（編程語言） · 回合 · AIM ·

2024 年 12 月 18 日

Investigating the Effects of Diffusion-based Conditional Generative Speech Models Used for Speech Enhancement on Dysarthric Speech

Joanna Reszka,Parvaneh Janbakhshi,Tilak Purohit,Sadegh Mohammadi

from arxiv, Accepted at ICASSP 2025 Satellite Workshop: Workshop on Speech Pathology Analysis and DEtection (SPADE)

In this study, we aim to explore the effect of pre-trained conditional generative speech models for the first time on dysarthric speech due to Parkinson's disease recorded in an ideal/non-noisy condition. Considering one category of generative models, i.e., diffusion-based speech enhancement, these models are previously trained to learn the distribution of clean (i.e, recorded in a noise-free environment) typical speech signals. Therefore, we hypothesized that when being exposed to dysarthric speech they might remove the unseen atypical paralinguistic cues during the enhancement process. By considering the automatic dysarthric speech detection task, in this study, we experimentally show that during the enhancement process of dysarthric speech data recorded in an ideal non-noisy environment, some of the acoustic dysarthric speech cues are lost. Therefore such pre-trained models are not yet suitable in the context of dysarthric speech enhancement since they manipulate the pathological speech cues when they process clean dysarthric speech. Furthermore, we show that the removed acoustics cues by the enhancement models in the form of residue speech signal can provide complementary dysarthric cues when fused with the original input speech signal in the feature space.

語言模型化 · MoDELS · 代碼 · CodeBERT · Engineering ·

2024 年 12 月 18 日

On the Compression of Language Models for Code: An Empirical Study on CodeBERT

Giordano d'Aloisio,Luca Traini,Federica Sarro,Antinisca Di Marco

Language models have proven successful across a wide range of software engineering tasks, but their significant computational costs often hinder their practical adoption. To address this challenge, researchers have begun applying various compression strategies to improve the efficiency of language models for code. These strategies aim to optimize inference latency and memory usage, though often at the cost of reduced model effectiveness. However, there is still a significant gap in understanding how these strategies influence the efficiency and effectiveness of language models for code. Here, we empirically investigate the impact of three well-known compression strategies -- knowledge distillation, quantization, and pruning -- across three different classes of software engineering tasks: vulnerability detection, code summarization, and code search. Our findings reveal that the impact of these strategies varies greatly depending on the task and the specific compression method employed. Practitioners and researchers can use these insights to make informed decisions when selecting the most appropriate compression strategy, balancing both efficiency and effectiveness based on their specific needs.

極小點 · 累積分布函數 · 泛函 · 向量化 · 樣例 ·

2024 年 12 月 18 日

On the Maximum and Minimum of a Multivariate Poisson Distribution

Zheng Liu,Feifan Shi,Jing Yao,Yang Yang

In this paper, we investigate the cumulative distribution functions (CDFs) of the maximum and minimum of multivariate Poisson distributions with three dependence structures, namely, the common shock, comonotonic shock and thinning-dependence models. In particular, we formulate the definition of a thinning-dependent multivariate Poisson distribution based on Wang and Yuen (2005). We derive explicit CDFs of the maximum and minimum of the multivariate Poisson random vectors and conduct asymptotic analyses on them. Our results reveal the substantial difference between the three dependence structures for multivariate Poisson distribution and may suggest an alternative method for studying the dependence for other multivariate distributions. We further provide numerical examples demonstrating obtained results.

操作 · Processing（編程語言） · Networking · 極大 · M步 ·

2024 年 12 月 17 日

Accelerating the Operation of Complex Workflows through Standard Data Interfaces

Taylor Paul,William Regli

from arxiv, 2 pages, 2 figures, accepted at the 19th Workshop on Workflows in Support of Large-Scale Science (WORKS24), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC24

In this position paper we argue for standardizing how we share and process data in scientific workflows at the network-level to maximize step re-use and workflow portability across platforms and networks in pursuit of a foundational workflow stack. We look to evolve workflows from steps connected point-to-point in a directed acyclic graph (DAG) to steps connected via shared channels in a message system implemented as a network service. To start this evolution, we contribute: a preliminary reference model, architecture, and open tools to implement the architecture today. Our goal stands to improve the deployment and operation of complex workflows by decoupling data sharing and data processing in workflow steps. We seek the workflow community's input on this approach's merit, related research to explore and initial requirements from the workflows community to inform future research.

MoDELS · 語言模型化 · Processing（編程語言） · 變換 · 流 ·

2024 年 12 月 17 日

Harnessing Event Sensory Data for Error Pattern Prediction in Vehicles: A Language Model Approach

Hugo Math,Rainer Lienhart,Robin Sch?n

from arxiv, 10 pages, 8 figures, accepted to AAAI 2025

In this paper, we draw an analogy between processing natural languages and processing multivariate event streams from vehicles in order to predict $\textit{when}$ and $\textit{what}$ error pattern is most likely to occur in the future for a given car. Our approach leverages the temporal dynamics and contextual relationships of our event data from a fleet of cars. Event data is composed of discrete values of error codes as well as continuous values such as time and mileage. Modelled by two causal Transformers, we can anticipate vehicle failures and malfunctions before they happen. Thus, we introduce $\textit{CarFormer}$, a Transformer model trained via a new self-supervised learning strategy, and $\textit{EPredictor}$, an autoregressive Transformer decoder model capable of predicting $\textit{when}$ and $\textit{what}$ error pattern will most likely occur after some error code apparition. Despite the challenges of high cardinality of event types, their unbalanced frequency of appearance and limited labelled data, our experimental results demonstrate the excellent predictive ability of our novel model. Specifically, with sequences of $160$ error codes on average, our model is able with only half of the error codes to achieve $80\%$ F1 score for predicting $\textit{what}$ error pattern will occur and achieves an average absolute error of $58.4 \pm 13.2$h $\textit{when}$ forecasting the time of occurrence, thus enabling confident predictive maintenance and enhancing vehicle safety.

知識 (knowledge) · 估計/估計量 · MoDELS · 潛在 · 知識提取 ·

2024 年 12 月 17 日

Towards Reliable Latent Knowledge Estimation in LLMs: Zero-Prompt Many-Shot Based Factual Knowledge Extraction

Qinyuan Wu,Mohammad Aflah Khan,Soumi Das,Vedant Nanda,Bishwamittra Ghosh,Camila Kolling,Till Speicher,Laurent Bindschaedler,Krishna P. Gummadi,Evimaria Terzi

In this paper, we focus on the challenging task of reliably estimating factual knowledge that is embedded inside large language models (LLMs). To avoid reliability concerns with prior approaches, we propose to eliminate prompt engineering when probing LLMs for factual knowledge. Our approach, called Zero-Prompt Latent Knowledge Estimator (ZP-LKE), leverages the in-context learning ability of LLMs to communicate both the factual knowledge question as well as the expected answer format. Our knowledge estimator is both conceptually simpler (i.e., doesn't depend on meta-linguistic judgments of LLMs) and easier to apply (i.e., is not LLM-specific), and we demonstrate that it can surface more of the latent knowledge embedded in LLMs. We also investigate how different design choices affect the performance of ZP-LKE. Using the proposed estimator, we perform a large-scale evaluation of the factual knowledge of a variety of open-source LLMs, like OPT, Pythia, Llama(2), Mistral, Gemma, etc. over a large set of relations and facts from the Wikidata knowledge base. We observe differences in the factual knowledge between different model families and models of different sizes, that some relations are consistently better known than others but that models differ in the precise facts they know, and differences in the knowledge of base models and their finetuned counterparts. Code available at: //github.com/QinyuanWu0710/ZeroPrompt_LKE

Subspace · 異常點 · 列 · 原點 · Integration ·

2024 年 12 月 17 日

Multi-Subspace Matrix Recovery from Permuted Data

Liangqi Xie,Jicong Fan

from arxiv, The paper was accepted by AAAI 2025

This paper aims to recover a multi-subspace matrix from permuted data: given a matrix, in which the columns are drawn from a union of low-dimensional subspaces and some columns are corrupted by permutations on their entries, recover the original matrix. The task has numerous practical applications such as data cleaning, integration, and de-anonymization, but it remains challenging and cannot be well addressed by existing techniques such as robust principal component analysis because of the presence of multiple subspaces and the permutations on the elements of vectors. To solve the challenge, we develop a novel four-stage algorithm pipeline including outlier identification, subspace reconstruction, outlier classification, and unsupervised sensing for permuted vector recovery. Particularly, we provide theoretical guarantees for the outlier classification step, ensuring reliable multi-subspace matrix recovery. Our pipeline is compared with state-of-the-art competitors on multiple benchmarks and shows superior performance.