一区二区三区四区五区无码,国内精品VA视频在线观看

A Pyramidal Histogram Of Characters (PHOC) represents the spatial location of symbols as binary vectors. The vectors are composed of levels that split a formula into equal-sized regions of one or more types (e.g., rectangles or ellipses). For each region type, this produces a pyramid of overlapping regions, where the first level contains the entire formula, and the final level the finest-grained regions. In this work, we introduce concentric rectangles for regions, and analyze whether subsequent PHOC levels encode redundant information by omitting levels from PHOC configurations. As a baseline, we include a bag of words PHOC containing only the first whole-formula level. Finally, using the ARQMath-3 formula retrieval benchmark, we demonstrate that some levels encoded in the original PHOC configurations are redundant, that PHOC models with rectangular regions outperform earlier PHOC models, and that despite their simplicity, PHOC models are surprisingly competitive with the state-of-the-art. PHOC is not math-specific, and might be used for chemical diagrams, charts, or other graphics.

相關內容

Pyramid

關注 20

Pyramid is a small, fast, down-to-earth Python web application development framework.

MoDELS · Performer · 語言模型化 · Processing（編程語言） · 大語言模型 ·

2024 年 10 月 4 日

Towards a Benchmark for Large Language Models for Business Process Management Tasks

Kiran Busch,Henrik Leopold

An increasing number of organizations are deploying Large Language Models (LLMs) for a wide range of tasks. Despite their general utility, LLMs are prone to errors, ranging from inaccuracies to hallucinations. To objectively assess the capabilities of existing LLMs, performance benchmarks are conducted. However, these benchmarks often do not translate to more specific real-world tasks. This paper addresses the gap in benchmarking LLM performance in the Business Process Management (BPM) domain. Currently, no BPM-specific benchmarks exist, creating uncertainty about the suitability of different LLMs for BPM tasks. This paper systematically compares LLM performance on four BPM tasks focusing on small open-source models. The analysis aims to identify task-specific performance variations, compare the effectiveness of open-source versus commercial models, and assess the impact of model size on BPM task performance. This paper provides insights into the practical applications of LLMs in BPM, guiding organizations in selecting appropriate models for their specific needs.

確切的 · 泛函 · 近似 · Extensibility · DFT計算 ·

2024 年 10 月 3 日

Towards Verifying Exact Conditions for Implementations of Density Functional Approximations

Sameerah Helal,Zhe Tao,Cindy Rubio-González,Francois Gygi,Aditya V. Thakur

from arxiv, Accepted paper at Correctness 2024

Density Functional Theory (DFT) is used extensively in the computation of electronic properties of matter, with various applications. Approximating the exchange-correlation (XC) functional is the key to the Kohn-Sham DFT approach, the basis of most DFT calculations. The choice of this density functional approximation (DFA) depends crucially on the particular system under study, which has resulted in the development of hundreds of DFAs. Though the exact density functional is not known, researchers have discovered analytical properties of this exact functional. Furthermore, these exact conditions are used when designing DFAs. We present XCVerifier, the first approach for verifying whether a DFA implementation satisfies the DFT exact conditions. XCVerifier was evaluated on five DFAs from the popular Libxc library and seven exact conditions from recent work. XCVerifier was able to verify or find violations for a majority of the DFA/condition pairs, demonstrating the feasibility of using formal methods to verify DFA implementations.

3D · 變換 · INFORMS · 相似度 · 表示 ·

2024 年 10 月 3 日

Context and Geometry Aware Voxel Transformer for Semantic Scene Completion

Zhu Yu,Runmin Zhang,Jiacheng Ying,Junchen Yu,Xiaohai Hu,Lun Luo,Si-Yuan Cao,Hui-Liang Shen

from arxiv, NIPS 2024 Spotlight

Vision-based Semantic Scene Completion (SSC) has gained much attention due to its widespread applications in various 3D perception tasks. Existing sparse-to-dense approaches typically employ shared context-independent queries across various input images, which fails to capture distinctions among them as the focal regions of different inputs vary and may result in undirected feature aggregation of cross-attention. Additionally, the absence of depth information may lead to points projected onto the image plane sharing the same 2D position or similar sampling points in the feature map, resulting in depth ambiguity. In this paper, we present a novel context and geometry aware voxel transformer. It utilizes a context aware query generator to initialize context-dependent queries tailored to individual input images, effectively capturing their unique characteristics and aggregating information within the region of interest. Furthermore, it extend deformable cross-attention from 2D to 3D pixel space, enabling the differentiation of points with similar image coordinates based on their depth coordinates. Building upon this module, we introduce a neural network named CGFormer to achieve semantic scene completion. Simultaneously, CGFormer leverages multiple 3D representations (i.e., voxel and TPV) to boost the semantic and geometric representation abilities of the transformed 3D volume from both local and global perspectives. Experimental results demonstrate that CGFormer achieves state-of-the-art performance on the SemanticKITTI and SSCBench-KITTI-360 benchmarks, attaining a mIoU of 16.87 and 20.05, as well as an IoU of 45.99 and 48.07, respectively. Remarkably, CGFormer even outperforms approaches employing temporal images as inputs or much larger image backbone networks.

Things · 有向 · MoDELS · Integration · Automator ·

2024 年 10 月 3 日

Research Directions and Modeling Guidelines for Industrial Internet of Things Applications

Giampaolo Cuozzo,Enrico Testi,Salvatore Riolo,Luciano Miuccio,Gianluca Cena,Gianni Pasolini,Luca De Nardis,Daniela Panno,Marco Chiani,Maria-Gabriella Di Benedetto,Enrico Buracchini,Roberto Verdone

The Industrial Internet of Things (IIoT) paradigm has emerged as a transformative force, revolutionizing industrial processes by integrating advanced wireless technologies into traditional procedures to enhance their efficiency. The importance of this paradigm shift has produced a massive, yet heterogeneous, proliferation of scientific contributions. However, these works lack a standardized and cohesive characterization of the IIoT framework coming from different entities, like the 3rd Generation Partnership Project (3GPP) or the 5G Alliance for Connected Industries and Automation (5G-ACIA), resulting in divergent perspectives and potentially hindering interoperability. To bridge this gap, this article offers a unified characterization of (i) the main IIoT application domains, (ii) their respective requirements, (iii) the principal technological gaps existing in the current literature, and, most importantly, (iv) we propose a systematic approach for assessing and addressing the identified research challenges. Therefore, this article serves as a roadmap for future research endeavors, promoting a unified vision of the IIoT paradigm and fostering collaborative efforts to advance the field.

優化器 · 約束 · 代價 · 代價函數 · 泛函 ·

2024 年 10 月 3 日

Optimal Digital Twinning of Random Systems with Twinning Rate Constraints

Caglar Tunc

With the massive advancements in processing power, Digital Twins (DTs) have become powerful tools to monitor and analyze physical entities. However, due to the potentially very high number of Physical Systems (PSs) to be tracked and emulated, for instance, in a factory environment or an Internet of Things (IoT) network, continuous twinning might become infeasible. In this paper, a DT system is investigated with a set of random PSs, where the twinning rate is limited due to resource constraints. Three cost functions are considered to quantify and penalize the twinning delay. For these cost functions, the optimal twinning problem under twinning rate constraints is formulated. In a numerical example, the proposed cost functions are evaluated for two, one push-based and one pull-based, benchmark twinning policies. The proposed methodology is the first to investigate the optimal twinning problem with random PSs and twinning rate constraints, and serves as a guideline for real-world implementations on how frequently PSs should be twinned.

動力系統 · MoDELS · 上下文向量 · Performer · 線性的 ·

2024 年 10 月 2 日

Neural Context Flows for Meta-Learning of Dynamical Systems

Roussel Desmond Nzoyem,David A. W. Barton,Tom Deakin

from arxiv, 31 pages, 19 figures, 8 tables

Neural Ordinary Differential Equations (NODEs) often struggle to adapt to new dynamic behaviors caused by parameter changes in the underlying system, even when these dynamics are similar to previously observed behaviors. This problem becomes more challenging when the changing parameters are unobserved, meaning their value or influence cannot be directly measured when collecting data. To address this issue, we introduce Neural Context Flow (NCF), a robust and interpretable Meta-Learning framework that includes uncertainty estimation. NCF uses higher-order Taylor expansion to enable contextual self-modulation, allowing context vectors to influence dynamics from other domains while also modulating themselves. After establishing convergence guarantees, we empirically test NCF and compare it to related adaptation methods. Our results show that NCF achieves state-of-the-art Out-of-Distribution performance on 5 out of 6 linear and non-linear benchmark problems. Through extensive experiments, we explore the flexible model architecture of NCF and the encoded representations within the learned context vectors. Our findings highlight the potential implications of NCF for foundational models in the physical sciences, offering a promising approach to improving the adaptability and generalization of NODEs in various scientific applications. Our code is openly available at \url{//github.com/ddrous/ncflow}.

Vision · 機器人 · 數據集 · Liquid · Atom（文本編輯器） ·

2024 年 10 月 2 日

Toward Zero-Shot Learning for Visual Dehazing of Urological Surgical Robots

Renkai Wu,Xianjin Wang,Pengchen Liang,Zhenyu Zhang,Qing Chang,Hao Tang

Robot-assisted surgery has profoundly influenced current forms of minimally invasive surgery. However, in transurethral suburethral urological surgical robots, they need to work in a liquid environment. This causes vaporization of the liquid when shearing and heating is performed, resulting in bubble atomization that affects the visual perception of the robot. This can lead to the need for uninterrupted pauses in the surgical procedure, which makes the surgery take longer. To address the atomization characteristics of liquids under urological surgical robotic vision, we propose an unsupervised zero-shot dehaze method (RSF-Dehaze) for urological surgical robotic vision. Specifically, the proposed Region Similarity Filling Module (RSFM) of RSF-Dehaze significantly improves the recovery of blurred region tissues. In addition, we organize and propose a dehaze dataset for robotic vision in urological surgery (USRobot-Dehaze dataset). In particular, this dataset contains the three most common urological surgical robot operation scenarios. To the best of our knowledge, we are the first to organize and propose a publicly available dehaze dataset for urological surgical robot vision. The proposed RSF-Dehaze proves the effectiveness of our method in three urological surgical robot operation scenarios with extensive comparative experiments with 20 most classical and advanced dehazing and image recovery algorithms. The proposed source code and dataset are available at //github.com/wurenkai/RSF-Dehaze .

Analysis · MoDELS · 推斷 · 語言模型化 · TOOLS ·

2024 年 10 月 1 日

An Empirical Study of Large Language Models for Type and Call Graph Analysis

Ashwin Prasad Shivarpatna Venkatesh,Rose Sunil,Samkutty Sabu,Amir M. Mir,Sofia Reis,Eric Bodden

from arxiv, Pre-print: Submitted to EMSE journal for review

Large Language Models (LLMs) are increasingly being explored for their potential in software engineering, particularly in static analysis tasks. In this study, we investigate the potential of current LLMs to enhance call-graph analysis and type inference for Python and JavaScript programs. We empirically evaluated 24 LLMs, including OpenAI's GPT series and open-source models like LLaMA and Mistral, using existing and newly developed benchmarks. Specifically, we enhanced TypeEvalPy, a micro-benchmarking framework for type inference in Python, with auto-generation capabilities, expanding its scope from 860 to 77,268 type annotations for Python. Additionally, we introduced SWARM-CG and SWARM-JS, comprehensive benchmarking suites for evaluating call-graph construction tools across multiple programming languages. Our findings reveal a contrasting performance of LLMs in static analysis tasks. For call-graph generation in Python, traditional static analysis tools like PyCG significantly outperform LLMs. In JavaScript, the static tool TAJS underperforms due to its inability to handle modern language features, while LLMs, despite showing potential with models like mistral-large-it-2407-123b and GPT-4o, struggle with completeness and soundness in both languages for call-graph analysis. Conversely, LLMs demonstrate a clear advantage in type inference for Python, surpassing traditional tools like HeaderGen and hybrid approaches such as HiTyper. These results suggest that while LLMs hold promise in type inference, their limitations in call-graph analysis highlight the need for further research. Our study provides a foundation for integrating LLMs into static analysis workflows, offering insights into their strengths and current limitations.

2023 年 5 月 31 日

A Survey on Large Language Models for Recommendation

Likang Wu,Zhi Zheng,Zhaopeng Qiu,Hao Wang,Hongchao Gu,Tingjia Shen,Chuan Qin,Chen Zhu,Hengshu Zhu,Qi Liu,Hui Xiong,Enhong Chen

from arxiv, 10 pages, 3 figures

Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS). These models, trained on massive amounts of data using self-supervised learning, have demonstrated remarkable success in learning universal representations and have the potential to enhance various aspects of recommendation systems by some effective transfer techniques such as fine-tuning and prompt tuning, and so on. The crucial aspect of harnessing the power of language models in enhancing recommendation quality is the utilization of their high-quality representations of textual features and their extensive coverage of external knowledge to establish correlations between items and users. To provide a comprehensive understanding of the existing LLM-based recommendation systems, this survey presents a taxonomy that categorizes these models into two major paradigms, respectively Discriminative LLM for Recommendation (DLLM4Rec) and Generative LLM for Recommendation (GLLM4Rec), with the latter being systematically sorted out for the first time. Furthermore, we systematically review and analyze existing LLM-based recommendation systems within each paradigm, providing insights into their methodologies, techniques, and performance. Additionally, we identify key challenges and several valuable findings to provide researchers and practitioners with inspiration.

視覺問答 · 自動問答 · MoDELS · 可辨認的 · 注意力機制 ·

2018 年 2 月 15 日

Learning to Count Objects in Natural Images for Visual Question Answering

Yan Zhang,Jonathon Hare,Adam Prügel-Bennett

from arxiv, Published in ICLR 2018

Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.