又黄又爽又色的视频免费-亚洲综合在线观看一区二区三区

In this paper, we introduce HoughToRadon Transform layer, a novel layer designed to improve the speed of neural networks incorporated with Hough Transform to solve semantic image segmentation problems. By placing it after a Hough Transform layer, "inner" convolutions receive modified feature maps with new beneficial properties, such as a smaller area of processed images and parameter space linearity by angle and shift. These properties were not presented in Hough Transform alone. Furthermore, HoughToRadon Transform layer allows us to adjust the size of intermediate feature maps using two new parameters, thus allowing us to balance the speed and quality of the resulting neural network. Our experiments on the open MIDV-500 dataset show that this new approach leads to time savings in document segmentation tasks and achieves state-of-the-art 97.7% accuracy, outperforming HoughEncoder with larger computational complexity.

相關內容

層

關注 2

變換 · state-of-the-art · 3D · 歸納偏好 · HTTPS ·

2024 年 3 月 18 日

VIHE: Virtual In-Hand Eye Transformer for 3D Robotic Manipulation

Weiyao Wang,Yutian Lei,Gregory D. Hage,Liangjun Zhang

In this work, we introduce the Virtual In-Hand Eye Transformer (VIHE), a novel method designed to enhance 3D manipulation capabilities through action-aware view rendering. VIHE autoregressively refines actions in multiple stages by conditioning on rendered views posed from action predictions in the earlier stages. These virtual in-hand views provide a strong inductive bias for effectively recognizing the correct pose for the hand, especially for challenging high-precision tasks such as peg insertion. On 18 manipulation tasks in RLBench simulated environments, VIHE achieves a new state-of-the-art, with a 12% absolute improvement, increasing from 65% to 77% over the existing state-of-the-art model using 100 demonstrations per task. In real-world scenarios, VIHE can learn manipulation tasks with just a handful of demonstrations, highlighting its practical utility. Videos and code implementation can be found at our project site: //vihe-3d.github.io.

可理解性 · 在線 · 論文 · Analysis · 模態 ·

2024 年 3 月 17 日

HarmPot: An Annotation Framework for Evaluating Offline Harm Potential of Social Media Text

Ritesh Kumar,Ojaswee Bhalla,Madhu Vanthi,Shehlat Maknoon Wani,Siddharth Singh

from arxiv, Accepted for: LREC COLING 2024

In this paper, we discuss the development of an annotation schema to build datasets for evaluating the offline harm potential of social media texts. We define "harm potential" as the potential for an online public post to cause real-world physical harm (i.e., violence). Understanding that real-world violence is often spurred by a web of triggers, often combining several online tactics and pre-existing intersectional fissures in the social milieu, to result in targeted physical violence, we do not focus on any single divisive aspect (i.e., caste, gender, religion, or other identities of the victim and perpetrators) nor do we focus on just hate speech or mis/dis-information. Rather, our understanding of the intersectional causes of such triggers focuses our attempt at measuring the harm potential of online content, irrespective of whether it is hateful or not. In this paper, we discuss the development of a framework/annotation schema that allows annotating the data with different aspects of the text including its socio-political grounding and intent of the speaker (as expressed through mood and modality) that together contribute to it being a trigger for offline harm. We also give a comparative analysis and mapping of our framework with some of the existing frameworks.

大語言模型 · MoDELS · 語言模型化 · 命名實體識別 · Extensibility ·

2024 年 3 月 15 日

Redefining Developer Assistance: Through Large Language Models in Software Ecosystem

Somnath Banerjee,Avik Dutta,Sayan Layek,Amruit Sahoo,Sam Conrad Joyce,Rima Hazra

from arxiv, Under review

In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate technical documentation, enhancing developer capability in software specific tasks. The creation of DevAssistLlama involved constructing an extensive instruction dataset from various software systems, enabling effective handling of Named Entity Recognition (NER), Relation Extraction (RE), and Link Prediction (LP). Our results demonstrate DevAssistLlama's superior capabilities in these tasks, in comparison with other models including ChatGPT. This research not only highlights the potential of specialized LLMs in software development also the pioneer LLM for this domain.

TEAM · 情景 · MoDELS · 模型評估 · 語言模型化 ·

2024 年 3 月 15 日

Team Trifecta at Factify5WQA: Setting the Standard in Fact Verification with Fine-Tuning

Shang-Hsuan Chiang,Ming-Chih Lo,Lin-Wei Chao,Wen-Chih Peng

from arxiv, Accepted by AAAI 2024 Workshop: FACTIFY 3.0 - Workshop Series on Multimodal Fact-Checking and Hate Speech Detection

In this paper, we present Pre-CoFactv3, a comprehensive framework comprised of Question Answering and Text Classification components for fact verification. Leveraging In-Context Learning, Fine-tuned Large Language Models (LLMs), and the FakeNet model, we address the challenges of fact verification. Our experiments explore diverse approaches, comparing different Pre-trained LLMs, introducing FakeNet, and implementing various ensemble methods. Notably, our team, Trifecta, secured first place in the AAAI-24 Factify 3.0 Workshop, surpassing the baseline accuracy by 103% and maintaining a 70% lead over the second competitor. This success underscores the efficacy of our approach and its potential contributions to advancing fact verification research.

估計/估計量 · 3D · Performer · Automator · 穩健性 ·

2024 年 3 月 14 日

ThermoHands: A Benchmark for 3D Hand Pose Estimation from Egocentric Thermal Image

Fangqiang Ding,Yunzhou Zhu,Xiangyu Wen,Chris Xiaoxuan Lu

from arxiv, 20 pages, 6 pages, 5 tables

In this work, we present ThermoHands, a new benchmark for thermal image-based egocentric 3D hand pose estimation, aimed at overcoming challenges like varying lighting and obstructions (e.g., handwear). The benchmark includes a diverse dataset from 28 subjects performing hand-object and hand-virtual interactions, accurately annotated with 3D hand poses through an automated process. We introduce a bespoken baseline method, TheFormer, utilizing dual transformer modules for effective egocentric 3D hand pose estimation in thermal imagery. Our experimental results highlight TheFormer's leading performance and affirm thermal imaging's effectiveness in enabling robust 3D hand pose estimation in adverse conditions.

控制器 · 級聯 · Microsoft Surface · MoDELS · Learning ·

2024 年 3 月 14 日

C3D: Cascade Control with Change Point Detection and Deep Koopman Learning for Autonomous Surface Vehicles

Jianwen Li,Hyunsang Park,Wenjian Hao,Lei Xin,Jalil Chavez-Galaviz,Ajinkya Chaudhary,Meredith Bloss,Kyle Pattison,Christopher Vo,Devesh Upadhyay,Shreyas Sundaram,Shaoshuai Mou,Nina Mahmoudian

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

In this paper, we discuss the development and deployment of a robust autonomous system capable of performing various tasks in the maritime domain under unknown dynamic conditions. We investigate a data-driven approach based on modular design for ease of transfer of autonomy across different maritime surface vessel platforms. The data-driven approach alleviates issues related to a priori identification of system models that may become deficient under evolving system behaviors or shifting, unanticipated, environmental influences. Our proposed learning-based platform comprises a deep Koopman system model and a change point detector that provides guidance on domain shifts prompting relearning under severe exogenous and endogenous perturbations. Motion control of the autonomous system is achieved via an optimal controller design. The Koopman linearized model naturally lends itself to a linear-quadratic regulator (LQR) control design. We propose the C3D control architecture Cascade Control with Change Point Detection and Deep Koopman Learning. The framework is verified in station keeping task on an ASV in both simulation and real experiments. The approach achieved at least 13.9 percent improvement in mean distance error in all test cases compared to the methods that do not consider system changes.

知識 (knowledge) · 蒸餾 · 目標檢測 · Performance · MoDELS ·

2024 年 3 月 14 日

Knowledge Distillation in YOLOX-ViT for Side-Scan Sonar Object Detection

Martin Aubard,László Antal,Ana Madureira,Erika ábrahám

In this paper we present YOLOX-ViT, a novel object detection model, and investigate the efficacy of knowledge distillation for model size reduction without sacrificing performance. Focused on underwater robotics, our research addresses key questions about the viability of smaller models and the impact of the visual transformer layer in YOLOX. Furthermore, we introduce a new side-scan sonar image dataset, and use it to evaluate our object detector's performance. Results show that knowledge distillation effectively reduces false positives in wall detection. Additionally, the introduced visual transformer layer significantly improves object detection accuracy in the underwater environment. The source code of the knowledge distillation in the YOLOX-ViT is at //github.com/remaro-network/KD-YOLOX-ViT.

代價 · 回合 · 流 · MoDELS · 操作 ·

2024 年 3 月 13 日

COSTREAM: Learned Cost Models for Operator Placement in Edge-Cloud Environments

Roman Heinrich,Carsten Binnig,Harald Kornmayer,Manisha Luthra

from arxiv, This paper has been accepted by IEEE ICDE 2024

In this work, we present COSTREAM, a novel learned cost model for Distributed Stream Processing Systems that provides accurate predictions of the execution costs of a streaming query in an edge-cloud environment. The cost model can be used to find an initial placement of operators across heterogeneous hardware, which is particularly important in these environments. In our evaluation, we demonstrate that COSTREAM can produce highly accurate cost estimates for the initial operator placement and even generalize to unseen placements, queries, and hardware. When using COSTREAM to optimize the placements of streaming operators, a median speed-up of around 21x can be achieved compared to baselines.

全局優化 · 優化器 · 樣本 · Lipschitz · Continuity ·

2024 年 3 月 13 日

Stein Boltzmann Sampling: A Variational Approach for Global Optimization

Ga?tan Serré,Argyris Kalogeratos,Nicolas Vayatis

In this paper, we introduce a new flow-based method for global optimization of Lipschitz functions, called Stein Boltzmann Sampling (SBS). Our method samples from the Boltzmann distribution that becomes asymptotically uniform over the set of the minimizers of the function to be optimized. Candidate solutions are sampled via the \emph{Stein Variational Gradient Descent} algorithm. We prove the asymptotic convergence of our method, introduce two SBS variants, and provide a detailed comparison with several state-of-the-art global optimization algorithms on various benchmark functions. The design of our method, the theoretical results, and our experiments, suggest that SBS is particularly well-suited to be used as a continuation of efficient global optimization methods as it can produce better solutions while making a good use of the budget.

Taxonomy · 目標檢測 · 可辨認的 · 評論員 · HTTPS ·

2020 年 3 月 11 日

Imbalance Problems in Object Detection: A Review

Kemal Oksuz,Baris Can Cam,Sinan Kalkan,Emre Akbas

from arxiv, Accepted to IEEE TPAMI; currently in press

In this paper, we present a comprehensive review of the imbalance problems in object detection. To analyze the problems in a systematic manner, we introduce a problem-based taxonomy. Following this taxonomy, we discuss each problem in depth and present a unifying yet critical perspective on the solutions in the literature. In addition, we identify major open issues regarding the existing imbalance problems as well as imbalance problems that have not been discussed before. Moreover, in order to keep our review up to date, we provide an accompanying webpage which catalogs papers addressing imbalance problems, according to our problem-based taxonomy. Researchers can track newer studies on this webpage available at: //github.com/kemaloksuz/ObjectDetectionImbalance .