顾美玲国产一区二区三区,日本一区二区三区不卡网站,啊在线不卡视频无码,最黄网站在线观看,无码GV中文一区二区三区

Purpose: In this paper, we present a novel approach to the automatic evaluation of open surgery skills using depth cameras. This work is intended to show that depth cameras achieve similar results to RGB cameras, which is the common method in the automatic evaluation of open surgery skills. Moreover, depth cameras offer advantages such as robustness to lighting variations, camera positioning, simplified data compression, and enhanced privacy, making them a promising alternative to RGB cameras. Methods: Experts and novice surgeons completed two simulators of open suturing. We focused on hand and tool detection, and action segmentation in suturing procedures. YOLOv8 was used for tool detection in RGB and depth videos. Furthermore, UVAST and MSTCN++ were used for action segmentation. Our study includes the collection and annotation of a dataset recorded with Azure Kinect. Results: We demonstrated that using depth cameras in object detection and action segmentation achieves comparable results to RGB cameras. Furthermore, we analyzed 3D hand path length, revealing significant differences between experts and novice surgeons, emphasizing the potential of depth cameras in capturing surgical skills. We also investigated the influence of camera angles on measurement accuracy, highlighting the advantages of 3D cameras in providing a more accurate representation of hand movements. Conclusion: Our research contributes to advancing the field of surgical skill assessment by leveraging depth cameras for more reliable and privacy evaluations. The findings suggest that depth cameras can be valuable in assessing surgical skills and provide a foundation for future research in this area.

相關內容

Kinect

關注 1

Kinect for Xbox 360，簡稱 Kinect，是由微軟開發，應用于 Xbox 360 主機的周邊設備。它讓玩家不需要手持或踩踏控制器，而是使用語音指令或手勢來操作 Xbox 360 的系統界面。它也能捕捉玩家全身上下的動作，用身體來進行游戲，帶給玩家“免控制器的游戲與娛樂體驗”。 2009 年 6 月 1 日微軟于 E3 游戲展中公布名為“Project Natal”（誕生計劃）的感應器，它能夠捕捉使用者的肢體動作，或是進行臉部辨識。感應器也內建麥克風，可以用來識別語音指令。此感應器兼容于所有 Xbox 360 主機，玩家只需新購此感應器就可直接使用。 2010 年的 E3 電玩展，微軟宣布 Project Natal 的正式名稱為“Kinect”，并預計在 2010 年 11 月 4 日于美國上市，建議售價 149 美金。臺灣則在2010 年 11 月 20 日上市。

Unstructured · 知識 (knowledge) · 語言模型化 · Performer · MoDELS ·

2024 年 2 月 29 日

Updating Language Models with Unstructured Facts: Towards Practical Knowledge Editing

Xiaobao Wu,Liangming Pan,William Yang Wang,Anh Tuan Luu

Knowledge editing aims to inject knowledge updates into language models to keep them correct and up-to-date. However, its current evaluation strategies are notably impractical: they solely update with well-curated structured facts (triplets with subjects, relations, and objects), whereas real-world knowledge updates commonly emerge in unstructured texts like news articles. In this paper, we propose a new benchmark, Unstructured Knowledge Editing (UKE). It evaluates editing performance directly using unstructured texts as knowledge updates, termed unstructured facts. Hence UKE avoids the laborious construction of structured facts and enables efficient and responsive knowledge editing, becoming a more practical benchmark. We conduct extensive experiments on newly built datasets and demonstrate that UKE poses a significant challenge to state-of-the-art knowledge editing methods, resulting in their critical performance declines. We further show that this challenge persists even if we extract triplets as structured facts. Our analysis discloses key insights to motivate future research in UKE for more practical knowledge editing.

估計/估計量 · 狀態估計 · Agent · 有偏 · 模型評估 ·

2024 年 2 月 28 日

Dual-IMU State Estimation for Relative Localization of Two Mobile Agents

Wenqian Lai,Ruonan Guo,Kejian J. Wu

In this paper, we address the problem of relative localization of two mobile agents. Specifically, we consider the Dual-IMU system, where each agent is equipped with one IMU, and employs relative pose observations between them. Previous works, however, typically assumed known ego motion and ignored biases of the IMUs. Instead, we study the most general case of unknown biases for both IMUs. Besides the derivation of dynamic model equations of the proposed system, we focus on the observability analysis, for the observability under general motion and the unobservable directions arising from various special motions. Through numerical simulations, we validate our key observability findings and examine their impact on the estimation accuracy and consistency. Finally, the system is implemented to achieve effective relative localization of an HMD with respect to a vehicle moving in the real world.

全局優化 · 優化器 · 樣本 · Lipschitz · Continuity ·

2024 年 2 月 28 日

Stein Boltzmann Sampling: A Variational Approach for Global Optimization

Ga?tan Serré,Argyris Kalogeratos,Nicolas Vayatis

In this paper, we introduce a new flow-based method for global optimization of Lipschitz functions, called Stein Boltzmann Sampling (SBS). Our method samples from the Boltzmann distribution that becomes asymptotically uniform over the set of the minimizers of the function to be optimized. Candidate solutions are sampled via the \emph{Stein Variational Gradient Descent} algorithm. We prove the asymptotic convergence of our method, introduce two SBS variants, and provide a detailed comparison with several state-of-the-art global optimization algorithms on various benchmark functions. The design of our method, the theoretical results, and our experiments, suggest that SBS is particularly well-suited to be used as a continuation of efficient global optimization methods as it can produce better solutions while making a good use of the budget.

Wireless Networks · 簇 · Networks · 優化器 · 塊 ·

2024 年 2 月 28 日

Bandwidth Efficient Livestreaming in Mobile Wireless Networks: A Peer-to-Peer ACIDE Solution

Andrei Negulescu,Weijia Shang

from arxiv, 13 pages, 12 figures, 3 tables, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

In this paper, a media distribution model, Active Control in an Intelligent and Distributed Environment (ACIDE), is proposed for bandwidth efficient livestreaming in mobile wireless networks. Two optimization problems are addressed. The first problem is how to minimize the bandwidth allocated to a cluster of n peers such that a continuous media play for all peers is guaranteed. The second problem is how to find the maximum number of peers n, chosen from a group of N users, that can be admitted to a cluster knowing the given allocated bandwidth, the amount of bandwidth that a base station allocates to a cluster prior to admitting users. Media is sent in packages and each package is divided into n blocks. The distribution of blocks to the peers follows a two-phase, multi-step approach. For the first problem a solution is proposed to find the optimal block sizes such that the allocated bandwidth is minimized, and its lower bound is the bandwidth required for multicasting. The second problem is NP-complete and a greedy strategy is proposed to calculate a near optimal solution for peer selection such that the network capacity, the total number of users who are able to access livestream media, increases.

INTERACT · 跡 · Principle · Python · 可理解性 ·

2024 年 2 月 26 日

Anteater: Interactive Visualization of Program Execution Values in Context

Rebecca Faust,Katherine Isaacs,William Z. Bernstein,Michael Sharp,Carlos Scheidegger

from arxiv, 31 pages, 9 figures, 3 tables

Debugging is famously one the hardest parts in programming. In this paper, we tackle the question: what does a debugging environment look like when we take interactive visualization as a central design principle? We introduce Anteater, an interactive visualization system for tracing and exploring the execution of Python programs. Existing systems often have visualization components built on top of an existing infrastructure. In contrast, Anteater's organization of trace data enables an intermediate representation which can be leveraged to automatically synthesize a variety of visualizations and interactions. These interactive visualizations help with tasks such as discovering important structures in the execution and understanding and debugging unexpected behaviors. To assess the utility of Anteater, we conducted a participant study where programmers completed tasks on their own python programs using Anteater. Finally, we discuss limitations and where further research is needed.

多峰值 · MoDELS · Performer · Integration · 語言模型化 ·

2024 年 2 月 19 日

The (R)Evolution of Multimodal Large Language Models: A Survey

Davide Caffagni,Federico Cocchi,Luca Barsellotti,Nicholas Moratelli,Sara Sarto,Lorenzo Baraldi,Lorenzo Baraldi,Marcella Cornia,Rita Cucchiara

Connecting text and visual modalities plays an essential role in generative intelligence. For this reason, inspired by the success of large language models, significant research efforts are being devoted to the development of Multimodal Large Language Models (MLLMs). These models can seamlessly integrate visual and textual modalities, both as input and output, while providing a dialogue-based interface and instruction-following capabilities. In this paper, we provide a comprehensive review of recent visual-based MLLMs, analyzing their architectural choices, multimodal alignment strategies, and training techniques. We also conduct a detailed analysis of these models across a wide range of tasks, including visual grounding, image generation and editing, visual understanding, and domain-specific applications. Additionally, we compile and describe training datasets and evaluation benchmarks, conducting comparisons among existing models in terms of performance and computational requirements. Overall, this survey offers a comprehensive overview of the current state of the art, laying the groundwork for future MLLMs.

FRN · INFORMS · Networking · MoDELS · 學成 ·

2021 年 4 月 12 日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, YanYan,Shenqi Lai,Zhenhua Chai,Chunhua Shen,Hanzi Wang

from arxiv, IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (CVPR 2021)

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.

Taxonomy · 目標檢測 · 可辨認的 · 評論員 · HTTPS ·

2020 年 3 月 11 日

Imbalance Problems in Object Detection: A Review

Kemal Oksuz,Baris Can Cam,Sinan Kalkan,Emre Akbas

from arxiv, Accepted to IEEE TPAMI; currently in press

In this paper, we present a comprehensive review of the imbalance problems in object detection. To analyze the problems in a systematic manner, we introduce a problem-based taxonomy. Following this taxonomy, we discuss each problem in depth and present a unifying yet critical perspective on the solutions in the literature. In addition, we identify major open issues regarding the existing imbalance problems as well as imbalance problems that have not been discussed before. Moreover, in order to keep our review up to date, we provide an accompanying webpage which catalogs papers addressing imbalance problems, according to our problem-based taxonomy. Researchers can track newer studies on this webpage available at: //github.com/kemaloksuz/ObjectDetectionImbalance .

圖像檢索 · 牛津大學 (University of Oxford) · Extensibility · 數據集 · Performer ·

2018 年 3 月 29 日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Filip Radenovi?,Ahmet Iscen,Giorgos Tolias,Yannis Avrithis,Ond?ej Chum

from arxiv, CVPR 2018

In this paper we address issues with image retrieval benchmarking on standard and popular Oxford 5k and Paris 6k datasets. In particular, annotation errors, the size of the dataset, and the level of challenge are addressed: new annotation for both datasets is created with an extra attention to the reliability of the ground truth. Three new protocols of varying difficulty are introduced. The protocols allow fair comparison between different methods, including those using a dataset pre-processing stage. For each dataset, 15 new challenging queries are introduced. Finally, a new set of 1M hard, semi-automatically cleaned distractors is selected. An extensive comparison of the state-of-the-art methods is performed on the new benchmark. Different types of methods are evaluated, ranging from local-feature-based to modern CNN based methods. The best results are achieved by taking the best of the two worlds. Most importantly, image retrieval appears far from being solved.

秩 · 目標檢測 · Performer · 排序 · DATE ·

2018 年 3 月 14 日

Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects

Md Amirul Islam,Mahmoud Kalash,Neil D. B. Bruce

from arxiv, To appear in CVPR 2018

Salient object detection is a problem that has been considered in detail and many solutions proposed. In this paper, we argue that work to date has addressed a problem that is relatively ill-posed. Specifically, there is not universal agreement about what constitutes a salient object when multiple observers are queried. This implies that some objects are more likely to be judged salient than others, and implies a relative rank exists on salient objects. The solution presented in this paper solves this more general problem that considers relative rank, and we propose data and metrics suitable to measuring success in a relative objects saliency landscape. A novel deep learning solution is proposed based on a hierarchical representation of relative saliency and stage-wise refinement. We also show that the problem of salient object subitizing can be addressed with the same network, and our approach exceeds performance of any prior work across all metrics considered (both traditional and newly proposed).