爱琴海论坛视频播放三免费_国产欧美日韩视频一区二区_久久久久成人精品一区二区_亚洲国区二区三区_WWW欧美三级片亚洲三级片_91在线视频网站首页_久久久久久中文字幕大全免费看

Screening colonoscopy is an important clinical application for several 3D computer vision techniques, including depth estimation, surface reconstruction, and missing region detection. However, the development, evaluation, and comparison of these techniques in real colonoscopy videos remain largely qualitative due to the difficulty of acquiring ground truth data. In this work, we present a Colonoscopy 3D Video Dataset (C3VD) acquired with a high definition clinical colonoscope and high-fidelity colon models for benchmarking computer vision methods in colonoscopy. We introduce a novel multimodal 2D-3D registration technique to register optical video sequences with ground truth rendered views of a known 3D model. The different modalities are registered by transforming optical images to depth maps with a Generative Adversarial Network and aligning edge features with an evolutionary optimizer. This registration method achieves an average translation error of 0.321 millimeters and an average rotation error of 0.159 degrees in simulation experiments where error-free ground truth is available. The method also leverages video information, improving registration accuracy by 55.6% for translation and 60.4% for rotation compared to single frame registration. 22 short video sequences were registered to generate 10,015 total frames with paired ground truth depth, surface normals, optical flow, occlusion, six degree-of-freedom pose, coverage maps, and 3D models. The dataset also includes screening videos acquired by a gastroenterologist with paired ground truth pose and 3D surface models. The dataset and registration source code are available at durr.jhu.edu/C3VD.

相關內容

真實值

關注 0

示例 · 值域 · Vision · Learning · MoDELS ·

2023 年 10 月 19 日

Deep Learning Techniques for Video Instance Segmentation: A Survey

Chenhao Xu,Chang-Tsun Li,Yongjian Hu,Chee Peng Lim,Douglas Creighton

Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applications (e.g., human action recognition, medical image processing, autonomous vehicle navigation, surveillance, etc) can be implemented. As deep-learning techniques take a dominant role in various computer vision areas, a plethora of deep-learning-based video instance segmentation schemes have been proposed. This survey offers a multifaceted view of deep-learning schemes for video instance segmentation, covering various architectural paradigms, along with comparisons of functional performance, model complexity, and computational overheads. In addition to the common architectural designs, auxiliary techniques for improving the performance of deep-learning models for video instance segmentation are compiled and discussed. Finally, we discuss a range of major challenges and directions for further investigations to help advance this promising research field.

估計/估計量 · 協變量偏移 · Conformer · CP · MoDELS ·

2023 年 10 月 18 日

Conformal Drug Property Prediction with Density Estimation under Covariate Shift

Siddhartha Laghuvarapu,Zhen Lin,Jimeng Sun

from arxiv, Accepted at NeurIPS 2023

In drug discovery, it is vital to confirm the predictions of pharmaceutical properties from computational models using costly wet-lab experiments. Hence, obtaining reliable uncertainty estimates is crucial for prioritizing drug molecules for subsequent experimental validation. Conformal Prediction (CP) is a promising tool for creating such prediction sets for molecular properties with a coverage guarantee. However, the exchangeability assumption of CP is often challenged with covariate shift in drug discovery tasks: Most datasets contain limited labeled data, which may not be representative of the vast chemical space from which molecules are drawn. To address this limitation, we propose a method called CoDrug that employs an energy-based model leveraging both training data and unlabelled data, and Kernel Density Estimation (KDE) to assess the densities of a molecule set. The estimated densities are then used to weigh the molecule samples while building prediction sets and rectifying for distribution shift. In extensive experiments involving realistic distribution drifts in various small-molecule drug discovery tasks, we demonstrate the ability of CoDrug to provide valid prediction sets and its utility in addressing the distribution shift arising from de novo drug design models. On average, using CoDrug can reduce the coverage gap by over 35% when compared to conformal prediction sets not adjusted for covariate shift.

可辨認的 · Automator · Prompt · Guidance · 語言模型化 ·

2023 年 10 月 17 日

Evaluating LLMs for Privilege-Escalation Scenarios

Andreas Happe,Aaron Kaplan,Jürgen Cito

Penetration testing, an essential component of cybersecurity, allows organizations to proactively identify and remediate vulnerabilities in their systems, thus bolstering their defense mechanisms against potential cyberattacks. One recent advancement in the realm of penetration testing is the utilization of Language Models (LLMs). We explore the intersection of LLMs and penetration testing to gain insight into their capabilities and challenges in the context of privilige escalation. We create an automated Linux privilege-escalation benchmark utilizing local virtual machines. We introduce an LLM-guided privilege-escalation tool designed for evaluating different LLMs and prompt strategies against our benchmark. We analyze the impact of different prompt designs, the benefits of in-context learning, and the advantages of offering high-level guidance to LLMs. We discuss challenging areas for LLMs, including maintaining focus during testing, coping with errors, and finally comparing them with both stochastic parrots as well as with human hackers.

操作 · INTERACT · AIM · 相關系數 · Better ·

2023 年 10 月 17 日

Towards Operationalizing Social Bonding in Human-Robot Dyads

Imran Khan

With momentum increasing in the use of social robots as long-term assistive and collaborative partners, humans developing social bonds with these artificial agents appears to be inevitable. In human-human dyads, social bonding plays a powerful role in regulating behaviours, emotions, and even health. If this is to extend to human-robot dyads, the phenomenology of such relationships (including their emergence and stability) must be better understood. In this paper, we discuss potential approaches towards operationalizing the phenomenon of social bonding between human-robot dyads. We will discuss a number of biobehavioural proxies of social bonding, moving away from existing approaches that use subjective, psychological measures, and instead grounding our approach in some of the evolutionary, neurobiological and physiological correlates of social bond formation in natural systems: (a) reductions in physiological stress (the ''social buffering'' phenomenon), (b) narrowing of spatial proximity between dyads, and (c) inter-dyad behavioural synchrony. We provide relevant evolutionary support for each proposed component, with suggestions and considerations for how they can be recorded in (real-time) human-robot interaction scenarios. With this, we aim to inspire more robust operationalisation of ''social bonding'' between human and artificial (robotic) agents.

知識提取 · Performer · MoDELS · Vision · 知識 (knowledge) ·

2023 年 10 月 17 日

Knowledge Extraction and Distillation from Large-Scale Image-Text Colonoscopy Records Leveraging Large Language and Vision Models

Shuo Wang,Yan Zhu,Xiaoyuan Luo,Zhiwei Yang,Yizhe Zhang,Peiyao Fu,Manning Wang,Zhijian Song,Quanlin Li,Pinghong Zhou,Yike Guo

The development of artificial intelligence systems for colonoscopy analysis often necessitates expert-annotated image datasets. However, limitations in dataset size and diversity impede model performance and generalisation. Image-text colonoscopy records from routine clinical practice, comprising millions of images and text reports, serve as a valuable data source, though annotating them is labour-intensive. Here we leverage recent advancements in large language and vision models and propose EndoKED, a data mining paradigm for deep knowledge extraction and distillation. EndoKED automates the transformation of raw colonoscopy records into image datasets with pixel-level annotation. We validate EndoKED using multi-centre datasets of raw colonoscopy records (~1 million images), demonstrating its superior performance in training polyp detection and segmentation models. Furthermore, the EndoKED pre-trained vision backbone enables data-efficient and generalisable learning for optical biopsy, achieving expert-level performance in both retrospective and prospective validation.

Learning · Extensibility · 查準率/準確率 · 操作 · Integration ·

2023 年 10 月 17 日

Hyperspectral In-Memory Computing with Optical Frequency Combs and Programmable Optical Memories

Mostafa Honari Latifpour,Byoung Jun Park,Yoshihisa Yamamoto,Myoung-Gyun Suh

The rapid advancements in machine learning across numerous industries have amplified the demand for extensive matrix-vector multiplication operations, thereby challenging the capacities of traditional von Neumann computing architectures. To address this, researchers are currently exploring alternatives such as in-memory computing systems to develop faster and more energy-efficient hardware. In particular, there is renewed interest in computing systems based on optics, which could potentially handle matrix-vector multiplication in a more energy-efficient way. Despite promising initial results, developing a highly parallel, programmable, and scalable optical computing system capable of rivaling electronic computing hardware still remains elusive. In this context, we propose a hyperspectral in-memory computing architecture that integrates space multiplexing with frequency multiplexing of optical frequency combs and uses spatial light modulators as a programmable optical memory, thereby boosting the computational throughput and the energy efficiency. We have experimentally demonstrated multiply-accumulate operations with higher than 4-bit precision in both matrix-vector and matrix-matrix multiplications, which suggests the system's potential for a wide variety of deep learning and optimization tasks. This system exhibits extraordinary modularity, scalability, and programmability, effectively transcending the traditional limitations of optics-based computing architectures. Our approach demonstrates the potential to scale beyond peta operations per second, marking a significant step towards achieving high-throughput energy-efficient optical computing.

圖 · 核化 · 圖形處理器 · Neural Networks · Networking ·

2023 年 10 月 16 日

Neural Tangent Kernels Motivate Graph Neural Networks with Cross-Covariance Graphs

Shervin Khalafi,Saurabh Sihag,Alejandro Ribeiro

Neural tangent kernels (NTKs) provide a theoretical regime to analyze the learning and generalization behavior of over-parametrized neural networks. For a supervised learning task, the association between the eigenvectors of the NTK kernel and given data (a concept referred to as alignment in this paper) can govern the rate of convergence of gradient descent, as well as generalization to unseen data. Building upon this concept, we investigate NTKs and alignment in the context of graph neural networks (GNNs), where our analysis reveals that optimizing alignment translates to optimizing the graph representation or the graph shift operator in a GNN. Our results further establish the theoretical guarantees on the optimality of the alignment for a two-layer GNN and these guarantees are characterized by the graph shift operator being a function of the cross-covariance between the input and the output data. The theoretical insights drawn from the analysis of NTKs are validated by our experiments focused on a multi-variate time series prediction task for a publicly available dataset. Specifically, they demonstrate that GNNs with cross-covariance as the graph shift operator indeed outperform those that operate on the covariance matrix from only the input data.

知識 (knowledge) · Machine Learning · MoDELS · 學成 · Conformer ·

2022 年 5 月 10 日

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Julian W?rmann,Daniel Bogdoll,Etienne Bührle,Han Chen,Evaristus Fuh Chuo,Kostadin Cvejoski,Ludger van Elst,Tobias Glei?ner,Philip Gottschall,Stefan Griesche,Christian Hellert,Christian Hesels,Sebastian Houben,Tim Joseph,Niklas Keil,Johann Kelsch,Hendrik K?nigshof,Erwin Kraft,Leonie Kreuser,Kevin Krone,Tobias Latka,Denny Mattern,Stefan Matthes,Mohsin Munir,Moritz Nekolla,Adrian Paschke,Maximilian Alexander Pintz,Tianming Qiu,Faraz Qureishi,Syed Tahseen Raza Rizvi,J?rg Reichardt,Laura von Rueden,Stefan Rudolph,Alexander Sagel,Gerhard Schunk,Hao Shen,Hendrik Stapelbroek,Vera Stehr,Gurucharan Srinivas,Anh Tuan Tran,Abhishek Vivekanandan,Ya Wang,Florian Wasserrab,Tino Werner,Christian Wirth,Stefan Zwicklbauer

from arxiv, 93 pages

The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

INFORMS · 可辨認的 · Networking · Neural Networks · 黑盒 ·

2021 年 10 月 4 日

Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Yang Zhang,Ashkan Khakzar,Yawei Li,Azade Farshad,Seong Tae Kim,Nassir Navab

from arxiv, Accepted in NeurIPS 2021 (Neural Information Processing Systems)

One principal approach for illuminating a black-box neural network is feature attribution, i.e. identifying the importance of input features for the network's prediction. The predictive information of features is recently proposed as a proxy for the measure of their importance. So far, the predictive information is only identified for latent features by placing an information bottleneck within the network. We propose a method to identify features with predictive information in the input domain. The method results in fine-grained identification of input features' information and is agnostic to network architecture. The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through. We compare our method with several feature attribution methods using mainstream feature attribution evaluation experiments. The code is publicly available.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.