销魂美女一区二区三区AV_亚洲日韩网站在线观看_人人橹在线观看视频97_国产欧美在线亚洲一区_午夜小视频在线观看_又黄又潮的视频免费观看_一卡二卡三卡四卡AV在线播放

Nowadays neural-network-based image- and video-quality metrics show better performance compared to traditional methods. However, they also became more vulnerable to adversarial attacks that increase metrics' scores without improving visual quality. The existing benchmarks of quality metrics compare their performance in terms of correlation with subjective quality and calculation time. However, the adversarial robustness of image-quality metrics is also an area worth researching. In this paper, we analyse modern metrics' robustness to different adversarial attacks. We adopted adversarial attacks from computer vision tasks and compared attacks' efficiency against 15 no-reference image/video-quality metrics. Some metrics showed high resistance to adversarial attacks which makes their usage in benchmarks safer than vulnerable metrics. The benchmark accepts new metrics submissions for researchers who want to make their metrics more robust to attacks or to find such metrics for their needs. Try our benchmark using pip install robustness-benchmark.

相關內容

穩健性

關注 3

流 · 變換 · 模型評估 · Analysis · 集成 ·

2024 年 2 月 15 日

Dual input stream transformer for vertical drift correction in eye-tracking reading data

Thomas M. Mercier,Marcin Budka,Martin R. Vasilev,Julie A. Kirkby,Bernhard Angele,Timothy J. Slattery

from arxiv, This work has been submitted to the IEEE Transactions on pattern analysis and machine intelligence for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

We introduce a novel Dual Input Stream Transformer (DIST) for the challenging problem of assigning fixation points from eye-tracking data collected during passage reading to the line of text that the reader was actually focused on. This post-processing step is crucial for analysis of the reading data due to the presence of noise in the form of vertical drift. We evaluate DIST against eleven classical approaches on a comprehensive suite of nine diverse datasets. We demonstrate that combining multiple instances of the DIST model in an ensemble achieves high accuracy across all datasets. Further combining the DIST ensemble with the best classical approach yields an average accuracy of 98.17 %. Our approach presents a significant step towards addressing the bottleneck of manual line assignment in reading research. Through extensive analysis and ablation studies, we identify key factors that contribute to DIST's success, including the incorporation of line overlap features and the use of a second input stream. Via rigorous evaluation, we demonstrate that DIST is robust to various experimental setups, making it a safe first choice for practitioners in the field.

Processing（編程語言） · MoDELS · Segment Anything · 3D · BASIC ·

2024 年 2 月 15 日

Lester: rotoscope animation through video object segmentation and tracking

Ruben Tous

This article introduces Lester, a novel method to automatically synthetise retro-style 2D animations from videos. The method approaches the challenge mainly as an object segmentation and tracking problem. Video frames are processed with the Segment Anything Model (SAM) and the resulting masks are tracked through subsequent frames with DeAOT, a method of hierarchical propagation for semi-supervised video object segmentation. The geometry of the masks' contours is simplified with the Douglas-Peucker algorithm. Finally, facial traits, pixelation and a basic shadow effect can be optionally added. The results show that the method exhibits an excellent temporal consistency and can correctly process videos with different poses and appearances, dynamic shots, partial shots and diverse backgrounds. The proposed method provides a more simple and deterministic approach than diffusion models based video-to-video translation pipelines, which suffer from temporal consistency problems and do not cope well with pixelated and schematic outputs. The method is also much most practical than techniques based on 3D human pose estimation, which require custom handcrafted 3D models and are very limited with respect to the type of scenes they can process.

相似度 · Performer · Stable Diffusion · 數據集 · 模型評估 ·

2024 年 2 月 15 日

Synthetic images aid the recognition of human-made art forgeries

Johann Ostmeyer,Ludovica Schaerf,Pavel Buividovich,Tessa Charles,Eric Postma,Carina Popovici

from arxiv, 15 + 10 pages, 4 + 5 figures, 4 + 10 tables; van Gogh dataset available, DOI: //doi.org/10.5281/zenodo.10276928

Previous research has shown that Artificial Intelligence is capable of distinguishing between authentic paintings by a given artist and human-made forgeries with remarkable accuracy, provided sufficient training. However, with the limited amount of existing known forgeries, augmentation methods for forgery detection are highly desirable. In this work, we examine the potential of incorporating synthetic artworks into training datasets to enhance the performance of forgery detection. Our investigation focuses on paintings by Vincent van Gogh, for which we release the first dataset specialized for forgery detection. To reinforce our results, we conduct the same analyses on the artists Amedeo Modigliani and Raphael. We train a classifier to distinguish original artworks from forgeries. For this, we use human-made forgeries and imitations in the style of well-known artists and augment our training sets with images in a similar style generated by Stable Diffusion and StyleGAN. We find that the additional synthetic forgeries consistently improve the detection of human-made forgeries. In addition, we find that, in line with previous research, the inclusion of synthetic forgeries in the training also enables the detection of AI-generated forgeries, especially if created using a similar generator.

邊 · 優化器 · Networking · Performer · 估計/估計量 ·

2024 年 2 月 15 日

POBEVM: Real-time Video Matting via Progressively Optimize the Target Body and Edge

Jianming Xian

Deep convolutional neural networks (CNNs) based approaches have achieved great performance in video matting. Many of these methods can produce accurate alpha estimation for the target body but typically yield fuzzy or incorrect target edges. This is usually caused by the following reasons: 1) The current methods always treat the target body and edge indiscriminately; 2) Target body dominates the whole target with only a tiny proportion target edge. For the first problem, we propose a CNN-based module that separately optimizes the matting target body and edge (SOBE). And on this basis, we introduce a real-time, trimap-free video matting method via progressively optimizing the matting target body and edge (POBEVM) that is much lighter than previous approaches and achieves significant improvements in the predicted target edge. For the second problem, we propose an Edge-L1-Loss (ELL) function that enforces our network on the matting target edge. Experiments demonstrate our method outperforms prior trimap-free matting methods on both Distinctions-646 (D646) and VideoMatte240K(VM) dataset, especially in edge optimization.

Segment Anything · MoDELS · Performer · Prompt · 真正例 ·

2024 年 2 月 14 日

ClickSAM: Fine-tuning Segment Anything Model using click prompts for ultrasound image segmentation

Aimee Guo,Grace Fei,Hemanth Pasupuleti,Jing Wang

from arxiv, 5 pages, 2 figures, SPIE Medical Imaging Conference 2024

The newly released Segment Anything Model (SAM) is a popular tool used in image processing due to its superior segmentation accuracy, variety of input prompts, training capabilities, and efficient model design. However, its current model is trained on a diverse dataset not tailored to medical images, particularly ultrasound images. Ultrasound images tend to have a lot of noise, making it difficult to segment out important structures. In this project, we developed ClickSAM, which fine-tunes the Segment Anything Model using click prompts for ultrasound images. ClickSAM has two stages of training: the first stage is trained on single-click prompts centered in the ground-truth contours, and the second stage focuses on improving the model performance through additional positive and negative click prompts. By comparing the first stage predictions to the ground-truth masks, true positive, false positive, and false negative segments are calculated. Positive clicks are generated using the true positive and false negative segments, and negative clicks are generated using the false positive segments. The Centroidal Voronoi Tessellation algorithm is then employed to collect positive and negative click prompts in each segment that are used to enhance the model performance during the second stage of training. With click-train methods, ClickSAM exhibits superior performance compared to other existing models for ultrasound image segmentation.

U-Net · 置信度 · Projection · 欠采樣 · 戴斯相似度 ·

2024 年 2 月 14 日

Improving image quality of sparse-view lung tumor CT images with U-Net

Annika Ries,Tina Dorosti,Johannes Thalhammer,Daniel Sasse,Andreas Sauter,Felix Meurer,Ashley Benne,Tobias Lasser,Franz Pfeiffer,Florian Schaff,Daniela Pfeiffer

Background: We aimed at improving image quality (IQ) of sparse-view computed tomography (CT) images using a U-Net for lung metastasis detection and determining the best tradeoff between number of views, IQ, and diagnostic confidence. Methods: CT images from 41 subjects aged 62.8 $\pm$ 10.6 years (mean $\pm$ standard deviation), 23 men, 34 with lung metastasis, 7 healthy, were retrospectively selected (2016-2018) and forward projected onto 2,048-view sinograms. Six corresponding sparse-view CT data subsets at varying levels of undersampling were reconstructed from sinograms using filtered backprojection with 16, 32, 64, 128, 256, and 512 views. A dual-frame U-Net was trained and evaluated for each subsampling level on 8,658 images from 22 diseased subjects. A representative image per scan was selected from 19 subjects (12 diseased, 7 healthy) for a single-blinded multireader study. These slices, for all levels of subsampling, with and without U-Net postprocessing, were presented to three readers. IQ and diagnostic confidence were ranked using predefined scales. Subjective nodule segmentation was evaluated using sensitivity and Dice similarity coefficient (DSC); clustered Wilcoxon signed-rank test was used. Results: The 64-projection sparse-view images resulted in 0.89 sensitivity and 0.81 DSC, while their counterparts, postprocessed with the U-Net, had improved metrics (0.94 sensitivity and 0.85 DSC) (p = 0.400). Fewer views led to insufficient IQ for diagnosis. For increased views, no substantial discrepancies were noted between sparse-view and postprocessed images. Conclusions: Projection views can be reduced from 2,048 to 64 while maintaining IQ and the confidence of the radiologists on a satisfactory level.

EPG · 泛函 · 近似 · 罰項 · Continuity ·

2024 年 2 月 14 日

A locally mass-conservative enriched Petrov-Galerkin method without penalty for the Darcy flow in porous media

Huangxin Chen,Piaopiao Dong,Shuyu Sun,Zixuan Wang

In this work we present an enriched Petrov-Galerkin (EPG) method for the simulation of the Darcy flow in porous media. The new method enriches the approximation trial space of the conforming continuous Galerkin (CG) method with bubble functions and enriches the approximation test space of the CG method with piecewise constant functions, and it does not require any penalty term in the weak formulation. Moreover, we propose a framework for constructing the bubble functions and consider a decoupled algorithm for the EPG method based on this framework, which enables the process of solving pressure to be decoupled into two steps. The first step is to solve the pressure by the standard CG method, and the second step is a post-processing correction of the first step. Compared with the CG method, the proposed EPG method is locally mass-conservative, while keeping fewer degrees of freedom than the discontinuous Galerkin (DG) method. In addition, this method is more concise in the error analysis than the enriched Galerkin (EG) method. The coupled flow and transport in porous media is considered to illustrate the advantages of locally mass-conservative properties of the EPG method. We establish the optimal convergence of numerical solutions and present several numerical examples to illustrate the performance of the proposed method.

Automator · MoDELS · 磁流變材料 · Learning · 查全率/召回率 ·

2024 年 2 月 13 日

Automated detection of motion artifacts in brain MR images using deep learning and explainable artificial intelligence

Marina Manso Jimeno,Keerthi Sravan Ravi,Maggie Fung,John Thomas Vaughan, Jr.,Sairam Geethanath

from arxiv, 25 pages, 9 figures, 1 table. Submitted to NMR in Biomedicine

Quality assessment, including inspecting the images for artifacts, is a critical step during MRI data acquisition to ensure data quality and downstream analysis or interpretation success. This study demonstrates a deep learning model to detect rigid motion in T1-weighted brain images. We leveraged a 2D CNN for three-class classification and tested it on publicly available retrospective and prospective datasets. Grad-CAM heatmaps enabled the identification of failure modes and provided an interpretation of the model's results. The model achieved average precision and recall metrics of 85% and 80% on six motion-simulated retrospective datasets. Additionally, the model's classifications on the prospective dataset showed a strong inverse correlation (-0.84) compared to average edge strength, an image quality metric indicative of motion. This model is part of the ArtifactID tool, aimed at inline automatic detection of Gibbs ringing, wrap-around, and motion artifacts. This tool automates part of the time-consuming QA process and augments expertise on-site, particularly relevant in low-resource settings where local MR knowledge is scarce.

可約的 · 講稿 · 文本分類 · 圖 ·

2024 年 2 月 13 日

Interleaved snowballing: Reducing the workload of literature curators

Ralf Stephan

We formally define the literature (reference) snowballing method and present a refined version of it. We show that the improved algorithm can substantially reduce curator work, even before application of text classification, by reducing the number of candidates to classify. We also present a desktop application named LitBall that implements this and other literature collection methods, through access to the Semantic Scholar academic graph (S2AG).

秩 · Learning · Performer · state-of-the-art · 相關系數 ·

2024 年 2 月 13 日

Learning semantic image quality for fetal ultrasound from noisy ranking annotation

Manxi Lin,Jakob Ambsdorf,Emilie Pi Fogtmann Sejer,Zahra Bashir,Chun Kit Wong,Paraskevas Pegios,Alberto Raheli,Morten Bo S?ndergaard Svendsen,Mads Nielsen,Martin Gr?nneb?k Tolsgaard,Anders Nymark Christensen,Aasa Feragen

from arxiv, Extended version of the accepted paper at ISBI 2024

We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training data, we design an efficient ranking annotation scheme based on the merge sort algorithm. Finally, we compare our ranking algorithm to a number of state-of-the-art ranking algorithms on a challenging fetal ultrasound quality assessment task, showing the superior performance of our method on the majority of rank correlation metrics.