久久一级高潮A免费,一区二区三区免费观看在线视频播放,人亲久久精品天天中文字幕,五月天高清无码一区,中字幕午夜一区二区三区

Face Recognition (FR) systems can suffer from physical (i.e., print photo) and digital (i.e., DeepFake) attacks. However, previous related work rarely considers both situations at the same time. This implies the deployment of multiple models and thus more computational burden. The main reasons for this lack of an integrated model are caused by two factors: (1) The lack of a dataset including both physical and digital attacks with ID consistency which means the same ID covers the real face and all attack types; (2) Given the large intra-class variance between these two attacks, it is difficult to learn a compact feature space to detect both attacks simultaneously. To address these issues, we collect a Unified physical-digital Attack dataset, called UniAttackData. The dataset consists of $1,800$ participations of 2 and 12 physical and digital attacks, respectively, resulting in a total of 29,706 videos. Then, we propose a Unified Attack Detection framework based on Vision-Language Models (VLMs), namely UniAttackDetection, which includes three main modules: the Teacher-Student Prompts (TSP) module, focused on acquiring unified and specific knowledge respectively; the Unified Knowledge Mining (UKM) module, designed to capture a comprehensive feature space; and the Sample-Level Prompt Interaction (SLPI) module, aimed at grasping sample-level semantics. These three modules seamlessly form a robust unified attack detection framework. Extensive experiments on UniAttackData and three other datasets demonstrate the superiority of our approach for unified face attack detection.

相關內容

特征空間

關注 0

Analysis · Learning · 表示學習 · 數據集 · 變換 ·

2024 年 3 月 12 日

Rotation-Agnostic Image Representation Learning for Digital Pathology

Saghir Alfasly,Abubakr Shafique,Peyman Nejat,Jibran Khan,Areej Alsaafin,Ghazal Alabtah,H. R. Tizhoosh

from arxiv, CVPR 2024 - 23 pages, 10 figures, and 18 tables

This paper addresses complex challenges in histopathological image analysis through three key contributions. Firstly, it introduces a fast patch selection method, FPS, for whole-slide image (WSI) analysis, significantly reducing computational cost while maintaining accuracy. Secondly, it presents PathDino, a lightweight histopathology feature extractor with a minimal configuration of five Transformer blocks and only 9 million parameters, markedly fewer than alternatives. Thirdly, it introduces a rotation-agnostic representation learning paradigm using self-supervised learning, effectively mitigating overfitting. We also show that our compact model outperforms existing state-of-the-art histopathology-specific vision transformers on 12 diverse datasets, including both internal datasets spanning four sites (breast, liver, skin, and colorectal) and seven public datasets (PANDA, CAMELYON16, BRACS, DigestPath, Kather, PanNuke, and WSSS4LUAD). Notably, even with a training dataset of 6 million histopathology patches from The Cancer Genome Atlas (TCGA), our approach demonstrates an average 8.5% improvement in patch-level majority vote performance. These contributions provide a robust framework for enhancing image analysis in digital pathology, rigorously validated through extensive evaluation. Project Page: //kimialabmayo.github.io/PathDino-Page/

INFORMS · 3D · NeRF · Performer · Extensibility ·

2024 年 3 月 12 日

Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Zhiyuan Min,Yawei Luo,Wei Yang,Yuesong Wang,Yi Yang

from arxiv, Accepted by CVPR-2024

Generalizable NeRF can directly synthesize novel views across new scenes, eliminating the need for scene-specific retraining in vanilla NeRF. A critical enabling factor in these approaches is the extraction of a generalizable 3D representation by aggregating source-view features. In this paper, we propose an Entangled View-Epipolar Information Aggregation method dubbed EVE-NeRF. Different from existing methods that consider cross-view and along-epipolar information independently, EVE-NeRF conducts the view-epipolar feature aggregation in an entangled manner by injecting the scene-invariant appearance continuity and geometry consistency priors to the aggregation process. Our approach effectively mitigates the potential lack of inherent geometric and appearance constraint resulting from one-dimensional interactions, thus further boosting the 3D representation generalizablity. EVE-NeRF attains state-of-the-art performance across various evaluation scenarios. Extensive experiments demonstate that, compared to prevailing single-dimensional aggregation, the entangled network excels in the accuracy of 3D scene geometry and appearance reconstruction. Our code is publicly available at //github.com/tatakai1/EVENeRF.

回合 · 穩健性 · 可行 · 控制器 · 路徑 ·

2024 年 3 月 10 日

Modular Multi-Level Replanning TAMP Framework for Dynamic Environment

Tao Lin,Chengfei Yue,Ziran Liu,Xibin Cao

Task and Motion Planning (TAMP) algorithms can generate plans that combine logic and motion aspects for robots. However, these plans are sensitive to interference and control errors. To make TAMP more applicable in real-world, we propose the modular multi-level replanning TAMP framework(MMRF), blending the probabilistic completeness of sampling-based TAMP algorithm with the robustness of reactive replanning. MMRF generates an nominal plan from the initial state, then dynamically reconstructs this nominal plan in real-time, reorders robot manipulations. Following the logic-level adjustment, GMRF will try to replan a new motion path to ensure the updated plan is feasible at the motion level. Finally, we conducted real-world experiments involving stack and rearrange task domains. The result demonstrate MMRF's ability to swiftly complete tasks in scenarios with varying degrees of interference.

同態加密 · 可約的 · MoDELS · Stable Diffusion · Processing（編程語言） ·

2024 年 3 月 9 日

Privacy-Preserving Diffusion Model Using Homomorphic Encryption

Yaojian Chen,Qiben Yan

In this paper, we introduce a privacy-preserving stable diffusion framework leveraging homomorphic encryption, called HE-Diffusion, which primarily focuses on protecting the denoising phase of the diffusion process. HE-Diffusion is a tailored encryption framework specifically designed to align with the unique architecture of stable diffusion, ensuring both privacy and functionality. To address the inherent computational challenges, we propose a novel min-distortion method that enables efficient partial image encryption, significantly reducing the overhead without compromising the model's output quality. Furthermore, we adopt a sparse tensor representation to expedite computational operations, enhancing the overall efficiency of the privacy-preserving diffusion process. We successfully implement HE-based privacy-preserving stable diffusion inference. The experimental results show that HE-Diffusion achieves 500 times speedup compared with the baseline method, and reduces time cost of the homomorphically encrypted inference to the minute level. Both the performance and accuracy of the HE-Diffusion are on par with the plaintext counterpart. Our approach marks a significant step towards integrating advanced cryptographic techniques with state-of-the-art generative models, paving the way for privacy-preserving and efficient image generation in critical applications.

MoDELS · 估計/估計量 · Extensibility · 數據集 · state-of-the-art ·

2024 年 3 月 8 日

Probabilistic Image-Driven Traffic Modeling via Remote Sensing

Scott Workman,Armin Hadzic

This work addresses the task of modeling spatiotemporal traffic patterns directly from overhead imagery, which we refer to as image-driven traffic modeling. We extend this line of work and introduce a multi-modal, multi-task transformer-based segmentation architecture that can be used to create dense city-scale traffic models. Our approach includes a geo-temporal positional encoding module for integrating geo-temporal context and a probabilistic objective function for estimating traffic speeds that naturally models temporal variations. We evaluate our method extensively using the Dynamic Traffic Speeds (DTS) benchmark dataset and significantly improve the state-of-the-art. Finally, we introduce the DTS++ dataset to support mobility-related location adaptation experiments.

邊緣化 · MoDELS · Shapley value · 閾值 · 納什均衡 ·

2024 年 3 月 8 日

A Task-Driven Multi-UAV Coalition Formation Mechanism

Xinpeng Lu,Heng Song,Huailing Ma,Junwu Zhu

With the rapid advancement of UAV technology, the problem of UAV coalition formation has become a hotspot. Therefore, designing task-driven multi-UAV coalition formation mechanism has become a challenging problem. However, existing coalition formation mechanisms suffer from low relevance between UAVs and task requirements, resulting in overall low coalition utility and unstable coalition structures. To address these problems, this paper proposed a novel multi-UAV coalition network collaborative task completion model, considering both coalition work capacity and task-requirement relationships. This model stimulated the formation of coalitions that match task requirements by using a revenue function based on the coalition's revenue threshold. Subsequently, an algorithm for coalition formation based on marginal utility was proposed. Specifically, the algorithm utilized Shapley value to achieve fair utility distribution within the coalition, evaluated coalition values based on marginal utility preference order, and achieved stable coalition partition through a limited number of iterations. Additionally, we theoretically proved that this algorithm has Nash equilibrium solution. Finally, experimental results demonstrated that the proposed algorithm, compared to currently classical algorithms, not only forms more stable coalitions but also further enhances the overall utility of coalitions effectively.

MoDELS · 近似 · 優化器 · 可約的 · 得分 ·

2024 年 3 月 8 日

Improving Diffusion-Based Generative Models via Approximated Optimal Transport

Daegyu Kim,Jooyoung Choi,Chaehun Shin,Uiwon Hwang,Sungroh Yoon

We introduce the Approximated Optimal Transport (AOT) technique, a novel training scheme for diffusion-based generative models. Our approach aims to approximate and integrate optimal transport into the training process, significantly enhancing the ability of diffusion models to estimate the denoiser outputs accurately. This improvement leads to ODE trajectories of diffusion models with lower curvature and reduced truncation errors during sampling. We achieve superior image quality and reduced sampling steps by employing AOT in training. Specifically, we achieve FID scores of 1.88 with just 27 NFEs and 1.73 with 29 NFEs in unconditional and conditional generations, respectively. Furthermore, when applying AOT to train the discriminator for guidance, we establish new state-of-the-art FID scores of 1.68 and 1.58 for unconditional and conditional generations, respectively, each with 29 NFEs. This outcome demonstrates the effectiveness of AOT in enhancing the performance of diffusion models.

簇 · Performer · 分離的 · 值域 · 可辨認的 ·

2024 年 3 月 7 日

Bayesian Level-Set Clustering

David Buch,Miheer Dewaskar,David B. Dunson

from arxiv, 36 pages, 6 figures

Broadly, the goal when clustering data is to separate observations into meaningful subgroups. The rich variety of methods for clustering reflects the fact that the relevant notion of meaningful clusters varies across applications. The classical Bayesian approach clusters observations by their association with components of a mixture model; the choice in class of components allows flexibility to capture a range of meaningful cluster notions. However, in practice the range is somewhat limited as difficulties with computation and cluster identifiability arise as components are made more flexible. Instead of mixture component attribution, we consider clusterings that are functions of the data and the density $f$, which allows us to separate flexible density estimation from clustering. Within this framework, we develop a method to cluster data into connected components of a level set of $f$. Under mild conditions, we establish that our Bayesian level-set (BALLET) clustering methodology yields consistent estimates, and we highlight its performance in a variety of toy and simulated data examples. Finally, through an application to astronomical data we show the method performs favorably relative to the popular level-set clustering algorithm DBSCAN in terms of accuracy, insensitivity to tuning parameters, and quantification of uncertainty.

圖 · entity · 知識圖譜 · 模型復雜度 · 變換 ·

2020 年 5 月 1 日

Low-Dimensional Hyperbolic Knowledge Graph Embeddings

Ines Chami,Adva Wolf,Da-Cheng Juan,Frederic Sala,Sujith Ravi,Christopher Ré

Knowledge graph (KG) embeddings learn low-dimensional representations of entities and relations to predict missing facts. KGs often exhibit hierarchical and logical patterns which must be preserved in the embedding space. For hierarchical data, hyperbolic embedding methods have shown promise for high-fidelity and parsimonious representations. However, existing hyperbolic embedding methods do not account for the rich logical patterns in KGs. In this work, we introduce a class of hyperbolic KG embedding models that simultaneously capture hierarchical and logical patterns. Our approach combines hyperbolic reflections and rotations with attention to model complex relational patterns. Experimental results on standard KG benchmarks show that our method improves over previous Euclidean- and hyperbolic-based efforts by up to 6.1% in mean reciprocal rank (MRR) in low dimensions. Furthermore, we observe that different geometric transformations capture different types of relations while attention-based transformations generalize to multiple relations. In high dimensions, our approach yields new state-of-the-art MRRs of 49.6% on WN18RR and 57.7% on YAGO3-10.

馬爾可夫隨機場 · 圖 · 圖像分割 · SSL · SC ·

2018 年 5 月 21 日

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Dwarikanath Mahapatra

Medical image segmentation requires consensus ground truth segmentations to be derived from multiple expert annotations. A novel approach is proposed that obtains consensus segmentations from experts using graph cuts (GC) and semi supervised learning (SSL). Popular approaches use iterative Expectation Maximization (EM) to estimate the final annotation and quantify annotator's performance. Such techniques pose the risk of getting trapped in local minima. We propose a self consistency (SC) score to quantify annotator consistency using low level image features. SSL is used to predict missing annotations by considering global features and local image consistency. The SC score also serves as the penalty cost in a second order Markov random field (MRF) cost function optimized using graph cuts to derive the final consensus label. Graph cut obtains a global maximum without an iterative procedure. Experimental results on synthetic images, real data of Crohn's disease patients and retinal images show our final segmentation to be accurate and more consistent than competing methods.