亚州AV无码专区在线电影,99久热这里精品免费观看

We present an unsupervised data-driven approach for non-rigid shape matching. Shape matching identifies correspondences between two shapes and is a fundamental step in many computer vision and graphics applications. Our approach is designed to be particularly robust when matching shapes digitized using 3D scanners that contain fine geometric detail and suffer from different types of noise including topological noise caused by the coalescence of spatially close surface regions. We build on two strategies. First, using a hierarchical patch based shape representation we match shapes consistently in a coarse to fine manner, allowing for robustness to noise. This multi-scale representation drastically reduces the dimensionality of the problem when matching at the coarsest scale, rendering unsupervised learning feasible. Second, we constrain this hierarchical matching to be reflected in 3D by fitting a patch-wise near-rigid deformation model. Using this constraint, we leverage spatial continuity at different scales to capture global shape properties, resulting in matchings that generalize well to data with different deformations and noise characteristics. Experiments demonstrate that our approach obtains significantly better results on raw 3D scans than state-of-the-art methods, while performing on-par on standard test scenarios.

相關內容

塑造

關注 1

INFORMS · 信息檢索 · Storage · 分離的 · 情景 ·

2024 年 1 月 17 日

Weakly-Private Information Retrieval From MDS-Coded Distributed Storage

Asbj?rn O. Orvedal,Hsuan-Yin Lin,Eirik Rosnes

from arxiv, To be presented at the 2024 International Zurich Seminar on Information and Communication (IZS'24), Zurich, Switzerland

We consider the problem of weakly-private information retrieval (WPIR) when data is encoded by a maximum distance separable code and stored across multiple servers. In WPIR, a user wishes to retrieve a piece of data from a set of servers without leaking too much information about which piece of data she is interested in. We study and provide the first WPIR protocols for this scenario and present results on their optimal trade-off between download rate and information leakage using the maximal leakage privacy metric.

層 · 可約的 · INFORMS · 統計量 · Microsoft Surface ·

2024 年 1 月 17 日

IRS-Enhanced Anti-Jamming Precoding Against DISCO Physical Layer Jamming Attacks

Huan Huang,Hongliang Zhang,Yi Cai,Yunjing Zhang,A. Lee Swindlehurst,Zhu Han

from arxiv, This paper has been accepted by IEEE ICC 2024

Illegitimate intelligent reflective surfaces (IRSs) can pose significant physical layer security risks on multi-user multiple-input single-output (MU-MISO) systems. Recently, a DISCO approach has been proposed an illegitimate IRS with random and time-varying reflection coefficients, referred to as a "disco" IRS (DIRS). Such DIRS can attack MU-MISO systems without relying on either jamming power or channel state information (CSI), and classical anti-jamming techniques are ineffective for the DIRS-based fully-passive jammers (DIRS-based FPJs). In this paper, we propose an IRS-enhanced anti-jamming precoder against DIRS-based FPJs that requires only statistical rather than instantaneous CSI of the DIRS-jammed channels. Specifically, a legitimate IRS is introduced to reduce the strength of the DIRS-based jamming relative to the transmit signals at a legitimate user (LU). In addition, the active beamforming at the legitimate access point (AP) is designed to maximize the signal-to-jamming-plus-noise ratios (SJNRs). Numerical results are presented to evaluate the effectiveness of the proposed IRS-enhanced anti-jamming precoder against DIRS-based FPJs.

可辨認的 · Performer · 同分布的 · Networking · CASE ·

2024 年 1 月 16 日

Game-Theoretic Neyman-Pearson Detection to Combat Strategic Evasion

Yinan Hu,Quanyan Zhu

The security in networked systems depends greatly on recognizing and identifying adversarial behaviors. Traditional detection methods focus on specific categories of attacks and have become inadequate for increasingly stealthy and deceptive attacks that are designed to bypass detection strategically. This work aims to develop a holistic theory to countermeasure such evasive attacks. We focus on extending a fundamental class of statistical-based detection methods based on Neyman-Pearson's (NP) hypothesis testing formulation. We propose game-theoretic frameworks to capture the conflicting relationship between a strategic evasive attacker and an evasion-aware NP detector. By analyzing both the equilibrium behaviors of the attacker and the NP detector, we characterize their performance using Equilibrium Receiver-Operational-Characteristic (EROC) curves. We show that the evasion-aware NP detectors outperform the passive ones in the way that the former can act strategically against the attacker's behavior and adaptively modify their decision rules based on the received messages. In addition, we extend our framework to a sequential setting where the user sends out identically distributed messages. We corroborate the analytical results with a case study of anomaly detection.

Extensibility · 近似 · 掩碼 · Learning · MoDELS ·

2024 年 1 月 16 日

Exploring Phonetic Context-Aware Lip-Sync For Talking Face Generation

Se Jin Park,Minsu Kim,Jeongsoo Choi,Yong Man Ro

from arxiv, Accepted at AAAI 2024

Talking face generation is the challenging task of synthesizing a natural and realistic face that requires accurate synchronization with a given audio. Due to co-articulation, where an isolated phone is influenced by the preceding or following phones, the articulation of a phone varies upon the phonetic context. Therefore, modeling lip motion with the phonetic context can generate more spatio-temporally aligned lip movement. In this respect, we investigate the phonetic context in generating lip motion for talking face generation. We propose Context-Aware Lip-Sync framework (CALS), which explicitly leverages phonetic context to generate lip movement of the target face. CALS is comprised of an Audio-to-Lip module and a Lip-to-Face module. The former is pretrained based on masked learning to map each phone to a contextualized lip motion unit. The contextualized lip motion unit then guides the latter in synthesizing a target identity with context-aware lip motion. From extensive experiments, we verify that simply exploiting the phonetic context in the proposed CALS framework effectively enhances spatio-temporal alignment. We also demonstrate the extent to which the phonetic context assists in lip synchronization and find the effective window size for lip generation to be approximately 1.2 seconds.

泛化理論 · state-of-the-art · 數據集 · 測試數據 · Performance ·

2024 年 1 月 12 日

Generalizable Sleep Staging via Multi-Level Domain Alignment

Jiquan Wang,Sha Zhao,Haiteng Jiang,Shijian Li,Tao Li,Gang Pan

from arxiv, Accepted by the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)

Automatic sleep staging is essential for sleep assessment and disorder diagnosis. Most existing methods depend on one specific dataset and are limited to be generalized to other unseen datasets, for which the training data and testing data are from the same dataset. In this paper, we introduce domain generalization into automatic sleep staging and propose the task of generalizable sleep staging which aims to improve the model generalization ability to unseen datasets. Inspired by existing domain generalization methods, we adopt the feature alignment idea and propose a framework called SleepDG to solve it. Considering both of local salient features and sequential features are important for sleep staging, we propose a Multi-level Feature Alignment combining epoch-level and sequence-level feature alignment to learn domain-invariant feature representations. Specifically, we design an Epoch-level Feature Alignment to align the feature distribution of each single sleep epoch among different domains, and a Sequence-level Feature Alignment to minimize the discrepancy of sequential features among different domains. SleepDG is validated on five public datasets, achieving the state-of-the-art performance.

Networking · Processing（編程語言） · Learning · CASE · Machine Learning ·

2024 年 1 月 12 日

Intelligent Data-Driven Architectural Features Orchestration for Network Slicing

Rodrigo Moreira,Flavio de Oliveira Silva,Tereza Cristina Melo de Brito Carvalho,Joberto S. B. Martins

from arxiv, 12 pages, 6 figures, Conference ADVANCE 24 - International Workshop on ADVANCEs in ICT Infrastructures and Services - February 26--29, 2024 - Hanoi, Vietnam

Network slicing is a crucial enabler and a trend for the Next Generation Mobile Network (NGMN) and various other new systems like the Internet of Vehicles (IoV) and Industrial IoT (IIoT). Orchestration and machine learning are key elements with a crucial role in the network-slicing processes since the NS process needs to orchestrate resources and functionalities, and machine learning can potentially optimize the orchestration process. However, existing network-slicing architectures lack the ability to define intelligent approaches to orchestrate features and resources in the slicing process. This paper discusses machine learning-based orchestration of features and capabilities in network slicing architectures. Initially, the slice resource orchestration and allocation in the slicing planning, configuration, commissioning, and operation phases are analyzed. In sequence, we highlight the need for optimized architectural feature orchestration and recommend using ML-embed agents, federated learning intrinsic mechanisms for knowledge acquisition, and a data-driven approach embedded in the network slicing architecture. We further develop an architectural features orchestration case embedded in the SFI2 network slicing architecture. An attack prevention security mechanism is developed for the SFI2 architecture using distributed embedded and cooperating ML agents. The case presented illustrates the architectural feature's orchestration process and benefits, highlighting its importance for the network slicing process.

INFORMS · Integration · Networking · CLIP · 模態 ·

2024 年 1 月 12 日

CLIP-Driven Semantic Discovery Network for Visible-Infrared Person Re-Identification

Xiaoyan Yu,Neng Dong,Liehuang Zhu,Hao Peng,Dapeng Tao

Visible-infrared person re-identification (VIReID) primarily deals with matching identities across person images from different modalities. Due to the modality gap between visible and infrared images, cross-modality identity matching poses significant challenges. Recognizing that high-level semantics of pedestrian appearance, such as gender, shape, and clothing style, remain consistent across modalities, this paper intends to bridge the modality gap by infusing visual features with high-level semantics. Given the capability of CLIP to sense high-level semantic information corresponding to visual representations, we explore the application of CLIP within the domain of VIReID. Consequently, we propose a CLIP-Driven Semantic Discovery Network (CSDN) that consists of Modality-specific Prompt Learner, Semantic Information Integration (SII), and High-level Semantic Embedding (HSE). Specifically, considering the diversity stemming from modality discrepancies in language descriptions, we devise bimodal learnable text tokens to capture modality-private semantic information for visible and infrared images, respectively. Additionally, acknowledging the complementary nature of semantic details across different modalities, we integrate text features from the bimodal language descriptions to achieve comprehensive semantics. Finally, we establish a connection between the integrated text features and the visual features across modalities. This process embed rich high-level semantic information into visual representations, thereby promoting the modality invariance of visual representations. The effectiveness and superiority of our proposed CSDN over existing methods have been substantiated through experimental evaluations on multiple widely used benchmarks. The code will be released at \url{//github.com/nengdong96/CSDN}.

語音識別 · MoDELS · 情景 · state-of-the-art · 分離的 ·

2024 年 1 月 12 日

Attention-Guided Adaptation for Code-Switching Speech Recognition

Bobbi Aditya,Mahdin Rohmatillah,Liang-Hsuan Tai,Jen-Tzung Chien

from arxiv, Accepted to ICASSP 2024

The prevalence of the powerful multilingual models, such as Whisper, has significantly advanced the researches on speech recognition. However, these models often struggle with handling the code-switching setting, which is essential in multilingual speech recognition. Recent studies have attempted to address this setting by separating the modules for different languages to ensure distinct latent representations for languages. Some other methods considered the switching mechanism based on language identification. In this study, a new attention-guided adaptation is proposed to conduct parameter-efficient learning for bilingual ASR. This method selects those attention heads in a model which closely express language identities and then guided those heads to be correctly attended with their corresponding languages. The experiments on the Mandarin-English code-switching speech corpus show that the proposed approach achieves a 14.2% mixed error rate, surpassing state-of-the-art method, where only 5.6% additional parameters over Whisper are trained.

MoDELS · 目標檢測 · Performer · 監督 · Processing（編程語言） ·

2024 年 1 月 12 日

Zero-Shot Co-salient Object Detection Framework

Haoke Xiao,Lv Tang,Bo Li,Zhiming Luo,Shaozi Li

Co-salient Object Detection (CoSOD) endeavors to replicate the human visual system's capacity to recognize common and salient objects within a collection of images. Despite recent advancements in deep learning models, these models still rely on training with well-annotated CoSOD datasets. The exploration of training-free zero-shot CoSOD frameworks has been limited. In this paper, taking inspiration from the zero-shot transfer capabilities of foundational computer vision models, we introduce the first zero-shot CoSOD framework that harnesses these models without any training process. To achieve this, we introduce two novel components in our proposed framework: the group prompt generation (GPG) module and the co-saliency map generation (CMP) module. We evaluate the framework's performance on widely-used datasets and observe impressive results. Our approach surpasses existing unsupervised methods and even outperforms fully supervised methods developed before 2020, while remaining competitive with some fully supervised methods developed before 2022.

學成 · Networking · INFORMS · Performer · Neural Networks ·

2020 年 2 月 27 日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Jae Woong Soh,Sunwoo Cho,Nam Ik Cho

from arxiv, Will be presented in CVPR 2020

Convolutional neural networks (CNNs) have shown dramatic improvements in single image super-resolution (SISR) by using large-scale external samples. Despite their remarkable performance based on the external dataset, they cannot exploit internal information within a specific image. Another problem is that they are applicable only to the specific condition of data that they are supervised. For instance, the low-resolution (LR) image should be a "bicubic" downsampled noise-free image from a high-resolution (HR) one. To address both issues, zero-shot super-resolution (ZSSR) has been proposed for flexible internal learning. However, they require thousands of gradient updates, i.e., long inference time. In this paper, we present Meta-Transfer Learning for Zero-Shot Super-Resolution (MZSR), which leverages ZSSR. Precisely, it is based on finding a generic initial parameter that is suitable for internal learning. Thus, we can exploit both external and internal information, where one single gradient update can yield quite considerable results. (See Figure 1). With our method, the network can quickly adapt to a given image condition. In this respect, our method can be applied to a large spectrum of image conditions within a fast adaptation process.