东京热加勒比中文无码_无码人妻综合精品一区色欲AV_91久久久久无码精品国产软件_国产免费AV片在线观看可下载_俺去啦激情在线视频一区二区三区_久久国产精品99久久久久久综合_WWW影音先锋在线观看

Land cover analysis using hyperspectral images (HSI) remains an open problem due to their low spatial resolution and complex spectral information. Recent studies are primarily dedicated to designing Transformer-based architectures for spatial-spectral long-range dependencies modeling, which is computationally expensive with quadratic complexity. Selective structured state space model (Mamba), which is efficient for modeling long-range dependencies with linear complexity, has recently shown promising progress. However, its potential in hyperspectral image processing that requires handling numerous spectral bands has not yet been explored. In this paper, we innovatively propose S$^2$Mamba, a spatial-spectral state space model for hyperspectral image classification, to excavate spatial-spectral contextual features, resulting in more efficient and accurate land cover analysis. In S$^2$Mamba, two selective structured state space models through different dimensions are designed for feature extraction, one for spatial, and the other for spectral, along with a spatial-spectral mixture gate for optimal fusion. More specifically, S$^2$Mamba first captures spatial contextual relations by interacting each pixel with its adjacent through a Patch Cross Scanning module and then explores semantic information from continuous spectral bands through a Bi-directional Spectral Scanning module. Considering the distinct expertise of the two attributes in homogenous and complicated texture scenes, we realize the Spatial-spectral Mixture Gate by a group of learnable matrices, allowing for the adaptive incorporation of representations learned across different dimensions. Extensive experiments conducted on HSI classification benchmarks demonstrate the superiority and prospect of S$^2$Mamba. The code will be made available at: //github.com/PURE-melo/S2Mamba.

相關內容

Mamba

關注 2

SCAN · Mamba · Neck · MoDELS · 情景 ·

2024 年 9 月 18 日

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

Haoxuan Wang,Qingdong He,Jinlong Peng,Hao Yang,Mingmin Chi,Yabiao Wang

Open-vocabulary detection (OVD) aims to detect objects beyond a predefined set of categories. As a pioneering model incorporating the YOLO series into OVD, YOLO-World is well-suited for scenarios prioritizing speed and efficiency. However, its performance is hindered by its neck feature fusion mechanism, which causes the quadratic complexity and the limited guided receptive fields. To address these limitations, we present Mamba-YOLO-World, a novel YOLO-based OVD model employing the proposed MambaFusion Path Aggregation Network (MambaFusion-PAN) as its neck architecture. Specifically, we introduce an innovative State Space Model-based feature fusion mechanism consisting of a Parallel-Guided Selective Scan algorithm and a Serial-Guided Selective Scan algorithm with linear complexity and globally guided receptive fields. It leverages multi-modal input sequences and mamba hidden states to guide the selective scanning process. Experiments demonstrate that our model outperforms the original YOLO-World on the COCO and LVIS benchmarks in both zero-shot and fine-tuning settings while maintaining comparable parameters and FLOPs. Additionally, it surpasses existing state-of-the-art OVD methods with fewer parameters and FLOPs.

INTERACT · Microsoft Surface · 峰值 · 模型評估 · 設計 ·

2024 年 9 月 9 日

Evaluating Force-based Haptics for Immersive Tangible Interactions with Surface Visualizations

Hamza Afzaal,Usman Alim

from arxiv, Replaced the incorrect ORCID information with the Primary Author Added External DOI

Haptic feedback provides an essential sensory stimulus crucial for interaction and analyzing three-dimensional spatio-temporal phenomena on surface visualizations. Given its ability to provide enhanced spatial perception and scene maneuverability, virtual reality (VR) catalyzes haptic interactions on surface visualizations. Various interaction modes, encompassing both mid-air and on-surface interactions -- with or without the application of assisting force stimuli -- have been explored using haptic force feedback devices. In this paper, we evaluate the use of on-surface and assisted on-surface haptic modes of interaction compared to a no-haptic interaction mode. A force-based haptic stylus is used for all three modalities; the on-surface mode uses collision based forces, whereas the assisted on-surface mode is accompanied by an additional snapping force. We conducted a within-subjects user study involving fundamental interaction tasks performed on surface visualizations. Keeping a consistent visual design across all three modes, our study incorporates tasks that require the localization of the highest, lowest, and random points on surfaces; and tasks that focus on brushing curves on surfaces with varying complexity and occlusion levels. Our findings show that participants took almost the same time to brush curves using all the interaction modes. They could draw smoother curves using the on-surface interaction modes compared to the no-haptic mode. However, the assisted on-surface mode provided better accuracy than the on-surface mode. The on-surface mode was slower in point localization, but the accuracy depended on the visual cues and occlusions associated with the tasks. Finally, we discuss participant feedback on using haptic force feedback as a tangible input modality and share takeaways to aid the design of haptics-based tangible interactions for surface visualizations.

估計/估計量 · 優化器 · 穩健性 · 知識 (knowledge) · Excel ·

2024 年 9 月 9 日

KRONC: Keypoint-based Robust Camera Optimization for 3D Car Reconstruction

Davide Di Nucci,Alessandro Simoni,Matteo Tomei,Luca Ciuffreda,Roberto Vezzani,Rita Cucchiara

from arxiv, Accepted at ECCVW

The three-dimensional representation of objects or scenes starting from a set of images has been a widely discussed topic for years and has gained additional attention after the diffusion of NeRF-based approaches. However, an underestimated prerequisite is the knowledge of camera poses or, more specifically, the estimation of the extrinsic calibration parameters. Although excellent general-purpose Structure-from-Motion methods are available as a pre-processing step, their computational load is high and they require a lot of frames to guarantee sufficient overlapping among the views. This paper introduces KRONC, a novel approach aimed at inferring view poses by leveraging prior knowledge about the object to reconstruct and its representation through semantic keypoints. With a focus on vehicle scenes, KRONC is able to estimate the position of the views as a solution to a light optimization problem targeting the convergence of keypoints' back-projections to a singular point. To validate the method, a specific dataset of real-world car scenes has been collected. Experiments confirm KRONC's ability to generate excellent estimates of camera poses starting from very coarse initialization. Results are comparable with Structure-from-Motion methods with huge savings in computation. Code and data will be made publicly available.

知識 (knowledge) · 模型評估 · Learning · 流 · 蒸餾 ·

2024 年 9 月 9 日

Look One and More: Distilling Hybrid Order Relational Knowledge for Cross-Resolution Image Recognition

Shiming Ge,Kangkai Zhang,Haolin Liu,Yingying Hua,Shengwei Zhao,Xin Jin,Hao Wen

from arxiv, Accepted by AAAI 2020

In spite of great success in many image recognition tasks achieved by recent deep models, directly applying them to recognize low-resolution images may suffer from low accuracy due to the missing of informative details during resolution degradation. However, these images are still recognizable for subjects who are familiar with the corresponding high-resolution ones. Inspired by that, we propose a teacher-student learning approach to facilitate low-resolution image recognition via hybrid order relational knowledge distillation. The approach refers to three streams: the teacher stream is pretrained to recognize high-resolution images in high accuracy, the student stream is learned to identify low-resolution images by mimicking the teacher's behaviors, and the extra assistant stream is introduced as bridge to help knowledge transfer across the teacher to the student. To extract sufficient knowledge for reducing the loss in accuracy, the learning of student is supervised with multiple losses, which preserves the similarities in various order relational structures. In this way, the capability of recovering missing details of familiar low-resolution images can be effectively enhanced, leading to a better knowledge transfer. Extensive experiments on metric learning, low-resolution image classification and low-resolution face recognition tasks show the effectiveness of our approach, while taking reduced models.

NeRF · MoDELS · Extensibility · 可辨認的 · Integration ·

2024 年 9 月 3 日

$S^2$NeRF: Privacy-preserving Training Framework for NeRF

Bokang Zhang,Yanglin Zhang,Zhikun Zhang,Jinglan Yang,Lingying Huang,Junfeng Wu

from arxiv, To appear in the ACM Conference on Computer and Communications Security (CCS'24), October 14-18, 2024, Salt Lake City, UT, USA

Neural Radiance Fields (NeRF) have revolutionized 3D computer vision and graphics, facilitating novel view synthesis and influencing sectors like extended reality and e-commerce. However, NeRF's dependence on extensive data collection, including sensitive scene image data, introduces significant privacy risks when users upload this data for model training. To address this concern, we first propose SplitNeRF, a training framework that incorporates split learning (SL) techniques to enable privacy-preserving collaborative model training between clients and servers without sharing local data. Despite its benefits, we identify vulnerabilities in SplitNeRF by developing two attack methods, Surrogate Model Attack and Scene-aided Surrogate Model Attack, which exploit the shared gradient data and a few leaked scene images to reconstruct private scene information. To counter these threats, we introduce $S^2$NeRF, secure SplitNeRF that integrates effective defense mechanisms. By introducing decaying noise related to the gradient norm into the shared gradient information, $S^2$NeRF preserves privacy while maintaining a high utility of the NeRF model. Our extensive evaluations across multiple datasets demonstrate the effectiveness of $S^2$NeRF against privacy breaches, confirming its viability for secure NeRF training in sensitive applications.

優化器 · 評論員 · Better · GPUs · 可約的 ·

2024 年 8 月 25 日

cuSZ-$i$: High-Ratio Scientific Lossy Compression on GPUs with Optimized Multi-Level Interpolation

Jinyang Liu,Jiannan Tian,Shixun Wu,Sheng Di,Boyuan Zhang,Robert Underwood,Yafan Huang,Jiajun Huang,Kai Zhao,Guanpeng Li,Dingwen Tao,Zizhong Chen,Franck Cappello

from arxiv, camera-ready version for SC '24

Error-bounded lossy compression is a critical technique for significantly reducing scientific data volumes. Compared to CPU-based compressors, GPU-based compressors exhibit substantially higher throughputs, fitting better for today's HPC applications. However, the critical limitations of existing GPU-based compressors are their low compression ratios and qualities, severely restricting their applicability. To overcome these, we introduce a new GPU-based error-bounded scientific lossy compressor named cuSZ-$i$, with the following contributions: (1) A novel GPU-optimized interpolation-based prediction method significantly improves the compression ratio and decompression data quality. (2) The Huffman encoding module in cuSZ-$i$ is optimized for better efficiency. (3) cuSZ-$i$ is the first to integrate the NVIDIA Bitcomp-lossless as an additional compression-ratio-enhancing module. Evaluations show that cuSZ-$i$ significantly outperforms other latest GPU-based lossy compressors in compression ratio under the same error bound (hence, the desired quality), showcasing a 476% advantage over the second-best. This leads to cuSZ-$i$'s optimized performance in several real-world use cases.

估計/估計量 · Processing（編程語言） · 操作 · 泛函 · 求逆 ·

2024 年 8 月 23 日

Estimating Lagged (Cross-)Covariance Operators of $L^p$-$m$-approximable Processes in Cartesian Product Hilbert Spaces

Sebastian Kühnert

from arxiv, 15 pages

Estimating parameters of functional ARMA, GARCH and invertible processes requires estimating lagged covariance and cross-covariance operators of Cartesian product Hilbert space-valued processes. Asymptotic results have been derived in recent years, either less generally or under a strict condition. This article derives upper bounds of the estimation errors for such operators based on the mild condition Lp-m-approximability for each lag, Cartesian power(s) and sample size, where the two processes can take values in different spaces in the context of lagged cross-covariance operators. Implications of our results on eigenelements, parameters in functional AR(MA) models and other general situations are also discussed.

自動問答 · Better · 知識庫 · 基 · 圖 ·

2019 年 12 月 12 日

AliMe KBQA: Question Answering over Structured Knowledge for E-commerce Customer Service

Feng-Lin Li,Weijia Chen,Qi Huang,Yikun Guo

With the rise of knowledge graph (KG), question answering over knowledge base (KBQA) has attracted increasing attention in recent years. Despite much research has been conducted on this topic, it is still challenging to apply KBQA technology in industry because business knowledge and real-world questions can be rather complicated. In this paper, we present AliMe-KBQA, a bold attempt to apply KBQA in the E-commerce customer service field. To handle real knowledge and questions, we extend the classic "subject-predicate-object (SPO)" structure with property hierarchy, key-value structure and compound value type (CVT), and enhance traditional KBQA with constraints recognition and reasoning ability. We launch AliMe-KBQA in the Marketing Promotion scenario for merchants during the "Double 11" period in 2018 and other such promotional events afterwards. Online results suggest that AliMe-KBQA is not only able to gain better resolution and improve customer satisfaction, but also becomes the preferred knowledge management method by business knowledge staffs since it offers a more convenient and efficient management experience.

視覺問答 · 數據集 · Performer · state-of-the-art · MoDELS ·

2018 年 3 月 20 日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Qing Li,Qingyi Tao,Shafiq Joty,Jianfei Cai,Jiebo Luo

Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.

圖片分類 · 生成式對抗網絡 · Networking · 未標記 · GANs ·

2018 年 2 月 10 日

Generative Adversarial Networks and Probabilistic Graph Models for Hyperspectral Image Classification

Zilong Zhong,Jonathan Li

from arxiv, Accepted by AAAI-18

High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.