2021精品一级毛片一区二区_美女被狂C到高潮视频网站18_日韩AAA黄片免费看_中文字幕第35在线播放_人妻无码精品久久久久一区_免费看高清黄A级毛片手機看片影視_久久久久中文无码精品

Monocular depth estimation plays a fundamental role in computer vision. Due to the costly acquisition of depth ground truth, self-supervised methods that leverage adjacent frames to establish a supervisory signal have emerged as the most promising paradigms. In this work, we propose two novel ideas to improve self-supervised monocular depth estimation: 1) self-reference distillation and 2) disparity offset refinement. Specifically, we use a parameter-optimized model as the teacher updated as the training epochs to provide additional supervision during the training process. The teacher model has the same structure as the student model, with weights inherited from the historical student model. In addition, a multiview check is introduced to filter out the outliers produced by the teacher model. Furthermore, we leverage the contextual consistency between high-scale and low-scale features to obtain multiscale disparity offsets, which are used to refine the disparity output incrementally by aligning disparity information at different scales. The experimental results on the KITTI and Make3D datasets show that our method outperforms previous state-of-the-art competitors.

相關內容

估計/估計量

關注 3

蒸餾 · 知識 (knowledge) · Machine Translation · MoDELS · SimPLe ·

2023 年 8 月 4 日

Selective Knowledge Distillation for Non-Autoregressive Neural Machine Translation

Min Liu,Yu Bao,Chengqi Zhao,Shujian Huang

from arxiv, Accepted to AAAI 2023

Benefiting from the sequence-level knowledge distillation, the Non-Autoregressive Transformer (NAT) achieves great success in neural machine translation tasks. However, existing knowledge distillation has side effects, such as propagating errors from the teacher to NAT students, which may limit further improvements of NAT models and are rarely discussed in existing research. In this paper, we introduce selective knowledge distillation by introducing an NAT evaluator to select NAT-friendly targets that are of high quality and easy to learn. In addition, we introduce a simple yet effective progressive distillation method to boost NAT performance. Experiment results on multiple WMT language directions and several representative NAT models show that our approach can realize a flexible trade-off between the quality and complexity of training data for NAT models, achieving strong performances. Further analysis shows that distilling only 5% of the raw translations can help an NAT outperform its counterpart trained on raw data by about 2.4 BLEU.

估計/估計量 · 曲率 · 標量 · 流形 · 樣本 ·

2023 年 8 月 4 日

An Intrinsic Approach to Scalar-Curvature Estimation for Point Clouds

Abigail Hickok,Andrew J. Blumberg

from arxiv, 37 pages, 5 figures

We introduce an intrinsic estimator for the scalar curvature of a data set presented as a finite metric space. Our estimator depends only on the metric structure of the data and not on an embedding in $\mathbb{R}^n$. We show that the estimator is consistent in the sense that for points sampled from a probability measure on a compact Riemannian manifold, the estimator converges to the scalar curvature as the number of points increases. To justify its use in applications, we show that the estimator is stable with respect to perturbations of the metric structure, e.g., noise in the sample or error estimating the intrinsic metric. We validate our estimator experimentally on synthetic data that is sampled from manifolds with specified curvature.

估計/估計量 · Integration · 稀疏 · 穩健性 · MoDELS ·

2023 年 8 月 4 日

Diffusion-Augmented Depth Prediction with Sparse Annotations

Jiaqi Li,Yiran Wang,Zihao Huang,Jinghong Zheng,Ke Xian,Zhiguo Cao,Jianming Zhang

from arxiv, Accepted by ACM MM'2023

Depth estimation aims to predict dense depth maps. In autonomous driving scenes, sparsity of annotations makes the task challenging. Supervised models produce concave objects due to insufficient structural information. They overfit to valid pixels and fail to restore spatial structures. Self-supervised methods are proposed for the problem. Their robustness is limited by pose estimation, leading to erroneous results in natural scenes. In this paper, we propose a supervised framework termed Diffusion-Augmented Depth Prediction (DADP). We leverage the structural characteristics of diffusion model to enforce depth structures of depth models in a plug-and-play manner. An object-guided integrality loss is also proposed to further enhance regional structure integrality by fetching objective information. We evaluate DADP on three driving benchmarks and achieve significant improvements in depth structures and robustness. Our work provides a new perspective on depth estimation with sparse annotations in autonomous driving scenes.

差分進化 · Networking · MoDELS · Neural Networks · 變換 ·

2023 年 8 月 4 日

Differential Evolution Algorithm based Hyper-Parameters Selection of Transformer Neural Network Model for Load Forecasting

Anuvab Sen,Arul Rhik Mazumder,Udayon Sen

from arxiv, 6 Pages, 6 Figures, 2 Tables

Accurate load forecasting plays a vital role in numerous sectors, but accurately capturing the complex dynamics of dynamic power systems remains a challenge for traditional statistical models. For these reasons, time-series models (ARIMA) and deep-learning models (ANN, LSTM, GRU, etc.) are commonly deployed and often experience higher success. In this paper, we analyze the efficacy of the recently developed Transformer-based Neural Network model in Load forecasting. Transformer models have the potential to improve Load forecasting because of their ability to learn long-range dependencies derived from their Attention Mechanism. We apply several metaheuristics namely Differential Evolution to find the optimal hyperparameters of the Transformer-based Neural Network to produce accurate forecasts. Differential Evolution provides scalable, robust, global solutions to non-differentiable, multi-objective, or constrained optimization problems. Our work compares the proposed Transformer based Neural Network model integrated with different metaheuristic algorithms by their performance in Load forecasting based on numerical metrics such as Mean Squared Error (MSE) and Mean Absolute Percentage Error (MAPE). Our findings demonstrate the potential of metaheuristic-enhanced Transformer-based Neural Network models in Load forecasting accuracy and provide optimal hyperparameters for each model.

控制器 · Performer · MoDELS · 可約的 · Performance ·

2023 年 8 月 3 日

Planning and Control for a Dynamic Morphing-Wing UAV Using a Vortex Particle Model

Gino Perrotta,Luca Scheuer,Yocheved Kopel,Max Basescu,Adam Polevoy,Kevin Wolfe,Joseph Moore

Achieving precise, highly-dynamic maneuvers with Unmanned Aerial Vehicles (UAVs) is a major challenge due to the complexity of the associated aerodynamics. In particular, unsteady effects -- as might be experienced in post-stall regimes or during sudden vehicle morphing -- can have an adverse impact on the performance of modern flight control systems. In this paper, we present a vortex particle model and associated model-based controller capable of reasoning about the unsteady aerodynamics during aggressive maneuvers. We evaluate our approach in hardware on a morphing-wing UAV executing post-stall perching maneuvers. Our results show that the use of the unsteady aerodynamics model improves performance during both fixed-wing and dynamic-wing perching, while the use of wing-morphing planned with quasi-steady aerodynamics results in reduced performance. While the focus of this paper is a pre-computed control policy, we believe that, with sufficient computational resources, our approach could enable online planning in the future.

Cognition · 虛擬現實（VR） · SPE · Performer · 設計 ·

2023 年 8 月 3 日

A Virtual Reality Game to Improve Physical and Cognitive Acuity

Blooma John,Ramanathan Subramanian,Jayan Chirayath Kurian

from arxiv, 5 Figures

We present the Virtual Human Benchmark (VHB) game to evaluate and improve physical and cognitive acuity. VHB simulates in 3D the BATAK lightboard game, which is designed to improve physical reaction and hand-eye coordination, on the \textit{Oculus Rift} and \textit{Quest} headsets. The game comprises the \textit{reaction}, \textit{accumulator} and \textit{sequence} modes; \bj{along} with the \textit{reaction} and \textit{accumulator} modes which mimic BATAK functionalities, the \textit{sequence} mode involves the user repeating a sequence of illuminated targets with increasing complexity to train visual memory and cognitive processing. A first version of the game (VHB v1) was evaluated against the real-world BATAK by 20 users, and their feedback was utilized to improve game design and obtain a second version (VHB v2). Another study to evaluate VHB v2 was conducted with 20 users, whose results confirmed that the deign improvements enhanced game usability and user experience in multiple respects. Also, logging and visualization of performance data such as \textit{reaction time}, \textit{speed between targets} and \textit{completed sequence patterns} provides useful data for coaches/therapists monitoring sports/rehabilitation regimens.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

INFORMS · 圖像分割 · Networking · state-of-the-art · Integration ·

2019 年 4 月 9 日

Cross-Modal Self-Attention Network for Referring Image Segmentation

Linwei Ye,Mrigank Rochan,Zhi Liu,Yang Wang

from arxiv, Accepted to CVPR2019

We consider the problem of referring image segmentation. Given an input image and a natural language expression, the goal is to segment the object referred by the language expression in the image. Existing works in this area treat the language expression and the input image separately in their representations. They do not sufficiently capture long-range correlations between these two modalities. In this paper, we propose a cross-modal self-attention (CMSA) module that effectively captures the long-range dependencies between linguistic and visual features. Our model can adaptively focus on informative words in the referring expression and important regions in the input image. In addition, we propose a gated multi-level fusion module to selectively integrate self-attentive cross-modal features corresponding to different levels in the image. This module controls the information flow of features at different levels. We validate the proposed approach on four evaluation datasets. Our proposed approach consistently outperforms existing state-of-the-art methods.

Performance · 相似度度量 · Performer · state-of-the-art · 圖像檢索 ·

2018 年 4 月 6 日

Cross-Domain Image Matching with Deep Feature Maps

Bailey Kong,James Supancic,Deva Ramanan,Charless C. Fowlkes

We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.