蜜芽亚洲精品国产品国语在线试看-极度变态SM玩弄孕妇

In this study, we present a method for synthesizing novel views from a single 360-degree RGB-D image based on the neural radiance field (NeRF) . Prior studies relied on the neighborhood interpolation capability of multi-layer perceptrons to complete missing regions caused by occlusion and zooming, which leads to artifacts. In the method proposed in this study, the input image is reprojected to 360-degree RGB images at other camera positions, the missing regions of the reprojected images are completed by a 2D image generative model, and the completed images are utilized to train the NeRF. Because multiple completed images contain inconsistencies in 3D, we introduce a method to learn the NeRF model using a subset of completed images that cover the target scene with less overlap of completed regions. The selection of such a subset of images can be attributed to the maximum weight independent set problem, which is solved through simulated annealing. Experiments demonstrated that the proposed method can synthesize plausible novel views while preserving the features of the scene for both artificial and real-world data.

相關內容

NeRF

關注 7

Weight · 圖 · 語義相似度 · 簇 · 樣本 ·

2023 年 10 月 24 日

A Diffusion Weighted Graph Framework for New Intent Discovery

Wenkai Shi,Wenbin An,Feng Tian,Qinghua Zheng,QianYing Wang,Ping Chen

from arxiv, EMNLP 2023 Main

New Intent Discovery (NID) aims to recognize both new and known intents from unlabeled data with the aid of limited labeled data containing only known intents. Without considering structure relationships between samples, previous methods generate noisy supervisory signals which cannot strike a balance between quantity and quality, hindering the formation of new intent clusters and effective transfer of the pre-training knowledge. To mitigate this limitation, we propose a novel Diffusion Weighted Graph Framework (DWGF) to capture both semantic similarities and structure relationships inherent in data, enabling more sufficient and reliable supervisory signals. Specifically, for each sample, we diffuse neighborhood relationships along semantic paths guided by the nearest neighbors for multiple hops to characterize its local structure discriminately. Then, we sample its positive keys and weigh them based on semantic similarities and local structures for contrastive learning. During inference, we further propose Graph Smoothing Filter (GSF) to explicitly utilize the structure relationships to filter high-frequency noise embodied in semantically ambiguous samples on the cluster boundary. Extensive experiments show that our method outperforms state-of-the-art models on all evaluation metrics across multiple benchmark datasets. Code and data are available at //github.com/yibai-shi/DWGF.

深度Q網絡 · Agent · Q網絡` · 機器人 · Learning ·

2023 年 10 月 20 日

Enhanced Low-Dimensional Sensing Mapless Navigation of Terrestrial Mobile Robots Using Double Deep Reinforcement Learning Techniques

Linda Dotto de Moraes,Victor Augusto Kich,Alisson Henrique Kolling,Jair Augusto Bottega,Ricardo Bedin Grando,Anselmo Rafael Cukla,Daniel Fernando Tello Gamarra

In this study, we present two distinct approaches within the realm of Deep Reinforcement Learning (Deep-RL) aimed at enhancing mapless navigation for a ground-based mobile robot. The research methodology primarily involves a comparative analysis between a Deep-RL strategy grounded in the foundational Deep Q-Network (DQN) algorithm, and an alternative approach based on the Double Deep Q-Network (DDQN) algorithm. The agents in these approaches leverage 24 measurements from laser range sampling, coupled with the agent's positional differentials and orientation relative to the target. This amalgamation of data influences the agents' determinations regarding navigation, ultimately dictating the robot's velocities. By embracing this parsimonious sensory framework as proposed, we successfully showcase the training of an agent for proficiently executing navigation tasks and adeptly circumventing obstacles. Notably, this accomplishment is attained without a dependency on intricate sensory inputs like those inherent to image-centric methodologies. The proposed methodology is evaluated in three different real environments, revealing that Double Deep structures significantly enhance the navigation capabilities of mobile robots compared to simple Q structures.

語言模型化 · MoDELS · 黑盒 · 自動問答 · search engine ·

2023 年 10 月 20 日

Query Rewriting for Retrieval-Augmented Large Language Models

Xinbei Ma,Yeyun Gong,Pengcheng He,Hai Zhao,Nan Duan

from arxiv, EMNLP2023

Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline, making remarkable progress in knowledge-intensive tasks. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs from the perspective of the query rewriting. Unlike prior studies focusing on adapting either the retriever or the reader, our approach pays attention to the adaptation of the search query itself, for there is inevitably a gap between the input text and the needed knowledge in retrieval. We first prompt an LLM to generate the query, then use a web search engine to retrieve contexts. Furthermore, to better align the query to the frozen modules, we propose a trainable scheme for our pipeline. A small language model is adopted as a trainable rewriter to cater to the black-box LLM reader. The rewriter is trained using the feedback of the LLM reader by reinforcement learning. Evaluation is conducted on downstream tasks, open-domain QA and multiple-choice QA. Experiments results show consistent performance improvement, indicating that our framework is proven effective and scalable, and brings a new framework for retrieval-augmented LLM.

Learning · SR · 表示 · 表示學習 · HMM ·

2023 年 10 月 20 日

Learning Successor Representations with Distributed Hebbian Temporal Memory

Evgenii Dzhivelikian,Petr Kuderov,Aleksandr I. Panov

from arxiv, 12 pages, 4 figures

This paper presents a novel approach to address the challenge of online hidden representation learning for decision-making under uncertainty in non-stationary, partially observable environments. The proposed algorithm, Distributed Hebbian Temporal Memory (DHTM), is based on factor graph formalism and a multicomponent neuron model. DHTM aims to capture sequential data relationships and make cumulative predictions about future observations, forming Successor Representation (SR). Inspired by neurophysiological models of the neocortex, the algorithm utilizes distributed representations, sparse transition matrices, and local Hebbian-like learning rules to overcome the instability and slow learning process of traditional temporal memory algorithms like RNN and HMM. Experimental results demonstrate that DHTM outperforms classical LSTM and performs comparably to more advanced RNN-like algorithms, speeding up Temporal Difference learning for SR in changing environments. Additionally, we compare the SRs produced by DHTM to another biologically inspired HMM-like algorithm, CSCG. Our findings suggest that DHTM is a promising approach for addressing the challenges of online hidden representation learning in dynamic environments.

去噪 · MoDELS · Performer · 模型評估 · FAST ·

2023 年 10 月 20 日

Music Augmentation and Denoising For Peak-Based Audio Fingerprinting

Kamil Akesbi,Dorian Desblancs,Benjamin Martin

Audio fingerprinting is a well-established solution for song identification from short recording excerpts. Popular methods rely on the extraction of sparse representations, generally spectral peaks, and have proven to be accurate, fast, and scalable to large collections. However, real-world applications of audio identification often happen in noisy environments, which can cause these systems to fail. In this work, we tackle this problem by introducing and releasing a new audio augmentation pipeline that adds noise to music snippets in a realistic way, by stochastically mimicking real-world scenarios. We then propose and release a deep learning model that removes noisy components from spectrograms in order to improve peak-based fingerprinting systems' accuracy. We show that the addition of our model improves the identification performance of commonly used audio fingerprinting systems, even under noisy conditions.

線性的 · 塑造 · RGB-D · 分離的 · 相同 ·

2023 年 10 月 20 日

Simultaneous Shape Tracking of Multiple Deformable Linear Objects with Global-Local Topology Preservation

Jingyi Xiang,Holly Dinkel

This work presents an algorithm for tracking the shape of multiple entangling Deformable Linear Objects (DLOs) from a sequence of RGB-D images. This algorithm runs in real-time and improves on previous single-DLO tracking approaches by enabling tracking of multiple objects. This is achieved using Global-Local Topology Preservation (GLTP). This work uses the geodesic distance in GLTP to define the distance between separate objects and the distance between different parts of the same object. Tracking multiple entangling DLOs is demonstrated experimentally. The source code is publicly released.

學成 · Vision · 深度學習 · 注意力機制 · 計算機視覺 ·

2021 年 12 月 22 日

A Survey of Natural Language Generation

Chenhe Dong,Yinghui Li,Haifan Gong,Miaoxin Chen,Junxin Li,Ying Shen,Min Yang

from arxiv, 36 pages, 4 tables; Under review

This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.

FRN · INFORMS · Networking · MoDELS · 學成 ·

2021 年 4 月 12 日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, YanYan,Shenqi Lai,Zhenhua Chai,Chunhua Shen,Hanzi Wang

from arxiv, IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (CVPR 2021)

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.

注意力機制 · 機器閱讀理解 · Extensibility · state-of-the-art · MoDELS ·

2018 年 4 月 25 日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Minghao Hu,Yuxing Peng,Zhen Huang,Xipeng Qiu,Furu Wei,Ming Zhou

from arxiv, Published in 26th International Joint Conference on Artificial Intelligence (IJCAI), 2018

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.

Single-Shot · Branch · 目標檢測 · 推斷 · MS ·

2018 年 4 月 8 日

Single-Shot Object Detection with Enriched Semantics

Zhishuai Zhang,Siyuan Qiao,Cihang Xie,Wei Shen,Bo Wang,Alan L. Yuille

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.