免费在线黄色电影_全网最新黄色网站_国产尤物一区二区三区网站_国产大学生口爆吞精在线视频_亚洲AV永久无码精品无码色戒_久久窝窝午夜不卡国产精品_中文字幕一区二区三三

We study universal zero-shot segmentation in this work to achieve panoptic, instance, and semantic segmentation for novel categories without any training samples. Such zero-shot segmentation ability relies on inter-class relationships in semantic space to transfer the visual knowledge learned from seen categories to unseen ones. Thus, it is desired to well bridge semantic-visual spaces and apply the semantic relationships to visual feature learning. We introduce a generative model to synthesize features for unseen categories, which links semantic and visual spaces as well as addresses the issue of lack of unseen training data. Furthermore, to mitigate the domain gap between semantic and visual spaces, firstly, we enhance the vanilla generator with learned primitives, each of which contains fine-grained attributes related to categories, and synthesize unseen features by selectively assembling these primitives. Secondly, we propose to disentangle the visual feature into the semantic-related part and the semantic-unrelated part that contains useful visual classification clues but is less relevant to semantic representation. The inter-class relationships of semantic-related visual features are then required to be aligned with those in semantic space, thereby transferring semantic knowledge to visual feature learning. The proposed approach achieves impressively state-of-the-art performance on zero-shot panoptic segmentation, instance segmentation, and semantic segmentation. Code is available at //henghuiding.github.io/PADing/.

相關內容

Learning

關注 12

全局最小 · 協方差矩陣 · 極小點 · 估計/估計量 · 正則化項 ·

2023 年 8 月 13 日

Spectrally-Corrected and Regularized Global Minimum Variance Portfolio for Spiked Model

Hua Li,Jiafu Huang

Considering the shortcomings of the traditional sample covariance matrix estimation, this paper proposes an improved global minimum variance portfolio model and named spectral corrected and regularized global minimum variance portfolio (SCRGMVP), which is better than the traditional risk model. The key of this method is that under the assumption that the population covariance matrix follows the spiked model and the method combines the design idea of the sample spectrally-corrected covariance matrix and regularized. The simulation of real and synthetic data shows that our method is not only better than the performance of traditional sample covariance matrix estimation (SCME), shrinkage estimation (SHRE), weighted shrinkage estimation (WSHRE) and simple spectral correction estimation (SCE), but also has lower computational complexity.

機器人 · 回合 · MoDELS · INTERACT · 設計 ·

2023 年 8 月 11 日

Towards a Causal Probabilistic Framework for Prediction, Action-Selection & Explanations for Robot Block-Stacking Tasks

Ricardo Cannizzaro,Jonathan Routley,Lars Kunze

from arxiv, 3 pages, 3 figures, accepted to the "Causality for Robotics: Answering the Question of Why" workshop at the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Uncertainties in the real world mean that is impossible for system designers to anticipate and explicitly design for all scenarios that a robot might encounter. Thus, robots designed like this are fragile and fail outside of highly-controlled environments. Causal models provide a principled framework to encode formal knowledge of the causal relationships that govern the robot's interaction with its environment, in addition to probabilistic representations of noise and uncertainty typically encountered by real-world robots. Combined with causal inference, these models permit an autonomous agent to understand, reason about, and explain its environment. In this work, we focus on the problem of a robot block-stacking task due to the fundamental perception and manipulation capabilities it demonstrates, required by many applications including warehouse logistics and domestic human support robotics. We propose a novel causal probabilistic framework to embed a physics simulation capability into a structural causal model to permit robots to perceive and assess the current state of a block-stacking task, reason about the next-best action from placement candidates, and generate post-hoc counterfactual explanations. We provide exemplar next-best action selection results and outline planned experimentation in simulated and real-world robot block-stacking tasks.

HTTPS · 自動問答 · 可辨認的 · Extensibility · Integration ·

2023 年 8 月 10 日

Progressive Spatio-temporal Perception for Audio-Visual Question Answering

Guangyao Li,Wenxuan Hou,Di Hu

from arxiv, Accepted by ACM MM 2023

Audio-Visual Question Answering (AVQA) task aims to answer questions about different visual objects, sounds, and their associations in videos. Such naturally multi-modal videos are composed of rich and complex dynamic audio-visual components, where most of which could be unrelated to the given questions, or even play as interference in answering the content of interest. Oppositely, only focusing on the question-aware audio-visual content could get rid of influence, meanwhile enabling the model to answer more efficiently. In this paper, we propose a Progressive Spatio-Temporal Perception Network (PSTP-Net), which contains three modules that progressively identify key spatio-temporal regions w.r.t. questions. Specifically, a temporal segment selection module is first introduced to select the most relevant audio-visual segments related to the given question. Then, a spatial region selection module is utilized to choose the most relevant regions associated with the question from the selected temporal segments. To further refine the selection of features, an audio-guided visual attention module is employed to perceive the association between auido and selected spatial regions. Finally, the spatio-temporal features from these modules are integrated for answering the question. Extensive experimental results on the public MUSIC-AVQA and AVQA datasets provide compelling evidence of the effectiveness and efficiency of PSTP-Net. Code is available at: \href{//github.com/GeWu-Lab/PSTP-Net}{//github.com/GeWu-Lab/PSTP-Net}

Performer · 小樣本學習 · 知識 (knowledge) · Learning · 圖 ·

2023 年 8 月 10 日

Few-Shot Data-to-Text Generation via Unified Representation and Multi-Source Learning

Alexander Hanbo Li,Mingyue Shang,Evangelia Spiliopoulou,Jie Ma,Patrick Ng,Zhiguo Wang,Bonan Min,William Wang,Kathleen McKeown,Vittorio Castelli,Dan Roth,Bing Xiang

We present a novel approach for structured data-to-text generation that addresses the limitations of existing methods that primarily focus on specific types of structured data. Our proposed method aims to improve performance in multi-task training, zero-shot and few-shot scenarios by providing a unified representation that can handle various forms of structured data such as tables, knowledge graph triples, and meaning representations. We demonstrate that our proposed approach can effectively adapt to new structured forms, and can improve performance in comparison to current methods. For example, our method resulted in a 66% improvement in zero-shot BLEU scores when transferring models trained on table inputs to a knowledge graph dataset. Our proposed method is an important step towards a more general data-to-text generation framework.

INTERACT · 優化器 · TOOLS · 離散化 · Analysis ·

2023 年 8 月 10 日

Density-Based Topology Optimization of High-Fidelity Fluid-Structure Interaction Problems with Large Deformations

Mohamed Abdelhamid,Aleksander Czekanski

The application of modern topology optimization techniques to single physics systems has seen great advances in the last three decades. However, the application of these tools to sophisticated multiphysics systems such as fluid-structure interactions is still lagging behind, mainly due to the multidisciplinary and complex nature of such systems. In this work, we implement topology optimization of high-fidelity, fully-coupled fluid-structure interaction problems with large deformations. We use the arbitrary Lagrangian-Eulerian approach to deform the fluid mesh as a pseudo-structural system such that structural deformations are completely reflected in the fluid flow mesh. The fluid-structure interaction problem is formulated using the three-field formulation and the sensitivity analysis is derived using the discrete adjoint approach. We show through numerical examples the effect of the projection and interpolation parameters on the convergence and topology of the optimized designs and demonstrate the effect of considering the structural deformations in the fluid mesh.

Attention · 注意力模型 · MoDELS · 可理解性 · 表示 ·

2023 年 8 月 9 日

Hierarchical Representations for Spatio-Temporal Visual Attention Modeling and Understanding

Miguel-ángel Fernández-Torres

from arxiv, PhD thesis

This PhD. Thesis concerns the study and development of hierarchical representations for spatio-temporal visual attention modeling and understanding in video sequences. More specifically, we propose two computational models for visual attention. First, we present a generative probabilistic model for context-aware visual attention modeling and understanding. Secondly, we develop a deep network architecture for visual attention modeling, which first estimates top-down spatio-temporal visual attention, and ultimately serves for modeling attention in the temporal domain.

離散化 · 優化器 · 樣本 · 穩健性 · 論文 ·

2023 年 8 月 9 日

Costate Convergence with Legendre-Lobatto Collocation for Trajectory Optimization

José Garrido,Artemi Makarow,Marco Sagliano,David Seelbinder,Stephan Theil

This paper introduces a new method of discretization that collocates both endpoints of the domain and enables the complete convergence of the costate variables associated with the Hamilton boundary-value problem. This is achieved through the inclusion of an \emph{exceptional sample} to the roots of the Legendre-Lobatto polynomial, thus promoting the associated differentiation matrix to be full-rank. We study the location of the new sample such that the differentiation matrix is the most robust to perturbations and we prove that this location is also the choice that mitigates the Runge phenomenon associated with polynomial interpolation. Two benchmark problems are successfully implemented in support of our theoretical findings. The new method is observed to converge exponentially with the number of discretization points used.

FRN · INFORMS · Networking · MoDELS · 學成 ·

2021 年 4 月 12 日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, YanYan,Shenqi Lai,Zhenhua Chai,Chunhua Shen,Hanzi Wang

from arxiv, IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (CVPR 2021)

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.

MoDELS · 注意力機制 · RNN · 標注 · Networking ·

2017 年 12 月 20 日

Order-Free RNN with Visual Attention for Multi-Label Classification

Shang-Fu Chen,Yi-Chen Chen,Chih-Kuan Yeh,Yu-Chiang Frank Wang

from arxiv, Accepted at 32nd AAAI Conference on Artificial Intelligence (AAAI-18)

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.