欧美丰满大乳屁股流白浆,国精产品W灬源码网站1688

Object pose estimation underwater allows an autonomous system to perform tracking and intervention tasks. Nonetheless, underwater target pose estimation is remarkably challenging due to, among many factors, limited visibility, light scattering, cluttered environments, and constantly varying water conditions. An approach is to employ sonar or laser sensing to acquire 3D data, however, the data is not clear and the sensors expensive. For this reason, the community has focused on extracting pose estimates from RGB input. In this work, we propose an approach that leverages 2D object detection to reliably compute 6D pose estimates in different underwater scenarios. We test our proposal with 4 objects with symmetrical shapes and poor texture spanning across 33,920 synthetic and 10 real scenes. All objects and scenes are made available in an open-source dataset that includes annotations for object detection and pose estimation. When benchmarking against similar end-to-end methodologies for 6D object pose estimation, our pipeline provides estimates that are 8% more accurate. We also demonstrate the real world usability of our pose estimation pipeline on an underwater robotic manipulator in a reaching task.

相關內容

估計/估計量

關注 3

多峰值 · MoDELS · INTERACT · 模態 · Learning ·

2023 年 10 月 31 日

A Transformer-Based Model With Self-Distillation for Multimodal Emotion Recognition in Conversations

Hui Ma,Jian Wang,Hongfei Lin,Bo Zhang,Yijia Zhang,Bo Xu

from arxiv, 13 pages, 10 figures. Accepted by IEEE Transactions on Multimedia (TMM)

Emotion recognition in conversations (ERC), the task of recognizing the emotion of each utterance in a conversation, is crucial for building empathetic machines. Existing studies focus mainly on capturing context- and speaker-sensitive dependencies on the textual modality but ignore the significance of multimodal information. Different from emotion recognition in textual conversations, capturing intra- and inter-modal interactions between utterances, learning weights between different modalities, and enhancing modal representations play important roles in multimodal ERC. In this paper, we propose a transformer-based model with self-distillation (SDT) for the task. The transformer-based model captures intra- and inter-modal interactions by utilizing intra- and inter-modal transformers, and learns weights between modalities dynamically by designing a hierarchical gated fusion strategy. Furthermore, to learn more expressive modal representations, we treat soft labels of the proposed model as extra training supervision. Specifically, we introduce self-distillation to transfer knowledge of hard and soft labels from the proposed model to each modality. Experiments on IEMOCAP and MELD datasets demonstrate that SDT outperforms previous state-of-the-art baselines.

Networking · AVS · 極小點 · 多峰值 · INTERACT ·

2023 年 10 月 30 日

Regulating For-Hire Autonomous Vehicles for An Equitable Multimodal Transportation Network

Jing Gao,Sen Li

This paper assesses the equity impacts of for-hire autonomous vehicles (AVs) and investigates regulatory policies that promote spatial and social equity in future autonomous mobility ecosystems. To this end, we consider a multimodal transportation network, where a ride-hailing platform operates a fleet of AVs to offer mobility-on-demand services in competition with a public transit agency that offers transit services on a transportation network. A game-theoretic model is developed to characterize the intimate interactions between the ride-hailing platform, the transit agency, and multiclass passengers with distinct income levels. An algorithm is proposed to compute the Nash equilibrium of the game and conduct an ex-post evaluation of the performance of the obtained solution. Based on the proposed framework, we evaluate the spatial and social equity in transport accessibility using the Theil index, and find that although the proliferation of for-hire AVs in the ride-hailing network improves overall accessibility, the benefits are not fairly distributed among distinct locations or population groups, implying that the deployment of AVs will enlarge the existing spatial and social inequity gaps in the transportation network if no regulatory intervention is in place. To address this concern, we investigate two regulatory policies that can improve transport equity: (a) a minimum service-level requirement on ride-hailing services, which improves the spatial equity in the transport network; (b) a subsidy on transit services by taxing ride-hailing services, which promotes the use of public transit and improves the spatial and social equity of the transport network. We show that the minimum service-level requirement entails a trade-off: as a higher minimum service level is imposed, the spatial inequity reduces, but the social inequity will be exacerbated. On the other hand ...

Mail · Learning · Agent · 機器人 · INTERACT ·

2023 年 10 月 30 日

Learning Robot Manipulation from Cross-Morphology Demonstration

Gautam Salhotra,I-Chun Arthur Liu,Gaurav Sukhatme

from arxiv, Accepted to the Conference on Robot Learning (CoRL) 2023

Some Learning from Demonstrations (LfD) methods handle small mismatches in the action spaces of the teacher and student. Here we address the case where the teacher's morphology is substantially different from that of the student. Our framework, Morphological Adaptation in Imitation Learning (MAIL), bridges this gap allowing us to train an agent from demonstrations by other agents with significantly different morphologies. MAIL learns from suboptimal demonstrations, so long as they provide $\textit{some}$ guidance towards a desired solution. We demonstrate MAIL on manipulation tasks with rigid and deformable objects including 3D cloth manipulation interacting with rigid obstacles. We train a visual control policy for a robot with one end-effector using demonstrations from a simulated agent with two end-effectors. MAIL shows up to $24\%$ improvement in a normalized performance metric over LfD and non-LfD baselines. It is deployed to a real Franka Panda robot, handles multiple variations in properties for objects (size, rotation, translation), and cloth-specific properties (color, thickness, size, material). An overview is on //uscresl.github.io/mail .

語言模型化 · INTERACT · MoDELS · Performer · 泛化理論 ·

2023 年 10 月 28 日

Training Socially Aligned Language Models on Simulated Social Interactions

Ruibo Liu,Ruixin Yang,Chenyan Jia,Ge Zhang,Denny Zhou,Andrew M. Dai,Diyi Yang,Soroush Vosoughi

from arxiv, Code, data, and models can be downloaded via //github.com/agi-templar/Stable-Alignment

Social alignment in AI systems aims to ensure that these models behave according to established societal values. However, unlike humans, who derive consensus on value judgments through social interaction, current language models (LMs) are trained to rigidly replicate their training corpus in isolation, leading to subpar generalization in unfamiliar scenarios and vulnerability to adversarial attacks. This work presents a novel training paradigm that permits LMs to learn from simulated social interactions. In comparison to existing methodologies, our approach is considerably more scalable and efficient, demonstrating superior performance in alignment benchmarks and human evaluations. This paradigm shift in the training of LMs brings us a step closer to developing AI systems that can robustly and accurately reflect societal norms and values.

Learning · Prompt · 聯邦學習 · MoDELS · 知識 (knowledge) ·

2023 年 10 月 27 日

Heterogeneous Federated Learning with Group-Aware Prompt Tuning

Wenlong Deng,Christos Thrampoulidis,Xiaoxiao Li

Transformers have achieved remarkable success in various machine-learning tasks, prompting their widespread adoption. In this paper, we explore their application in the context of federated learning (FL), with a particular focus on heterogeneous scenarios where individual clients possess diverse local datasets. To meet the computational and communication demands of FL, we leverage pre-trained Transformers and use an efficient prompt-tuning strategy. Our strategy introduces the concept of learning both shared and group prompts, enabling the acquisition of universal knowledge and group-specific knowledge simultaneously. Additionally, a prompt selection module assigns personalized group prompts to each input, aligning the global model with the data distribution of each client. This approach allows us to train a single global model that can automatically adapt to various local client data distributions without requiring local fine-tuning. In this way, our proposed method effectively bridges the gap between global and personalized local models in Federated Learning and surpasses alternative approaches that lack the capability to adapt to previously unseen clients. The effectiveness of our approach is rigorously validated through extensive experimentation and ablation studies.

估計/估計量 · 標注 · 連接主義 · 隨機變量 · 均值 ·

2023 年 10 月 27 日

Label Shift Estimators for Non-Ignorable Missing Data

Andrew C. Miller,Joseph Futoma

from arxiv, 8 pages, 5 figures

We consider the problem of estimating the mean of a random variable Y subject to non-ignorable missingness, i.e., where the missingness mechanism depends on Y . We connect the auxiliary proxy variable framework for non-ignorable missingness (West and Little, 2013) to the label shift setting (Saerens et al., 2002). Exploiting this connection, we construct an estimator for non-ignorable missing data that uses high-dimensional covariates (or proxies) without the need for a generative model. In synthetic and semi-synthetic experiments, we study the behavior of the proposed estimator, comparing it to commonly used ignorable estimators in both well-specified and misspecified settings. Additionally, we develop a score to assess how consistent the data are with the label shift assumption. We use our approach to estimate disease prevalence using a large health survey, comparing ignorable and non-ignorable approaches. We show that failing to account for non-ignorable missingness can have profound consequences on conclusions drawn from non-representative samples.

Learning · INFORMS · 表示 · 控制器 · 強化學習 ·

2023 年 10 月 27 日

Towards Control-Centric Representations in Reinforcement Learning from Images

Chen Liu,Hongyu Zang,Xin Li,Yong Heng,Yifei Wang,Zhen Fang,Yisen Wang,Mingzhong Wang

Image-based Reinforcement Learning is a practical yet challenging task. A major hurdle lies in extracting control-centric representations while disregarding irrelevant information. While approaches that follow the bisimulation principle exhibit the potential in learning state representations to address this issue, they still grapple with the limited expressive capacity of latent dynamics and the inadaptability to sparse reward environments. To address these limitations, we introduce ReBis, which aims to capture control-centric information by integrating reward-free control information alongside reward-specific knowledge. ReBis utilizes a transformer architecture to implicitly model the dynamics and incorporates block-wise masking to eliminate spatiotemporal redundancy. Moreover, ReBis combines bisimulation-based loss with asymmetric reconstruction loss to prevent feature collapse in environments with sparse rewards. Empirical studies on two large benchmarks, including Atari games and DeepMind Control Suit, demonstrate that ReBis has superior performance compared to existing methods, proving its effectiveness.

Performer · 學成 · Boosting（一種模型訓練加速方式） · MoDELS · 可辨認的 ·

2021 年 12 月 22 日

Hybrid Curriculum Learning for Emotion Recognition in Conversation

Lin Yang,Yi Shen,Yue Mao,Longjun Cai

from arxiv, Accepted by AAAI-2022

Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in a meaningful order rather than considering them randomly can boost the performance of models, we propose an ERC-oriented hybrid curriculum learning framework. Our framework consists of two curricula: (1) conversation-level curriculum (CC); and (2) utterance-level curriculum (UC). In CC, we construct a difficulty measurer based on "emotion shift" frequency within a conversation, then the conversations are scheduled in an "easy to hard" schema according to the difficulty score returned by the difficulty measurer. For UC, it is implemented from an emotion-similarity perspective, which progressively strengthens the model's ability in identifying the confusing emotions. With the proposed model-agnostic hybrid curriculum learning strategy, we observe significant performance boosts over a wide range of existing ERC models and we are able to achieve new state-of-the-art results on four public ERC datasets.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

Performance · 相似度度量 · Performer · state-of-the-art · 圖像檢索 ·

2018 年 4 月 6 日

Cross-Domain Image Matching with Deep Feature Maps

Bailey Kong,James Supancic,Deva Ramanan,Charless C. Fowlkes

We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.