秋霞网一区二区三区,视频一区二区三区黄色视频,亚洲国模大胆私拍久久AV

Physical human-robot interactions (pHRIs) can improve robot autonomy and reduce physical demands on humans. In this paper, we consider a collaborative task with a considerably long object and no prior knowledge of the object's parameters. An integrated control framework with an online object parameter estimator and a Cartesian object-aware impedance controller is proposed to realize complicated scenarios. During the transportation task, the object parameters are estimated online while a robot and human lift an object. The perturbation motion is incorporated into the null space of the desired trajectory to enhance the estimator accuracy. An object-aware impedance controller is designed using the real-time estimation results to effectively transmit the intended human motion to the robot through the object. Experimental demonstrations of collaborative tasks, including object transportation and assembly tasks, are implemented to show the effectiveness of our proposed method.

相關內容

估計/估計量

關注 3

圖 · Learning · 深度學習 · Performer · 評論員 ·

2023 年 12 月 6 日

Geometric Deep Learning Towards the Iterative Classification of Graph-Based Aircraft Thermal Management Systems

Anthony Sirico Jr.,Daniel R Herber

from arxiv, 15 pages, 8 figures, 2024 AIAA SciTech Forum

In this paper, we use graph-based techniques to investigate the use of geometric deep learning (GDL) in the classification and down-selection of aircraft thermal management systems (TMS). Previous work developed an enumerative graph generation procedure using a component catalog with network structure constraints to represent novel aircraft TMSs as graphs. However, as with many enumerative approaches, combinatorial explosion limits its efficacy in many real-world problems, particularly when simulations and optimization must be performed on the many (automatically-generated) physics models. Therefore, we present an approach that takes the directed graphs representing aircraft TMSs and use GDL to predict the critical characteristics of the remaining graphs. This paper's findings demonstrate that incorporating additional graph-based features enhances performance, achieving an accuracy of 97% for determining a graph's compilability and simulatability while using only 5% of the data for training. By applying iterative classification methods, we also successfully segmented the total set of graphs into more specific groups with an average inclusion of 84.7 of the top 100 highest-performing graphs, achieved by training on 45% of the data.

泛函 · 線性的 · Learning · 近似 · 優化器 ·

2023 年 12 月 4 日

Provably Efficient Iterated CVaR Reinforcement Learning with Function Approximation and Human Feedback

Yu Chen,Yihan Du,Pihe Hu,Siwei Wang,Desheng Wu,Longbo Huang

Risk-sensitive reinforcement learning (RL) aims to optimize policies that balance the expected reward and risk. In this paper, we present a novel risk-sensitive RL framework that employs an Iterated Conditional Value-at-Risk (CVaR) objective under both linear and general function approximations, enriched by human feedback. These new formulations provide a principled way to guarantee safety in each decision making step throughout the control process. Moreover, integrating human feedback into risk-sensitive RL framework bridges the gap between algorithmic decision-making and human participation, allowing us to also guarantee safety for human-in-the-loop systems. We propose provably sample-efficient algorithms for this Iterated CVaR RL and provide rigorous theoretical analysis. Furthermore, we establish a matching lower bound to corroborate the optimality of our algorithms in a linear context.

圖注意力網絡 · Learning · Networking · 圖 · Attention ·

2023 年 12 月 4 日

Learning Structure-from-Motion with Graph Attention Networks

Lucas Brynte,José Pedro Iglesias,Carl Olsson,Fredrik Kahl

from arxiv, Added additional metrics

In this paper we tackle the problem of learning Structure-from-Motion (SfM) through the use of graph attention networks. SfM is a classic computer vision problem that is solved though iterative minimization of reprojection errors, referred to as Bundle Adjustment (BA), starting from a good initialization. In order to obtain a good enough initialization to BA, conventional methods rely on a sequence of sub-problems (such as pairwise pose estimation, pose averaging or triangulation) which provides an initial solution that can then be refined using BA. In this work we replace these sub-problems by learning a model that takes as input the 2D keypoints detected across multiple views, and outputs the corresponding camera poses and 3D keypoint coordinates. Our model takes advantage of graph neural networks to learn SfM-specific primitives, and we show that it can be used for fast inference of the reconstruction for new and unseen sequences. The experimental results show that the proposed model outperforms competing learning-based methods, and challenges COLMAP while having lower runtime.

視覺問答 · 自動問答 · Performance · 多樣性 · 大語言模型 ·

2023 年 12 月 4 日

How to Configure Good In-Context Sequence for Visual Question Answering

Li Li,Jiawei Peng,Huiyi Chen,Chongyang Gao,Xu Yang

from arxiv, 8 pages, 6 figures

Inspired by the success of Large Language Models in dealing with new tasks via In-Context Learning (ICL) in NLP, researchers have also developed Large Vision-Language Models (LVLMs) with ICL capabilities. However, when implementing ICL using these LVLMs, researchers usually resort to the simplest way like random sampling to configure the in-context sequence, thus leading to sub-optimal results. To enhance the ICL performance, in this study, we use Visual Question Answering (VQA) as case study to explore diverse in-context configurations to find the powerful ones. Additionally, through observing the changes of the LVLM outputs by altering the in-context sequence, we gain insights into the inner properties of LVLMs, improving our understanding of them. Specifically, to explore in-context configurations, we design diverse retrieval methods and employ different strategies to manipulate the retrieved demonstrations. Through exhaustive experiments on three VQA datasets: VQAv2, VizWiz, and OK-VQA, we uncover three important inner properties of the applied LVLM and demonstrate which strategies can consistently improve the ICL VQA performance. Our code is provided in: //github.com/GaryJiajia/OFv2_ICL_VQA.

估計/估計量 · 貝葉斯估計 · Agent · 邊緣化 · Networks ·

2023 年 12 月 2 日

Distributed Bayesian Estimation in Sensor Networks: Consensus on Marginal Densities

Parth Paritosh,Nikolay Atanasov,Sonia Martinez

In this paper, we aim to design and analyze distributed Bayesian estimation algorithms for sensor networks. The challenges we address are to (i) derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables, and (ii) leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents. This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest. We present Bayesian density estimation algorithms using data from non-linear likelihoods at agents in centralized, distributed, and marginal distributed settings. After setting up a distributed estimation objective, we prove almost-sure convergence to the optimal set of pdfs at each agent. Then, we prove the same for a storage-aware algorithm estimating densities only over relevant variables at each agent. Finally, we present a Gaussian version of these algorithms and implement it in a mapping problem using variational inference to handle non-linear likelihood models associated with LiDAR sensing.

支持向量 · 核化 · 支持向量機 · 向量化 · 優化器 ·

2023 年 12 月 1 日

A Preconditioned Interior Point Method for Support Vector Machines Using an ANOVA-Decomposition and NFFT-Based Matrix-Vector Products

Theresa Wagner,John W. Pearson,Martin Stoll

from arxiv, Official Code //github.com/wagnertheresa/NFFTSVMipm

In this paper we consider the numerical solution to the soft-margin support vector machine optimization problem. This problem is typically solved using the SMO algorithm, given the high computational complexity of traditional optimization algorithms when dealing with large-scale kernel matrices. In this work, we propose employing an NFFT-accelerated matrix-vector product using an ANOVA decomposition for the feature space that is used within an interior point method for the overall optimization problem. As this method requires the solution of a linear system of saddle point form we suggest a preconditioning approach that is based on low-rank approximations of the kernel matrix together with a Krylov subspace solver. We compare the accuracy of the ANOVA-based kernel with the default LIBSVM implementation. We investigate the performance of the different preconditioners as well as the accuracy of the ANOVA kernel on several large-scale datasets.

Performer · Learning · MoDELS · 可約的 · 矩 ·

2023 年 12 月 1 日

Large-scale Vision-Language Models Learn Super Images for Efficient and High-Performance Partially Relevant Video Retrieval

Taichi Nishimura,Shota Nakada,Masayoshi Kondo

from arxiv, 13 pages, 11 figures

In this paper, we propose an efficient and high-performance method for partially relevant video retrieval (PRVR), which aims to retrieve untrimmed long videos that contain at least one relevant moment to the input text query. In terms of both efficiency and performance, the overlooked bottleneck of previous studies is the visual encoding of dense frames. This guides researchers to choose lightweight visual backbones, yielding sub-optimal retrieval performance due to their limited capabilities of learned visual representations. However, it is undesirable to simply replace them with high-performance large-scale vision-and-language models (VLMs) due to their low efficiency. To address these issues, instead of dense frames, we focus on super images, which are created by rearranging the video frames in a $N \times N$ grid layout. This reduces the number of visual encodings to $\frac{1}{N^2}$ and compensates for the low efficiency of large-scale VLMs, allowing us to adopt them as powerful encoders. Surprisingly, we discover that with a simple query-image attention trick, VLMs generalize well to super images effectively and demonstrate promising zero-shot performance against SOTA methods efficiently. In addition, we propose a fine-tuning approach by incorporating a few trainable modules into the VLM backbones. The experimental results demonstrate that our approaches efficiently achieve the best performance on ActivityNet Captions and TVR.

估計/估計量 · 矩 · 情景 · 部分可觀測馬爾可夫決策過程 · 廣義函數 ·

2023 年 12 月 1 日

A Policy Gradient Method for Confounded POMDPs

Mao Hong,Zhengling Qi,Yanxun Xu

from arxiv, 95 pages, 3 figures

In this paper, we propose a policy gradient method for confounded partially observable Markov decision processes (POMDPs) with continuous state and observation spaces in the offline setting. We first establish a novel identification result to non-parametrically estimate any history-dependent policy gradient under POMDPs using the offline data. The identification enables us to solve a sequence of conditional moment restrictions and adopt the min-max learning procedure with general function approximation for estimating the policy gradient. We then provide a finite-sample non-asymptotic bound for estimating the gradient uniformly over a pre-specified policy class in terms of the sample size, length of horizon, concentratability coefficient and the measure of ill-posedness in solving the conditional moment restrictions. Lastly, by deploying the proposed gradient estimation in the gradient ascent algorithm, we show the global convergence of the proposed algorithm in finding the history-dependent optimal policy under some technical conditions. To the best of our knowledge, this is the first work studying the policy gradient method for POMDPs under the offline setting.

元學習 · 語音識別 · MAML · 學成 · 端到端 ·

2019 年 10 月 26 日

Meta Learning for End-to-End Low-Resource Speech Recognition

Jui-Yang Hsu,Yuan-Jui Chen,Hung-yi Lee

from arxiv, 5 pages, submitted to ICASSP 2020

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

3D · 圖像分割 · Neural Networks · Networking · state-of-the-art ·

2018 年 8 月 2 日

A 3D Coarse-to-Fine Framework for Volumetric Medical Image Segmentation

Zhuotun Zhu,Yingda Xia,Wei Shen,Elliot K. Fishman,Alan L. Yuille

from arxiv, 9 pages, 4 figures, Accepted to 3DV

In this paper, we adopt 3D Convolutional Neural Networks to segment volumetric medical images. Although deep neural networks have been proven to be very effective on many 2D vision tasks, it is still challenging to apply them to 3D tasks due to the limited amount of annotated 3D data and limited computational resources. We propose a novel 3D-based coarse-to-fine framework to effectively and efficiently tackle these challenges. The proposed 3D-based framework outperforms the 2D counterpart to a large margin since it can leverage the rich spatial infor- mation along all three axes. We conduct experiments on two datasets which include healthy and pathological pancreases respectively, and achieve the current state-of-the-art in terms of Dice-S{\o}rensen Coefficient (DSC). On the NIH pancreas segmentation dataset, we outperform the previous best by an average of over 2%, and the worst case is improved by 7% to reach almost 70%, which indicates the reliability of our framework in clinical applications.