一级欧美一级日韩大片,亚洲午夜三级黄片,欧美巨大精品欧美一区二区

In this study, we further investigate the robustness and generalization ability of an neural network (NN) based force estimation method, using the da Vinci Research Kit Si (dVRK-Si). To evaluate our method's performance, we compare the force estimation accuracy with several baseline methods. We conduct comparative studies between the dVRK classic and dVRK-Si systems to benchmark the effectiveness of these approaches. We conclude that the NN-based method provides comparable force estimation accuracy across the two systems, as the average root mean square error (RMSE) over the average range of force ratio is approximately 3.07% for the dVRK classic, and 5.27% for the dVRK-Si. On the dVRK-Si, the force estimation RMSEs for all the baseline methods are 2 to 4 times larger than the NN-based method in all directions. One possible reason is, we made assumptions in the baseline methods that static forces remain the same or dynamics is time-invariant. These assumptions may hold for the dVRK Classic, as it has pre-loaded weight and maintains horizontal self balance. Since the dVRK-Si configuration does not have this property, assumptions do not hold anymore, therefore the NN-based method significantly outperforms.

相關內容

估計/估計量

關注 3

U-Net · Attention · 查準率/準確率 · Learning · Networking ·

2024 年 6 月 25 日

Mask-Guided Attention U-Net for Enhanced Neonatal Brain Extraction and Image Preprocessing

Bahram Jafrasteh,Simon Pedro Lubian-Lopez,Emiliano Trimarco,Macarena Roman Ruiz,Carmen Rodriguez Barrios,Yolanda Marin Almagro,Isabel Benavente-Fernandez

In this study, we introduce MGA-Net, a novel mask-guided attention neural network, which extends the U-net model for precision neonatal brain imaging. MGA-Net is designed to extract the brain from other structures and reconstruct high-quality brain images. The network employs a common encoder and two decoders: one for brain mask extraction and the other for brain region reconstruction. A key feature of MGA-Net is its high-level mask-guided attention module, which leverages features from the brain mask decoder to enhance image reconstruction. To enable the same encoder and decoder to process both MRI and ultrasound (US) images, MGA-Net integrates sinusoidal positional encoding. This encoding assigns distinct positional values to MRI and US images, allowing the model to effectively learn from both modalities. Consequently, features learned from a single modality can aid in learning a modality with less available data, such as US. We extensively validated the proposed MGA-Net on diverse datasets from varied clinical settings and neonatal age groups. The metrics used for assessment included the DICE similarity coefficient, recall, and accuracy for image segmentation; structural similarity for image reconstruction; and root mean squared error for total brain volume estimation from 3D ultrasound images. Our results demonstrate that MGA-Net significantly outperforms traditional methods, offering superior performance in brain extraction and segmentation while achieving high precision in image reconstruction and volumetric analysis. Thus, MGA-Net represents a robust and effective preprocessing tool for MRI and 3D ultrasound images, marking a significant advance in neuroimaging that enhances both research and clinical diagnostics in the neonatal period and beyond.

模型性能 · Performer · MoDELS · Performance · 可辨認的 ·

2024 年 6 月 24 日

Compact Proofs of Model Performance via Mechanistic Interpretability

Jason Gross,Rajashree Agrawal,Thomas Kwa,Euan Ong,Chun Hei Yip,Alex Gibson,Soufiane Noubir,Lawrence Chan

In this work, we propose using mechanistic interpretability -- techniques for reverse engineering model weights into human-interpretable algorithms -- to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally proving lower bounds on the accuracy of 151 small transformers trained on a Max-of-$K$ task. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we find that shorter proofs seem to require and provide more mechanistic understanding. Moreover, we find that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless noise as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.

可辨認的 · CASES · Signal Processing · 有向 · 查準率/準確率 ·

2024 年 6 月 22 日

AI-based Drone Assisted Human Rescue in Disaster Environments: Challenges and Opportunities

Narek Papyan,Michel Kulhandjian,Hovannes Kulhandjian,Levon Hakob Aslanyan

In this survey we are focusing on utilizing drone-based systems for the detection of individuals, particularly by identifying human screams and other distress signals. This study has significant relevance in post-disaster scenarios, including events such as earthquakes, hurricanes, military conflicts, wildfires, and more. These drones are capable of hovering over disaster-stricken areas that may be challenging for rescue teams to access directly. Unmanned aerial vehicles (UAVs), commonly referred to as drones, are frequently deployed for search-and-rescue missions during disaster situations. Typically, drones capture aerial images to assess structural damage and identify the extent of the disaster. They also employ thermal imaging technology to detect body heat signatures, which can help locate individuals. In some cases, larger drones are used to deliver essential supplies to people stranded in isolated disaster-stricken areas. In our discussions, we delve into the unique challenges associated with locating humans through aerial acoustics. The auditory system must distinguish between human cries and sounds that occur naturally, such as animal calls and wind. Additionally, it should be capable of recognizing distinct patterns related to signals like shouting, clapping, or other ways in which people attempt to signal rescue teams. To tackle this challenge, one solution involves harnessing artificial intelligence (AI) to analyze sound frequencies and identify common audio signatures. Deep learning-based networks, such as convolutional neural networks (CNNs), can be trained using these signatures to filter out noise generated by drone motors and other environmental factors. Furthermore, employing signal processing techniques like the direction of arrival (DOA) based on microphone array signals can enhance the precision of tracking the source of human noises.

約束 · 縮放 · 奇異的 · Extensibility · 離散化 ·

2024 年 6 月 20 日

An Asymptotic Preserving and Energy Stable Scheme for the Euler System with Congestion Constraint

K. R. Arun,Amogh Krishnamurthy,Harihara Maharana

In this work, we design and analyze an asymptotic preserving (AP), semi-implicit finite volume scheme for the scaled compressible isentropic Euler system with a singular pressure law known as the congestion pressure law. The congestion pressure law imposes a maximal density constraint of the form $0\leq \varrho <1$, and the scaling introduces a small parameter $\varepsilon$ in order to control the stiffness of the density constraint. As $\varepsilon\to 0$, the solutions of the compressible system converge to solutions of the so-called free-congested Euler equations that couples compressible and incompressible dynamics. We show that the proposed scheme is positivity preserving and energy stable. In addition, we also show that the numerical densities satisfy a discrete variant of the constraint. By means of extensive numerical case studies, we verify the efficacy of the scheme and show that the scheme is able to capture the two dynamics in the limiting regime, thereby proving the AP property.

模型性能 · Performer · MoDELS · Performance · 可辨認的 ·

2024 年 6 月 17 日

Provable Guarantees for Model Performance via Mechanistic Interpretability

Jason Gross,Rajashree Agrawal,Thomas Kwa,Euan Ong,Chun Hei Yip,Alex Gibson,Soufiane Noubir,Lawrence Chan

from arxiv, Submitted to the ICML 2024 Workshop on Mechanistic Interpretability and The Thirty-eighth Annual Conference on Neural Information Processing Systems

In this work, we propose using mechanistic interpretability -- techniques for reverse engineering model weights into human-interpretable algorithms -- to derive and compactly prove formal guarantees on model performance. We prototype this approach by formally lower bounding the accuracy of 151 small transformers trained on a Max-of-$k$ task. We create 102 different computer-assisted proof strategies and assess their length and tightness of bound on each of our models. Using quantitative metrics, we show that shorter proofs seem to require and provide more mechanistic understanding, and that more faithful mechanistic understanding leads to tighter performance bounds. We confirm these connections by qualitatively examining a subset of our proofs. Finally, we identify compounding structureless noise as a key challenge for using mechanistic interpretability to generate compact proofs on model performance.

有偏 · MoDELS · 數據集 · Automator · INFORMS ·

2024 年 6 月 17 日

A Unified Framework and Dataset for Assessing Societal Bias in Vision-Language Models

Ashutosh Sathe,Prachi Jain,Sunayana Sitaram

Vision-language models (VLMs) have gained widespread adoption in both industry and academia. In this study, we propose a unified framework for systematically evaluating gender, race, and age biases in VLMs with respect to professions. Our evaluation encompasses all supported inference modes of the recent VLMs, including image-to-text, text-to-text, text-to-image, and image-to-image. Additionally, we propose an automated pipeline to generate high-quality synthetic datasets that intentionally conceal gender, race, and age information across different professional domains, both in generated text and images. The dataset includes action-based descriptions of each profession and serves as a benchmark for evaluating societal biases in vision-language models (VLMs). In our comparative analysis of widely used VLMs, we have identified that varying input-output modalities lead to discernible differences in bias magnitudes and directions. Additionally, we find that VLM models exhibit distinct biases across different bias attributes we investigated. We hope our work will help guide future progress in improving VLMs to learn socially unbiased representations. We will release our data and code.

穩健性 · Learning · Agent · 強化學習 · 可辨認的 ·

2024 年 6 月 17 日

Byzantine Robust Cooperative Multi-Agent Reinforcement Learning as a Bayesian Game

Simin Li,Jun Guo,Jingqiao Xiu,Ruixiao Xu,Xin Yu,Jiakai Wang,Aishan Liu,Yaodong Yang,Xianglong Liu

In this study, we explore the robustness of cooperative multi-agent reinforcement learning (c-MARL) against Byzantine failures, where any agent can enact arbitrary, worst-case actions due to malfunction or adversarial attack. To address the uncertainty that any agent can be adversarial, we propose a Bayesian Adversarial Robust Dec-POMDP (BARDec-POMDP) framework, which views Byzantine adversaries as nature-dictated types, represented by a separate transition. This allows agents to learn policies grounded on their posterior beliefs about the type of other agents, fostering collaboration with identified allies and minimizing vulnerability to adversarial manipulation. We define the optimal solution to the BARDec-POMDP as an ex post robust Bayesian Markov perfect equilibrium, which we proof to exist and weakly dominates the equilibrium of previous robust MARL approaches. To realize this equilibrium, we put forward a two-timescale actor-critic algorithm with almost sure convergence under specific conditions. Experimentation on matrix games, level-based foraging and StarCraft II indicate that, even under worst-case perturbations, our method successfully acquires intricate micromanagement skills and adaptively aligns with allies, demonstrating resilience against non-oblivious adversaries, random allies, observation-based attacks, and transfer-based attacks.

CASE · 機器人 · INTERACT · Performer · 論文 ·

2024 年 6 月 17 日

Eliciting New Perspectives in RtD Studies through Annotated Portfolios: A Case Study of Robotic Artefacts

Marius Hoggenmuller,Wen-Ying Lee,Luke Hespanhol,Malte Jung,Martin Tomitsch

In this paper, we investigate how to elicit new perspectives in research-through-design (RtD) studies through annotated portfolios. Situating the usage in human-robot interaction (HRI), we used two robotic artefacts as a case study: we first created our own annotated portfolio and subsequently ran online workshops during which we asked HRI experts to annotate our robotic artefacts. We report on the different aspects revealed about the value, use, and further improvements of the robotic artefacts through using the annotated portfolio technique ourselves versus using it with experts. We suggest that annotated portfolios - when performed by external experts - allow design researchers to obtain a form of creative and generative peer critique. Our paper offers methodological considerations for conducting expert annotation sessions. Further, we discuss the use of annotated portfolios to unveil designerly HRI knowledge in RtD studies.

Automator · AutoML · Machine Learning · 學成 · 可約的 ·

2019 年 1 月 17 日

Taking Human out of Learning Applications: A Survey on Automated Machine Learning

Quanming Yao,Mengshuo Wang,Yuqiang Chen,Wenyuan Dai,Hu Yi-Qi,Li Yu-Feng,Tu Wei-Wei,Yang Qiang,Yu Yang

from arxiv, This is a preliminary and will be kept updated

Machine learning techniques have deeply rooted in our everyday life. However, since it is knowledge- and labor-intensive to pursue good learning performance, human experts are heavily involved in every aspect of machine learning. In order to make machine learning techniques easier to apply and reduce the demand for experienced human experts, automated machine learning (AutoML) has emerged as a hot topic with both industrial and academic interest. In this paper, we provide an up to date survey on AutoML. First, we introduce and define the AutoML problem, with inspiration from both realms of automation and machine learning. Then, we propose a general AutoML framework that not only covers most existing approaches to date but also can guide the design for new methods. Subsequently, we categorize and review the existing works from two aspects, i.e., the problem setup and the employed techniques. Finally, we provide a detailed analysis of AutoML approaches and explain the reasons underneath their successful applications. We hope this survey can serve as not only an insightful guideline for AutoML beginners but also an inspiration for future research.

MoDELS · 注意力機制 · RNN · 標注 · Networking ·

2017 年 12 月 20 日

Order-Free RNN with Visual Attention for Multi-Label Classification

Shang-Fu Chen,Yi-Chen Chen,Chih-Kuan Yeh,Yu-Chiang Frank Wang

from arxiv, Accepted at 32nd AAAI Conference on Artificial Intelligence (AAAI-18)

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.