亚洲精品无码国产爽快A片百度,亚洲乱色熟女一区二区三区麻豆,日本一区二区三区免视频免费播放

The dominant paradigm for end-to-end robot learning focuses on optimizing task-specific objectives that solve a single robotic problem such as picking up an object or reaching a target position. However, recent work on high-capacity models in robotics has shown promise toward being trained on large collections of diverse and task-agnostic datasets of video demonstrations. These models have shown impressive levels of generalization to unseen circumstances, especially as the amount of data and the model complexity scale. Surgical robot systems that learn from data have struggled to advance as quickly as other fields of robot learning for a few reasons: (1) there is a lack of existing large-scale open-source data to train models, (2) it is challenging to model the soft-body deformations that these robots work with during surgery because simulation cannot match the physical and visual complexity of biological tissue, and (3) surgical robots risk harming patients when tested in clinical trials and require more extensive safety measures. This perspective article aims to provide a path toward increasing robot autonomy in robot-assisted surgery through the development of a multi-modal, multi-task, vision-language-action model for surgical robots. Ultimately, we argue that surgical robots are uniquely positioned to benefit from general-purpose models and provide three guiding actions toward increased autonomy in robot-assisted surgery.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · MoDELS · tuning · 視覺問答 · Performer ·

2024 年 2 月 16 日

Multi-modal preference alignment remedies regression of visual instruction tuning on language model

Shengzhi Li,Rongyu Lin,Shichao Pei

In production, multi-modal large language models (MLLMs) are expected to support multi-turn queries of interchanging image and text modalities. However, the current MLLMs trained with visual-question-answering (VQA) datasets could suffer from degradation, as VQA datasets lack the diversity and complexity of the original text instruction datasets which the underlying language model had been trained with. To address this challenging degradation, we first collect a lightweight (6k entries) VQA preference dataset where answers were annotated by Gemini for 5 quality metrics in a granular fashion, and investigate standard Supervised Fine-tuning, rejection sampling, Direct Preference Optimization (DPO), and SteerLM. Our findings indicate that the with DPO we are able to surpass instruction-following capabilities of the language model, achieving a 6.73 score on MT-Bench, compared to Vicuna's 6.57 and LLaVA's 5.99 despite small data scale. This enhancement in textual instruction proficiency correlates with boosted visual instruction performance (+4.9\% on MM-Vet, +6\% on LLaVA-Bench), with minimal alignment tax on visual knowledge benchmarks compared to previous RLHF approach. In conclusion, we propose a distillation-based multi-modal alignment model with fine-grained annotations on a small dataset that reconciles the textual and visual performance of MLLMs, restoring and boosting language capability after visual instruction tuning.

Vision · INTERACT · Integration · 機器人 · Automator ·

2024 年 2 月 16 日

A novel integrated industrial approach with cobots in the age of industry 4.0 through conversational interaction and computer vision

Andrea Pazienza,Nicola Macchiarulo,Felice Vitulano,Antonio Fiorentini,Marco Cammisa,Leonardo Rigutini,Ernesto Di Iorio,Achille Globo,Antonio Trevisi

From robots that replace workers to robots that serve as helpful colleagues, the field of robotic automation is experiencing a new trend that represents a huge challenge for component manufacturers. The contribution starts from an innovative vision that sees an ever closer collaboration between Cobot, able to do a specific physical job with precision, the AI world, able to analyze information and support the decision-making process, and the man able to have a strategic vision of the future.

Networking · 規范化的 · 可約的 · MoDELS · Neural Networks ·

2024 年 2 月 16 日

Normalizing flow neural networks by JKO scheme

Chen Xu,Xiuyuan Cheng,Yao Xie

from arxiv, NeurIPS 2023 spotlight

Normalizing flow is a class of deep generative models for efficient sampling and likelihood estimation, which achieves attractive performance, particularly in high dimensions. The flow is often implemented using a sequence of invertible residual blocks. Existing works adopt special network architectures and regularization of flow trajectories. In this paper, we develop a neural ODE flow network called JKO-iFlow, inspired by the Jordan-Kinderleherer-Otto (JKO) scheme, which unfolds the discrete-time dynamic of the Wasserstein gradient flow. The proposed method stacks residual blocks one after another, allowing efficient block-wise training of the residual blocks, avoiding sampling SDE trajectories and score matching or variational learning, thus reducing the memory load and difficulty in end-to-end training. We also develop adaptive time reparameterization of the flow network with a progressive refinement of the induced trajectory in probability space to improve the model accuracy further. Experiments with synthetic and real data show that the proposed JKO-iFlow network achieves competitive performance compared with existing flow and diffusion models at a significantly reduced computational and memory cost.

Learning · Performer · 聯邦學習 · 層 · 深度學習 ·

2024 年 2 月 15 日

A chaotic maps-based privacy-preserving distributed deep learning for incomplete and Non-IID datasets

Irina Arévalo,Jose L. Salmeron

Federated Learning is a machine learning approach that enables the training of a deep learning model among several participants with sensitive data that wish to share their own knowledge without compromising the privacy of their data. In this research, the authors employ a secured Federated Learning method with an additional layer of privacy and proposes a method for addressing the non-IID challenge. Moreover, differential privacy is compared with chaotic-based encryption as layer of privacy. The experimental approach assesses the performance of the federated deep learning model with differential privacy using both IID and non-IID data. In each experiment, the Federated Learning process improves the average performance metrics of the deep neural network, even in the case of non-IID data.

binary · 可辨認的 · 規范化的 · 設計 · 優化器 ·

2024 年 2 月 15 日

Optimal Bayesian stepped-wedge cluster randomised trial designs for binary outcome data

Laura Etfer,James M. S. Wason,Michael J. Grayling

Under a generalised estimating equation analysis approach, approximate design theory is used to determine Bayesian D-optimal designs. For two examples, considering simple exchangeable and exponential decay correlation structures, we compare the efficiency of identified optimal designs to balanced stepped-wedge designs and corresponding stepped-wedge designs determined by optimising using a normal approximation approach. The dependence of the Bayesian D-optimal designs on the assumed correlation structure is explored; for the considered settings, smaller decay in the correlation between outcomes across time periods, along with larger values of the intra-cluster correlation, leads to designs closer to a balanced design being optimal. Unlike for normal data, it is shown that the optimal design need not be centro-symmetric in the binary outcome case. The efficiency of the Bayesian D-optimal design relative to a balanced design can be large, but situations are demonstrated in which the advantages are small. Similarly, the optimal design from a normal approximation approach is often not much less efficient than the Bayesian D-optimal design. Bayesian D-optimal designs can be readily identified for stepped-wedge cluster randomised trials with binary outcome data. In certain circumstances, principally ones with strong time period effects, they will indicate that a design unlikely to have been identified by previous methods may be substantially more efficient. However, they require a larger number of assumptions than existing optimal designs, and in many situations existing theory under a normal approximation will provide an easier means of identifying an efficient design for binary outcome data.

CAV · 講稿 · 人機交互 ·

2024 年 2 月 15 日

Comparing autonomous vehicle acceptance of German residents with and without visual impairments

Celina Kacperski,Florian Kutzner,Tobias Vogel

Connected and autonomous vehicles (CAVs) will greatly impact the lives of individuals with visual impairments, but how they differ in expectations compared to sighted individuals is not clear. The present research reports results based on survey responses from 114 visually impaired participants and 117 panel recruited participants without visual impairments, from Germany. Their attitudes towards autonomous vehicles and their expectations for consequences of wide-spread adoption of CAVs are assessed. Results indicate significantly more positive CAV attitudes in participants with visual impairments compared to those without visual impairments. Mediation analyses indicate that visually impaired individuals' more positive CAV attitudes (compared to sighted individuals') are largely explained by higher hopes for independence, and more optimistic expectations regarding safety and sustainability. Policy makers should ensure accessibility without sacrificing goals for higher safety and lower ecological impact to make CAVs an acceptable inclusive mobility solution.

Networking · 穩健性 · Neural Networks · Weight · Processing（編程語言） ·

2024 年 2 月 15 日

Gradient-descent hardware-aware training and deployment for mixed-signal Neuromorphic processors

U?urcan ?akal, Maryada,Chenxi Wu,Ilkay Ulusoy,Dylan R. Muir

Mixed-signal neuromorphic processors provide extremely low-power operation for edge inference workloads, taking advantage of sparse asynchronous computation within Spiking Neural Networks (SNNs). However, deploying robust applications to these devices is complicated by limited controllability over analog hardware parameters, as well as unintended parameter and dynamical variations of analog circuits due to fabrication non-idealities. Here we demonstrate a novel methodology for ofDine training and deployment of spiking neural networks (SNNs) to the mixed-signal neuromorphic processor DYNAP-SE2. The methodology utilizes gradient-based training using a differentiable simulation of the mixed-signal device, coupled with an unsupervised weight quantization method to optimize the network's parameters. Parameter noise injection during training provides robustness to the effects of quantization and device mismatch, making the method a promising candidate for real-world applications under hardware constraints and non-idealities. This work extends Rockpool, an open-source deep-learning library for SNNs, with support for accurate simulation of mixed-signal SNN dynamics. Our approach simplifies the development and deployment process for the neuromorphic community, making mixed-signal neuromorphic processors more accessible to researchers and developers.

MoDELS · 情景 · Performer · 講稿 · 相關系數 ·

2024 年 2 月 14 日

Long-form evaluation of model editing

Domenic Rosati,Robie Gonzales,Jinkun Chen,Xuemin Yu,Melis Erkan,Yahya Kayani,Satya Deepika Chavatapalli,Frank Rudzicz,Hassan Sajjad

Evaluations of model editing currently only use the `next few token' completions after a prompt. As a result, the impact of these methods on longer natural language generation is largely unknown. We introduce long-form evaluation of model editing (\textbf{\textit{LEME}}) a novel evaluation protocol that measures the efficacy and impact of model editing in long-form generative settings. Our protocol consists of a machine-rated survey and a classifier which correlates well with human ratings. Importantly, we find that our protocol has very little relationship with previous short-form metrics (despite being designed to extend efficacy, generalization, locality, and portability into a long-form setting), indicating that our method introduces a novel set of dimensions for understanding model editing methods. Using this protocol, we benchmark a number of model editing techniques and present several findings including that, while some methods (ROME and MEMIT) perform well in making consistent edits within a limited scope, they suffer much more from factual drift than other methods. Finally, we present a qualitative analysis that illustrates common failure modes in long-form generative settings including internal consistency, lexical cohesion, and locality issues.

MoDELS · 相同 · 數據集 · 模型評估 · Analysis ·

2024 年 2 月 14 日

comparison of two models to predict vertebral failure loads on the same experimental dataset

V. Allard,C. Heidsieck,F. Bermond,C. Confavreux,C. Travert,L. Gajny, 3,W. Skalli,D. Mitton,H. Follet

from arxiv, 5 pages, 4 figures, 2 tables

Clinical use of finite element analysis requires validation and reproducibility studies. The current study compared two models of vertebral bodies including endplates, on the same experimental dataset and evaluated the influence of the operator on the failure load. Models used were strongly correlated (R2=0.91). The intra-operator reproducibility was 6.4% and 3.5 % for each model. Both simulated results were close to experimental results. The differences in performance could be associated to the differences in segmentation process, mesh (hexahedral vs tetrahedral), material representation and failure criteria. Linear analysis did not decrease model accuracy. Comparison with literature for accuracy and precision shows a wide range of values partly related to the different experimental datasets and the different modelling approaches. Models benchmark using the same experimental dataset are needed to go towards clinical applications.

可交換的 · INFORMS ·

2024 年 2 月 13 日

Cryptoanalysis of a key exchange protocol based on a congruence-simple semiring action

Otero Sanchez Alvaro,Lopez Ramos Juan Antonio

We show that a previously introduced key exchange based on a congruence-simple semiring action is not secure by providing an attack that reveals the shared key from the distributed public information for any of such semirings