露脸视频一区二区三区在线播放_欧美性爱黄色网战_国产在线精品黄片_日日夜夜国产一区_免费看国产曰批40分钟妓女91_国产一级一片免费观看999_一级国产航空美女毛片内谢

Spherical robots have garnered increasing interest for their applications in exploration, tunnel inspection, and extraterrestrial missions. Diverse designs have emerged, including barycentric configurations, pendulum-based mechanisms, etc. In addition, a wide spectrum of control strategies has been proposed, ranging from traditional PID approaches to cutting-edge neural networks. Our systematic review aims to comprehensively identify and categorize locomotion systems and control schemes employed by spherical robots, spanning the years 1996 to 2023. A meticulous search across five databases yielded a dataset of 3189 records. As a result of our exhaustive analysis, we identified a collection of novel designs and control strategies. Leveraging the insights garnered, we provide valuable recommendations for optimizing the design and control aspects of spherical robots, supporting both novel design endeavors and the advancement of field deployments. Furthermore, we illuminate key research directions that hold the potential to unlock the full capabilities of spherical robots

相關內容

控制器

關注 5

查準率/準確率 · MoDELS · Learning · Seven · Performer ·

2023 年 11 月 16 日

Emu Edit: Precise Image Editing via Recognition and Generation Tasks

Shelly Sheynin,Adam Polyak,Uriel Singer,Yuval Kirstain,Amit Zohar,Oron Ashual,Devi Parikh,Yaniv Taigman

Instruction-based image editing holds immense potential for a variety of applications, as it enables users to perform any editing operation using a natural language instruction. However, current models in this domain often struggle with accurately executing user instructions. We present Emu Edit, a multi-task image editing model which sets state-of-the-art results in instruction-based image editing. To develop Emu Edit we train it to multi-task across an unprecedented range of tasks, such as region-based editing, free-form editing, and Computer Vision tasks, all of which are formulated as generative tasks. Additionally, to enhance Emu Edit's multi-task learning abilities, we provide it with learned task embeddings which guide the generation process towards the correct edit type. Both these elements are essential for Emu Edit's outstanding performance. Furthermore, we show that Emu Edit can generalize to new tasks, such as image inpainting, super-resolution, and compositions of editing tasks, with just a few labeled examples. This capability offers a significant advantage in scenarios where high-quality samples are scarce. Lastly, to facilitate a more rigorous and informed assessment of instructable image editing models, we release a new challenging and versatile benchmark that includes seven different image editing tasks.

Chatbot · ChatGPT · 道德考慮 · Taxonomy · MoDELS ·

2023 年 11 月 16 日

Revolutionizing Customer Interactions: Insights and Challenges in Deploying ChatGPT and Generative Chatbots for FAQs

Feriel Khennouche,Youssef Elmir,Nabil Djebari,Yassine Himeur,Abbes Amira

In the rapidly evolving domain of artificial intelligence, chatbots have emerged as a potent tool for various applications ranging from e-commerce to healthcare. This research delves into the intricacies of chatbot technology, from its foundational concepts to advanced generative models like ChatGPT. We present a comprehensive taxonomy of existing chatbot approaches, distinguishing between rule-based, retrieval-based, generative, and hybrid models. A specific emphasis is placed on ChatGPT, elucidating its merits for frequently asked questions (FAQs)-based chatbots, coupled with an exploration of associated Natural Language Processing (NLP) techniques such as named entity recognition, intent classification, and sentiment analysis. The paper further delves into the customization and fine-tuning of ChatGPT, its integration with knowledge bases, and the consequent challenges and ethical considerations that arise. Through real-world applications in domains such as online shopping, healthcare, and education, we underscore the transformative potential of chatbots. However, we also spotlight open challenges and suggest future research directions, emphasizing the need for optimizing conversational flow, advancing dialogue mechanics, improving domain adaptability, and enhancing ethical considerations. The research culminates in a call for further exploration in ensuring transparent, ethical, and user-centric chatbot systems.

代價函數 · 代價 · 泛函 · Learning · 圖 ·

2023 年 11 月 14 日

SceneScore: Learning a Cost Function for Object Arrangement

Ivan Kapelyukh,Edward Johns

from arxiv, Presented at CoRL 2023 LEAP Workshop. Webpage: //sites.google.com/view/scenescore

Arranging objects correctly is a key capability for robots which unlocks a wide range of useful tasks. A prerequisite for creating successful arrangements is the ability to evaluate the desirability of a given arrangement. Our method "SceneScore" learns a cost function for arrangements, such that desirable, human-like arrangements have a low cost. We learn the distribution of training arrangements offline using an energy-based model, solely from example images without requiring environment interaction or human supervision. Our model is represented by a graph neural network which learns object-object relations, using graphs constructed from images. Experiments demonstrate that the learned cost function can be used to predict poses for missing objects, generalise to novel objects using semantic features, and can be composed with other cost functions to satisfy constraints at inference time.

Performer · 可約的 · 穩健性 · MoDELS · Integration ·

2023 年 11 月 14 日

MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation

Ehsan Asali,Prashant Doshi,Jin Sun

from arxiv, Conference on Robot Learning 2023 (CoRL2023)

The learn-from-observation (LfO) paradigm is a human-inspired mode for a robot to learn to perform a task simply by watching it being performed. LfO can facilitate robot integration on factory floors by minimizing disruption and reducing tedious programming. A key component of the LfO pipeline is a transformation of the depth camera frames to the corresponding task state and action pairs, which are then relayed to learning techniques such as imitation or inverse reinforcement learning for understanding the task parameters. While several existing computer vision models analyze videos for activity recognition, SA-Net specifically targets robotic LfO from RGB-D data. However, SA-Net and many other models analyze frame data captured from a single viewpoint. Their analysis is therefore highly sensitive to occlusions of the observed task, which are frequent in deployments. An obvious way of reducing occlusions is to simultaneously observe the task from multiple viewpoints and synchronously fuse the multiple streams in the model. Toward this, we present multi-view SA-Net, which generalizes the SA-Net model to allow the perception of multiple viewpoints of the task activity, integrate them, and better recognize the state and action in each frame. Performance evaluations on two distinct domains establish that MVSA-Net recognizes the state-action pairs under occlusion more accurately compared to single-view MVSA-Net and other baselines. Our ablation studies further evaluate its performance under different ambient conditions and establish the contribution of the architecture components. As such, MVSA-Net offers a significantly more robust and deployable state-action trajectory generation compared to previous methods.

數據集 · GROUP · Elevate · 評論員 · 生物特征識別 ·

2022 年 11 月 3 日

Expanding Accurate Person Recognition to New Altitudes and Ranges: The BRIAR Dataset

David Cornett III,Joel Brogan,Nell Barber,Deniz Aykac,Seth Baird,Nick Burchfield,Carl Dukes,Andrew Duncan,Regina Ferrell,Jim Goddard,Gavin Jager,Matt Larson,Bart Murphy,Christi Johnson,Ian Shelley,Nisha Srinivas,Brandon Stockwell,Leanne Thompson,Matt Yohe,Robert Zhang,Scott Dolvin,Hector J. Santos-Villalobos,David S. Bolme

Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These applications require lower resolution, longer ranges, and elevated viewpoints. To meet these critical needs, we collected and curated the first and second subsets of a large multi-modal biometric dataset designed for use in the research and development (R&D) of biometric recognition technologies under extremely challenging conditions. Thus far, the dataset includes more than 350,000 still images and over 1,300 hours of video footage of approximately 1,000 subjects. To collect this data, we used Nikon DSLR cameras, a variety of commercial surveillance cameras, specialized long-rage R&D cameras, and Group 1 and Group 2 UAV platforms. The goal is to support the development of algorithms capable of accurately recognizing people at ranges up to 1,000 m and from high angles of elevation. These advances will include improvements to the state of the art in face recognition and will support new research in the area of whole-body recognition using methods based on gait and anthropometry. This paper describes methods used to collect and curate the dataset, and the dataset's characteristics at the current stage.

Next · Integration · 有向 · 控制器 · Continuity ·

2022 年 3 月 5 日

AI for Next Generation Computing: Emerging Trends and Future Directions

Sukhpal Singh Gill,Minxian Xu,Carlo Ottaviani,Panos Patros,Rami Bahsoon,Arash Shaghaghi,Muhammed Golec,Vlado Stankovski,Huaming Wu,Ajith Abraham,Manmeet Singh,Harshit Mehta,Soumya K. Ghosh,Thar Baker,Ajith Kumar Parlikad,Hanan Lutfiyya,Salil S. Kanhere,Rizos Sakellariou,Schahram Dustdar,Omer Rana,Ivona Brandic,Steve Uhlig

from arxiv, Accepted for Publication in Elsevier IoT Journal, 2022

Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.

變換 · Vision · 可辨認的 · Taxonomy · Prompt ·

2022 年 1 月 24 日

Transformers in Medical Imaging: A Survey

Fahad Shamshad,Salman Khan,Syed Waqas Zamir,Muhammad Haris Khan,Munawar Hayat,Fahad Shahbaz Khan,Huazhu Fu

from arxiv, 41 pages, \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, reconstruction, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}.

可理解性 · Transformer · Vision · 塊 · INFORMS ·

2021 年 12 月 30 日

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding

Zizhao Zhang,Han Zhang,Long Zhao,Ting Chen,Sercan O. Arik,Tomas Pfister

from arxiv, AAAI2022

Hierarchical structures are popular in recent vision transformers, however, they require sophisticated designs and massive datasets to work well. In this paper, we explore the idea of nesting basic local transformers on non-overlapping image blocks and aggregating them in a hierarchical way. We find that the block aggregation function plays a critical role in enabling cross-block non-local information communication. This observation leads us to design a simplified architecture that requires minor code changes upon the original vision transformer. The benefits of the proposed judiciously-selected design are threefold: (1) NesT converges faster and requires much less training data to achieve good generalization on both ImageNet and small datasets like CIFAR; (2) when extending our key ideas to image generation, NesT leads to a strong decoder that is 8$\times$ faster than previous transformer-based generators; and (3) we show that decoupling the feature learning and abstraction processes via this nested hierarchy in our design enables constructing a novel method (named GradCAT) for visually interpreting the learned model. Source code is available //github.com/google-research/nested-transformer.

Extensibility · 噪聲 · Performer · state-of-the-art · 學成 ·

2021 年 6 月 30 日

Affective Image Content Analysis: Two Decades Review and New Perspectives

Sicheng Zhao,Xingxu Yao,Jufeng Yang,Guoli Jia,Guiguang Ding,Tat-Seng Chua,Bj?rn W. Schuller,Kurt Keutzer

from arxiv, Accepted by IEEE TPAMI

Images can convey rich semantics and induce various emotions in viewers. Recently, with the rapid advancement of emotional intelligence and the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this survey, we will comprehensively review the development of AICA in the recent two decades, especially focusing on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We begin with an introduction to the key emotion representation models that have been widely employed in AICA and description of available datasets for performing evaluation with quantitative comparison of label noise and dataset bias. We then summarize and compare the representative approaches on (1) emotion feature extraction, including both handcrafted and deep features, (2) learning methods on dominant emotion recognition, personalized emotion prediction, emotion distribution learning, and learning from noisy data or few labels, and (3) AICA based applications. Finally, we discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.

MoDELS · Taxonomy · INFORMS · 可理解性 · IR ·

2019 年 8 月 15 日

Explainable Recommendation: A Survey and New Perspectives

Yongfeng Zhang,Xu Chen

from arxiv, 88 pages

Explainable recommendation attempts to develop models that generate not only high-quality recommendations but also intuitive explanations. The explanations may either be post-hoc or directly come from an explainable model (also called interpretable or transparent model in some context). Explainable recommendation tries to address the problem of why: by providing explanations to users or system designers, it helps humans to understand why certain items are recommended by the algorithm, where the human can either be users or system designers. Explainable recommendation helps to improve the transparency, persuasiveness, effectiveness, trustworthiness, and satisfaction of recommendation systems. In this survey, we review works on explainable recommendation in or before the year of 2019. We first highlight the position of explainable recommendation in recommender system research by categorizing recommendation problems into the 5W, i.e., what, when, who, where, and why. We then conduct a comprehensive survey of explainable recommendation on three perspectives: 1) We provide a chronological research timeline of explainable recommendation, including user study approaches in the early years and more recent model-based approaches. 2) We provide a two-dimensional taxonomy to classify existing explainable recommendation research: one dimension is the information source (or display style) of the explanations, and the other dimension is the algorithmic mechanism to generate explainable recommendations. 3) We summarize how explainable recommendation applies to different recommendation tasks, such as product recommendation, social recommendation, and POI recommendation. We also devote a section to discuss the explanation perspectives in broader IR and AI/ML research. We end the survey by discussing potential future directions to promote the explainable recommendation research area and beyond.