亚洲国产最新AV片,国产精品久久久久一级毛片,精品变态视频一区二区三区

Conducting collaborative tasks, e.g., multi-user game, in virtual reality (VR) could enable us to explore more immersive and effective experience. However, for current VR systems, users cannot communicate properly with each other via their gaze points, and this would interfere with users' mutual understanding of the intention. In this study, we aimed to find the optimal eye tracking data visualization , which minimized the cognitive interference and improved the understanding of the visual attention and intention between users. We designed three different eye tracking data visualizations: gaze cursor, gaze spotlight and gaze trajectory in VR scene for a course of human heart , and found that gaze cursor from doctors could help students learn complex 3D heart models more effectively. To further explore, two students as a pair were asked to finish a quiz in VR environment, with sharing gaze cursors with each other, and obtained more efficiency and scores. It indicated that sharing eye tracking data visualization could improve the quality and efficiency of collaborative work in the VR environment.

相關內容

回合

關注 3

tuning · 多樣性 · GPT-4 · Learning · MoDELS ·

2023 年 5 月 9 日

Towards Building the Federated GPT: Federated Instruction Tuning

Jianyi Zhang,Saeed Vahidian,Martin Kuo,Chunyuan Li,Ruiyi Zhang,Guoyin Wang,Yiran Chen

from arxiv, 20 pages, work in progress, project page: //github.com/JayZhang42/FederatedGPT-Shepherd

While ``instruction-tuned" generative large language models (LLMs) have demonstrated an impressive ability to generalize to new tasks, the training phases heavily rely on large amounts of diverse and high-quality instruction data (such as ChatGPT and GPT-4). Unfortunately, acquiring high-quality data, especially when it comes to human-written data, can pose significant challenges both in terms of cost and accessibility. Moreover, concerns related to privacy can further limit access to such data, making the process of obtaining it a complex and nuanced undertaking. Consequently, this hinders the generality of the tuned models and may restrict their effectiveness in certain contexts. To tackle this issue, our study introduces a new approach called Federated Instruction Tuning (FedIT), which leverages federated learning (FL) as the learning framework for the instruction tuning of LLMs. This marks the first exploration of FL-based instruction tuning for LLMs. This is especially important since text data is predominantly generated by end users. Therefore, it is imperative to design and adapt FL approaches to effectively leverage these users' diverse instructions stored on local devices, while preserving privacy and ensuring data security. In the current paper, by conducting widely used GPT-4 auto-evaluation, we demonstrate that by exploiting the heterogeneous and diverse sets of instructions on the client's end with the proposed framework FedIT, we improved the performance of LLMs compared to centralized training with only limited local instructions. Further, in this paper, we developed a Github repository named Shepherd. This repository offers a foundational framework for exploring federated fine-tuning of LLMs using heterogeneous instructions across diverse categories.

假陰性 · contrastive · 圖像字幕 · 多峰值 · 最優化 ·

2023 年 5 月 9 日

Exploiting Pseudo Image Captions for Multimodal Summarization

Chaoya Jiang,Rui Xie,Wei Ye,Jinan Sun,Shikun Zhang

from arxiv, Accepted at ACL2023 Findings

Cross-modal contrastive learning in vision language pretraining (VLP) faces the challenge of (partial) false negatives. In this paper, we study this problem from the perspective of Mutual Information (MI) optimization. It is common sense that InfoNCE loss used in contrastive learning will maximize the lower bound of MI between anchors and their positives, while we theoretically prove that MI involving negatives also matters when noises commonly exist. Guided by a more general lower bound form for optimization, we propose a contrastive learning strategy regulated by progressively refined cross-modal similarity, to more accurately optimize MI between an image/text anchor and its negative texts/images instead of improperly minimizing it. Our method performs competitively on four downstream cross-modal tasks and systematically balances the beneficial and harmful effects of (partial) false negative samples under theoretical guidance.

SLAM · 可理解性 · 回合 · SOTA · 即時定位與地圖構建 ·

2023 年 5 月 9 日

Understanding why SLAM algorithms fail in modern indoor environments

Nwankwo Linus,Elmar Rueckert

Simultaneous localization and mapping (SLAM) algorithms are essential for the autonomous navigation of mobile robots. With the increasing demand for autonomous systems, it is crucial to evaluate and compare the performance of these algorithms in real-world environments. In this paper, we provide an evaluation strategy and real-world datasets to test and evaluate SLAM algorithms in complex and challenging indoor environments. Further, we analysed state-of-the-art (SOTA) SLAM algorithms based on various metrics such as absolute trajectory error, scale drift, and map accuracy and consistency. Our results demonstrate that SOTA SLAM algorithms often fail in challenging environments, with dynamic objects, transparent and reflecting surfaces. We also found that successful loop closures had a significant impact on the algorithm's performance. These findings highlight the need for further research to improve the robustness of the algorithms in real-world scenarios.

INTERACT · 任務對話系統 · 回合 · 增強現實（AR） · Agent ·

2023 年 5 月 8 日

ARDIE: AR, Dialogue, and Eye Gaze Policies for Human-Robot Collaboration

Chelsea Zou,Kishan Chandan,Yan Ding,Shiqi Zhang

Human-robot collaboration (HRC) has become increasingly relevant in industrial, household, and commercial settings. However, the effectiveness of such collaborations is highly dependent on the human and robots' situational awareness of the environment. Improving this awareness includes not only aligning perceptions in a shared workspace, but also bidirectionally communicating intent and visualizing different states of the environment to enhance scene understanding. In this paper, we propose ARDIE (Augmented Reality with Dialogue and Eye Gaze), a novel intelligent agent that leverages multi-modal feedback cues to enhance HRC. Our system utilizes a decision theoretic framework to formulate a joint policy that incorporates interactive augmented reality (AR), natural language, and eye gaze to portray current and future states of the environment. Through object-specific AR renders, the human can visualize future object interactions to make adjustments as needed, ultimately providing an interactive and efficient collaboration between humans and robots.

VR · 回合 · Performance · INTERACT · 控制器 ·

2023 年 5 月 7 日

Dynamic Scene Adjustment for Player Engagement in VR Game

Zhitao Liu,Yi Li,Ning Xie,YouTeng Fan,Haolan Tang,Wei Zhang

Virtual reality (VR) produces a highly realistic simulated environment with controllable environment variables. This paper proposes a Dynamic Scene Adjustment (DSA) mechanism based on the user interaction status and performance, which aims to adjust the VR experiment variables to improve the user's game engagement. We combined the DSA mechanism with a musical rhythm VR game. The experimental results show that the DSA mechanism can improve the user's game engagement (task performance).

Agent · Prompt · AI · MoDELS · 設計 ·

2023 年 5 月 5 日

CHAI-DT: A Framework for Prompting Conversational Generative AI Agents to Actively Participate in Co-Creation

Brandon Harwood

This paper explores the potential for utilizing generative AI models in group-focused co-creative frameworks to enhance problem solving and ideation in business innovation and co-creation contexts, and proposes a novel prompting technique for conversational generative AI agents which employ methods inspired by traditional 'human-to-human' facilitation and instruction to enable active contribution to Design Thinking, a co-creative framework. Through experiments using this prompting technique, we gather evidence that conversational generative transformers (i.e. ChatGPT) have the capability to contribute context-specific, useful, and creative input into Design Thinking activities. We also discuss the potential benefits, limitations, and risks associated with using generative AI models in co-creative ideation and provide recommendations for future research.

清華大學智能產業研究院 · ESA · MoDELS · 多峰值 · Projection ·

2023 年 5 月 5 日

Predicting air quality via multimodal AI and satellite imagery

Andrew Rowley,Oktay Karaku?

from arxiv, 14 pages, 7 figures, 4 tables

Climate change may be classified as the most important environmental problem that the Earth is currently facing, and affects all living species on Earth. Given that air-quality monitoring stations are typically ground-based their abilities to detect pollutant distributions are often restricted to wide areas. Satellites however have the potential for studying the atmosphere at large; the European Space Agency (ESA) Copernicus project satellite, "Sentinel-5P" is a newly launched satellite capable of measuring a variety of pollutant information with publicly available data outputs. This paper seeks to create a multi-modal machine learning model for predicting air-quality metrics where monitoring stations do not exist. The inputs of this model will include a fusion of ground measurements and satellite data with the goal of highlighting pollutant distribution and motivating change in societal and industrial behaviors. A new dataset of European pollution monitoring station measurements is created with features including $\textit{altitude, population, etc.}$ from the ESA Copernicus project. This dataset is used to train a multi-modal ML model, Air Quality Network (AQNet) capable of fusing these various types of data sources to output predictions of various pollutants. These predictions are then aggregated to create an "air-quality index" that could be used to compare air quality over different regions. Three pollutants, NO$_2$, O$_3$, and PM$_{10}$, are predicted successfully by AQNet and the network was found to be useful compared to a model only using satellite imagery. It was also found that the addition of supporting data improves predictions. When testing the developed AQNet on out-of-sample data of the UK and Ireland, we obtain satisfactory estimates though on average pollution metrics were roughly overestimated by around 20\%.

Analysis · Better · Performer · 論文 · 秩 ·

2023 年 5 月 5 日

Analysis of h-index for research awards

Aashay Singhal,Kamalakar Karlapalem

In order to advance academic research, it is important to assess and evaluate the academic influence of researchers and the findings they produce. Citation metrics are universally used methods to evaluate researchers. Amongst the several variations of citation metrics, the h-index proposed by Hirsch has become the leading measure. Recent work shows that h-index is not an effective measure to determine scientific impact - due to changing authorship patterns. This can be mitigated by using h-index of a paper to compute h- index of an author. We show that using fractional allocation of h-index gives better results. In this work, we reapply two indices based on the h-index of a single paper. The indices are referred to as: hp-index and hp-frac-index. We run large-scale experiments in three different fields with about a million publications and 3,000 authors. We also compare h-index of a paper with nine h-index like metrics. Our experiments show that hp-frac-index provides a unique ranking when compared to h-index. It also performs better than h-index in providing higher ranks to the awarded researcher.

多跳 · INFORMS · IR · 結點 · 語言模型化 ·

2023 年 5 月 5 日

Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks

Shicheng Xu,Liang Pang,Huawei Shen,Xueqi Cheng,Tat-seng Chua

from arxiv, work in progress

With the wide application of Large Language Models (LLMs) such as ChatGPT, how to make the contents generated by LLM accurate and credible becomes very important, especially in complex knowledge-intensive tasks. In this paper, we propose a novel framework called Search-in-the-Chain (SearChain) to improve the accuracy, credibility and traceability of LLM-generated content for multi-hop question answering, which is a typical complex knowledge-intensive task. SearChain is a framework that deeply integrates LLM and information retrieval (IR). In SearChain, LLM constructs a chain-of-query, which is the decomposition of the multi-hop question. Each node of the chain is a query-answer pair consisting of an IR-oriented query and the answer generated by LLM for this query. IR verifies, completes, and traces the information of each node of the chain, so as to guide LLM to construct the correct chain-of-query, and finally answer the multi-hop question. SearChain makes LLM change from trying to give a answer to trying to construct the chain-of-query when faced with the multi-hop question, which can stimulate the knowledge-reasoning ability and provides the interface for IR to be deeply involved in reasoning process of LLM. IR interacts with each node of chain-of-query of LLM. It verifies the information of the node and provides the unknown knowledge to LLM, which ensures the accuracy of the whole chain in the process of LLM generating the answer. Besides, the contents returned by LLM to the user include not only the final answer but also the reasoning process for the question, that is, the chain-of-query and the supporting documents retrieved by IR for each node of the chain, which improves the credibility and traceability of the contents generated by LLM. Experimental results show SearChain outperforms related baselines on four multi-hop question-answering datasets.

推薦系統 · 學成 · 強化學習 · 策略搜索 · INTERACT ·

2021 年 9 月 22 日

A Survey on Reinforcement Learning for Recommender Systems

Yuanguo Lin,Yong Liu,Fan Lin,Pengcheng Wu,Wenhua Zeng,Chunyan Miao

from arxiv, 25 pages, 4 figures

Recommender systems have been widely applied in different real-life scenarios to help us find useful information. Recently, Reinforcement Learning (RL) based recommender systems have become an emerging research topic. It often surpasses traditional recommendation models even most deep learning-based methods, owing to its interactive nature and autonomous learning ability. Nevertheless, there are various challenges of RL when applying in recommender systems. Toward this end, we firstly provide a thorough overview, comparisons, and summarization of RL approaches for five typical recommendation scenarios, following three main categories of RL: value-function, policy search, and Actor-Critic. Then, we systematically analyze the challenges and relevant solutions on the basis of existing literature. Finally, under discussion for open issues of RL and its limitations of recommendation, we highlight some potential research directions in this field.