欧美成年黄色网站在线观看,亚洲色大成网站WWW久久久,亚洲精品高清国产一线九九,亚洲日韩欧美黄在线观看

Automated driving has the potential to revolutionize personal, public, and freight mobility. Besides the enormous challenge of perception, i.e. accurately perceiving the environment using available sensor data, automated driving comprises planning a safe, comfortable, and efficient motion trajectory. To promote safety and progress, many works rely on modules that predict the future motion of surrounding traffic. Modular automated driving systems commonly handle prediction and planning as sequential separate tasks. While this accounts for the influence of surrounding traffic on the ego-vehicle, it fails to anticipate the reactions of traffic participants to the ego-vehicle's behavior. Recent works suggest that integrating prediction and planning in an interdependent joint step is necessary to achieve safe, efficient, and comfortable driving. While various models implement such integrated systems, a comprehensive overview and theoretical understanding of different principles are lacking. We systematically review state-of-the-art deep learning-based prediction, planning, and integrated prediction and planning models. Different facets of the integration ranging from model architecture and model design to behavioral aspects are considered and related to each other. Moreover, we discuss the implications, strengths, and limitations of different integration methods. By pointing out research gaps, describing relevant future challenges, and highlighting trends in the research field, we identify promising directions for future research.

相關內容

Automator

關注 5

Automator是蘋果公司為他們的Mac OS X系統開發的一款軟件。 只要通過點擊拖拽鼠標等操作就可以將一系列動作組合成一個工作流，從而幫助你自動的（可重復的）完成一些復雜的工作。Automator還能橫跨很多不同種類的程序，包括：查找器、Safari網絡瀏覽器、iCal、地址簿或者其他的一些程序。它還能和一些第三方的程序一起工作，如微軟的Office、Adobe公司的Photoshop或者Pixelmator等。

Performer · MoDELS · VR · 掃視 · Less ·

2023 年 10 月 3 日

The Shortest Route Is Not Always the Fastest: Probability-Modeled Stereoscopic Eye Movement Completion Time in VR

Budmonde Duinkharjav,Benjamin Liang,Anjul Patney,Rachel Brown,Qi Sun

Speed and consistency of target-shifting play a crucial role in human ability to perform complex tasks. Shifting our gaze between objects of interest quickly and consistently requires changes both in depth and direction. Gaze changes in depth are driven by slow, inconsistent vergence movements which rotate the eyes in opposite directions, while changes in direction are driven by ballistic, consistent movements called saccades, which rotate the eyes in the same direction. In the natural world, most of our eye movements are a combination of both types. While scientific consensus on the nature of saccades exists, vergence and combined movements remain less understood and agreed upon. We eschew the lack of scientific consensus in favor of proposing an operationalized computational model which predicts the speed of any type of gaze movement during target-shifting in 3D. To this end, we conduct a psychophysical study in a stereo VR environment to collect more than 12,000 gaze movement trials, analyze the temporal distribution of the observed gaze movements, and fit a probabilistic model to the data. We perform a series of objective measurements and user studies to validate the model. The results demonstrate its predictive accuracy, generalization, as well as applications for optimizing visual performance by altering content placement. Lastly, we leverage the model to measure differences in human target-changing time relative to the natural world, as well as suggest scene-aware projection depth. By incorporating the complexities and randomness of human oculomotor control, we hope this research will support new behavior-aware metrics for VR/AR display design, interface layout, and gaze-contingent rendering.

在線 · 設計 · INTERACT · 評論員 · Analysis ·

2023 年 9 月 29 日

Decentralizing Platform Power: A Design Space of Multi-level Governance in Online Social Platforms

Shagun Jhaver,Seth Frey,Amy Zhang

Many have criticized the centralized and unaccountable governance of prominent online social platforms, leading to renewed interest in platform governance that incorporates multiple centers of power. Decentralization of power can arise horizontally, through parallel communities, each with local administration, and vertically, through multiple hierarchies of overlapping jurisdiction. Drawing from literature on federalism and polycentricity in analogous offline institutions, we scrutinize the landscape of existing platforms through the lens of multi-level governance. Our analysis describes how online platforms incorporate varying forms and degrees of decentralized governance. In particular, we propose a framework that characterizes the general design space and the various ways that middle levels of governance vary in how they can interact with a centralized governance system above and end users below. This focus provides a starting point for new lines of inquiry between platform- and community-governance scholarship. By engaging themes of decentralization, hierarchy, power, and responsibility, while discussing concrete examples, we connect designers and theorists of online spaces.

估計/估計量 · 查準率/準確率 · Performer · Attention · Learning ·

2023 年 9 月 29 日

GSDC Transformer: An Efficient and Effective Cue Fusion for Monocular Multi-Frame Depth Estimation

Naiyu Fang,Lemiao Qiu,Shuyou Zhang,Zili Wang,Zheyuan Zhou,Kerui Hu

Depth estimation provides an alternative approach for perceiving 3D information in autonomous driving. Monocular depth estimation, whether with single-frame or multi-frame inputs, has achieved significant success by learning various types of cues and specializing in either static or dynamic scenes. Recently, these cues fusion becomes an attractive topic, aiming to enable the combined cues to perform well in both types of scenes. However, adaptive cue fusion relies on attention mechanisms, where the quadratic complexity limits the granularity of cue representation. Additionally, explicit cue fusion depends on precise segmentation, which imposes a heavy burden on mask prediction. To address these issues, we propose the GSDC Transformer, an efficient and effective component for cue fusion in monocular multi-frame depth estimation. We utilize deformable attention to learn cue relationships at a fine scale, while sparse attention reduces computational requirements when granularity increases. To compensate for the precision drop in dynamic scenes, we represent scene attributes in the form of super tokens without relying on precise shapes. Within each super token attributed to dynamic scenes, we gather its relevant cues and learn local dense relationships to enhance cue fusion. Our method achieves state-of-the-art performance on the KITTI dataset with efficient fusion speed.

INTERACT · Performer · AI · 數據集 · Seven ·

2023 年 9 月 29 日

HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World

Xin Wang,Taein Kwon,Mahdi Rad,Bowen Pan,Ishani Chakraborty,Sean Andrist,Dan Bohus,Ashley Feniello,Bugra Tekin,Felipe Vieira Frujeri,Neel Joshi,Marc Pollefeys

from arxiv, ICCV 2023

Building an interactive AI assistant that can perceive, reason, and collaborate with humans in the real world has been a long-standing pursuit in the AI community. This work is part of a broader research effort to develop intelligent agents that can interactively guide humans through performing tasks in the physical world. As a first step in this direction, we introduce HoloAssist, a large-scale egocentric human interaction dataset, where two people collaboratively complete physical manipulation tasks. The task performer executes the task while wearing a mixed-reality headset that captures seven synchronized data streams. The task instructor watches the performer's egocentric video in real time and guides them verbally. By augmenting the data with action and conversational annotations and observing the rich behaviors of various participants, we present key insights into how human assistants correct mistakes, intervene in the task completion procedure, and ground their instructions to the environment. HoloAssist spans 166 hours of data captured by 350 unique instructor-performer pairs. Furthermore, we construct and present benchmarks on mistake detection, intervention type prediction, and hand forecasting, along with detailed analysis. We expect HoloAssist will provide an important resource for building AI assistants that can fluidly collaborate with humans in the real world. Data can be downloaded at //holoassist.github.io/.

統計量 · 隨機變量 · 泛函 · Extensibility · PyTorch ·

2023 年 9 月 28 日

QonFusion -- Quantum Approaches to Gaussian Random Variables: Applications in Stable Diffusion and Brownian Motion

Shlomo Kashani

In the present study, we delineate a strategy focused on non-parametric quantum circuits for the generation of Gaussian random variables (GRVs). This quantum-centric approach serves as a substitute for conventional pseudorandom number generators (PRNGs), such as the \textbf{torch.rand} function in PyTorch. The principal theme of our research is the incorporation of Quantum Random Number Generators (QRNGs) into classical models of diffusion. Notably, our Quantum Gaussian Random Variable Generator fulfills dual roles, facilitating simulations in both Stable Diffusion (SD) and Brownian Motion (BM). This diverges markedly from prevailing methods that utilize parametric quantum circuits (PQCs), often in conjunction with variational quantum eigensolvers (VQEs). Although conventional techniques can accurately approximate ground states in complex systems or model elaborate probability distributions, they require a computationally demanding optimization process to tune parameters. Our non-parametric strategy obviates this necessity. To facilitate assimilating our methodology into existing computational frameworks, we put forward QonFusion, a Python library congruent with both PyTorch and PennyLane, functioning as a bridge between classical and quantum computational paradigms. We validate QonFusion through extensive statistical testing, including tests which confirm the statistical equivalence of the Gaussian samples from our quantum approach to classical counterparts within defined significance limits. QonFusion is available at \url{//boltzmannentropy.github.io/qonfusion.github.io/} to reproduce all findings here.

Agent · AI Agent · 語言模型化 · AI · MoDELS ·

2023 年 9 月 14 日

The Rise and Potential of Large Language Model Based Agents: A Survey

Zhiheng Xi,Wenxiang Chen,Xin Guo,Wei He,Yiwen Ding,Boyang Hong,Ming Zhang,Junzhe Wang,Senjie Jin,Enyu Zhou,Rui Zheng,Xiaoran Fan,Xiao Wang,Limao Xiong,Qin Liu,Yuhao Zhou,Weiran Wang,Changhao Jiang,Yicheng Zou,Xiangyang Liu,Zhangyue Yin,Shihan Dou,Rongxiang Weng,Wensen Cheng,Qi Zhang,Wenjuan Qin,Yongyan Zheng,Xipeng Qiu,Xuanjing Huan,Tao Gui

from arxiv, 86 pages, 12 figures

For a long time, humanity has pursued artificial intelligence (AI) equivalent to or surpassing the human level, with AI agents considered a promising vehicle for this pursuit. AI agents are artificial entities that sense their environment, make decisions, and take actions. Many efforts have been made to develop intelligent AI agents since the mid-20th century. However, these efforts have mainly focused on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks. Actually, what the community lacks is a sufficiently general and powerful model to serve as a starting point for designing AI agents that can adapt to diverse scenarios. Due to the versatile and remarkable capabilities they demonstrate, large language models (LLMs) are regarded as potential sparks for Artificial General Intelligence (AGI), offering hope for building general AI agents. Many research efforts have leveraged LLMs as the foundation to build AI agents and have achieved significant progress. We start by tracing the concept of agents from its philosophical origins to its development in AI, and explain why LLMs are suitable foundations for AI agents. Building upon this, we present a conceptual framework for LLM-based agents, comprising three main components: brain, perception, and action, and the framework can be tailored to suit different applications. Subsequently, we explore the extensive applications of LLM-based agents in three aspects: single-agent scenarios, multi-agent scenarios, and human-agent cooperation. Following this, we delve into agent societies, exploring the behavior and personality of LLM-based agents, the social phenomena that emerge when they form societies, and the insights they offer for human society. Finally, we discuss a range of key topics and open problems within the field.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

注意力機制 · Cognition · Performer · 深度學習 · Boosting（一種模型訓練加速方式） ·

2022 年 4 月 16 日

Visual Attention Methods in Deep Learning: An In-Depth Survey

Mohammed Hassanin,Saeed Anwar,Ibrahim Radwan,Fahad S Khan,Ajmal Mian

Inspired by the human cognitive system, attention is a mechanism that imitates the human cognitive awareness about specific information, amplifying critical details to focus more on the essential aspects of data. Deep learning has employed attention to boost performance for many applications. Interestingly, the same attention design can suit processing different data modalities and can easily be incorporated into large networks. Furthermore, multiple complementary attention mechanisms can be incorporated in one network. Hence, attention techniques have become extremely attractive. However, the literature lacks a comprehensive survey specific to attention techniques to guide researchers in employing attention in their deep models. Note that, besides being demanding in terms of training data and computational resources, transformers only cover a single category in self-attention out of the many categories available. We fill this gap and provide an in-depth survey of 50 attention techniques categorizing them by their most prominent features. We initiate our discussion by introducing the fundamental concepts behind the success of attention mechanism. Next, we furnish some essentials such as the strengths and limitations of each attention category, describe their fundamental building blocks, basic formulations with primary usage, and applications specifically for computer vision. We also discuss the challenges and open questions related to attention mechanism in general. Finally, we recommend possible future research directions for deep attention.

Processing（編程語言） · 推斷 · NLP · Computational Linguistics · 估計/估計量 ·

2021 年 9 月 2 日

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Amir Feder,Katherine A. Keith,Emaad Manzoor,Reid Pryzant,Dhanya Sridhar,Zach Wood-Doughty,Jacob Eisenstein,Justin Grimmer,Roi Reichart,Margaret E. Roberts,Brandon M. Stewart,Victor Veitch,Diyi Yang

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the remaining challenges. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects, encompassing settings where text is used as an outcome, treatment, or as a means to address confounding. In addition, we explore potential uses of causal inference to improve the performance, robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the computational linguistics community.

跳躍連接 · Neural Networks · 優化器 · 線性的 · 圖 ·

2021 年 5 月 10 日

Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth

Keyulu Xu,Mozhi Zhang,Stefanie Jegelka,Kenji Kawaguchi

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.