Trust calibration presents a main challenge during the interaction between drivers and automated vehicles (AVs). In order to calibrate trust, it is important to measure drivers' trust in real time. One possible method is through modeling its dynamic changes using machine learning models and physiological measures. In this paper, we proposed a technique based on machine learning models to predict drivers' dynamic trust in conditional AVs using physiological measurements in real time. We conducted the study in a driving simulator where participants were requested to take over control from automated driving in three conditions that included a control condition, a false alarm condition, a miss condition with eight takeover requests (TORs) in different scenarios. Drivers' physiological measures were recorded during the experiment, including galvanic skin response (GSR), heart rate (HR) indices, and eye-tracking metrics. Using five machine learning models, we found that eXtreme Gradient Boosting (XGBoost) performed the best and was able to predict drivers' trust in real time with an f1-score of 89.1%. Our findings provide good implications on how to design an in-vehicle trust monitoring system to calibrate drivers' trust to facilitate interaction between the driver and the AV in real time.
Integration of Visual Inertial Odometry (VIO) methods into a modular control system designed for deployment of Unmanned Aerial Vehicles (UAVs) and teams of cooperating UAVs in real-world conditions are presented in this paper. Reliability analysis and fair performance comparison of several methods integrated into a control pipeline for achieving full autonomy in real conditions is provided. Although most VIO algorithms achieve excellent localization precision and negligible drift on artificially created datasets, the aspects of reliability in non-ideal situations, robustness to degraded sensor data, and the effects of external disturbances and feedback control coupling are not well studied. These imperfections, which are inherently present in cases of real-world deployment of UAVs, negatively affect the ability of the most used VIO approaches to output a sensible pose estimation. We identify the conditions that are critical for a reliable flight under VIO localization and propose workarounds and compensations for situations in which such conditions cannot be achieved. The performance of the UAV system with integrated VIO methods is quantitatively analyzed w.r.t. RTK ground truth and the ability to provide reliable pose estimation for the feedback control is demonstrated onboard a UAV that is tracking dynamic trajectories under challenging illumination.
In machine learning, the selection of a promising model from a potentially large number of competing models and the assessment of its generalization performance are critical tasks that need careful consideration. Typically, model selection and evaluation are strictly separated endeavors, splitting the sample at hand into a training, validation, and evaluation set, and only compute a single confidence interval for the prediction performance of the final selected model. We however propose an algorithm how to compute valid lower confidence bounds for multiple models that have been selected based on their prediction performances in the evaluation set by interpreting the selection problem as a simultaneous inference problem. We use bootstrap tilting and a maxT-type multiplicity correction. The approach is universally applicable for any combination of prediction models, any model selection strategy, and any prediction performance measure that accepts weights. We conducted various simulation experiments which show that our proposed approach yields lower confidence bounds that are at least comparably good as bounds from standard approaches, and that reliably reach the nominal coverage probability. In addition, especially when sample size is small, our proposed approach yields better performing prediction models than the default selection of only one model for evaluation does.
Driver stress is a major cause of car accidents and death worldwide. Furthermore, persistent stress is a health problem, contributing to hypertension and other diseases of the cardiovascular system. Stress has a measurable impact on heart and breathing rates and stress levels can be inferred from such measurements. Galvanic skin response is a common test to measure the perspiration caused by both physiological and psychological stress, as well as extreme emotions. In this paper, galvanic skin response is used to estimate the ground truth stress levels. A feature selection technique based on the minimal redundancy-maximal relevance method is then applied to multiple heart rate variability and breathing rate metrics to identify a novel and optimal combination for use in detecting stress. The support vector machine algorithm with a radial basis function kernel was used along with these features to reliably predict stress. The proposed method has achieved a high level of accuracy on the target dataset.
We consider the lossy quantum source coding problem where the task is to compress a given quantum source below its von Neumann entropy. Inspired by the duality connections between the rate-distortion and channel coding problems in the classical setting, we propose a new formulation for the lossy quantum source coding problem. This formulation differs from the existing quantum rate-distortion theory in two aspects. Firstly, we require that the reconstruction of the compressed quantum source fulfill a global error constraint as opposed to the sample-wise local error criterion used in the standard rate-distortion setting. Secondly, instead of a distortion observable, we employ the notion of a backward quantum channel, which we refer to as a "posterior reference map", to measure the reconstruction error. Using these, we characterize the asymptotic performance limit of the lossy quantum source coding problem in terms of single-letter coherent information of the given posterior reference map. We demonstrate a protocol to encode (at the specified rate) and decode, with the reconstruction satisfying the provided global error criterion, and therefore achieving the asymptotic performance limit. The protocol is constructed by decomposing coherent information as a difference of two Holevo information quantities, inspired from prior works in quantum communication problems. To further support the findings, we develop analogous formulations for the quantum-classical and classical variants and express the asymptotic performance limit in terms of single-letter mutual information quantities with respect to appropriately defined channels analogous to posterior reference maps. We also provide various examples for the three formulations, and shed light on their connection to the standard rate-distortion formulation wherever possible.
Driven by ongoing improvements in machine learning, chatbots have increasingly grown from experimental interface prototypes to reliable and robust tools for process automation. Building on these advances, companies have identified various application scenarios, where the automated processing of human language can help foster task efficiency. To this end, the use of chatbots may not only decrease costs, but it is also said to boost user satisfaction. People's intention to use and/or reuse said technology, however, is often dependent on less utilitarian factors. Particularly trust and respective task satisfaction count as relevant usage predictors. In this paper, we thus present work that aims to shed some light on these two variable constructs. We report on an experimental study ($n=277$), investigating four different human-chatbot interaction tasks. After each task, participants were asked to complete survey items on perceived trust, perceived task complexity and perceived task satisfaction. Results show that task complexity impacts negatively on both trust and satisfaction. To this end, higher complexity was associated particularly with those conversations that relied on broad, descriptive chatbot answers, while conversations that span over several short steps were perceived less complex, even when the overall conversation was eventually longer.
As research in deep neural networks advances, deep convolutional networks become promising for autonomous driving tasks. In particular, there is an emerging trend of employing end-to-end neural network models for autonomous driving. However, previous research has shown that deep neural network classifiers are vulnerable to adversarial attacks. While for regression tasks, the effect of adversarial attacks is not as well understood. In this research, we devise two white-box targeted attacks against end-to-end autonomous driving models. The driving system uses a regression model that takes an image as input and outputs the steering angle. Our attacks manipulate the behavior of the autonomous driving system by perturbing the input image. Both attacks can be initiated in real-time on CPUs without employing GPUs. The efficiency of the attacks is illustrated using experiments conducted in Udacity Simulator. Demo video: //youtu.be/I0i8uN2oOP0.
Regression models that ignore measurement error in predictors may produce highly biased estimates leading to erroneous inferences. It is well known that it is extremely difficult to take measurement error into account in Gaussian nonparametric regression. This problem becomes tremendously more difficult when considering other families such as logistic regression, Poisson and negative-binomial. For the first time, we present a method aiming to correct for measurement error when estimating regression functions flexibly covering virtually all distributions and link functions regularly considered in generalized linear models. This approach depends on approximating the first and the second moment of the response after integrating out the true unobserved predictors in a semiparametric generalized linear model. Unlike previous methods, this method is not restricted to truncated splines and can utilize various basis functions. Through extensive simulation studies, we study the performance of our method under many scenarios.
Trust has emerged as a key factor in people's interactions with AI-infused systems. Yet, little is known about what models of trust have been used and for what systems: robots, virtual characters, smart vehicles, decision aids, or others. Moreover, there is yet no known standard approach to measuring trust in AI. This scoping review maps out the state of affairs on trust in human-AI interaction (HAII) from the perspectives of models, measures, and methods. Findings suggest that trust is an important and multi-faceted topic of study within HAII contexts. However, most work is under-theorized and under-reported, generally not using established trust models and missing details about methods, especially Wizard of Oz. We offer several targets for systematic review work as well as a research agenda for combining the strengths and addressing the weaknesses of the current literature.
Estimating human pose and shape from monocular images is a long-standing problem in computer vision. Since the release of statistical body models, 3D human mesh recovery has been drawing broader attention. With the same goal of obtaining well-aligned and physically plausible mesh results, two paradigms have been developed to overcome challenges in the 2D-to-3D lifting process: i) an optimization-based paradigm, where different data terms and regularization terms are exploited as optimization objectives; and ii) a regression-based paradigm, where deep learning techniques are embraced to solve the problem in an end-to-end fashion. Meanwhile, continuous efforts are devoted to improving the quality of 3D mesh labels for a wide range of datasets. Though remarkable progress has been achieved in the past decade, the task is still challenging due to flexible body motions, diverse appearances, complex environments, and insufficient in-the-wild annotations. To the best of our knowledge, this is the first survey to focus on the task of monocular 3D human mesh recovery. We start with the introduction of body models and then elaborate recovery frameworks and training objectives by providing in-depth analyses of their strengths and weaknesses. We also summarize datasets, evaluation metrics, and benchmark results. Open issues and future directions are discussed in the end, hoping to motivate researchers and facilitate their research in this area. A regularly updated project page can be found at //github.com/tinatiansjz/hmr-survey.
Autonomous driving is regarded as one of the most promising remedies to shield human beings from severe crashes. To this end, 3D object detection serves as the core basis of such perception system especially for the sake of path planning, motion prediction, collision avoidance, etc. Generally, stereo or monocular images with corresponding 3D point clouds are already standard layout for 3D object detection, out of which point clouds are increasingly prevalent with accurate depth information being provided. Despite existing efforts, 3D object detection on point clouds is still in its infancy due to high sparseness and irregularity of point clouds by nature, misalignment view between camera view and LiDAR bird's eye of view for modality synergies, occlusions and scale variations at long distances, etc. Recently, profound progress has been made in 3D object detection, with a large body of literature being investigated to address this vision task. As such, we present a comprehensive review of the latest progress in this field covering all the main topics including sensors, fundamentals, and the recent state-of-the-art detection methods with their pros and cons. Furthermore, we introduce metrics and provide quantitative comparisons on popular public datasets. The avenues for future work are going to be judiciously identified after an in-deep analysis of the surveyed works. Finally, we conclude this paper.