Automated vehicle acceptance (AVA) has been measured mostly subjectively by questionnaires and interviews, with a main focus on drivers inside automated vehicles (AVs). To ensure that AVs are widely accepted by the public, ensuring the acceptance by both drivers and passengers is key. The in-vehicle experience of passengers will determine the extent to which AVs will be accepted by passengers. A comprehensive understanding of potential assessment methods to measure the passenger experience in AVs is needed to improve the in-vehicle experience of passengers and thereby the acceptance. The present work provides an overview of assessment methods that were used to measure a driver's behavior, and cognitive and emotional states during (automated) driving. The results of the review have shown that these assessment methods can be classified by type of data-collection method (e.g., questionnaires, interviews, direct input devices, sensors), object of their measurement (i.e., perception, behavior, state), time of measurement, and degree of objectivity of the data collected. A conceptual model synthesizes the results of the literature review, formulating relationships between the factors constituting the in-vehicle experience and AVA acceptance. It is theorized that the in-vehicle experience influences the intention to use, with intention to use serving as predictor of actual use. The model also formulates relationships between actual use and well-being. A combined approach of using both subjective and objective assessment methods is needed to provide more accurate estimates for AVA, and advance the uptake and use of AVs.
This paper presents a Bayesian optimization framework for the automatic tuning of shared controllers which are defined as a Model Predictive Control (MPC) problem. The proposed framework includes the design of performance metrics as well as the representation of user inputs for simulation-based optimization. The framework is applied to the optimization of a shared controller for an Image Guided Therapy robot. VR-based user experiments confirm the increase in performance of the automatically tuned MPC shared controller with respect to a hand-tuned baseline version as well as its generalization ability.
Safe control with guarantees generally requires the system model to be known. It is far more challenging to handle systems with uncertain parameters. In this paper, we propose a generic algorithm that can synthesize and verify safe controllers for systems with constant, unknown parameters. In particular, we use robust-adaptive control barrier functions (raCBFs) to achieve safety. We develop new theories and techniques using sum-of-squares that enable us to pose synthesis and verification as a series of convex optimization problems. In our experiments, we show that our algorithms are general and scalable, applying them to three different polynomial systems of up to moderate size (7D). Our raCBFs are currently the most effective way to guarantee safety for uncertain systems, achieving 100% safety and up to 55% performance improvement over a robust baseline.
The study of time-inhomogeneous Markov jump processes is a traditional topic within probability theory that has recently attracted substantial attention in various applications. However, their flexibility also incurs a substantial mathematical burden which is usually circumvented by using well-known generic distributional approximations or simulations. This article provides a novel approximation method that tailors the dynamics of a time-homogeneous Markov jump process to meet those of its time-inhomogeneous counterpart on an increasingly fine Poisson grid. Strong convergence of the processes in terms of the Skorokhod $J_1$ metric is established, and convergence rates are provided. Under traditional regularity assumptions, distributional convergence is established for unconditional proxies, to the same limit. Special attention is devoted to the case where the target process has one absorbing state and the remaining ones transient, for which the absorption times also converge. Some applications are outlined, such as univariate hazard-rate density estimation, ruin probabilities, and multivariate phase-type density evaluation.
Exploring the semantic context in scene images is essential for indoor scene recognition. However, due to the diverse intra-class spatial layouts and the coexisting inter-class objects, modeling contextual relationships to adapt various image characteristics is a great challenge. Existing contextual modeling methods for indoor scene recognition exhibit two limitations: 1) During training, space-independent information, such as color, may hinder optimizing the network's capacity to represent the spatial context. 2) These methods often overlook the differences in coexisting objects across different scenes, suppressing scene recognition performance. To address these limitations, we propose SpaCoNet, which simultaneously models the Spatial relation and Co-occurrence of objects based on semantic segmentation. Firstly, the semantic spatial relation module (SSRM) is designed to explore the spatial relation among objects within a scene. With the help of semantic segmentation, this module decouples the spatial information from the image, effectively avoiding the influence of irrelevant features. Secondly, both spatial context features from the SSRM and deep features from the Image Feature Extraction Module are used to distinguish the coexisting object across different scenes. Finally, utilizing the discriminative features mentioned above, we employ the self-attention mechanism to explore the long-range co-occurrence among objects, and further generate a semantic-guided feature representation for indoor scene recognition. Experimental results on three widely used scene datasets demonstrate the effectiveness and generality of the proposed method. The code will be made publicly available after the blind review process is completed.
In the emerging field of mechanical metamaterials, using periodic lattice structures as a primary ingredient is relatively frequent. However, the choice of aperiodic lattices in these structures presents unique advantages regarding failure, e.g., buckling or fracture, because avoiding repeated patterns prevents global failures, with local failures occurring in turn that can beneficially delay structural collapse. Therefore, it is expedient to develop models for computing efficiently the effective mechanical properties in lattices from different general features while addressing the challenge of presenting topologies (or graphs) of different sizes. In this paper, we develop a deep learning model to predict energetically-equivalent mechanical properties of linear elastic lattices effectively. Considering the lattice as a graph and defining material and geometrical features on such, we show that Graph Neural Networks provide more accurate predictions than a dense, fully connected strategy, thanks to the geometrically induced bias through graph representation, closer to the underlying equilibrium laws from mechanics solved in the direct problem. Leveraging the efficient forward-evaluation of a vast number of lattices using this surrogate enables the inverse problem, i.e., to obtain a structure having prescribed specific behavior, which is ultimately suitable for multiscale structural optimization problems.
We systematically study a wide variety of generative models spanning semantically-diverse image datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metric strongly correlates with human evaluations. Comparing to 17 modern metrics for evaluating the overall performance, fidelity, diversity, rarity, and memorization of generative models, we find that the state-of-the-art perceptual realism of diffusion models as judged by humans is not reflected in commonly reported metrics such as FID. This discrepancy is not explained by diversity in generated samples, though one cause is over-reliance on Inception-V3. We address these flaws through a study of alternative self-supervised feature extractors, find that the semantic information encoded by individual networks strongly depends on their training procedure, and show that DINOv2-ViT-L/14 allows for much richer evaluation of generative models. Next, we investigate data memorization, and find that generative models do memorize training examples on simple, smaller datasets like CIFAR10, but not necessarily on more complex datasets like ImageNet. However, our experiments show that current metrics do not properly detect memorization: none in the literature is able to separate memorization from other phenomena such as underfitting or mode shrinkage. To facilitate further development of generative models and their evaluation we release all generated image datasets, human evaluation data, and a modular library to compute 17 common metrics for 9 different encoders at //github.com/layer6ai-labs/dgm-eval.
This contribution introduces a model order reduction approach for an advection-reaction problem with a parametrized reaction function. The underlying discretization uses an ultraweak formulation with an $L^2$-like trial space and an 'optimal' test space as introduced by Demkowicz et al. This ensures the stability of the discretization and in addition allows for a symmetric reformulation of the problem in terms of a dual solution which can also be interpreted as the normal equations of an adjoint least-squares problem. Classic model order reduction techniques can then be applied to the space of dual solutions which also immediately gives a reduced primal space. We show that the necessary computations do not require the reconstruction of any primal solutions and can instead be performed entirely on the space of dual solutions. We prove exponential convergence of the Kolmogorov $N$-width and show that a greedy algorithm produces quasi-optimal approximation spaces for both the primal and the dual solution space. Numerical experiments based on the benchmark problem of a catalytic filter confirm the applicability of the proposed method.
The scaled boundary finite element method (SBFEM) has recently been employed as an efficient means to model three-dimensional structures, in particular when the geometry is provided as a voxel-based image. To this end, an octree decomposition of the computational domain is deployed and each cubic cell is treated as an SBFEM subdomain. The surfaces of each subdomain are discretized in the finite element sense. We improve on this idea by combining the semi-analytical concept of the SBFEM with certain transition elements on the subdomains' surfaces. Thus, we avoid the triangulation of surfaces employed in previous works and consequently reduce the number of surface elements and degrees of freedom. In addition, these discretizations allow coupling elements of arbitrary order such that local p-refinement can be achieved straightforwardly.
This paper describes two intelligibility prediction systems derived from a pretrained noise-robust automatic speech recognition (ASR) model for the second Clarity Prediction Challenge (CPC2). One system is intrusive and leverages the hidden representations of the ASR model. The other system is non-intrusive and makes predictions with derived ASR uncertainty. The ASR model is only pretrained with a simulated noisy speech corpus and does not take advantage of the CPC2 data. For that reason, the intelligibility prediction systems are robust to unseen scenarios given the accurate prediction performance on the CPC2 evaluation.
Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.