Technological advancements focus on developing comfortable and acceptable driving characteristics in autonomous vehicles. Present driving functions predominantly possess predefined parameters, and there is no universally accepted driving style for autonomous vehicles. While driving may be technically safe and the likelihood of road accidents is reduced, passengers may still feel insecure due to a mismatch in driving styles between the human and the autonomous system. Incorporating driving style preferences into automated vehicles enhances acceptance, reduces uncertainty, and poses the opportunity to expedite their adoption. Despite the increased research focus on driving styles, there remains a need for comprehensive studies investigating how variations in the driving context impact the assessment of automated driving functions. Therefore, this work evaluates lateral driving style preferences for autonomous vehicles on rural roads, considering different weather and traffic situations. A controlled study was conducted with a variety of German participants utilizing a high-fidelity driving simulator. The subjects experienced four different driving styles, including mimicking of their own driving behavior under two weather conditions. A notable preference for a more passive driving style became evident based on statistical analyses of participants' responses during and after the drives. This study could not confirm the hypothesis that subjects prefer to be driven by mimicking their own driving behavior. Furthermore, the study illustrated that weather conditions and oncoming traffic substantially influence the perceived comfort during autonomous rides. The gathered dataset is openly accessible at //www.kaggle.com/datasets/jhaselberger/idcld-subject-study-on-driving-style-preferences.
LiDAR sensors play a crucial role in various applications, especially in autonomous driving. Current research primarily focuses on optimizing perceptual models with point cloud data as input, while the exploration of deeper cognitive intelligence remains relatively limited. To address this challenge, parallel LiDARs have emerged as a novel theoretical framework for the next-generation intelligent LiDAR systems, which tightly integrate physical, digital, and social systems. To endow LiDAR systems with cognitive capabilities, we introduce the 3D visual grounding task into parallel LiDARs and present a novel human-computer interaction paradigm for LiDAR systems. We propose Talk2LiDAR, a large-scale benchmark dataset tailored for 3D visual grounding in autonomous driving. Additionally, we present a two-stage baseline approach and an efficient one-stage method named BEVGrounding, which significantly improves grounding accuracy by fusing coarse-grained sentence and fine-grained word embeddings with visual features. Our experiments on Talk2Car-3D and Talk2LiDAR datasets demonstrate the superior performance of BEVGrounding, laying a foundation for further research in this domain.
We come up with a novel application for image analysis methods in the context of direction dependent signal characteristics. For this purpose, we describe an inpainting approach adding benefit to technical signal information which are typically only used for monitoring and control purposes in ground station operations. Recalling the theoretical properties of the employed inpainting technique and appropriate modeling allow us to demonstrate the usefulness of our approach for satellite data reception quality assessment. In our application, we show the advantages of inpainting products over raw data as well as the rich potential of the visualization of technical signal information.
Language models (LMs) have long been used to improve results of automatic speech recognition (ASR) systems, but they are unaware of the errors that ASR systems make. Error correction models are designed to fix ASR errors, however, they showed little improvement over traditional LMs mainly due to the lack of supervised training data. In this paper, we present Denoising LM (DLM), which is a $\textit{scaled}$ error correction model trained with vast amounts of synthetic data, significantly exceeding prior attempts meanwhile achieving new state-of-the-art ASR performance. We use text-to-speech (TTS) systems to synthesize audio, which is fed into an ASR system to produce noisy hypotheses, which are then paired with the original texts to train the DLM. DLM has several $\textit{key ingredients}$: (i) up-scaled model and data; (ii) usage of multi-speaker TTS systems; (iii) combination of multiple noise augmentation strategies; and (iv) new decoding techniques. With a Transformer-CTC ASR, DLM achieves 1.5% word error rate (WER) on $\textit{test-clean}$ and 3.3% WER on $\textit{test-other}$ on Librispeech, which to our knowledge are the best reported numbers in the setting where no external audio data are used and even match self-supervised methods which use external audio data. Furthermore, a single DLM is applicable to different ASRs, and greatly surpassing the performance of conventional LM based beam-search rescoring. These results indicate that properly investigated error correction models have the potential to replace conventional LMs, holding the key to a new level of accuracy in ASR systems.
We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints. We propose a differentially private best subset selection method with strong utility properties by adopting the well-known exponential mechanism for selecting the best model. We propose an efficient Metropolis-Hastings algorithm and establish that it enjoys polynomial mixing time to its stationary distribution. Furthermore, we also establish approximate differential privacy for the estimates of the mixed Metropolis-Hastings chain. Finally, we perform some illustrative experiments that show the strong utility of our algorithm.
Deep generative models have shown tremendous capability in data density estimation and data generation from finite samples. While these models have shown impressive performance by learning correlations among features in the data, some fundamental shortcomings are their lack of explainability, tendency to induce spurious correlations, and poor out-of-distribution extrapolation. To remedy such challenges, recent work has proposed a shift toward causal generative models. Causal models offer several beneficial properties to deep generative models, such as distribution shift robustness, fairness, and interpretability. Structural causal models (SCMs) describe data-generating processes and model complex causal relationships and mechanisms among variables in a system. Thus, SCMs can naturally be combined with deep generative models. We provide a technical survey on causal generative modeling categorized into causal representation learning and controllable counterfactual generation methods. We focus on fundamental theory, methodology, drawbacks, datasets, and metrics. Then, we cover applications of causal generative models in fairness, privacy, out-of-distribution generalization, precision medicine, and biological sciences. Lastly, we discuss open problems and fruitful research directions for future work in the field.
Recent innovations in digital technology offer significant opportunities for older adults to engage in meaningful activities. To investigate older adults' perceptions of using existing and emerging technologies for meaningful activities, we conducted three participatory design workshops and follow-up interviews with adults aged over 65. The workshops encompassed discussions on existing technologies for meaningful activities, demonstrations of emerging technologies such as VR, AR, and AI, and design activities including prototyping and storyboarding. Our findings show that while participants had diverse interpretations of meaningful activities, they sought to use technologies to support continuity in the pursuit of these activities. Specifically, participants highlighted the importance of safe aging at home, which provides a pathway for meaningful activities in later life. We further discuss participants' discerning attitudes when assessing the use of different technologies for meaningful activities and several values and attributes they desire when envisioning future technologies, including simplicity, positivity, proactivity, and integration.
The tremendous recent advances in generative artificial intelligence techniques have led to significant successes and promise in a wide range of different applications ranging from conversational agents and textual content generation to voice and visual synthesis. Amid the rise in generative AI and its increasing widespread adoption, there has been significant growing concern over the use of generative AI for malicious purposes. In the realm of visual content synthesis using generative AI, key areas of significant concern has been image forgery (e.g., generation of images containing or derived from copyright content), and data poisoning (i.e., generation of adversarially contaminated images). Motivated to address these key concerns to encourage responsible generative AI, we introduce the DeepfakeArt Challenge, a large-scale challenge benchmark dataset designed specifically to aid in the building of machine learning algorithms for generative AI art forgery and data poisoning detection. Comprising of over 32,000 records across a variety of generative forgery and data poisoning techniques, each entry consists of a pair of images that are either forgeries / adversarially contaminated or not. Each of the generated images in the DeepfakeArt Challenge benchmark dataset \footnote{The link to the dataset: //anon\_for\_review.com} has been quality checked in a comprehensive manner.
Autonomous vehicles necessitate a delicate balance between safety, efficiency, and user preferences in trajectory planning. Existing traditional or learning-based methods face challenges in adequately addressing all these aspects. In response, this paper proposes a novel component termed the Logical Guidance Layer (LGL), designed for seamless integration into autonomous driving trajectory planning frameworks, specifically tailored for highway scenarios. The LGL guides the trajectory planning with a local target area determined through scenario reasoning, scenario evaluation, and guidance area calculation. Integrating the Responsibility-Sensitive Safety (RSS) model, the LGL ensures formal safety guarantees while accommodating various user preferences defined by logical formulae. Experimental validation demonstrates the effectiveness of the LGL in achieving a balance between safety and efficiency, and meeting user preferences in autonomous highway driving scenarios.
Recent advancements in event argument extraction (EAE) involve incorporating beneficial auxiliary information into models during training and inference, such as retrieved instances and event templates. Additionally, some studies introduce learnable prefix vectors to models. These methods face three challenges: (1) insufficient utilization of relevant event instances due to deficiencies in retrieval; (2) neglect of important information provided by relevant event templates; (3) the advantages of prefixes are constrained due to their inability to meet the specific informational needs of EAE. In this work, we propose DEGAP, which addresses the above challenges through two simple yet effective components: (1) dual prefixes, where the instance-oriented prefix and template-oriented prefix are trained to learn information from different event instances and templates, respectively, and then provide relevant information as cues to EAE model without retrieval; (2) event-guided adaptive gating mechanism, which guides the prefixes based on the target event to fully leverage their advantages. Extensive experiments demonstrate that our method achieves new state-of-the-art performance on four datasets (ACE05, RAMS, WIKIEVENTS, and MLEE). Further analysis verifies the importance of the proposed design and the effectiveness of the main components.
We present VeriX, a first step towards verified explainability of machine learning models in safety-critical applications. Specifically, our sound and optimal explanations can guarantee prediction invariance against bounded perturbations. We utilise constraint solving techniques together with feature sensitivity ranking to efficiently compute these explanations. We evaluate our approach on image recognition benchmarks and a real-world scenario of autonomous aircraft taxiing.