This paper considers mixed traffic consisting of connected automated vehicles equipped with vehicle-to-everything (V2X) connectivity and human-driven vehicles. A control strategy is proposed for communicating pairs of connected automated vehicles, where the two vehicles regulate their longitudinal motion by responding to each other, and, at the same time, stabilize the human-driven traffic between them. Stability analysis is conducted to find stabilizing controllers, and simulations are used to show the efficacy of the proposed approach. The impact of the penetration of connectivity and automation on the string stability of traffic is quantified. It is shown that, even with moderate penetration, connected automated vehicle pairs executing the proposed controllers achieve significant benefits compared to when these vehicles are disconnected and controlled independently.
Remotely operating vehicles utilizes the benefits of vehicle automation when fully automated driving is not yet possible. A human operator ensures safety and availability from afar and supports the vehicle automation when its capabilities are exceeded. The remote operator thus fulfills the legal requirements in Germany as a Technical Supervisor to operate highly automated vehicles at SAE 4. To integrate the remote operator into the automated driving system, a novel user-centered human-machine interface (HMI) for remote assistance workplaces was developed and initially evaluated. The insights gained in this process were incorporated into the design of a workplace prototype for remote assistance. This prototype was now tested in the study reported here by 34 participants meeting the professional background criteria for the role of Technical Supervisor according to the German law by using typical remote assistance scenarios created in a simulation environment. Even under elevated cognitive load induced by simultaneously engaging in a secondary task, participants were able to obtain sufficient situation awareness and quickly resolve the scenarios. The HMI also yielded favorable usability and acceptance ratings. The results of the study inform the iterative workplace development and further research on the remote assistance of highly automated vehicles.
In many applications, piecewise continuous functions are commonly interpolated over meshes. However, accurate high-order manipulations of such functions can be challenging due to potential spurious oscillations known as the Gibbs phenomena. To address this challenge, we propose a novel approach, Robust Discontinuity Indicators (RDI), which can efficiently and reliably detect both C^{0} and C^{1} discontinuities for node-based and cell-averaged values. We present a detailed analysis focusing on its derivation and the dual-thresholding strategy. A key advantage of RDI is its ability to handle potential inaccuracies associated with detecting discontinuities on non-uniform meshes, thanks to its innovative discontinuity indicators. We also extend the applicability of RDI to handle general surfaces with boundaries, features, and ridge points, thereby enhancing its versatility and usefulness in various scenarios. To demonstrate the robustness of RDI, we conduct a series of experiments on non-uniform meshes and general surfaces, and compare its performance with some alternative methods. By addressing the challenges posed by the Gibbs phenomena and providing reliable detection of discontinuities, RDI opens up possibilities for improved approximation and analysis of piecewise continuous functions, such as in data remap.
We present novel cross-sectional and longitudinal claim count models for vehicle insurance built upon the Combined Actuarial Neural Network (CANN) framework proposed by Mario W\"uthrich and Michael Merz. The CANN approach combines a classical actuarial model, such as a generalized linear model, with a neural network. This blending of models results in a two-component model comprising a classical regression model and a neural network part. The CANN model leverages the strengths of both components, providing a solid foundation and interpretability from the classical model while harnessing the flexibility and capacity to capture intricate relationships and interactions offered by the neural network. In our proposed models, we use well-known log-linear claim count regression models for the classical regression part and a multilayer perceptron (MLP) for the neural network part. The MLP part is used to process telematics car driving data given as a vector characterizing the driving behavior of each insured driver. In addition to the Poisson and negative binomial distributions for cross-sectional data, we propose a procedure for training our CANN model with a multivariate negative binomial (MVNB) specification. By doing so, we introduce a longitudinal model that accounts for the dependence between contracts from the same insured. Our results reveal that the CANN models exhibit superior performance compared to log-linear models that rely on manually engineered telematics features.
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory forecasting of the detected objects, or predict dense occupancy and flow grids for the whole scene. The former poses a safety concern as the number of detections needs to be kept low for efficiency reasons, sacrificing object recall. The latter is computationally expensive due to the high-dimensionality of the output grid, and suffers from the limited receptive field inherent to fully convolutional networks. Furthermore, both approaches employ many computational resources predicting areas or objects that might never be queried by the motion planner. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network. Our method avoids unnecessary computation, as it can be directly queried by the motion planner at continuous spatio-temporal locations. Moreover, we design an architecture that overcomes the limited receptive field of previous explicit occupancy prediction methods by adding an efficient yet effective global attention mechanism. Through extensive experiments in both urban and highway settings, we demonstrate that our implicit model outperforms the current state-of-the-art. For more information, visit the project website: //waabi.ai/research/implicito.
Contact surfaces in planar motion exhibit a coupling between tangential and rotational friction forces. This paper proposes planar friction models grounded in the LuGre model and limit surface theory. First, distributed planar extended state models are proposed and the Elasto-Plastic model is extended for multi-dimensional friction. Subsequently, we derive a reduced planar friction model, coupled with a pre-calculated limit surface, that offers reduced computational cost. The limit surface approximation through an ellipsoid is discussed. The properties of the planar friction models are assessed in various simulations, demonstrating that the reduced planar friction model achieves comparable performance to the distributed model while exhibiting ~80 times lower computational cost.
Complex scenario of ultrasound image, in which adjacent tissues (i.e., background) share similar intensity with and even contain richer texture patterns than lesion region (i.e., foreground), brings a unique challenge for accurate lesion segmentation. This work presents a decomposition-coupling network, called DC-Net, to deal with this challenge in a (foreground-background) saliency map disentanglement-fusion manner. The DC-Net consists of decomposition and coupling subnets, and the former preliminarily disentangles original image into foreground and background saliency maps, followed by the latter for accurate segmentation under the assistance of saliency prior fusion. The coupling subnet involves three aspects of fusion strategies, including: 1) regional feature aggregation (via differentiable context pooling operator in the encoder) to adaptively preserve local contextual details with the larger receptive field during dimension reduction; 2) relation-aware representation fusion (via cross-correlation fusion module in the decoder) to efficiently fuse low-level visual characteristics and high-level semantic features during resolution restoration; 3) dependency-aware prior incorporation (via coupler) to reinforce foreground-salient representation with the complementary information derived from background representation. Furthermore, a harmonic loss function is introduced to encourage the network to focus more attention on low-confidence and hard samples. The proposed method is evaluated on two ultrasound lesion segmentation tasks, which demonstrates the remarkable performance improvement over existing state-of-the-art methods.
Many complex engineering systems can be represented in a topological form, such as graphs. This paper utilizes a machine learning technique called Geometric Deep Learning (GDL) to aid designers with challenging, graph-centric design problems. The strategy presented here is to take the graph data and apply GDL to seek the best realizable performing solution effectively and efficiently with lower computational costs. This case study used here is the synthesis of analog electrical circuits that attempt to match a specific frequency response within a particular frequency range. Previous studies utilized an enumeration technique to generate 43,249 unique undirected graphs presenting valid potential circuits. Unfortunately, determining the sizing and performance of many circuits can be too expensive. To reduce computational costs with a quantified trade-off in accuracy, the fraction of the circuit graphs and their performance are used as input data to a classification-focused GDL model. Then, the GDL model can be used to predict the remainder cheaply, thus, aiding decision-makers in the search for the best graph solutions. The results discussed in this paper show that additional graph-based features are useful, favorable total set classification accuracy of 80\% in using only 10\% of the graphs, and iteratively-built GDL models can further subdivide the graphs into targeted groups with medians significantly closer to the best and containing 88.2 of the top 100 best-performing graphs on average using 25\% of the graphs.
Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.
Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and evaluation metrics to measure its effect on current deep learning architectures. We then proceed to take steps to mitigate the effect of distributional shift on NLP models. To this end, we develop methods based on parametric reformulations of the distributionally robust optimization framework. Empirically, we demonstrate that these approaches yield more robust models as demonstrated on a selection of realistic problems. In the third and final part of this thesis, we explore ways of efficiently adapting existing models to new domains or tasks. Our contribution to this topic takes inspiration from information geometry to derive a new gradient update rule which alleviate catastrophic forgetting issues during adaptation.