Understanding adaptive human driving behavior, in particular how drivers manage uncertainty, is of key importance for developing simulated human driver models that can be used in the evaluation and development of autonomous vehicles. However, existing traffic psychology models of adaptive driving behavior either lack computational rigor or only address specific scenarios and/or behavioral phenomena. While models developed in the fields of machine learning and robotics can effectively learn adaptive driving behavior from data, due to their black box nature, they offer little or no explanation of the mechanisms underlying the adaptive behavior. Thus, a generalizable, interpretable, computational model of adaptive human driving behavior is still lacking. This paper proposes such a model based on active inference, a behavioral modeling framework originating in computational neuroscience. The model offers a principled solution to how humans trade progress against caution through policy selection based on the single mandate to minimize expected free energy. This casts goal-seeking and information-seeking (uncertainty-resolving) behavior under a single objective function, allowing the model to seamlessly resolve uncertainty as a means to obtain its goals. We apply the model in two apparently disparate driving scenarios that require managing uncertainty, (1) driving past an occluding object and (2) visual time sharing between driving and a secondary task, and show how human-like adaptive driving behavior emerges from the single principle of expected free energy minimization.
The increasing availability of temporal data poses a challenge to time-series and signal-processing domains due to its high numerosity and complexity. Symbolic representation outperforms raw data in a variety of engineering applications due to its storage efficiency, reduced numerosity, and noise reduction. The most recent symbolic aggregate approximation technique called ABBA demonstrates outstanding performance in preserving essential shape information of time series and enhancing the downstream applications. However, ABBA cannot handle multiple time series with consistent symbols, i.e., the same symbols from distinct time series are not identical. Also, working with appropriate ABBA digitization involves the tedious task of tuning the hyperparameters, such as the number of symbols or tolerance. Therefore, we present a joint symbolic aggregate approximation that has symbolic consistency, and show how the hyperparameter of digitization can itself be optimized alongside the compression tolerance ahead of time. Besides, we propose a novel computing paradigm that enables parallel computing of symbolic approximation. The extensive experiments demonstrate its superb performance and outstanding speed regarding symbolic approximation and reconstruction.
We analyze the prediction error of principal component regression (PCR) and prove high probability bounds for the corresponding squared risk conditional on the design. Our first main result shows that PCR performs comparably to the oracle method obtained by replacing empirical principal components by their population counterparts, provided that an effective rank condition holds. On the other hand, if the latter condition is violated, then empirical eigenvalues start to have a significant upward bias, resulting in a self-induced regularization of PCR. Our approach relies on the behavior of empirical eigenvalues, empirical eigenvectors and the excess risk of principal component analysis in high-dimensional regimes.
Recent work has focused on the very common practice of prediction-based inference: that is, (i) using a pre-trained machine learning model to predict an unobserved response variable, and then (ii) conducting inference on the association between that predicted response and some covariates. As pointed out by Wang et al. (2020), applying a standard inferential approach in (ii) does not accurately quantify the association between the unobserved (as opposed to the predicted) response and the covariates. In recent work, Wang et al. (2020) and Angelopoulos et al. (2023) propose corrections to step (ii) in order to enable valid inference on the association between the unobserved response and the covariates. Here, we show that the method proposed by Angelopoulos et al. (2023) successfully controls the type 1 error rate and provides confidence intervals with correct nominal coverage, regardless of the quality of the pre-trained machine learning model used to predict the unobserved response. However, the method proposed by Wang et al. (2020) provides valid inference only under very strong conditions that rarely hold in practice: for instance, if the machine learning model perfectly estimates the true regression function in the study population of interest.
It is well-known that mood and pain interact with each other, however individual-level variability in this relationship has been less well quantified than overall associations between low mood and pain. Here, we leverage the possibilities presented by mobile health data, in particular the "Cloudy with a Chance of Pain" study, which collected longitudinal data from the residents of the UK with chronic pain conditions. Participants used an App to record self-reported measures of factors including mood, pain and sleep quality. The richness of these data allows us to perform model-based clustering of the data as a mixture of Markov processes. Through this analysis we discover four endotypes with distinct patterns of co-evolution of mood and pain over time. The differences between endotypes are sufficiently large to play a role in clinical hypothesis generation for personalised treatments of comorbid pain and low mood.
With the development of deep learning and natural language processing techniques, pre-trained language models have been widely used to solve information retrieval (IR) problems. Benefiting from the pre-training and fine-tuning paradigm, these models achieve state-of-the-art performance. In previous works, plain texts in Wikipedia have been widely used in the pre-training stage. However, the rich structured information in Wikipedia, such as the titles, abstracts, hierarchical heading (multi-level title) structure, relationship between articles, references, hyperlink structures, and the writing organizations, has not been fully explored. In this paper, we devise four pre-training objectives tailored for IR tasks based on the structured knowledge of Wikipedia. Compared to existing pre-training methods, our approach can better capture the semantic knowledge in the training corpus by leveraging the human-edited structured data from Wikipedia. Experimental results on multiple IR benchmark datasets show the superior performance of our model in both zero-shot and fine-tuning settings compared to existing strong retrieval baselines. Besides, experimental results in biomedical and legal domains demonstrate that our approach achieves better performance in vertical domains compared to previous models, especially in scenarios where long text similarity matching is needed.
Accounting for exposure measurement errors has been recognized as a crucial problem in environmental epidemiology for over two decades. Bayesian hierarchical models offer a coherent probabilistic framework for evaluating associations between environmental exposures and health effects, which take into account exposure measurement errors introduced by uncertainty in the estimated exposure as well as spatial misalignment between the exposure and health outcome data. While two-stage Bayesian analyses are often regarded as a good alternative to fully Bayesian analyses when joint estimation is not feasible, there has been minimal research on how to properly propagate uncertainty from the first-stage exposure model to the second-stage health model, especially in the case of a large number of participant locations along with spatially correlated exposures. We propose a scalable two-stage Bayesian approach, called a sparse multivariate normal (sparse MVN) prior approach, based on the Vecchia approximation for assessing associations between exposure and health outcomes in environmental epidemiology. We compare its performance with existing approaches through simulation. Our sparse MVN prior approach shows comparable performance with the fully Bayesian approach, which is a gold standard but is impossible to implement in some cases. We investigate the association between source-specific exposures and pollutant (nitrogen dioxide (NO$_2$))-specific exposures and birth outcomes for 2012 in Harris County, Texas, using several approaches, including the newly developed method.
Detecting early warning indicators for abrupt dynamical transitions in complex systems or high-dimensional observation data is essential in many real-world applications, such as brain diseases, natural disasters, financial crises, and engineering reliability. To this end, we develop a novel approach: the directed anisotropic diffusion map that captures the latent evolutionary dynamics in the low-dimensional manifold. Then three effective warning signals (Onsager-Machlup Indicator, Sample Entropy Indicator, and Transition Probability Indicator) are derived through the latent coordinates and the latent stochastic dynamical systems. To validate our framework, we apply this methodology to authentic electroencephalogram (EEG) data. We find that our early warning indicators are capable of detecting the tipping point during state transition. This framework not only bridges the latent dynamics with real-world data but also shows the potential ability for automatic labeling on complex high-dimensional time series.
For the convolutional neural network (CNN) used for pattern classification, the training loss function is usually applied to the final output of the network, except for some regularization constraints on the network parameters. However, with the increasing of the number of network layers, the influence of the loss function on the network front layers gradually decreases, and the network parameters tend to fall into local optimization. At the same time, it is found that the trained network has significant information redundancy at all stages of features, which reduces the effectiveness of feature mapping at all stages and is not conducive to the change of the subsequent parameters of the network in the direction of optimality. Therefore, it is possible to obtain a more optimized solution of the network and further improve the classification accuracy of the network by designing a loss function for restraining the front stage features and eliminating the information redundancy of the front stage features .For CNN, this article proposes a multi-stage feature decorrelation loss (MFD Loss), which refines effective features and eliminates information redundancy by constraining the correlation of features at all stages. Considering that there are many layers in CNN, through experimental comparison and analysis, MFD Loss acts on multiple front layers of CNN, constrains the output features of each layer and each channel, and performs supervision training jointly with classification loss function during network training. Compared with the single Softmax Loss supervised learning, the experiments on several commonly used datasets on several typical CNNs prove that the classification performance of Softmax Loss+MFD Loss is significantly better. Meanwhile, the comparison experiments before and after the combination of MFD Loss and some other typical loss functions verify its good universality.
Dealing with uncertainty in optimization parameters is an important and longstanding challenge. Typically, uncertain parameters are predicted accurately, and then a deterministic optimization problem is solved. However, the decisions produced by this so-called \emph{predict-then-optimize} procedure can be highly sensitive to uncertain parameters. In this work, we contribute to recent efforts in producing \emph{decision-focused} predictions, i.e., to build predictive models that are constructed with the goal of minimizing a \emph{regret} measure on the decisions taken with them. We formulate the exact expected regret minimization as a pessimistic bilevel optimization model. Then, using duality arguments, we reformulate it as a non-convex quadratic optimization problem. Finally, we show various computational techniques to achieve tractability. We report extensive computational results on shortest-path instances with uncertain cost vectors. Our results indicate that our approach can improve training performance over the approach of Elmachtoub and Grigas (2022), a state-of-the-art method for decision-focused learning.
Conventional entity typing approaches are based on independent classification paradigms, which make them difficult to recognize inter-dependent, long-tailed and fine-grained entity types. In this paper, we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges. To this end, we propose \emph{Label Reasoning Network(LRN)}, which sequentially reasons fine-grained entity labels by discovering and exploiting label dependencies knowledge entailed in the data. Specifically, LRN utilizes an auto-regressive network to conduct deductive reasoning and a bipartite attribute graph to conduct inductive reasoning between labels, which can effectively model, learn and reason complex label dependencies in a sequence-to-set, end-to-end manner. Experiments show that LRN achieves the state-of-the-art performance on standard ultra fine-grained entity typing benchmarks, and can also resolve the long tail label problem effectively.