In this paper we propose to quantify execution time variability of programs using statistical dispersion parameters. We show how the execution time variability can be exploited in mixed criticality real-time systems. We propose a heuristic to compute the execution time budget to be allocated to each low criticality real-time task according to its execution time variability. We show using experiments and simulations that the proposed heuristic reduces the probability of exceeding the allocated budget compared to algorithms which do not take into account the execution time variability parameter.
In this paper, we carry out the numerical analysis of a nonsmooth quasilinear elliptic optimal control problem, where the coefficient in the divergence term of the corresponding state equation is not differentiable with respect to the state variable. Despite the lack of differentiability of the nonlinearity in the quasilinear elliptic equation, the corresponding control-to-state operator is of class $C^1$ but not of class $C^2$. Analogously, the discrete control-to-state operators associated with the approximated control problems are proven to be of class $C^1$ only. By using an explicit second-order sufficient optimality condition, we prove a priori error estimates for a variational approximation, a piecewise constant approximation, and a continuous piecewise linear approximation of the continuous optimal control problem. The numerical tests confirm these error estimates.
This paper analyzes the approximate control variate (ACV) approach to multifidelity uncertainty quantification in the case where weighted estimators are combined to form the components of the ACV. The weighted estimators enable one to precisely group models that share input samples to achieve improved variance reduction. We demonstrate that this viewpoint yields a generalized linear estimator that can assign any weight to any sample. This generalization shows that other linear estimators in the literature, particularly the multilevel best linear unbiased estimator (ML-BLUE) of Schaden and Ullman in 2020, becomes a specific version of the ACV estimator of Gorodetsky, Geraci, Jakeman, and Eldred, 2020. Moreover, this connection enables numerous extensions and insights. For example, we empirically show that having non-independent groups can yield better variance reduction compared to the independent groups used by ML-BLUE. Furthermore, we show that such grouped estimators can use arbitrary weighted estimators, not just the simple Monte Carlo estimators used in ML-BLUE. Furthermore, the analysis enables the derivation of ML-BLUE directly from a variance reduction perspective, rather than a regression perspective.
Datasets collecting software mentions from scholarly publications can potentially be used for research into the software that has been used in the published research, as well as into the practice of software citation. Recently, new software mention datasets with different characteristics have been published. We present an approach to assess the usability of such datasets for research on research software. Our approach includes sampling and data preparation, manual annotation for quality and mention characteristics, and annotation analysis. We applied it to two software mention datasets for evaluation based on qualitative observation. Doing this, we were able to find challenges to working with the selected datasets to do research. Main issues refer to the structure of the dataset, the quality of the extracted mentions (54% and 23% of mentions respectively are not to software), and software accessibility. While one dataset does not provide links to mentioned software at all, the other does so in a way that can impede quantitative research endeavors: (1) Links may come from different sources and each point to different software for the same mention. (2) The quality of the automatically retrieved links is generally poor (in our sample, 65.4% link the wrong software). (3) Links exist only for a small subset (in our sample, 20.5%) of mentions, which may lead to skewed or disproportionate samples. However, the greatest challenge and underlying issue in working with software mention datasets is the still suboptimal practice of software citation: Software should not be mentioned, it should be cited following the software citation principles.
Feature attribution methods have become a staple method to disentangle the complex behavior of black box models. Despite their success, some scholars have argued that such methods suffer from a serious flaw: they do not allow a reliable interpretation in terms of human concepts. Simply put, visualizing an array of feature contributions is not enough for humans to conclude something about a model's internal representations, and confirmation bias can trick users into false beliefs about model behavior. We argue that a structured approach is required to test whether our hypotheses on the model are confirmed by the feature attributions. This is what we call the "semantic match" between human concepts and (sub-symbolic) explanations. Building on the conceptual framework put forward in Cin\`a et al. [2023], we propose a structured approach to evaluate semantic match in practice. We showcase the procedure in a suite of experiments spanning tabular and image data, and show how the assessment of semantic match can give insight into both desirable (e.g., focusing on an object relevant for prediction) and undesirable model behaviors (e.g., focusing on a spurious correlation). We couple our experimental results with an analysis on the metrics to measure semantic match, and argue that this approach constitutes the first step towards resolving the issue of confirmation bias in XAI.
In this study, we systematically evaluate the impact of common design choices in Mixture of Experts (MoEs) on validation performance, uncovering distinct influences at token and sequence levels. We also present empirical evidence showing comparable performance between a learned router and a frozen, randomly initialized router, suggesting that learned routing may not be essential. Our study further reveals that Sequence-level routing can result in topic-specific weak expert specialization, in contrast to syntax specialization observed with Token-level routing.
We present a fully-integrated lattice Boltzmann (LB) method for fluid--structure interaction (FSI) simulations that efficiently models deformable solids in complex suspensions and active systems. Our Eulerian method (LBRMT) couples finite-strain solids to the LB fluid on the same fixed computational grid with the reference map technique (RMT). An integral part of the LBRMT is a new LB boundary condition for moving deformable interfaces across different densities. With this fully Eulerian solid--fluid coupling, the LBRMT is well-suited for parallelization and simulating multi-body contact without remeshing or extra meshes. We validate its accuracy via a benchmark of a deformable solid in a lid-driven cavity, then showcase its versatility through examples of soft solids rotating and settling. With simulations of complex suspensions mixing, we highlight potentials of the LBRMT for studying collective behavior in soft matter and biofluid dynamics.
Loss reserving generally focuses on identifying a single model that can generate superior predictive performance. However, different loss reserving models specialise in capturing different aspects of loss data. This is recognised in practice in the sense that results from different models are often considered, and sometimes combined. For instance, actuaries may take a weighted average of the prediction outcomes from various loss reserving models, often based on subjective assessments. In this paper, we propose a systematic framework to objectively combine (i.e. ensemble) multiple _stochastic_ loss reserving models such that the strengths offered by different models can be utilised effectively. Our framework contains two main innovations compared to existing literature and practice. Firstly, our criteria model combination considers the full distributional properties of the ensemble and not just the central estimate - which is of particular importance in the reserving context. Secondly, our framework is that it is tailored for the features inherent to reserving data. These include, for instance, accident, development, calendar, and claim maturity effects. Crucially, the relative importance and scarcity of data across accident periods renders the problem distinct from the traditional ensembling techniques in statistical learning. Our framework is illustrated with a complex synthetic dataset. In the results, the optimised ensemble outperforms both (i) traditional model selection strategies, and (ii) an equally weighted ensemble. In particular, the improvement occurs not only with central estimates but also relevant quantiles, such as the 75th percentile of reserves (typically of interest to both insurers and regulators).
In this paper we develop a novel neural network model for predicting implied volatility surface. Prior financial domain knowledge is taken into account. A new activation function that incorporates volatility smile is proposed, which is used for the hidden nodes that process the underlying asset price. In addition, financial conditions, such as the absence of arbitrage, the boundaries and the asymptotic slope, are embedded into the loss function. This is one of the very first studies which discuss a methodological framework that incorporates prior financial domain knowledge into neural network architecture design and model training. The proposed model outperforms the benchmarked models with the option data on the S&P 500 index over 20 years. More importantly, the domain knowledge is satisfied empirically, showing the model is consistent with the existing financial theories and conditions related to implied volatility surface.
This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language
Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvolutional Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.