We propose and discuss sensitivity metrics for reliability analysis, which are based on the value of information. These metrics are easier to interpret than other existing sensitivity metrics in the context of a specific decision and they are applicable to any type of reliability assessment, including those with dependent inputs. We develop computational strategies that enable efficient evaluation of these metrics, in some scenarios without additional runs of the deterministic model. The metrics are investigated by application to numerical examples.
Because it determines a center-outward ordering of observations in $\mathbb{R}^d$ with $d\geq 2$, the concept of statistical depth permits to define quantiles and ranks for multivariate data and use them for various statistical tasks (e.g. inference, hypothesis testing). Whereas many depth functions have been proposed \textit{ad-hoc} in the literature since the seminal contribution of \cite{Tukey75}, not all of them possess the properties desirable to emulate the notion of quantile function for univariate probability distributions. In this paper, we propose an extension of the \textit{integrated rank-weighted} statistical depth (IRW depth in abbreviated form) originally introduced in \cite{IRW}, modified in order to satisfy the property of \textit{affine-invariance}, fulfilling thus all the four key axioms listed in the nomenclature elaborated by \cite{ZuoS00a}. The variant we propose, referred to as the Affine-Invariant IRW depth (AI-IRW in short), involves the covariance/precision matrices of the (supposedly square integrable) $d$-dimensional random vector $X$ under study, in order to take into account the directions along which $X$ is most variable to assign a depth value to any point $x\in \mathbb{R}^d$. The accuracy of the sampling version of the AI-IRW depth is investigated from a nonasymptotic perspective. Namely, a concentration result for the statistical counterpart of the AI-IRW depth is proved. Beyond the theoretical analysis carried out, applications to anomaly detection are considered and numerical results are displayed, providing strong empirical evidence of the relevance of the depth function we propose here.
With the ever growing importance of uncertainty and sensitivity analysis of complex model evaluations and the difficulty of their timely realizations comes a need for more efficient numerical operations. Non-intrusive Polynomial Chaos methods are highly efficient and accurate to map input-output relationships to investigate complex models. There is a lot of potential to increase the efficacy of the method regarding the selected sampling scheme. We examined state-of-the-art sampling schemes categorized in space-filling-optimal designs such as Latin Hypercube sampling and L1 optimal sampling and compare their empirical performance against standard random sampling. The analysis was performed in the context of L1 minimization using the least-angle regression algorithm to fit the gPC regression models. The sampling schemes are thoroughly investigated by evaluating the quality of the constructed surrogate models considering distinct test cases representing different problem classes covering low, medium and high dimensional problems. Finally, the samplings schemes are tested on an application example to estimate the sensitivity of the self-impedance of a probe, which is used to measure the impedance of biological tissues at different frequencies. Due to the random nature, we compared the sampling schemes using statistical stability measures and evaluated the success rates to construct a surrogate model with an accuracy of <0.1%. We observed strong differences in the convergence properties of the methods between the analyzed test functions.
A growing body of work uses the paradigm of algorithmic fairness to frame the development of techniques to anticipate and proactively mitigate the introduction or exacerbation of health inequities that may follow from the use of model-guided decision-making. We evaluate the interplay between measures of model performance, fairness, and the expected utility of decision-making to offer practical recommendations for the operationalization of algorithmic fairness principles for the development and evaluation of predictive models in healthcare. We conduct an empirical case-study via development of models to estimate the ten-year risk of atherosclerotic cardiovascular disease to inform statin initiation in accordance with clinical practice guidelines. We demonstrate that approaches that incorporate fairness considerations into the model training objective typically do not improve model performance or confer greater net benefit for any of the studied patient populations compared to the use of standard learning paradigms followed by threshold selection concordant with patient preferences, evidence of intervention effectiveness, and model calibration. These results hold when the measured outcomes are not subject to differential measurement error across patient populations and threshold selection is unconstrained, regardless of whether differences in model performance metrics, such as in true and false positive error rates, are present. In closing, we argue for focusing model development efforts on developing calibrated models that predict outcomes well for all patient populations while emphasizing that such efforts are complementary to transparent reporting, participatory design, and reasoning about the impact of model-informed interventions in context.
"Program sensitivity" measures the distance between the outputs of a program when it is run on two related inputs. This notion, which plays an important role in areas such as data privacy and optimization, has been the focus of several program analysis techniques introduced in recent years. One approach that has proved particularly fruitful for this domain is the use of type systems inspired by linear logic, as pioneered by Reed and Pierce in the Fuzz programming language. In Fuzz, each type is equipped with its own notion of distance, and the typing rules explain how those distances can be treated soundly when analyzing the sensitivity of a program. In particular, Fuzz features two products types, corresponding to two different sensitivity analyses: the "tensor product" combines the distances of each component by adding them, while the "with product" takes their maximum. In this work, we show that products in Fuzz can be generalized to arbitrary $L^p$ distances, metrics that are often used in privacy and optimization. The original Fuzz products, tensor and with, correspond to the special cases $L^1$ and $L^\infty$. To simplify the handling of such products, we extend the Fuzz type system with bunches -- as in the logic of bunched implications -- where the distances of different groups of variables can be combined using different $L^p$ distances. We show that our extension can be used to reason about important examples of metrics between probability distributions in a natural way.
Biomechanical models often need to describe very complex systems, organs or diseases, and hence also include a large number of parameters. One of the attractive features of physics-based models is that in those models (most) parameters have a clear physical meaning. Nevertheless, the determination of these parameters is often very elaborate and costly and shows a large scatter within the population. Hence, it is essential to identify the most important parameter for a particular problem at hand. In order to distinguish parameters which have a significant influence on a specific model output from non-influential parameters, we use sensitivity analysis, in particular the Sobol method as a global variance-based method. However, the Sobol method requires a large number of model evaluations, which is prohibitive for computationally expensive models. We therefore employ Gaussian processes as a metamodel for the underlying full model. Metamodelling introduces further uncertainty, which we also quantify. We demonstrate the approach by applying it to two different problems: nanoparticle-mediated drug delivery in a multiphase tumour-growth model, and arterial growth and remodelling. Even relatively small numbers of evaluations of the full model suffice to identify the influential parameters in both cases and to separate them from non-influential parameters. The approach also allows the quantification of higher-order interaction effects. We thus show that a variance-based global sensitivity analysis is feasible for computationally expensive biomechanical models. Different aspects of sensitivity analysis are covered including a transparent declaration of the uncertainties involved in the estimation process. Such a global sensitivity analysis not only helps to massively reduce costs for experimental determination of parameters but is also highly beneficial for inverse analysis of such complex models.
Recent work has proposed stochastic Plackett-Luce (PL) ranking models as a robust choice for optimizing relevance and fairness metrics. Unlike their deterministic counterparts that require heuristic optimization algorithms, PL models are fully differentiable. Theoretically, they can be used to optimize ranking metrics via stochastic gradient descent. However, in practice, the computation of the gradient is infeasible because it requires one to iterate over all possible permutations of items. Consequently, actual applications rely on approximating the gradient via sampling techniques. In this paper, we introduce a novel algorithm: PL-Rank, that estimates the gradient of a PL ranking model w.r.t. both relevance and fairness metrics. Unlike existing approaches that are based on policy gradients, PL-Rank makes use of the specific structure of PL models and ranking metrics. Our experimental analysis shows that PL-Rank has a greater sample-efficiency and is computationally less costly than existing policy gradients, resulting in faster convergence at higher performance. PL-Rank further enables the industry to apply PL models for more relevant and fairer real-world ranking systems.
We study the link between generalization and interference in temporal-difference (TD) learning. Interference is defined as the inner product of two different gradients, representing their alignment. This quantity emerges as being of interest from a variety of observations about neural networks, parameter sharing and the dynamics of learning. We find that TD easily leads to low-interference, under-generalizing parameters, while the effect seems reversed in supervised learning. We hypothesize that the cause can be traced back to the interplay between the dynamics of interference and bootstrapping. This is supported empirically by several observations: the negative relationship between the generalization gap and interference in TD, the negative effect of bootstrapping on interference and the local coherence of targets, and the contrast between the propagation rate of information in TD(0) versus TD($\lambda$) and regression tasks such as Monte-Carlo policy evaluation. We hope that these new findings can guide the future discovery of better bootstrapping methods.
This paper addresses the problem of viewpoint estimation of an object in a given image. It presents five key insights that should be taken into consideration when designing a CNN that solves the problem. Based on these insights, the paper proposes a network in which (i) The architecture jointly solves detection, classification, and viewpoint estimation. (ii) New types of data are added and trained on. (iii) A novel loss function, which takes into account both the geometry of the problem and the new types of data, is propose. Our network improves the state-of-the-art results for this problem by 9.8%.
During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in these family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
Current convolutional neural networks algorithms for video object tracking spend the same amount of computation for each object and video frame. However, it is harder to track an object in some frames than others, due to the varying amount of clutter, scene complexity, amount of motion, and object's distinctiveness against its background. We propose a depth-adaptive convolutional Siamese network that performs video tracking adaptively at multiple neural network depths. Parametric gating functions are trained to control the depth of the convolutional feature extractor by minimizing a joint loss of computational cost and tracking error. Our network achieves accuracy comparable to the state-of-the-art on the VOT2016 benchmark. Furthermore, our adaptive depth computation achieves higher accuracy for a given computational cost than traditional fixed-structure neural networks. The presented framework extends to other tasks that use convolutional neural networks and enables trading speed for accuracy at runtime.