The field of 'explainable' artificial intelligence (XAI) has produced highly cited methods that seek to make the decisions of complex machine learning (ML) methods 'understandable' to humans, for example by attributing 'importance' scores to input features. Yet, a lack of formal underpinning leaves it unclear as to what conclusions can safely be drawn from the results of a given XAI method and has also so far hindered the theoretical verification and empirical validation of XAI methods. This means that challenging non-linear problems, typically solved by deep neural networks, presently lack appropriate remedies. Here, we craft benchmark datasets for three different non-linear classification scenarios, in which the important class-conditional features are known by design, serving as ground truth explanations. Using novel quantitative metrics, we benchmark the explanation performance of a wide set of XAI methods across three deep learning model architectures. We show that popular XAI methods are often unable to significantly outperform random performance baselines and edge detection methods. Moreover, we demonstrate that explanations derived from different model architectures can be vastly different; thus, prone to misinterpretation even under controlled conditions.
Computational analysis with the finite element method requires geometrically accurate meshes. It is well known that high-order meshes can accurately capture curved surfaces with fewer degrees of freedom in comparison to low-order meshes. Existing techniques for high-order mesh generation typically output meshes with same polynomial order for all elements. However, high order elements away from curvilinear boundaries or interfaces increase the computational cost of the simulation without increasing geometric accuracy. In prior work, we have presented one such approach for generating body-fitted uniform-order meshes that takes a given mesh and morphs it to align with the surface of interest prescribed as the zero isocontour of a level-set function. We extend this method to generate mixed-order meshes such that curved surfaces of the domain are discretized with high-order elements, while low-order elements are used elsewhere. Numerical experiments demonstrate the robustness of the approach and show that it can be used to generate mixed-order meshes that are much more efficient than high uniform-order meshes. The proposed approach is purely algebraic, and extends to different types of elements (quadrilaterals/triangles/tetrahedron/hexahedra) in two- and three-dimensions.
We present new Neumann-Neumann algorithms based on a time domain decomposition applied to unconstrained parabolic optimal control problems. After a spatial semi-discretization, the Lagrange multiplier approach provides a coupled forward-backward optimality system, which can be solved using a time domain decomposition. Due to the forward-backward structure of the optimality system, nine variants can be found for the Neumann-Neumann algorithms. We analyze their convergence behavior and determine the optimal relaxation parameter for each algorithm. Our analysis reveals that the most natural algorithms are actually only good smoothers, and there are better choices which lead to efficient solvers. We illustrate our analysis with numerical experiments.
Statistical inference for high dimensional parameters (HDPs) can be based on their intrinsic correlation; that is, parameters that are close spatially or temporally tend to have more similar values. This is why nonlinear mixed-effects models (NMMs) are commonly (and appropriately) used for models with HDPs. Conversely, in many practical applications of NMM, the random effects (REs) are actually correlated HDPs that should remain constant during repeated sampling for frequentist inference. In both scenarios, the inference should be conditional on REs, instead of marginal inference by integrating out REs. In this paper, we first summarize recent theory of conditional inference for NMM, and then propose a bias-corrected RE predictor and confidence interval (CI). We also extend this methodology to accommodate the case where some REs are not associated with data. Simulation studies indicate that this new approach leads to substantial improvement in the conditional coverage rate of RE CIs, including CIs for smooth functions in generalized additive models, as compared to the existing method based on marginal inference.
Tracking ripening tomatoes is time consuming and labor intensive. Artificial intelligence technologies combined with those of computer vision can help users optimize the process of monitoring the ripening status of plants. To this end, we have proposed a tomato ripening monitoring approach based on deep learning in complex scenes. The objective is to detect mature tomatoes and harvest them in a timely manner. The proposed approach is declined in two parts. Firstly, the images of the scene are transmitted to the pre-processing layer. This process allows the detection of areas of interest (area of the image containing tomatoes). Then, these images are used as input to the maturity detection layer. This layer, based on a deep neural network learning algorithm, classifies the tomato thumbnails provided to it in one of the following five categories: green, brittle, pink, pale red, mature red. The experiments are based on images collected from the internet gathered through searches using tomato state across diverse languages including English, German, French, and Spanish. The experimental results of the maturity detection layer on a dataset composed of images of tomatoes taken under the extreme conditions, gave a good classification rate.
In this work, a Generalized Finite Difference (GFD) scheme is presented for effectively computing the numerical solution of a parabolic-elliptic system modelling a bacterial strain with density-suppressed motility. The GFD method is a meshless method known for its simplicity for solving non-linear boundary value problems over irregular geometries. The paper first introduces the basic elements of the GFD method, and then an explicit-implicit scheme is derived. The convergence of the method is proven under a bound for the time step, and an algorithm is provided for its computational implementation. Finally, some examples are considered comparing the results obtained with a regular mesh and an irregular cloud of points.
We propose a novel algorithm for the support estimation of partially known Gaussian graphical models that incorporates prior information about the underlying graph. In contrast to classical approaches that provide a point estimate based on a maximum likelihood or a maximum a posteriori criterion using (simple) priors on the precision matrix, we consider a prior on the graph and rely on annealed Langevin diffusion to generate samples from the posterior distribution. Since the Langevin sampler requires access to the score function of the underlying graph prior, we use graph neural networks to effectively estimate the score from a graph dataset (either available beforehand or generated from a known distribution). Numerical experiments demonstrate the benefits of our approach.
We introduce an extension of first-order logic that comes equipped with additional predicates for reasoning about an abstract state. Sequents in the logic comprise a main formula together with pre- and postconditions in the style of Hoare logic, and the axioms and rules of the logic ensure that the assertions about the state compose in the correct way. The main result of the paper is a realizability interpretation of our logic that extracts programs into a mixed functional/imperative language. All programs expressible in this language act on the state in a sequential manner, and we make this intuition precise by interpreting them in a semantic metatheory using the state monad. Our basic framework is very general, and our intention is that it can be instantiated and extended in a variety of different ways. We outline in detail one such extension: A monadic version of Heyting arithmetic with a wellfounded while rule, and conclude by outlining several other directions for future work.
Emotion recognition in conversation (ERC) has emerged as a research hotspot in domains such as conversational robots and question-answer systems. How to efficiently and adequately retrieve contextual emotional cues has been one of the key challenges in the ERC task. Existing efforts do not fully model the context and employ complex network structures, resulting in limited performance gains. In this paper, we propose a novel emotion recognition network based on curriculum learning strategy (ERNetCL). The proposed ERNetCL primarily consists of temporal encoder (TE), spatial encoder (SE), and curriculum learning (CL) loss. We utilize TE and SE to combine the strengths of previous methods in a simplistic manner to efficiently capture temporal and spatial contextual information in the conversation. To ease the harmful influence resulting from emotion shift and simulate the way humans learn curriculum from easy to hard, we apply the idea of CL to the ERC task to progressively optimize the network parameters. At the beginning of training, we assign lower learning weights to difficult samples. As the epoch increases, the learning weights for these samples are gradually raised. Extensive experiments on four datasets exhibit that our proposed method is effective and dramatically beats other baseline models.
Inequality measures are quantitative measures that take values in the unit interval, with a zero value characterizing perfect equality. Although originally proposed to measure economic inequalities, they can be applied to several other situations, in which one is interested in the mutual variability between a set of observations, rather than in their deviations from the mean. While unidimensional measures of inequality, such as the Gini index, are widely known and employed, multidimensional measures, such as Lorenz Zonoids, are difficult to interpret and computationally expensive and, for these reasons, are not much well known. To overcome the problem, in this paper we propose a new scaling invariant multidimensional inequality index, based on the Fourier transform, which exhibits a number of interesting properties, and whose application to the multidimensional case is rather straightforward to calculate and interpret.
We introduce MCCE: Monte Carlo sampling of valid and realistic Counterfactual Explanations for tabular data, a novel counterfactual explanation method that generates on-manifold, actionable and valid counterfactuals by modeling the joint distribution of the mutable features given the immutable features and the decision. Unlike other on-manifold methods that tend to rely on variational autoencoders and have strict prediction model and data requirements, MCCE handles any type of prediction model and categorical features with more than two levels. MCCE first models the joint distribution of the features and the decision with an autoregressive generative model where the conditionals are estimated using decision trees. Then, it samples a large set of observations from this model, and finally, it removes the samples that do not obey certain criteria. We compare MCCE with a range of state-of-the-art on-manifold counterfactual methods using four well-known data sets and show that MCCE outperforms these methods on all common performance metrics and speed. In particular, including the decision in the modeling process improves the efficiency of the method substantially.