Most research on fair machine learning has prioritized optimizing criteria such as Demographic Parity and Equalized Odds. Despite these efforts, there remains a limited understanding of how different bias mitigation strategies affect individual predictions and whether they introduce arbitrariness into the debiasing process. This paper addresses these gaps by exploring whether models that achieve comparable fairness and accuracy metrics impact the same individuals and mitigate bias in a consistent manner. We introduce the FRAME (FaiRness Arbitrariness and Multiplicity Evaluation) framework, which evaluates bias mitigation through five dimensions: Impact Size (how many people were affected), Change Direction (positive versus negative changes), Decision Rates (impact on models' acceptance rates), Affected Subpopulations (who was affected), and Neglected Subpopulations (where unfairness persists). This framework is intended to help practitioners understand the impacts of debiasing processes and make better-informed decisions regarding model selection. Applying FRAME to various bias mitigation approaches across key datasets allows us to exhibit significant differences in the behaviors of debiasing methods. These findings highlight the limitations of current fairness criteria and the inherent arbitrariness in the debiasing process.
In shape-constrained nonparametric inference, it is often necessary to perform preliminary tests to verify whether a probability mass function (p.m.f.) satisfies qualitative constraints such as monotonicity, convexity or in general $k$-monotonicity. In this paper, we are interested in testing $k$-monotonicity of a compactly supported p.m.f. and we put our main focus on monotonicity and convexity; i.e., $k \in \{1,2\}$. We consider new testing procedures that are directly derived from the definition of $k$-monotonicity and rely exclusively on the empirical measure, as well as tests that are based on the projection of the empirical measure on the class of $k$-monotone p.m.f.s. The asymptotic behaviour of the introduced test statistics is derived and a simulation study is performed to assess the finite sample performance of all the proposed tests. Applications to real datasets are presented to illustrate the theory.
We propose a novel framework of generalised Petrov-Galerkin Dynamical Low Rank Approximations (DLR) in the context of random PDEs. It builds on the standard Dynamical Low Rank Approximations in their Dynamically Orthogonal formulation. It allows to seamlessly build-in many standard and well-studied stabilisation techniques that can be framed as either generalised Galerkin methods, or Petrov-Galerkin methods. The framework is subsequently applied to the case of Streamine Upwind/Petrov Galerkin (SUPG) stabilisation of advection-dominated problems with small stochastic perturbations of the transport field. The norm-stability properties of two time discretisations are analysed. Numerical experiments confirm that the stabilising properties of the SUPG method naturally carry over to the DLR framework.
We address the problem of the best uniform approximation of a continuous function on a convex domain. The approximation is by linear combinations of a finite system of functions (not necessarily Chebyshev) under arbitrary linear constraints. By modifying the concept of alternance and of the Remez iterative procedure we present a method, which demonstrates its efficiency in numerical problems. The linear rate of convergence is proved under some favourable assumptions. A special attention is paid to systems of complex exponents, Gaussian functions, lacunar algebraic and trigonometric polynomials. Applications to signal processing, linear ODE, switching dynamical systems, and to Markov-Bernstein type inequalities are considered.
This work aims to introduce a heuristic timestep-adaptive algorithm for Computational Fluid Dynamics (CFD) and Fluid-Structure Interaction (FSI) problems where the flow is dominated by the pressure. In such scenarios, many time-adaptive algorithms based on the interplay of implicit and explicit time schemes fail to capture the fast transient dynamics of pressure fields. We present an algorithm that relies on a temporal error estimator using Backward Differentiation Formulae (BDF$k$) of order $k=2,3$. Specifically, we demonstrate that the implicit BDF$3$ solution can be well approximated by applying a single Newton-type nonlinear solver correction to the implicit BDF$2$ solution. The difference between these solutions determines our adaptive temporal error estimator. The effectiveness of our approach is confirmed by numerical experiments conducted on a backward-facing step flow CFD test case with Reynolds number $300$ and on a two-dimensional haemodynamics FSI benchmark.
To facilitate effective decision-making, gridded satellite precipitation products should include uncertainty estimates. Machine learning has been proposed for issuing such estimates. However, most existing algorithms for this purpose rely on quantile regression. Distributional regression offers distinct advantages over quantile regression, including the ability to model intermittency as well as a stronger ability to extrapolate beyond the training data, which is critical for predicting extreme precipitation. In this work, we introduce the concept of distributional regression for the engineering task of creating precipitation datasets through data merging. Building upon this concept, we propose new ensemble learning methods that can be valuable not only for spatial prediction but also for prediction problems in general. These methods exploit conditional zero-adjusted probability distributions estimated with generalized additive models for location, scale, and shape (GAMLSS), spline-based GAMLSS and distributional regression forests as well as their ensembles (stacking based on quantile regression, and equal-weight averaging). To identify the most effective methods for our specific problem, we compared them to benchmarks using a large, multi-source precipitation dataset. Stacking emerged as the most successful strategy. Three specific stacking methods achieved the best performance based on the quantile scoring rule, although the ranking of these methods varied across quantile levels. This suggests that a task-specific combination of multiple algorithms could yield significant benefits.
Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies. However, causal studies on non-intervention systems attract much attention but remain extremely challenging. To address this challenge, we propose a framework named Interventional Dynamical Causality (IntDC) for such non-intervention systems, along with its computational criterion, Interventional Embedding Entropy (IEE), to quantify causality. The IEE criterion theoretically and numerically enables the deciphering of IntDC solely from observational (non-interventional) time-series data, without requiring any knowledge of dynamical models or real interventions in the considered system. Demonstrations of performance showed the accuracy and robustness of IEE on benchmark simulated systems as well as real-world systems, including the neural connectomes of C. elegans, COVID-19 transmission networks in Japan, and regulatory networks surrounding key circadian genes.
Weakly Supervised Semantic Segmentation (WSSS) employs weak supervision, such as image-level labels, to train the segmentation model. Despite the impressive achievement in recent WSSS methods, we identify that introducing weak labels with high mean Intersection of Union (mIoU) does not guarantee high segmentation performance. Existing studies have emphasized the importance of prioritizing precision and reducing noise to improve overall performance. In the same vein, we propose ORANDNet, an advanced ensemble approach tailored for WSSS. ORANDNet combines Class Activation Maps (CAMs) from two different classifiers to increase the precision of pseudo-masks (PMs). To further mitigate small noise in the PMs, we incorporate curriculum learning. This involves training the segmentation model initially with pairs of smaller-sized images and corresponding PMs, gradually transitioning to the original-sized pairs. By combining the original CAMs of ResNet-50 and ViT, we significantly improve the segmentation performance over the single-best model and the naive ensemble model, respectively. We further extend our ensemble method to CAMs from AMN (ResNet-like) and MCTformer (ViT-like) models, achieving performance benefits in advanced WSSS models. It highlights the potential of our ORANDNet as a final add-on module for WSSS models.
Eigenmaps are important in analysis, geometry, and machine learning, especially in nonlinear dimension reduction. Approximation of the eigenmaps of a Laplace operator depends crucially on the scaling parameter $\epsilon$. If $\epsilon$ is too small or too large, then the approximation is inaccurate or completely breaks down. However, an analytic expression for the optimal $\epsilon$ is out of reach. In our work, we use some explicitly solvable models and Monte Carlo simulations to find the approximately optimal range of $\epsilon$ that gives, on average, relatively accurate approximation of the eigenmaps. Numerically we can consider several model situations where eigen-coordinates can be computed analytically, including intervals with uniform and weighted measures, squares, tori, spheres, and the Sierpinski gasket. In broader terms, we intend to study eigen-coordinates on weighted Riemannian manifolds, possibly with boundary, and on some metric measure spaces, such as fractals.
Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.