Principled accountability in the aftermath of harms is essential to the trustworthy design and governance of algorithmic decision making. Legal philosophy offers a paramount method for assessing culpability: putting the agent 'on the stand' to subject their actions and intentions to cross-examination. We show that under minimal assumptions automated reasoning can rigorously interrogate algorithmic behaviors as in the adversarial process of legal fact finding. We model accountability processes, such as trials or review boards, as Counterfactual-Guided Logic Exploration and Abstraction Refinement (CLEAR) loops. We use an SMT-based oracle to discharge queries about agent behavior in factual and counterfactual scenarios, as adaptively formulated by a human investigator. For a decision algorithm $\mathcal{A}$, we use symbolic execution to represent its logic as a statement $\Pi$ in the decidable theory $\texttt{QF_FPBV}$. We implement our framework in a tool called $\textsf{soid}$ with an accompanying GUI, and demonstrate its utility on an illustrative car crash scenario.
The issue of ensuring privacy for users who share their personal information has been a growing priority in a business and scientific environment where the use of different types of data and the laws that protect it have increased in tandem. Different technologies have been widely developed for static publications, i.e., where the information is published only once, such as k-anonymity and {\epsilon}-differential privacy. In the case where microdata information is published dynamically, although established notions such as m-invariance and {\tau}-safety already exist, developments for improving utility remain superficial. We propose a new heuristic approach for the NP-hard combinatorial problem of m-invariance and {\tau}-safety, which is based on a mathematical optimization column generation scheme. The quality of a solution to m-invariance and {\tau}-safety can be measured by the Information Loss (IL), a value in [0,100], the closer to 0 the better. We show that our approach improves by far current heuristics, providing in some instances solutions with ILs of 1.87, 8.5 and 1.93, while the state-of-the art methods reported ILs of 39.03, 51.84 and 57.97, respectively.
We present a finite element scheme for fractional diffusion problems with varying diffusivity and fractional order. We consider a symmetric integral form of these nonlocal equations defined on general geometries and in arbitrary bounded domains. A number of challenges are encountered when discretizing these equations. The first comes from the heterogeneous kernel singularity in the fractional integral operator. The second comes from the dense discrete operator with its quadratic growth in memory footprint and arithmetic operations. An additional challenge comes from the need to handle volume conditions-the generalization of classical local boundary conditions to the nonlocal setting. Satisfying these conditions requires that the effect of the whole domain, including both the interior and exterior regions, can be computed on every interior point in the discretization. Performed directly, this would result in quadratic complexity. To address these challenges, we propose a strategy that decomposes the stiffness matrix into three components. The first is a sparse matrix that handles the singular near-field separately and is computed by adapting singular quadrature techniques available for the homogeneous case to the case of spatially variable order. The second component handles the remaining smooth part of the near-field as well as the far field and is approximated by a hierarchical $\mathcal{H}^{2}$ matrix that maintains linear complexity in storage and operations. The third component handles the effect of the global mesh at every node and is written as a weighted mass matrix whose density is computed by a fast-multipole type method. The resulting algorithm has therefore overall linear space and time complexity. Analysis of the consistency of the stiffness matrix is provided and numerical experiments are conducted to illustrate the convergence and performance of the proposed algorithm.
Context. Algorithmic racism is the term used to describe the behavior of technological solutions that constrains users based on their ethnicity. Lately, various data-driven software systems have been reported to discriminate against Black people, either for the use of biased data sets or due to the prejudice propagated by software professionals in their code. As a result, Black people are experiencing disadvantages in accessing technology-based services, such as housing, banking, and law enforcement. Goal. This study aims to explore algorithmic racism from the perspective of software professionals. Method. A survey questionnaire was applied to explore the understanding of software practitioners on algorithmic racism, and data analysis was conducted using descriptive statistics and coding techniques. Results. We obtained answers from a sample of 73 software professionals discussing their understanding and perspectives on algorithmic racism in software development. Our results demonstrate that the effects of algorithmic racism are well-known among practitioners. However, there is no consensus on how the problem can be effectively addressed in software engineering. In this paper, some solutions to the problem are proposed based on the professionals' narratives. Conclusion. Combining technical and social strategies, including training on structural racism for software professionals, is the most promising way to address the algorithmic racism problem and its effects on the software solutions delivered to our society.
Given a set of inelastic material models, a microstructure, a macroscopic structural geometry, and a set of boundary conditions, one can in principle always solve the governing equations to determine the system's mechanical response. However, for large systems this procedure can quickly become computationally overwhelming, especially in three-dimensions when the microstructure is locally complex. In such settings multi-scale modeling offers a route to a more efficient model by holding out the promise of a framework with fewer degrees of freedom, which at the same time faithfully represents, up to a certain scale, the behavior of the system. In this paper, we present a methodology that produces such models for inelastic systems upon the basis of a variational scheme. The essence of the scheme is the construction of a variational statement for the free energy as well as the dissipation potential for a coarse scale model in terms of the free energy and dissipation functions of the fine scale model. From the coarse scale energy and dissipation we can then generate coarse scale material models that are computationally far more efficient than either directly solving the fine scale model or by resorting to FE-square type modeling. Moreover, the coarse scale model preserves the essential mathematical structure of the fine scale model. An essential feature for such schemes is the proper definition of the coarse scale inelastic variables. By way of concrete examples, we illustrate the needed steps to generate successful models via application to problems in classical plasticity, included are comparisons to direct numerical simulations of the microstructure to illustrate the accuracy of the proposed methodology.
Many real-world decision-making tasks require learning causal relationships between a set of variables. Traditional causal discovery methods, however, require that all variables are observed, which is often not feasible in practical scenarios. Without additional assumptions about the unobserved variables, it is not possible to recover any causal relationships from observational data. Fortunately, in many applied settings, additional structure among the confounders can be expected. In particular, pervasive confounding is commonly encountered and has been utilized for consistent causal estimation in linear causal models. In this paper, we present a provably consistent method to estimate causal relationships in the non-linear, pervasive confounding setting. The core of our procedure relies on the ability to estimate the confounding variation through a simple spectral decomposition of the observed data matrix. We derive a DAG score function based on this insight, prove its consistency in recovering a correct ordering of the DAG, and empirically compare it to previous approaches. We demonstrate improved performance on both simulated and real datasets by explicitly accounting for both confounders and non-linear effects.
Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e.g., GPT-3 and Swin Transformer. Although originally designed for prediction problems, it is natural to inquire about their suitability for sequential decision-making and reinforcement learning problems, which are typically beset by long-standing issues involving sample efficiency, credit assignment, and partial observability. In recent years, sequence models, especially the Transformer, have attracted increasing interest in the RL communities, spawning numerous approaches with notable effectiveness and generalizability. This survey presents a comprehensive overview of recent works aimed at solving sequential decision-making tasks with sequence models such as the Transformer, by discussing the connection between sequential decision-making and sequence modeling, and categorizing them based on the way they utilize the Transformer. Moreover, this paper puts forth various potential avenues for future research intending to improve the effectiveness of large sequence models for sequential decision-making, encompassing theoretical foundations, network architectures, algorithms, and efficient training systems. As this article has been accepted by the Frontiers of Computer Science, here is an early version, and the most up-to-date version can be found at //journal.hep.com.cn/fcs/EN/10.1007/s11704-023-2689-5
In this work, we explore a framework for contextual decision-making to study how the relevance and quantity of past data affects the performance of a data-driven policy. We analyze a contextual Newsvendor problem in which a decision-maker needs to trade-off between an underage and an overage cost in the face of uncertain demand. We consider a setting in which past demands observed under ``close by'' contexts come from close by distributions and analyze the performance of data-driven algorithms through a notion of context-dependent worst-case expected regret. We analyze the broad class of Weighted Empirical Risk Minimization (WERM) policies which weigh past data according to their similarity in the contextual space. This class includes classical policies such as ERM, k-Nearest Neighbors and kernel-based policies. Our main methodological contribution is to characterize exactly the worst-case regret of any WERM policy on any given configuration of contexts. To the best of our knowledge, this provides the first understanding of tight performance guarantees in any contextual decision-making problem, with past literature focusing on upper bounds via concentration inequalities. We instead take an optimization approach, and isolate a structure in the Newsvendor loss function that allows to reduce the infinite-dimensional optimization problem over worst-case distributions to a simple line search. This in turn allows us to unveil fundamental insights that were obfuscated by previous general-purpose bounds. We characterize actual guaranteed performance as a function of the contexts, as well as granular insights on the learning curve of algorithms.
We consider the problem of online allocation subject to a long-term fairness penalty. Contrary to existing works, however, we do not assume that the decision-maker observes the protected attributes -- which is often unrealistic in practice. Instead they can purchase data that help estimate them from sources of different quality; and hence reduce the fairness penalty at some cost. We model this problem as a multi-armed bandit problem where each arm corresponds to the choice of a data source, coupled with the online allocation problem. We propose an algorithm that jointly solves both problems and show that it has a regret bounded by $\mathcal{O}(\sqrt{T})$. A key difficulty is that the rewards received by selecting a source are correlated by the fairness penalty, which leads to a need for randomization (despite a stochastic setting). Our algorithm takes into account contextual information available before the source selection, and can adapt to many different fairness notions. We also show that in some instances, the estimates used can be learned on the fly.
Continued model-based decision support is associated with particular challenges, especially in long-term projects. Due to the regularly changing questions and the often changing understanding of the underlying system, the models used must be regularly re-evaluated, -modelled and -implemented with respect to changing modelling purpose, system boundaries and mapped causalities. Usually, this leads to models with continuously growing complexity and volume. In this work we aim to reevaluate the idea of the model family, dating back to the 1990s, and use it to promote this as a mindset in the creation of decision support frameworks in large research projects. The idea is to generally not develop and enhance a single standalone model, but to divide the research tasks into interacting smaller models which specifically correspond to the research question. This strategy comes with many advantages, which we explain using the example of a family of models for decision support in the COVID-19 crisis and corresponding success stories. We describe the individual models, explain their role within the family, and how they are used - individually and with each other.
Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks. When such models are deployed in real world environments, they inevitably interface with other entities and agents. For example, language models are often used to interact with human beings through dialogue, and visual perception models are used to autonomously navigate neighborhood streets. In response to these developments, new paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning. These paradigms leverage the existence of ever-larger datasets curated for multimodal, multitask, and generalist interaction. Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems that can interact effectively across a diverse range of applications such as dialogue, autonomous driving, healthcare, education, and robotics. In this manuscript, we examine the scope of foundation models for decision making, and provide conceptual tools and technical background for understanding the problem space and exploring new research directions. We review recent approaches that ground foundation models in practical decision making applications through a variety of methods such as prompting, conditional generative modeling, planning, optimal control, and reinforcement learning, and discuss common challenges and open problems in the field.