The IS discourse on the potential of distributed ledger technology (DLT) in the financial services has grown at a tremendous pace in recent years. Yet, little has been said about the related implications for the costly and highly regulated process of compliance reporting. Working with a group of representatives from industry and regulatory authorities, we employ the design science research methodology (DSR) in the design, development, and evaluation of an artefact, enabling the automated collection and enrichment of transactional data. Our findings indicate that DLT may facilitate the automation of key compliance processes through the implementation of a "pull-model", in which regulators can access compliance data in near real-time to stage aggregate exposures at the supranational level. Generalizing our preliminary results, we present four propositions on the implications of DLT in compliance. The findings contribute new practical insights on the topic of compliance to the growing IS discourse on DLT.
This paper presents the design, analysis, and performance evaluation of an omnidirectional transformable wheel-leg robot called OmniWheg. We design a novel mechanism consisting of a separable omni-wheel and 4-bar linkages, allowing the robot to transform between omni-wheeled and legged modes smoothly. In wheeled mode, the robot can move in all directions and efficiently adjust the relative position of its wheels, while it can overcome common obstacles in legged mode, such as stairs and steps. Unlike other articles studying whegs, this implementation with omnidirectional wheels allows the correction of misalignments between right and left wheels before traversing obstacles, which effectively improves the success rate and simplifies the preparation process before the wheel-leg transformation. We describe the design concept, mechanism, and the dynamic characteristic of the wheel-leg structure. We then evaluate its performance in various scenarios, including passing obstacles, climbing steps of different heights, and turning/moving omnidirectionally. Our results confirm that this mobile platform can overcome common indoor obstacles and move flexibly on the flat ground with the new transformable wheel-leg mechanism, while keeping a high degree of stability.
We consider a causal inference model in which individuals interact in a social network and they may not comply with the assigned treatments. Estimating causal parameters is challenging in the presence of network interference of unknown form, as each individual may be influenced by both close individuals and distant ones in complex ways. Noncompliance with treatment assignment further complicates this problem, and prior methods dealing with network spillovers but disregarding the noncompliance issue may underestimate the effect of the treatment receipt on the outcome. To estimate meaningful causal parameters, we introduce a new concept of exposure mapping, which summarizes potentially complicated spillover effects into a fixed dimensional statistic of instrumental variables. We investigate identification conditions for the intention-to-treat effect and the average causal effect for compliers, while explicitly considering the possibility of misspecification of exposure mapping. Based on our identification results, we develop nonparametric estimation procedures via inverse probability weighting. Their asymptotic properties, including consistency and asymptotic normality, are investigated using an approximate neighborhood interference framework, which is convenient for dealing with unknown forms of spillovers between individuals. For an empirical illustration, we apply our method to experimental data on the anti-conflict intervention school program.
We consider studies where multiple measures on an outcome variable are collected over time, but some subjects drop out before the end of follow up. Analyses of such data often proceed under either a 'last observation carried forward' or 'missing at random' assumption. We consider two alternative strategies for identification; the first is closely related to the difference-in-differences methodology in the causal inference literature. The second enables correction for violations of the parallel trend assumption, so long as one has access to a valid 'bespoke instrumental variable'. These are compared with existing approaches, first conceptually and then in an analysis of data from the Framingham Heart Study.
We bridge two research directions on graph neural networks (GNNs), by formalizing the relation between heterophily of node labels (i.e., connected nodes tend to have dissimilar labels) and the robustness of GNNs to adversarial attacks. Our theoretical and empirical analyses show that for homophilous graph data, impactful structural attacks always lead to reduced homophily, while for heterophilous graph data the change in the homophily level depends on the node degrees. These insights have practical implications for defending against attacks on real-world graphs: we deduce that separate aggregators for ego- and neighbor-embeddings, a design principle which has been identified to significantly improve prediction for heterophilous graph data, can also offer increased robustness to GNNs. Our comprehensive experiments show that GNNs merely adopting this design achieve improved empirical and certifiable robustness compared to the best-performing unvaccinated model. Additionally, combining this design with explicit defense mechanisms against adversarial attacks leads to an improved robustness with up to 18.33% performance increase under attacks compared to the best-performing vaccinated model.
The extensive adoption of business analytics (BA) has brought financial gains and increased efficiencies. However, these advances have simultaneously drawn attention to rising legal and ethical challenges when BA inform decisions with fairness implications. As a response to these concerns, the emerging study of algorithmic fairness deals with algorithmic outputs that may result in disparate outcomes or other forms of injustices for subgroups of the population, especially those who have been historically marginalized. Fairness is relevant on the basis of legal compliance, social responsibility, and utility; if not adequately and systematically addressed, unfair BA systems may lead to societal harms and may also threaten an organization's own survival, its competitiveness, and overall performance. This paper offers a forward-looking, BA-focused review of algorithmic fairness. We first review the state-of-the-art research on sources and measures of bias, as well as bias mitigation algorithms. We then provide a detailed discussion of the utility-fairness relationship, emphasizing that the frequent assumption of a trade-off between these two constructs is often mistaken or short-sighted. Finally, we chart a path forward by identifying opportunities for business scholars to address impactful, open challenges that are key to the effective and responsible deployment of BA.
We present a novel methodology for repairing neural networks that use ReLU activation functions. Unlike existing methods that rely on modifying the weights of a neural network which can induce a global change in the function space, our approach applies only a localized change in the function space while still guaranteeing the removal of the buggy behavior. By leveraging the piecewise linear nature of ReLU networks, our approach can efficiently construct a patch network tailored to the linear region where the buggy input resides, which when combined with the original network, provably corrects the behavior on the buggy input. Our method is both sound and complete -- the repaired network is guaranteed to fix the buggy input, and a patch is guaranteed to be found for any buggy input. Moreover, our approach preserves the continuous piecewise linear nature of ReLU networks, automatically generalizes the repair to all the points including other undetected buggy inputs inside the repair region, is minimal in terms of changes in the function space, and guarantees that outputs on inputs away from the repair region are unaltered. On several benchmarks, we show that our approach significantly outperforms existing methods in terms of locality and limiting negative side effects. Our code is available on GitHub: //github.com/BU-DEPEND-Lab/REASSURE.
The production, shipping, usage, and disposal of consumer goods have a substantial impact on greenhouse gas emissions and the depletion of resources. Machine Learning (ML) can help to foster sustainable consumption patterns by accounting for sustainability aspects in product search or recommendations of modern retail platforms. However, the lack of large high quality publicly available product data with trustworthy sustainability information impedes the development of ML technology that can help to reach our sustainability goals. Here we present GreenDB, a database that collects products from European online shops on a weekly basis. As proxy for the products' sustainability, it relies on sustainability labels, which are evaluated by experts. The GreenDB schema extends the well-known schema.org Product definition and can be readily integrated into existing product catalogs. We present initial results demonstrating that ML models trained with our data can reliably (F1 score 96%) predict the sustainability label of products. These contributions can help to complement existing e-commerce experiences and ultimately encourage users to more sustainable consumption patterns.
We propose a data-driven way to reduce the noise of covariance matrices of nonstationary systems. In the case of stationary systems, asymptotic approaches were proved to converge to the optimal solutions. Such methods produce eigenvalues that are highly dependent on the inputs, as common sense would suggest. Our approach proposes instead to use a set of eigenvalues totally independent from the inputs and that encode the long-term averaging of the influence of the future on present eigenvalues. Such an influence can be the predominant factor in nonstationary systems. Using real and synthetic data, we show that our data-driven method outperforms optimal methods designed for stationary systems for the filtering of both covariance matrix and its inverse, as illustrated by financial portfolio variance minimization, which makes out method generically relevant to many problems of multivariate inference.
As sustainability becomes an increasing priority throughout global society, academic and research institutions are assessed on their contribution to relevant research publications. This study compares four methods of identifying research publications related to United Nations Sustainable Development Goal 13: climate action. The four methods, Elsevier, STRINGS, SIRIS, and Dimensions have each developed search strings with the help of subject matter experts which are then enhanced through distinct methods to produce a final set of publications. Our analysis showed that the methods produced comparable quantities of publications but with little overlap between them. We visualised some difference in topic focus between the methods and drew links with the search strategies used. Differences between publications retrieved are likely to come from subjective interpretation of the goals, keyword selection, operationalising search strategies, AI enhancements, and selection of bibliographic database. Each of the elements warrants deeper investigation to understand their role in identifying SDG-related research. Before choosing any method to assess the research contribution to SDGs, end users of SDG data should carefully consider their interpretation of the goal and determine which of the available methods produces the closest dataset. Meanwhile data providers might customise their methods for varying interpretations of the SDGs.
Detecting and mitigating harmful biases in modern language models are widely recognized as crucial, open problems. In this paper, we take a step back and investigate how language models come to be biased in the first place. We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus. With full access to the data and to the model parameters as they change during every step while training, we can map in detail how the representation of gender develops, what patterns in the dataset drive this, and how the model's internal state relates to the bias in a downstream task (semantic textual similarity). We find that the representation of gender is dynamic and identify different phases during training. Furthermore, we show that gender information is represented increasingly locally in the input embeddings of the model and that, as a consequence, debiasing these can be effective in reducing the downstream bias. Monitoring the training dynamics, allows us to detect an asymmetry in how the female and male gender are represented in the input embeddings. This is important, as it may cause naive mitigation strategies to introduce new undesirable biases. We discuss the relevance of the findings for mitigation strategies more generally and the prospects of generalizing our methods to larger language models, the Transformer architecture, other languages and other undesirable biases.