Differential privacy (DP) has become the gold standard in privacy-preserving data analytics, but implementing it in real-world datasets and systems remains challenging. Recently developed DP tools aim to ease data practitioners' burden in implementing DP solutions, but limited research has investigated these DP tools' usability. Through a usability study with 24 US data practitioners with varying prior DP knowledge, we comprehensively evaluate the usability of four Python-based open-source DP tools: DiffPrivLib, Tumult Analytics, PipelineDP, and OpenDP. Our results suggest that DP tools can help novices learn DP concepts; that Application Programming Interface (API) design and documentation are vital for learnability and error prevention; and that user satisfaction highly correlates with the effectiveness of the tool. We discuss the balance between ease of use and the learning curve needed to appropriately implement DP and also provide recommendations to improve DP tools' usability to broaden adoption.
Neural networks have shown remarkable performance in computer vision, but their deployment in numerous scientific and technical fields is challenging due to their black-box nature. Scientists and practitioners need to evaluate the reliability of a decision, i.e., to know simultaneously if a model relies on the relevant features and whether these features are robust to image corruptions. Existing attribution methods aim to provide human-understandable explanations by highlighting important regions in the image domain, but fail to fully characterize a decision process's reliability. To bridge this gap, we introduce the Wavelet sCale Attribution Method (WCAM), a generalization of attribution from the pixel domain to the space-scale domain using wavelet transforms. Attribution in the wavelet domain reveals where and on what scales the model focuses, thus enabling us to assess whether a decision is reliable. Our code is accessible here: \url{//github.com/gabrielkasmi/spectral-attribution}.
In the high performance computing (HPC) domain, performance variability is a major scalability issue for parallel computing applications with heavy synchronization and communication. In this paper, we present an experimental performance analysis of OpenMP benchmarks regarding the variation of execution time, and determine the potential factors causing performance variability. Our work offers some understanding of performance distributions and directions for future work on how to mitigate variability for OpenMP-based applications. Two representative OpenMP benchmarks from the EPCC OpenMP micro-benchmark suite and BabelStream are run across two x86 multicore platforms featuring up to 256 threads. From the obtained results, we characterize and explain the execution time variability as a function of thread-pinning, simultaneous multithreading (SMT) and core frequency variation.
Artificial Intelligence (AI) has achieved significant advancements in technology and research with the development over several decades, and is widely used in many areas including computing vision, natural language processing, time-series analysis, speech synthesis, etc. During the age of deep learning, especially with the arise of Large Language Models, a large majority of researchers' attention is paid on pursuing new state-of-the-art (SOTA) results, resulting in ever increasing of model size and computational complexity. The needs for high computing power brings higher carbon emission and undermines research fairness by preventing small or medium-sized research institutions and companies with limited funding in participating in research. To tackle the challenges of computing resources and environmental impact of AI, Green Computing has become a hot research topic. In this survey, we give a systematic overview of the technologies used in Green Computing. We propose the framework of Green Computing and devide it into four key components: (1) Measures of Greenness, (2) Energy-Efficient AI, (3) Energy-Efficient Computing Systems and (4) AI Use Cases for Sustainability. For each components, we discuss the research progress made and the commonly used techniques to optimize the AI efficiency. We conclude that this new research direction has the potential to address the conflicts between resource constraints and AI development. We encourage more researchers to put attention on this direction and make AI more environmental friendly.
The increasing computational and memory requirements of Deep Learning (DL) workloads has led to outstanding innovations in hardware architectures. An archetype of such architectures is the novel Versal AI Engine (AIE) by AMD/Xilinx. The AIE comprises multiple programmable processors optimized for vector-based algorithms. An AIE array consisting of 400 processor cores, operating at 1.25 GHz is able to deliver a peak throughput of 8 TFLOPs for 32-bit floating-point (fp32), and 128 TOPs for 8-bit integer (int8) precision. In this work, we propose MaxEVA: a novel framework to efficiently map Matrix Multiplication (MatMul) workloads on Versal AIE devices. Our framework maximizes the performance and energy efficiency of MatMul applications by efficiently exploiting features of the AIE architecture and resolving performance bottlenecks from multiple angles. When demonstrating on the VC1902 device of the VCK190 board, MaxEVA accomplishes up to 5.44 TFLOPs and 77.01 TOPs throughput for fp32 and int8 precisions, respectively. In terms of energy efficiency, MaxEVA attains up to 85.11 GFLOPs/W for fp32, and 1.73 TOPs/W for int8. Our proposed method substantially outperforms the state-of-the-art approach by exhibiting up to 2.19x throughput gain and 29.4% higher energy efficiency. The MaxEVA framework provides notable insights to fill the knowledge gap in effectively designing MatMul-based DL workloads on the new Versal AIE devices.
Binarization is a powerful compression technique for neural networks, significantly reducing FLOPs, but often results in a significant drop in model performance. To address this issue, partial binarization techniques have been developed, but a systematic approach to mixing binary and full-precision parameters in a single network is still lacking. In this paper, we propose a controlled approach to partial binarization, creating a budgeted binary neural network (B2NN) with our MixBin strategy. This method optimizes the mixing of binary and full-precision components, allowing for explicit selection of the fraction of the network to remain binary. Our experiments show that B2NNs created using MixBin outperform those from random or iterative searches and state-of-the-art layer selection methods by up to 3% on the ImageNet-1K dataset. We also show that B2NNs outperform the structured pruning baseline by approximately 23% at the extreme FLOP budget of 15%, and perform well in object tracking, with up to a 12.4% relative improvement over other baselines. Additionally, we demonstrate that B2NNs developed by MixBin can be transferred across datasets, with some cases showing improved performance over directly applying MixBin on the downstream data.
Resource reservation is a fundamental mechanism for ensuring quality of service in time-sensitive networks, which can be decentralized by using reservation protocols. In the Ethernet technology Time-Sensitive Networking, this has been proposed in conjunction with the Credit-Based Shaper. For the reservation, the standards assume a maximum worst-case latency bound at each hop. However, we will show through formal analysis and simulation that these worst-case latency bounds are not safe. To face this, we propose an extension to the current standards to allow the reservation of time-sensitive traffic with reliable latency guarantees. The effectiveness of our approach is demonstrated through simulations of both synthetic and industrial networks. Finally, by providing additional information about neighboring devices, we could further increase the maximum reservable traffic by up to 20% in our test cases.
Our goal is to perform out-of-distribution (OOD) detection, i.e., to detect when a robot is operating in environments drawn from a different distribution than the ones used to train the robot. We leverage Probably Approximately Correct (PAC)-Bayes theory to train a policy with a guaranteed bound on performance on the training distribution. Our idea for OOD detection relies on the following intuition: violation of the performance bound on test environments provides evidence that the robot is operating OOD. We formalize this via statistical techniques based on p-values and concentration inequalities. The approach provides guaranteed confidence bounds on OOD detection including bounds on both the false positive and false negative rates of the detector and is task-driven and only sensitive to changes that impact the robot's performance. We demonstrate our approach in simulation and hardware for a grasping task using objects with unfamiliar shapes or poses and a drone performing vision-based obstacle avoidance in environments with wind disturbances and varied obstacle densities. Our examples demonstrate that we can perform task-driven OOD detection within just a handful of trials.
There has been a recent interest in proposing quantum protocols whose security relies on weaker computational assumptions than their classical counterparts. Importantly to our work, it has been recently shown that public-key encryption (PKE) from one-way functions (OWF) is possible if we consider quantum public keys. Notice that we do not expect classical PKE from OWF given the impossibility results of Impagliazzo and Rudich (STOC'89). However, the distribution of quantum public keys is a challenging task. Therefore, the main question that motivates our work is if quantum PKE from OWF is possible if we have classical public keys. Such protocols are impossible if ciphertexts are also classical, given the impossibility result of Austrin et al. (CRYPTO'22) of quantum enhanced key-agreement (KA) with classical communication. In this paper, we focus on black-box separation for PKE with classical public key and quantum ciphertext from OWF under the polynomial compatibility conjecture, first introduced in Austrin et al.. More precisely, we show the separation when the decryption algorithm of the PKE does not query the OWF. We prove our result by extending the techniques of Austrin et al. and we show an attack for KA in an extended classical communication model where the last message in the protocol can be a quantum state.
In human computer interaction (HCI), it is common to evaluate the value of HCI designs, techniques, devices, and systems in terms of their benefit to users. It is less common to discuss the benefit of HCI to computers. Every HCI task allows a computer to receive some data from the user. In many situations, the data received by the computer embodies human knowledge and intelligence in handling complex problems, and/or some critical information without which the computer cannot proceed. In this paper, we present an information-theoretic framework for quantifying the knowledge received by the computer from its users via HCI. We apply information-theoretic measures to some common HCI tasks as well as HCI tasks in complex data intelligence processes. We formalize the methods for estimating such quantities analytically and measuring them empirically. Using theoretical reasoning, we can confirm the significant but often undervalued role of HCI in data intelligence workflows.
The era of big data provides researchers with convenient access to copious data. However, people often have little knowledge about it. The increasing prevalence of big data is challenging the traditional methods of learning causality because they are developed for the cases with limited amount of data and solid prior causal knowledge. This survey aims to close the gap between big data and learning causality with a comprehensive and structured review of traditional and frontier methods and a discussion about some open problems of learning causality. We begin with preliminaries of learning causality. Then we categorize and revisit methods of learning causality for the typical problems and data types. After that, we discuss the connections between learning causality and machine learning. At the end, some open problems are presented to show the great potential of learning causality with data.