Usability is a crucial factor but one of the most neglected concerns in open source software (OSS). While far from an ideal approach, a common practice that OSS communities adopt to collaboratively address usability is through discussions on issue tracking systems (ITSs). However, there is little knowledge about the extent to which OSS community members engage in usability issue discussions, the aspects of usability they frequently target, and the characteristics of their collaboration around usability issue discussions. This knowledge is important for providing practical recommendations and research directions to better support OSS communities in addressing this important topic and improve OSS usability in general. To help achieve this goal, we performed an extensive empirical study on issues discussed in five popular OSS applications: three data science notebook projects (Jupyter Lab, Google Colab, and CoCalc) and two code editor projects (VSCode and Atom). Our results indicated that while usability issues are extensively discussed in the OSS projects, their scope tended to be limited to efficiency and aesthetics. Additionally, these issues are more frequently posted by experienced community members and display distinguishable characteristics, such as involving more visual communication and more participants. Our results provide important implications that can inform the OSS practitioners to better engage the community in usability issue discussion and shed light on future research efforts toward collaboration techniques and tools for discussing niche topics in diverse communities, such as the usability issues in the OSS context.
Most modern computing tasks have digital electronic input and output data. Due to these constraints imposed by real-world use cases of computer systems, any analog computing accelerator, whether analog electronic or optical, must perform an analog-to-digital conversion on its input data and a subsequent digital-to-analog conversion on its output data. The energy and latency costs incurred by data conversion place performance limits on analog computing accelerators. To avoid this overhead, analog hardware must replace the full functionality of traditional digital electronic computer hardware. This is not currently possible for optical computing accelerators due to limitations in gain, input-output isolation, and information storage in optical hardware. This article presents a case study that profiles 27 benchmarks for an analog optical Fourier transform and convolution accelerator which we designed and built. The case study shows that an ideal optical Fourier transform and convolution accelerator can produce an average speedup of 9.4 times and a median speedup of 1.9 times for the set of benchmarks. The optical Fourier transform and convolution accelerator only produces significant speedup for pure Fourier transform (45.3 times) and convolution (159.4 times) applications.
This is a first draft of a quick primer on the use of Python (and relevant libraries) to build a wireless communication prototype that supports multiple-input and multiple-output (MIMO) systems with orthogonal frequency division multiplexing (OFDM) in addition to some machine learning use cases. This primer is intended to empower researchers with a means to efficiently create simulations. This draft is aligned with the syllabus of a graduate course we created to be taught in Fall 2022 and we aspire to update this draft occasionally based on feedback from the larger research community.
In many real-world problems, there is a limited set of training data, but an abundance of unlabeled data. We propose a new method, Generative Posterior Networks (GPNs), that uses unlabeled data to estimate epistemic uncertainty in high-dimensional problems. A GPN is a generative model that, given a prior distribution over functions, approximates the posterior distribution directly by regularizing the network towards samples from the prior. We prove theoretically that our method indeed approximates the Bayesian posterior and show empirically that it improves epistemic uncertainty estimation and scalability over competing methods.
This paper presents the adaptive software security model, an innovative approach integrating the MAPE-K loop and the Software Development Life Cycle (SDLC). It proactively embeds security policies throughout development, reducing vulnerabilities from different levels of software engineering. Three primary contributions-MAPE-K integration, SDLC embedding, and analytical insights-converge to create a comprehensive approach for strengthening software systems against security threats. This research represents a paradigm shift, adapting security measures with agile software development and ensuring continuous improvement in the face of evolving threats. The model emerges as a robust solution, addressing the crucial need for adaptive software security strategies in modern software development. We analytically discuss the advantages of the proposed model.
Recent advances in long-context Large Language Models (LCLMs) have generated significant interest, especially in applications such as querying scientific research papers. However, their potential is often limited by inadequate context utilization. We identify the absence of long-range semantic dependencies in typical training data as a primary hindrance. To address this, we delve into the benefits of frequently incorporating related documents into training inputs. Using the inherent directory structure of code data as a source of training examples, we demonstrate improvements in perplexity, even for tasks unrelated to coding. Building on these findings, but with a broader focus, we introduce Structured Packing for Long Context (SPLiCe). SPLiCe is an innovative method for creating training examples by using a retrieval method to collate the most mutually relevant documents into a single training context. Our results indicate that \method{} enhances model performance and can be used to train large models to utilize long contexts better. We validate our results by training a large $3$B model, showing both perplexity improvements and better long-context performance on downstream tasks.
Digital twin (DT) is the recurrent and common feature in discussions about future technologies, bringing together advanced communication, computation, and artificial intelligence, to name a few. In the context of Industry 4.0, industries such as manufacturing, automotive, and healthcare are rapidly adopting DT-based development. The main challenges to date have been the high demands on communication and computing resources, as well as privacy and security concerns, arising from the large volumes of data exchanges. To achieve low latency and high security services in the emerging DT, multi-tier computing has been proposed by combining edge/fog computing and cloud computing. Specifically, low latency data transmission, efficient resource allocation, and validated security strategies of multi-tier computing systems are used to solve the operational problems of the DT system. In this paper, we introduce the architecture and applications of DT using examples from manufacturing, the Internet-of-Vehicles and healthcare. At the same time, the architecture and technology of multi-tier computing systems are studied to support DT. This paper will provide valuable reference and guidance for the theory, algorithms, and applications in collaborative multi-tier computing and DT.
We propose Stable Yet Memory Bounded Open-Loop (SYMBOL) planning, a general memory bounded approach to partially observable open-loop planning. SYMBOL maintains an adaptive stack of Thompson Sampling bandits, whose size is bounded by the planning horizon and can be automatically adapted according to the underlying domain without any prior domain knowledge beyond a generative model. We empirically test SYMBOL in four large POMDP benchmark problems to demonstrate its effectiveness and robustness w.r.t. the choice of hyperparameters and evaluate its adaptive memory consumption. We also compare its performance with other open-loop planning algorithms and POMCP.
Advances in artificial intelligence often stem from the development of new environments that abstract real-world situations into a form where research can be done conveniently. This paper contributes such an environment based on ideas inspired by elementary Microeconomics. Agents learn to produce resources in a spatially complex world, trade them with one another, and consume those that they prefer. We show that the emergent production, consumption, and pricing behaviors respond to environmental conditions in the directions predicted by supply and demand shifts in Microeconomics. We also demonstrate settings where the agents' emergent prices for goods vary over space, reflecting the local abundance of goods. After the price disparities emerge, some agents then discover a niche of transporting goods between regions with different prevailing prices -- a profitable strategy because they can buy goods where they are cheap and sell them where they are expensive. Finally, in a series of ablation experiments, we investigate how choices in the environmental rewards, bartering actions, agent architecture, and ability to consume tradable goods can either aid or inhibit the emergence of this economic behavior. This work is part of the environment development branch of a research program that aims to build human-like artificial general intelligence through multi-agent interactions in simulated societies. By exploring which environment features are needed for the basic phenomena of elementary microeconomics to emerge automatically from learning, we arrive at an environment that differs from those studied in prior multi-agent reinforcement learning work along several dimensions. For example, the model incorporates heterogeneous tastes and physical abilities, and agents negotiate with one another as a grounded form of communication.
Graph neural networks (GNNs) is widely used to learn a powerful representation of graph-structured data. Recent work demonstrates that transferring knowledge from self-supervised tasks to downstream tasks could further improve graph representation. However, there is an inherent gap between self-supervised tasks and downstream tasks in terms of optimization objective and training data. Conventional pre-training methods may be not effective enough on knowledge transfer since they do not make any adaptation for downstream tasks. To solve such problems, we propose a new transfer learning paradigm on GNNs which could effectively leverage self-supervised tasks as auxiliary tasks to help the target task. Our methods would adaptively select and combine different auxiliary tasks with the target task in the fine-tuning stage. We design an adaptive auxiliary loss weighting model to learn the weights of auxiliary tasks by quantifying the consistency between auxiliary tasks and the target task. In addition, we learn the weighting model through meta-learning. Our methods can be applied to various transfer learning approaches, it performs well not only in multi-task learning but also in pre-training and fine-tuning. Comprehensive experiments on multiple downstream tasks demonstrate that the proposed methods can effectively combine auxiliary tasks with the target task and significantly improve the performance compared to state-of-the-art methods.
How can we estimate the importance of nodes in a knowledge graph (KG)? A KG is a multi-relational graph that has proven valuable for many tasks including question answering and semantic search. In this paper, we present GENI, a method for tackling the problem of estimating node importance in KGs, which enables several downstream applications such as item recommendation and resource allocation. While a number of approaches have been developed to address this problem for general graphs, they do not fully utilize information available in KGs, or lack flexibility needed to model complex relationship between entities and their importance. To address these limitations, we explore supervised machine learning algorithms. In particular, building upon recent advancement of graph neural networks (GNNs), we develop GENI, a GNN-based method designed to deal with distinctive challenges involved with predicting node importance in KGs. Our method performs an aggregation of importance scores instead of aggregating node embeddings via predicate-aware attention mechanism and flexible centrality adjustment. In our evaluation of GENI and existing methods on predicting node importance in real-world KGs with different characteristics, GENI achieves 5-17% higher NDCG@100 than the state of the art.