Profile guided optimization is an effective technique for improving the optimization ability of compilers based on dynamic behavior, but collecting profile data is expensive, cumbersome, and requires regular updating to remain fresh. We present a novel statistical approach to inferring branch probabilities that improves the performance of programs that are compiled without profile guided optimizations. We perform offline training using information that is collected from a large corpus of binaries that have branch probabilities information. The learned model is used by the compiler to predict the branch probabilities of regular uninstrumented programs, which the compiler can then use to inform optimization decisions. We integrate our technique directly in LLVM, supplementing the existing human-engineered compiler heuristics. We evaluate our technique on a suite of benchmarks, demonstrating some gains over compiling without profile information. In deployment, our technique requires no profiling runs and has negligible effect on compilation time.
The design of effective online caching policies is an increasingly important problem for content distribution networks, online social networks and edge computing services, among other areas. This paper proposes a new algorithmic toolbox for tackling this problem through the lens of optimistic online learning. We build upon the Follow-the-Regularized-Leader (FTRL) framework which is developed further here to include predictions for the file requests, and we design online caching algorithms for bipartite networks with fixed-size caches or elastic leased caches subject to time-average budget constraints. The predictions are provided by a content recommendation system that influences the users viewing activity, and hence can naturally reduce the caching network's uncertainty about future requests. We prove that the proposed optimistic learning caching policies can achieve sub-zero performance loss (regret) for perfect predictions, and maintain the best achievable regret bound $O(\sqrt T)$ even for arbitrary-bad predictions. The performance of the proposed algorithms is evaluated with detailed trace-driven numerical tests.
The design of effective online caching policies is an increasingly important problem for content distribution networks, online social networks and edge computing services, among other areas. This paper proposes a new algorithmic toolbox for tackling this problem through the lens of optimistic online learning. We build upon the Follow-the-Regularized-Leader (FTRL) framework, which is developed further here to include predictions for the file requests, and we design online caching algorithms for bipartite networks with fixed-size caches or elastic leased caches subject to time-average budget constraints. The predictions are provided by a content recommendation system that influences the users viewing activity and hence can naturally reduce the caching network's uncertainty about future requests. We also extend the framework to learn and utilize the best request predictor in cases where many are available. We prove that the proposed {optimistic} learning caching policies can achieve sub-zero performance loss (regret) for perfect predictions, and maintain the sub-linear regret bound $O(\sqrt T)$, which is the best achievable bound for policies that do not use predictions, even for arbitrary-bad predictions. The performance of the proposed algorithms is evaluated with detailed trace-driven numerical tests.
Recent state-of-the-art computer vision systems are trained from natural language supervision, ranging from simple object category names to descriptive captions. This free form of supervision ensures high generality and usability of the learned visual models, based on extensive heuristics on data collection to cover as many visual concepts as possible. Alternatively, learning with external knowledge about images is a promising way which leverages a much more structured source of supervision. In this paper, we propose K-LITE (Knowledge-augmented Language-Image Training and Evaluation), a simple strategy to leverage external knowledge to build transferable visual systems: In training, it enriches entities in natural language with WordNet and Wiktionary knowledge, leading to an efficient and scalable approach to learning image representations that can understand both visual concepts and their knowledge; In evaluation, the natural language is also augmented with external knowledge and then used to reference learned visual concepts (or describe new ones) to enable zero-shot and few-shot transfer of the pre-trained models. We study the performance of K-LITE on two important computer vision problems, image classification and object detection, benchmarking on 20 and 13 different existing datasets, respectively. The proposed knowledge-augmented models show significant improvement in transfer learning performance over existing methods.
Reinforcement learning (RL) has shown promise as a tool for engineering safe, ethical, or legal behaviour in autonomous agents. Its use typically relies on assigning punishments to state-action pairs that constitute unsafe or unethical choices. Despite this assignment being a crucial step in this approach, however, there has been limited discussion on generalizing the process of selecting punishments and deciding where to apply them. In this paper, we adopt an approach that leverages an existing framework -- the normative supervisor of (Neufeld et al., 2021) -- during training. This normative supervisor is used to dynamically translate states and the applicable normative system into defeasible deontic logic theories, feed these theories to a theorem prover, and use the conclusions derived to decide whether or not to assign a punishment to the agent. We use multi-objective RL (MORL) to balance the ethical objective of avoiding violations with a non-ethical objective; we will demonstrate that our approach works for a multiplicity of MORL techniques, and show that it is effective regardless of the magnitude of the punishment we assign.
Learning program representations has been the core prerequisite of code intelligent tasks such as code search and code clone detection. The state-of-the-art pre-trained models such as CodeBERT require the availability of large-scale code corpora. However, gathering training samples can be costly and infeasible for domain-specific languages such as Solidity for smart contracts. In this paper, we propose Zecoler, a zero-shot learning approach for code representations. Zecoler is built upon a pre-trained programming language model. In order to elicit knowledge from the pre-trained models efficiently, Zecoler casts the downstream tasks to the same form of pre-training tasks by inserting trainable prompts into the original input. Then, it employs the prompt learning technique which optimizes the pre-trained model by merely adjusting the original input. This enables the representation model to efficiently fit the scarce task-oriented data while reusing pre-trained knowledge. We evaluate Zecoler in three code intelligent tasks in two program languages that have no training samples, namely, Solidity and Go, with model trained in corpora of common languages such as Java. Experimental results show that our approach significantly outperforms baseline models in both zero-shot and few-shot settings.
In this study, we examine a clustering problem in which the covariates of each individual element in a dataset are associated with an uncertainty specific to that element. More specifically, we consider a clustering approach in which a pre-processing applying a non-linear transformation to the covariates is used to capture the hidden data structure. To this end, we approximate the sets representing the propagated uncertainty for the pre-processed features empirically. To exploit the empirical uncertainty sets, we propose a greedy and optimistic clustering (GOC) algorithm that finds better feature candidates over such sets, yielding more condensed clusters. As an important application, we apply the GOC algorithm to synthetic datasets of the orbital properties of stars generated through our numerical simulation mimicking the formation process of the Milky Way. The GOC algorithm demonstrates an improved performance in finding sibling stars originating from the same dwarf galaxy. These realistic datasets have also been made publicly available.
This paper is devoted to a practical method for ferroalloys consumption modeling and optimization. We consider the problem of selecting the optimal process control parameters based on the analysis of historical data from sensors. We developed approach, which predicts results of chemical reactions and give ferroalloys consumption recommendation. The main features of our method are easy interpretation and noise resistance. Our approach is based on k-means clustering algorithm, decision trees and linear regression. The main idea of the method is to identify situations where processes go similarly. For this, we propose using a k-means based dataset clustering algorithm and a classification algorithm to determine the cluster. This algorithm can be also applied to various technological processes, in this article, we demonstrate its application in metallurgy. To test the application of the proposed method, we used it to optimize ferroalloys consumption in Basic Oxygen Furnace steelmaking when finishing steel in a ladle furnace. The minimum required element content for a given steel grade was selected as the predictive model's target variable, and the required amount of the element to be added to the melt as the optimized variable. Keywords: Clustering, Machine Learning, Linear Regression, Steelmaking, Optimization, Gradient Boosting, Artificial Intelligence, Decision Trees, Recommendation services
Mechanism design is a central research branch in microeconomics. An effective mechanism can significantly improve performance and efficiency of social decisions under desired objectives, such as to maximize social welfare or to maximize revenue for agents. However, mechanism design is challenging for many common models including the public project problem model which we study in this thesis. A typical public project problem is a group of agents crowdfunding a public project (e.g., building a bridge). The mechanism will decide the payment and allocation for each agent (e.g., how much the agent pays, and whether the agent can use it) according to their valuations. The mechanism can be applied to various economic scenarios, including those related to cyber security. There are different constraints and optimized objectives for different public project scenarios (sub-problems), making it unrealistic to design a universal mechanism that fits all scenarios, and designing mechanisms for different settings manually is a taxing job. Therefore, we explore automated mechanism design (AMD) of public project problems under different constraints. In this thesis, we focus on the public project problem, which includes many sub-problems (excludable/non-excludable, divisible/indivisible, binary/non-binary). We study the classical public project model and extend this model to other related areas such as the zero-day exploit markets. For different sub-problems of the public project problem, we adopt different novel machine learning techniques to design optimal or near-optimal mechanisms via automated mechanism design. We evaluate our mechanisms by theoretical analysis or experimentally comparing our mechanisms against existing mechanisms. The experiments and theoretical results show that our mechanisms are better than state-of-the-art automated or manual mechanisms.
Graph machine learning has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To tackle the challenge, automated graph machine learning, which aims at discovering the best hyper-parameter and neural architecture configuration for different graph tasks/data without manual design, is gaining an increasing number of attentions from the research community. In this paper, we extensively discuss automated graph machine approaches, covering hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We briefly overview existing libraries designed for either graph machine learning or automated machine learning respectively, and further in depth introduce AutoGL, our dedicated and the world's first open-source library for automated graph machine learning. Last but not least, we share our insights on future research directions for automated graph machine learning. This paper is the first systematic and comprehensive discussion of approaches, libraries as well as directions for automated graph machine learning.
The time and effort involved in hand-designing deep neural networks is immense. This has prompted the development of Neural Architecture Search (NAS) techniques to automate this design. However, NAS algorithms tend to be slow and expensive; they need to train vast numbers of candidate networks to inform the search process. This could be alleviated if we could partially predict a network's trained accuracy from its initial state. In this work, we examine the overlap of activations between datapoints in untrained networks and motivate how this can give a measure which is usefully indicative of a network's trained performance. We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU, and verify its effectiveness on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces. Our approach can be readily combined with more expensive search methods; we examine a simple adaptation of regularised evolutionary search. Code for reproducing our experiments is available at //github.com/BayesWatch/nas-without-training.