While most methods for solving mixed-integer optimization problems compute a single optimal solution, a diverse set of near-optimal solutions can often lead to improved outcomes. We present a new method for finding a set of diverse solutions by emphasizing diversity within the search for near-optimal solutions. Specifically, within a branch-and-bound framework, we investigated parameterized node selection rules that explicitly consider diversity. Our results indicate that our approach significantly increases the diversity of the final solution set. When compared with two existing methods, our method runs with similar runtime as regular node selection methods and gives a diversity improvement between 12% and 190%. In contrast, popular node selection rules, such as best-first search, in some instances performed worse than state-of-the-art methods by more than 35% and gave an improvement of no more than 130%. Further, we find that our method is most effective when diversity in node selection is continuously emphasized after reaching a minimal depth in the tree and when the solution set has grown sufficiently large. Our method can be easily incorporated into integer programming solvers and has the potential to significantly increase the diversity of solution sets.
We introduce and study the online pause and resume problem. In this problem, a player attempts to find the $k$ lowest (alternatively, highest) prices in a sequence of fixed length $T$, which is revealed sequentially. At each time step, the player is presented with a price and decides whether to accept or reject it. The player incurs a switching cost whenever their decision changes in consecutive time steps, i.e., whenever they pause or resume purchasing. This online problem is motivated by the goal of carbon-aware load shifting, where a workload may be paused during periods of high carbon intensity and resumed during periods of low carbon intensity and incurs a cost when saving or restoring its state. It has strong connections to existing problems studied in the literature on online optimization, though it introduces unique technical challenges that prevent the direct application of existing algorithms. Extending prior work on threshold-based algorithms, we introduce double-threshold algorithms for both the minimization and maximization variants of this problem. We further show that the competitive ratios achieved by these algorithms are the best achievable by any deterministic online algorithm. Finally, we empirically validate our proposed algorithm through case studies on the application of carbon-aware load shifting using real carbon trace data and existing baseline algorithms.
In the context of simulation-based methods, multiple challenges arise, two of which are considered in this work. As a first challenge, problems including time-dependent phenomena with complex domain deformations, potentially even with changes in the domain topology, need to be tackled appropriately. The second challenge arises when computational resources and the time for evaluating the model become critical in so-called many query scenarios for parametric problems. For example, these problems occur in optimization, uncertainty quantification (UQ), or automatic control and using highly resolved full-order models (FOMs) may become impractical. To address both types of complexity, we present a novel projection-based model order reduction (MOR) approach for deforming domain problems that takes advantage of the time-continuous space-time formulation. We apply it to two examples that are relevant for engineering or biomedical applications and conduct an error and performance analysis. In both cases, we are able to drastically reduce the computational expense for a model evaluation and, at the same time, to maintain an adequate accuracy level. All in all, this work indicates the effectiveness of the presented MOR approach for deforming domain problems taking advantage of a time-continuous space-time setting.
Orthogonality constraints naturally appear in many machine learning problems, from Principal Components Analysis to robust neural network training. They are usually solved using Riemannian optimization algorithms, which minimize the objective function while enforcing the constraint. However, enforcing the orthogonality constraint can be the most time-consuming operation in such algorithms. Recently, Ablin & Peyr\'e (2022) proposed the Landing algorithm, a method with cheap iterations that does not enforce the orthogonality constraint but is attracted towards the manifold in a smooth manner. In this article, we provide new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold, the set of rectangular orthogonal matrices. We also consider stochastic and variance reduction algorithms when the cost function is an average of many functions. We demonstrate that all these methods have the same rate of convergence as their Riemannian counterparts that exactly enforce the constraint. Finally, our experiments demonstrate the promise of our approach to an array of machine-learning problems that involve orthogonality constraints.
Estimating treatment effects conditional on observed covariates can improve the ability to tailor treatments to particular individuals. Doing so effectively requires dealing with potential confounding, and also enough data to adequately estimate effect moderation. A recent influx of work has looked into estimating treatment effect heterogeneity using data from multiple randomized controlled trials and/or observational datasets. With many new methods available for assessing treatment effect heterogeneity using multiple studies, it is important to understand which methods are best used in which setting, how the methods compare to one another, and what needs to be done to continue progress in this field. This paper reviews these methods broken down by data setting: aggregate-level data, federated learning, and individual participant-level data. We define the conditional average treatment effect and discuss differences between parametric and nonparametric estimators, and we list key assumptions, both those that are required within a single study and those that are necessary for data combination. After describing existing approaches, we compare and contrast them and reveal open areas for future research. This review demonstrates that there are many possible approaches for estimating treatment effect heterogeneity through the combination of datasets, but that there is substantial work to be done to compare these methods through case studies and simulations, extend them to different settings, and refine them to account for various challenges present in real data.
The BrainScaleS-2 (BSS-2) system implements physical models of neurons as well as synapses and aims for an energy-efficient and fast emulation of biological neurons. When replicating neuroscientific experiment results, a major challenge is finding suitable model parameters. This study investigates the suitability of the sequential neural posterior estimation (SNPE) algorithm for parameterizing a multi-compartmental neuron model emulated on the BSS-2 analog neuromorphic hardware system. In contrast to other optimization methods such as genetic algorithms or stochastic searches, the SNPE algorithms belongs to the class of approximate Bayesian computing (ABC) methods and estimates the posterior distribution of the model parameters; access to the posterior allows classifying the confidence in parameter estimations and unveiling correlation between model parameters. In previous applications, the SNPE algorithm showed a higher computational efficiency than traditional ABC methods. For our multi-compartmental model, we show that the approximated posterior is in agreement with experimental observations and that the identified correlation between parameters is in agreement with theoretical expectations. Furthermore, we show that the algorithm can deal with high-dimensional observations and parameter spaces. These results suggest that the SNPE algorithm is a promising approach for automating the parameterization of complex models, especially when dealing with characteristic properties of analog neuromorphic substrates, such as trial-to-trial variations or limited parameter ranges.
In the era of the Internet of Things (IoT), blockchain is a promising technology for improving the efficiency of healthcare systems, as it enables secure storage, management, and sharing of real-time health data collected by the IoT devices. As the implementations of blockchain-based healthcare systems usually involve multiple conflicting metrics, it is essential to balance them according to the requirements of specific scenarios. In this paper, we formulate a joint optimization model with three metrics, namely latency, security, and computational cost, that are particularly important for IoT-enabled healthcare. However, it is computationally intractable to identify the exact optimal solution of this problem for practical sized systems. Thus, we propose an algorithm called the Adaptive Discrete Particle Swarm Algorithm (ADPSA) to obtain near-optimal solutions in a low-complexity manner. With its roots in the classical Particle Swarm Optimization (PSO) algorithm, our proposed ADPSA can effectively manage the numerous binary and integer variables in the formulation. We demonstrate by extensive numerical experiments that the ADPSA consistently outperforms existing benchmark approaches, including the original PSO, exhaustive search and Simulated Annealing, in a wide range of scenarios.
With the maturity of web services, containers, and cloud computing technologies, large services in traditional systems (e.g. the computation services of machine learning and artificial intelligence) are gradually being broken down into many microservices to increase service reusability and flexibility. Therefore, this study proposes an efficiency analysis framework based on queuing models to analyze the efficiency difference of breaking down traditional large services into n microservices. For generalization, this study considers different service time distributions (e.g. exponential distribution of service time and fixed service time) and explores the system efficiency in the worst-case and best-case scenarios through queuing models (i.e. M/M/1 queuing model and M/D/1 queuing model). In each experiment, it was shown that the total time required for the original large service was higher than that required for breaking it down into multiple microservices, so breaking it down into multiple microservices can improve system efficiency. It can also be observed that in the best-case scenario, the improvement effect becomes more significant with an increase in arrival rate. However, in the worst-case scenario, only slight improvement was achieved. This study found that breaking down into multiple microservices can effectively improve system efficiency and proved that when the computation time of the large service is evenly distributed among multiple microservices, the best improvement effect can be achieved. Therefore, this study's findings can serve as a reference guide for future development of microservice architecture.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets. They need to compute node representations recursively from their neighbors. Current GCN training algorithms suffer from either high computational costs that grow exponentially with the number of layers, or high memory usage for loading the entire graph and node embeddings. In this paper, we propose a novel efficient layer-wise training framework for GCN (L-GCN), that disentangles feature aggregation and feature transformation during training, hence greatly reducing time and memory complexities. We present theoretical analysis for L-GCN under the graph isomorphism framework, that L-GCN leads to as powerful GCNs as the more costly conventional training algorithm does, under mild conditions. We further propose L^2-GCN, which learns a controller for each layer that can automatically adjust the training epochs per layer in L-GCN. Experiments show that L-GCN is faster than state-of-the-arts by at least an order of magnitude, with a consistent of memory usage not dependent on dataset size, while maintaining comparable prediction performance. With the learned controller, L^2-GCN can further cut the training time in half. Our codes are available at //github.com/Shen-Lab/L2-GCN.
Graph convolutional network (GCN) has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph. This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). For training a 3-layer GCN on this data, Cluster-GCN is faster than the previous state-of-the-art VR-GCN (1523 seconds vs 1961 seconds) and using much less memory (2.2GB vs 11.2GB). Furthermore, for training 4 layer GCN on this data, our algorithm can finish in around 36 minutes while all the existing GCN training algorithms fail to train due to the out-of-memory issue. Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99.36 on the PPI dataset, while the previous best result was 98.71 by [16]. Our codes are publicly available at //github.com/google-research/google-research/tree/master/cluster_gcn.