We present an exact approach to analyze and quantify the sensitivity of higher moments of probabilistic loops with symbolic parameters, polynomial arithmetic and potentially uncountable state spaces. Our approach integrates methods from symbolic computation, probability theory, and static analysis in order to automatically capture sensitivity information about probabilistic loops. Sensitivity information allows us to formally establish how value distributions of probabilistic loop variables influence the functional behavior of loops, which can in particular be helpful when choosing values of loop variables in order to ensure efficient/expected computations. Our work uses algebraic techniques to model higher moments of loop variables via linear recurrence equations and introduce the notion of sensitivity recurrences. We show that sensitivity recurrences precisely model loop sensitivities, even in cases where the moments of loop variables do not satisfy a system of linear recurrences. As such, we enlarge the class of probabilistic loops for which sensitivity analysis was so far feasible. We demonstrate the success of our approach while analyzing the sensitivities of probabilistic loops.
We consider the problem of answering connectivity queries on a real algebraic curve. The curve is given as the real trace of an algebraic curve, assumed to be in generic position, and being defined by some rational parametrizations. The query points are given by a zero-dimensional parametrization. We design an algorithm which counts the number of connected components of the real curve under study, and decides which query point lie in which connected component, in time log-linear in $N^6$, where $N$ is the maximum of the degrees and coefficient bit-sizes of the polynomials given as input. This matches the currently best-known bound for computing the topology of real plane curves. The main novelty of this algorithm is the avoidance of the computation of the complete topology of the curve.
Movement primitives are trainable parametric models that reproduce robotic movements starting from a limited set of demonstrations. Previous works proposed simple linear models that exhibited high sample efficiency and generalization power by allowing temporal modulation of movements (reproducing movements faster or slower), blending (merging two movements into one), via-point conditioning (constraining a movement to meet some particular via-points) and context conditioning (generation of movements based on an observed variable, e.g., position of an object). Previous works have proposed neural network-based motor primitive models, having demonstrated their capacity to perform tasks with some forms of input conditioning or time-modulation representations. However, there has not been a single unified deep motor primitive's model proposed that is capable of all previous operations, limiting neural motor primitive's potential applications. This paper proposes a deep movement primitive architecture that encodes all the operations above and uses a Bayesian context aggregator that allows a more sound context conditioning and blending. Our results demonstrate our approach can scale to reproduce complex motions on a larger variety of input choices compared to baselines while maintaining operations of linear movement primitives provide.
This paper presents a decentralized algorithm for solving distributed convex optimization problems in dynamic networks with time-varying objectives. The unique feature of the algorithm lies in its ability to accommodate a wide range of communication systems, including previously unsupported ones, by abstractly modeling the information exchange in the network. Specifically, it supports a novel communication protocol based on the "over-the-air" function computation (OTA-C) technology, that is designed for an efficient and truly decentralized implementation of the consensus step of the algorithm. Unlike existing OTA-C protocols, the proposed protocol does not require the knowledge of network graph structure or channel state information, making it particularly suitable for decentralized implementation over ultra-dense wireless networks with time-varying topologies and fading channels. Furthermore, the proposed algorithm synergizes with the "superiorization" methodology, allowing the development of new distributed algorithms with enhanced performance for the intended applications. The theoretical analysis establishes sufficient conditions for almost sure convergence of the algorithm to a common time-invariant solution for all agents, assuming such a solution exists. Our algorithm is applied to a real-world distributed random field estimation problem, showcasing its efficacy in terms of convergence speed, scalability, and spectral efficiency. Furthermore, we present a superiorized version of our algorithm that achieves faster convergence with significantly reduced energy consumption compared to the unsuperiorized algorithm.
In this paper, we provide a novel framework for the analysis of generalization error of first-order optimization algorithms for statistical learning when the gradient can only be accessed through partial observations given by an oracle. Our analysis relies on the regularity of the gradient w.r.t. the data samples, and allows to derive near matching upper and lower bounds for the generalization error of multiple learning problems, including supervised learning, transfer learning, robust learning, distributed learning and communication efficient learning using gradient quantization. These results hold for smooth and strongly-convex optimization problems, as well as smooth non-convex optimization problems verifying a Polyak-Lojasiewicz assumption. In particular, our upper and lower bounds depend on a novel quantity that extends the notion of conditional standard deviation, and is a measure of the extent to which the gradient can be approximated by having access to the oracle. As a consequence, our analysis provides a precise meaning to the intuition that optimization of the statistical learning objective is as hard as the estimation of its gradient. Finally, we show that, in the case of standard supervised learning, mini-batch gradient descent with increasing batch sizes and a warm start can reach a generalization error that is optimal up to a multiplicative factor, thus motivating the use of this optimization scheme in practical applications.
This paper provides an introductory overview of how one may employ importance sampling effectively as a tool for solving stochastic optimization formulations incorporating tail risk measures such as Conditional Value-at-Risk. Approximating the tail risk measure by its sample average approximation, while appealing due to its simplicity and universality in use, requires a large number of samples to be able to arrive at risk-minimizing decisions with high confidence. This is primarily due to the rarity with which the relevant tail events get observed in the samples. In simulation, Importance Sampling is among the most prominent methods for substantially reducing the sample requirement while estimating probabilities of rare events. Can importance sampling be used for optimization as well? If so, what are the ingredients required for making importance sampling an effective tool for optimization formulations involving rare events? This tutorial aims to provide an introductory overview of the two key ingredients in this regard, namely, (i) how one may arrive at an importance sampling change of measure prescription at every decision, and (ii) the prominent techniques available for integrating such a prescription within a solution paradigm for stochastic optimization formulations.
Local search algorithms are well-known methods for solving large, hard instances of the satisfiability problem (SAT). The performance of these algorithms crucially depends on heuristics for setting noise parameters and scoring variables. The optimal setting for these heuristics varies for different instance distributions. In this paper, we present an approach for learning effective variable scoring functions and noise parameters by using reinforcement learning. We consider satisfiability problems from different instance distributions and learn specialized heuristics for each of them. Our experimental results show improvements with respect to both a WalkSAT baseline and another local search learned heuristic.
This paper begins with a description of methods for estimating probability density functions for images that reflects the observation that such data is usually constrained to lie in restricted regions of the high-dimensional image space - not every pattern of pixels is an image. It is common to say that images lie on a lower-dimensional manifold in the high-dimensional space. However, although images may lie on such lower-dimensional manifolds, it is not the case that all points on the manifold have an equal probability of being images. Images are unevenly distributed on the manifold, and our task is to devise ways to model this distribution as a probability distribution. In pursuing this goal, we consider generative models that are popular in AI and computer vision community. For our purposes, generative/probabilistic models should have the properties of 1) sample generation: it should be possible to sample from this distribution according to the modelled density function, and 2) probability computation: given a previously unseen sample from the dataset of interest, one should be able to compute the probability of the sample, at least up to a normalising constant. To this end, we investigate the use of methods such as normalising flow and diffusion models. We then show that such probabilistic descriptions can be used to construct defences against adversarial attacks. In addition to describing the manifold in terms of density, we also consider how semantic interpretations can be used to describe points on the manifold. To this end, we consider an emergent language framework which makes use of variational encoders to produce a disentangled representation of points that reside on a given manifold. Trajectories between points on a manifold can then be described in terms of evolving semantic descriptions.
We study a market mechanism that sets edge prices to incentivize strategic agents to organize trips that efficiently share limited network capacity. This market allows agents to form groups to share trips, make decisions on departure times and route choices, and make payments to cover edge prices and other costs. We develop a new approach to analyze the existence and computation of market equilibrium, building on theories of combinatorial auctions and dynamic network flows. Our approach tackles the challenges in market equilibrium characterization arising from: (a) integer and network constraints on the dynamic flow of trips in sharing limited edge capacity; (b) heterogeneous and private preferences of strategic agents. We provide sufficient conditions on the network topology and agents' preferences that ensure the existence and polynomial-time computation of market equilibrium. We identify a particular market equilibrium that achieves maximum utilities for all agents, and is equivalent to the outcome of the classical Vickery Clark Grove mechanism. Finally, we extend our results to general networks with multiple populations and apply them to compute dynamic tolls for efficient carpooling in San Francisco Bay Area.
We present RETA (Relative Timing Analysis), a differential timing analysis technique to verify the impact of an update on the execution time of embedded software. Timing analysis is computationally expensive and labor intensive. Software updates render repeating the analysis from scratch a waste of resources and time, because their impact is inherently confined. To determine this boundary, in RETA we apply a slicing procedure that identifies all relevant code segments and a statement categorization that determines how to analyze each such line of code. We adapt a subset of RETA for integration into aiT, an industrial timing analysis tool, and also develop a complete implementation in a tool called DELTA. Based on staple benchmarks and realistic code updates from official repositories, we test the accuracy by analyzing the worst-case execution time (WCET) before and after an update, comparing the measures with the use of the unmodified aiT as well as real executions on embedded hardware. DELTA returns WCET information that ranges from exactly the WCET of real hardware to 148% of the new version's measured WCET. With the same benchmarks, the unmodified aiT estimates are 112% and 149% of the actual executions; therefore, even when DELTA is pessimistic, an industry-strength tool such as aiT cannot do better. Crucially, we also show that RETA decreases aiT's analysis time by 45% and its memory consumption by 8.9%, whereas removing RETA from DELTA, effectively rendering it a regular timing analysis tool, increases its analysis time by 27%.
Graph machine learning has been extensively studied in both academic and industry. However, as the literature on graph learning booms with a vast number of emerging methods and techniques, it becomes increasingly difficult to manually design the optimal machine learning algorithm for different graph-related tasks. To tackle the challenge, automated graph machine learning, which aims at discovering the best hyper-parameter and neural architecture configuration for different graph tasks/data without manual design, is gaining an increasing number of attentions from the research community. In this paper, we extensively discuss automated graph machine approaches, covering hyper-parameter optimization (HPO) and neural architecture search (NAS) for graph machine learning. We briefly overview existing libraries designed for either graph machine learning or automated machine learning respectively, and further in depth introduce AutoGL, our dedicated and the world's first open-source library for automated graph machine learning. Last but not least, we share our insights on future research directions for automated graph machine learning. This paper is the first systematic and comprehensive discussion of approaches, libraries as well as directions for automated graph machine learning.