The role of symmetry in Boolean functions $f:\{0,1\}^n \to \{0,1\}$ has been extensively studied in complexity theory. For example, symmetric functions, that is, functions that are invariant under the action of $S_n$, is an important class of functions in the study of Boolean functions. A function $f:\{0,1\}^n \to \{0,1\}$ is called transitive (or weakly-symmetric) if there exists a transitive group $G$ of $S_n$ such that $f$ is invariant under the action of $G$ - that is the function value remains unchanged even after the bits of the input of $f$ are moved around according to some permutation $\sigma \in G$. Understanding various complexity measures of transitive functions has been a rich area of research for the past few decades. In this work, we study transitive functions in light of several combinatorial measures. We look at the maximum separation between various pairs of measures for transitive functions. Such study for general Boolean functions has been going on for past many years. The best-known results for general Boolean functions have been nicely compiled by Aaronson et. al (STOC, 2021). The separation between a pair of combinatorial measures is shown by constructing interesting functions that demonstrate the separation. But many of the celebrated separation results are via the construction of functions (like "pointer functions" from Ambainis et al. (JACM, 2017) and "cheat-sheet functions" Aaronson et al. (STOC, 2016)) that are not transitive. Hence, we don't have such separation between the pairs of measures for transitive functions. In this paper we show how to modify some of these functions to construct transitive functions that demonstrate similar separations between pairs of combinatorial measures.
The goal of multi-objective optimization is to identify a collection of points which describe the best possible trade-offs between the multiple objectives. In order to solve this vector-valued optimization problem, practitioners often appeal to the use of scalarization functions in order to transform the multi-objective problem into a collection of single-objective problems. This set of scalarized problems can then be solved using traditional single-objective optimization techniques. In this work, we formalise this convention into a general mathematical framework. We show how this strategy effectively recasts the original multi-objective optimization problem into a single-objective optimization problem defined over sets. An appropriate class of objective functions for this new problem is the R2 utility function, which is defined as a weighted integral over the scalarized optimization problems. We show that this utility function is a monotone and submodular set function, which can be optimised effectively using greedy optimization algorithms. We analyse the performance of these greedy algorithms both theoretically and empirically. Our analysis largely focusses on Bayesian optimization, which is a popular probabilistic framework for black-box optimization.
An $f$-edge fault-tolerant distance sensitive oracle ($f$-DSO) with stretch $\sigma \ge 1$ is a data structure that preprocesses a given undirected, unweighted graph $G$ with $n$ vertices and $m$ edges, and a positive integer $f$. When queried with a pair of vertices $s, t$ and a set $F$ of at most $f$ edges, it returns a $\sigma$-approximation of the $s$-$t$-distance in $G-F$. We study $f$-DSOs that take subquadratic space. Thorup and Zwick [JACM 2015] showed that this is only possible for $\sigma \ge 3$. We present, for any constant $f \ge 1$ and $\alpha \in (0, \frac{1}{2})$, and any $\varepsilon > 0$, an $f$-DSO with stretch $ 3 + \varepsilon$ that takes $\widetilde{O}(n^{2-\frac{\alpha}{f+1}}/\varepsilon) \cdot O(\log n/\varepsilon)^{f+1}$ space and has an $O(n^\alpha/\varepsilon^2)$ query time. We also give an improved construction for graphs with diameter at most $D$. For any constant $k$, we devise an $f$-DSO with stretch $2k-1$ that takes $O(D^{f+o(1)} n^{1+1/k})$ space and has $\widetilde{O}(D^{o(1)})$ query time, with a preprocessing time of $O(D^{f+o(1)} mn^{1/k})$. Chechik, Cohen, Fiat, and Kaplan [SODA 2017] presented an $f$-DSO with stretch $1{+}\varepsilon$ and preprocessing time $O_\varepsilon(n^{5+o(1)})$, albeit with a super-quadratic space requirement. We show how to reduce their preprocessing time to $O_{\varepsilon}(mn^{2+o(1)})$.
Random Search is one of the most widely-used method for Hyperparameter Optimization, and is critical to the success of deep learning models. Despite its astonishing performance, little non-heuristic theory has been developed to describe the underlying working mechanism. This paper gives a theoretical accounting of Random Search. We introduce the concept of \emph{scattering dimension} that describes the landscape of the underlying function, and quantifies the performance of random search. We show that, when the environment is noise-free, the output of random search converges to the optimal value in probability at rate $ \widetilde{\mathcal{O}} \left( \left( \frac{1}{T} \right)^{ \frac{1}{d_s} } \right) $, where $ d_s \ge 0 $ is the scattering dimension of the underlying function. When the observed function values are corrupted by bounded $iid$ noise, the output of random search converges to the optimal value in probability at rate $ \widetilde{\mathcal{O}} \left( \left( \frac{1}{T} \right)^{ \frac{1}{d_s + 1} } \right) $. In addition, based on the principles of random search, we introduce an algorithm, called BLiN-MOS, for Lipschitz bandits in doubling metric spaces that are also emdowed with a Borel measure, and show that BLiN-MOS achieves a regret rate of order $ \widetilde{\mathcal{O}} \left( T^{ \frac{d_z}{d_z + 1} } \right) $, where $d_z$ is the zooming dimension of the problem instance. Our results show that in metric spaces with a Borel measure, the classic theory of Lipschitz bandits can be improved. This result suggests an intrinsic axiomatic gap between metric spaces and metric measure spaces from an algorithmic perspective, since the upper bound in a metric measure space breaks the known information-theoretical lower bounds for Lipschitz bandits in a metric space with no measure structure.
Evidence Networks can enable Bayesian model comparison when state-of-the-art methods (e.g. nested sampling) fail and even when likelihoods or priors are intractable or unknown. Bayesian model comparison, i.e. the computation of Bayes factors or evidence ratios, can be cast as an optimization problem. Though the Bayesian interpretation of optimal classification is well-known, here we change perspective and present classes of loss functions that result in fast, amortized neural estimators that directly estimate convenient functions of the Bayes factor. This mitigates numerical inaccuracies associated with estimating individual model probabilities. We introduce the leaky parity-odd power (l-POP) transform, leading to the novel ``l-POP-Exponential'' loss function. We explore neural density estimation for data probability in different models, showing it to be less accurate and scalable than Evidence Networks. Multiple real-world and synthetic examples illustrate that Evidence Networks are explicitly independent of dimensionality of the parameter space and scale mildly with the complexity of the posterior probability density function. This simple yet powerful approach has broad implications for model inference tasks. As an application of Evidence Networks to real-world data we compute the Bayes factor for two models with gravitational lensing data of the Dark Energy Survey. We briefly discuss applications of our methods to other, related problems of model comparison and evaluation in implicit inference settings.
This paper is devoted to the study of infinitesimal limit cycles that can bifurcate from zero-Hopf equilibria of differential systems based on the averaging method. We develop an efficient symbolic program using Maple for computing the averaged functions of any order for continuous differential systems in arbitrary dimension. The program allows us to systematically analyze zero-Hopf bifurcations of polynomial differential systems using symbolic computation methods. We show that for the first-order averaging, $\ell\in\{0,1,\ldots,2^{n-3}\}$ limit cycles can bifurcate from the zero-Hopf equilibrium for the general class of perturbed differential systems and up to the second-order averaging, the maximum number of limit cycles can be determined by computing the mixed volume of a polynomial system obtained from the averaged functions. A number of examples are presented to demonstrate the effectiveness of the proposed algorithmic approach.
We study the problem of designing consistent sequential two-sample tests in a nonparametric setting. Guided by the principle of testing by betting, we reframe this task into that of selecting a sequence of payoff functions that maximize the wealth of a fictitious bettor, betting against the null in a repeated game. In this setting, the relative increase in the bettor's wealth has a precise interpretation as the measure of evidence against the null, and thus our sequential test rejects the null when the wealth crosses an appropriate threshold. We develop a general framework for setting up the betting game for two-sample testing, in which the payoffs are selected by a prediction strategy as data-driven predictable estimates of the witness function associated with the variational representation of some statistical distance measures, such as integral probability metrics (IPMs). We then formally relate the statistical properties of the test~(such as consistency, type-II error exponent and expected sample size) to the regret of the corresponding prediction strategy. We construct a practical sequential two-sample test by instantiating our general strategy with the kernel-MMD metric, and demonstrate its ability to adapt to the difficulty of the unknown alternative through theoretical and empirical results. Our framework is versatile, and easily extends to other problems; we illustrate this by applying our approach to construct consistent tests for the following problems: (i) time-varying two-sample testing with non-exchangeable observations, and (ii) an abstract class of "invariant" testing problems, including symmetry and independence testing.
We consider a regular splitting based on the Sherman-Morrison-Woodbury formula, which is especially effective with iterative methods for the numerical solution of large linear systems with matrices that are perturbations of circulant or block circulant matrices. Such linear systems occur usually in the numerical discretization of one-dimensional differential equations. We prove the convergence of the new iteration without any assumptions on the symmetry or diagonal-dominance of the matrix. An extension to 2-by-2 block matrices that occur in some saddle point problems is also presented. The new method converged very fast in all of the test cases we used. Due to its trivial implementation and complexity with nearly circulant matrices via the Fast Fourier Transform it can be useful in the numerical solution of various one-dimensional finite element and finite difference discretizations of differential equations. A comparison with Gauss-Seidel and GMRES methods is also presented.
We consider problems of minimizing functionals $\mathcal{F}$ of probability measures on the Euclidean space. To propose an accelerated gradient descent algorithm for such problems, we consider gradient flow of transport maps that give push-forward measures of an initial measure. Then we propose a deterministic accelerated algorithm by extending Nesterov's acceleration technique with momentum. This algorithm do not based on the Wasserstein geometry. Furthermore, to estimate the convergence rate of the accelerated algorithm, we introduce new convexity and smoothness for $\mathcal{F}$ based on transport maps. As a result, we can show that the accelerated algorithm converges faster than a normal gradient descent algorithm. Numerical experiments support this theoretical result.
Finding a provably correct subquadratic synchronization algorithm for many filesystem replicas is one of the main theoretical problems in Operational Transformation (OT) and Conflict-free Replicated Data Types (CRDT) frameworks. Based on the Algebraic Theory of Filesystems, which incorporates non-commutative filesystem commands natively, we developed and built a proof-of-concept implementation of an algorithm suite which synchronizes an arbitrary number of replicas. The result is provably correct, and the synchronized system is created in linear space and time after an initial sorting phase. It works by identifying conflicting command pairs and requesting one of the commands to be removed. The method can be guided to reach any of the theoretically possible synchronized states. The algorithm also allows asynchronous usage. After the client sends a synchronization request, the local replica remains available for further modifications. When the synchronization instructions arrive, they can be merged with the changes made since the synchronization request. The suite also works on filesystems with directed acyclic graph-based path structure in place of the traditional tree-like arrangement. Consequently, our algorithms apply to filesystems with hard or soft links as long as the links create no loops.
Transportation of probability measures underlies many core tasks in statistics and machine learning, from variational inference to generative modeling. A typical goal is to represent a target probability measure of interest as the push-forward of a tractable source measure through a learned map. We present a new construction of such a transport map, given the ability to evaluate the score of the target distribution. Specifically, we characterize the map as a zero of an infinite-dimensional score-residual operator and derive a Newton-type method for iteratively constructing such a zero. We prove convergence of these iterations by invoking classical elliptic regularity theory for partial differential equations (PDE) and show that this construction enjoys rapid convergence, under smoothness assumptions on the target score. A key element of our approach is a generalization of the elementary Newton method to infinite-dimensional operators, other forms of which have appeared in nonlinear PDE and in dynamical systems. Our Newton construction, while developed in a functional setting, also suggests new iterative algorithms for approximating transport maps.