The $\mathsf{HYBRID}$ model was introduced as a means for theoretical study of distributed networks that use various communication modes. Conceptually, it is a synchronous message passing model with a local communication mode, where in each round each node can send large messages to all its neighbors in a local network (a graph), and a global communication mode, where each node is allotted limited (polylogarithmic) bandwidth per round which it can use to communicate with any node in the network. Prior work has often focused on shortest paths problems in the local network, as their global nature makes these an interesting case study how combining communication modes in the $\mathsf{HYBRID}$ model can overcome the individual lower bounds of either mode. In this work we consider a similar problem, namely computation of distance oracles and routing schemes. In the former, all nodes have to compute local tables, which allows them to look up the distance (estimates) to any target node in the local network when provided with the label of the target. In the latter, it suffices that nodes give the next node on an (approximately) shortest path to the target. Our goal is to compute these local tables as fast as possible with labels as small as possible. We show that this can be done exactly in $\widetilde O(n^{1/3})$ communication rounds and labels of size $\Theta(n^{2/3})$ bits. For constant stretch approximations we achieve labels of size $O(\log n)$ in the same time. Further, as our main technical contribution, we provide computational lower bounds for a variety of problem parameters. For instance, we show that computing solutions with stretch below a certain constant takes $\widetilde \Omega(n^{1/3})$ rounds even for labels of size $O(n^{2/3})$.
Given its status as a classic problem and its importance to both theoreticians and practitioners, edit distance provides an excellent lens through which to understand how the theoretical analysis of algorithms impacts practical implementations. From an applied perspective, the goals of theoretical analysis are to predict the empirical performance of an algorithm and to serve as a yardstick to design novel algorithms that perform well in practice. In this paper, we systematically survey the types of theoretical analysis techniques that have been applied to edit distance and evaluate the extent to which each one has achieved these two goals. These techniques include traditional worst-case analysis, worst-case analysis parametrized by edit distance or entropy or compressibility, average-case analysis, semi-random models, and advice-based models. We find that the track record is mixed. On one hand, two algorithms widely used in practice have been born out of theoretical analysis and their empirical performance is captured well by theoretical predictions. On the other hand, all the algorithms developed using theoretical analysis as a yardstick since then have not had any practical relevance. We conclude by discussing the remaining open problems and how they can be tackled.
The approximate uniform sampling of graph realizations with a given degree sequence is an everyday task in several social science, computer science, engineering etc. projects. One approach is using Markov chains. The best available current result about the well-studied switch Markov chain is that it is rapidly mixing on P-stable degree sequences (see DOI:10.1016/j.ejc.2021.103421). The switch Markov chain does not change any degree sequence. However, there are cases where degree intervals are specified rather than a single degree sequence. (A natural scenario where this problem arises is in hypothesis testing on social networks that are only partially observed.) Rechner, Strowick, and M\"uller-Hannemann introduced in 2018 the notion of degree interval Markov chain which uses three (separately well-studied) local operations (switch, hinge-flip and toggle), and employing on degree sequence realizations where any two sequences under scrutiny have very small coordinate-wise distance. Recently Amanatidis and Kleer published a beautiful paper (arXiv:2110.09068), showing that the degree interval Markov chain is rapidly mixing if the sequences are coming from a system of very thin intervals which are centered not far from a regular degree sequence. In this paper we extend substantially their result, showing that the degree interval Markov chain is rapidly mixing if the intervals are centred at P-stable degree sequences.
The Schrijver graph $S(n,k)$ is defined for integers $n$ and $k$ with $n \geq 2k$ as the graph whose vertices are all the $k$-subsets of $\{1,2,\ldots,n\}$ that do not include two consecutive elements modulo $n$, where two such sets are adjacent if they are disjoint. A result of Schrijver asserts that the chromatic number of $S(n,k)$ is $n-2k+2$ (Nieuw Arch. Wiskd., 1978). In the computational Schrijver problem, we are given an access to a coloring of the vertices of $S(n,k)$ with $n-2k+1$ colors, and the goal is to find a monochromatic edge. The Schrijver problem is known to be complete in the complexity class $\mathsf{PPA}$. We prove that it can be solved by a randomized algorithm with running time $n^{O(1)} \cdot k^{O(k)}$, hence it is fixed-parameter tractable with respect to the parameter $k$.
Population dynamics is the study of temporal and spatial variation in the size of populations of organisms and is a major part of population ecology. One of the main difficulties in analyzing population dynamics is that we can only obtain observation data with coarse time intervals from fixed-point observations due to experimental costs or other constraints. Recently, modeling population dynamics by using continuous normalizing flows (CNFs) and dynamic optimal transport has been proposed to infer the expected trajectory of samples from a fixed-point observed population. While the sample behavior in CNF is deterministic, the actual sample in biological systems moves in an essentially random yet directional manner. Moreover, when a sample moves from point A to point B in dynamical systems, its trajectory is such that the corresponding action has the smallest possible value, known as the principle of least action. To satisfy these requirements of the sample trajectories, we formulate the Lagrangian Schr\"odinger bridge (LSB) problem and propose to solve it approximately using neural SDE with regularization. We also develop a model architecture that enables faster computation. Our experiments show that our solution to the LSB problem can approximate the dynamics at the population level and that using the prior knowledge introduced by the Lagrangian enables us to estimate the trajectories of individual samples with stochastic behavior.
We study the problem of testing whether a function $f: \mathbb{R}^n \to \mathbb{R}$ is a polynomial of degree at most $d$ in the \emph{distribution-free} testing model. Here, the distance between functions is measured with respect to an unknown distribution $\mathcal{D}$ over $\mathbb{R}^n$ from which we can draw samples. In contrast to previous work, we do not assume that $\mathcal{D}$ has finite support. We design a tester that given query access to $f$, and sample access to $\mathcal{D}$, makes $(d/\varepsilon)^{O(1)}$ many queries to $f$, accepts with probability $1$ if $f$ is a polynomial of degree $d$, and rejects with probability at least $2/3$ if every degree-$d$ polynomial $P$ disagrees with $f$ on a set of mass at least $\varepsilon$ with respect to $\mathcal{D}$. Our result also holds under mild assumptions when we receive only a polynomial number of bits of precision for each query to $f$, or when $f$ can only be queried on rational points representable using a logarithmic number of bits. Along the way, we prove a new stability theorem for multivariate polynomials that may be of independent interest.
We study the distributed minimum spanning tree (MST) problem, a fundamental problem in distributed computing. It is well-known that distributed MST can be solved in $\tilde{O}(D+\sqrt{n})$ rounds in the standard CONGEST model (where $n$ is the network size and $D$ is the network diameter) and this is essentially the best possible round complexity (up to logarithmic factors). However, in resource-constrained networks such as ad hoc wireless and sensor networks, nodes spending so much time can lead to significant spending of resources such as energy. Motivated by the above consideration, we study distributed algorithms for MST under the \emph{sleeping model} [Chatterjee et al., PODC 2020], a model for design and analysis of resource-efficient distributed algorithms. In the sleeping model, a node can be in one of two modes in any round -- \emph{sleeping} or \emph{awake} (unlike the traditional model where nodes are always awake). Only the rounds in which a node is \emph{awake} are counted, while \emph{sleeping} rounds are ignored. A node spends resources only in the awake rounds and hence the main goal is to minimize the \emph{awake complexity} of a distributed algorithm, the worst-case number of rounds any node is awake. We present deterministic and randomized distributed MST algorithms that have an \emph{optimal} awake complexity of $O(\log n)$ time with a matching lower bound. We also show that our randomized awake-optimal algorithm has essentially the best possible round complexity by presenting a lower bound of $\tilde{\Omega}(n)$ on the product of the awake and round complexity of any distributed algorithm (including randomized) that outputs an MST, where $\tilde{\Omega}$ hides a $1/(\text{polylog } n)$ factor.
In the storied Colonel Blotto game, two colonels allocate $a$ and $b$ troops, respectively, to $k$ distinct battlefields. A colonel wins a battle if they assign more troops to that particular battle, and each colonel seeks to maximize their total number of victories. Despite the problem's formulation in 1921, the first polynomial-time algorithm to compute Nash equilibrium (NE) strategies for this game was discovered only quite recently. In 2016, \citep{ahmadinejad_dehghani_hajiaghayi_lucier_mahini_seddighin_2019} formulated a breakthrough algorithm to compute NE strategies for the Colonel Blotto game\footnote{To the best of our knowledge, the algorithm from \citep{ahmadinejad_dehghani_hajiaghayi_lucier_mahini_seddighin_2019} has computational complexity $O(k^{14}\max\{a,b\}^{13})$}, receiving substantial media coverage (e.g. \citep{Insider}, \citep{NSF}, \citep{ScienceDaily}). In this work, we present the first known $\epsilon$-approximation algorithm to compute NE strategies in the two-player Colonel Blotto game in runtime $\widetilde{O}(\epsilon^{-4} k^8 \max\{a,b\}^2)$ for arbitrary settings of these parameters. Moreover, this algorithm computes approximate coarse correlated equilibrium strategies in the multiplayer (continuous and discrete) Colonel Blotto game (when there are $\ell > 2$ colonels) with runtime $\widetilde{O}(\ell \epsilon^{-4} k^8 n^2 + \ell^2 \epsilon^{-2} k^3 n (n+k))$, where $n$ is the maximum troop count. Before this work, no polynomial-time algorithm was known to compute exact or approximate equilibrium (in any sense) strategies for multiplayer Colonel Blotto with arbitrary parameters. Our algorithm computes these approximate equilibria by a novel (to the author's knowledge) sampling technique with which we implicitly perform multiplicative weights update over the exponentially many strategies available to each player.
We demonstrate that merely analog transmissions and match filtering can realize the function of an edge server in federated learning (FL). Therefore, a network with massively distributed user equipments (UEs) can achieve large-scale FL without an edge server. We also develop a training algorithm that allows UEs to continuously perform local computing without being interrupted by the global parameter uploading, which exploits the full potential of UEs' processing power. We derive convergence rates for the proposed schemes to quantify their training efficiency. The analyses reveal that when the interference obeys a Gaussian distribution, the proposed algorithm retrieves the convergence rate of a server-based FL. But if the interference distribution is heavy-tailed, then the heavier the tail, the slower the algorithm converges. Nonetheless, the system run time can be largely reduced by enabling computation in parallel with communication, whereas the gain is particularly pronounced when communication latency is high. These findings are corroborated via excessive simulations.
Learning accurate classifiers for novel categories from very few examples, known as few-shot image classification, is a challenging task in statistical machine learning and computer vision. The performance in few-shot classification suffers from the bias in the estimation of classifier parameters; however, an effective underlying bias reduction technique that could alleviate this issue in training few-shot classifiers has been overlooked. In this work, we demonstrate the effectiveness of Firth bias reduction in few-shot classification. Theoretically, Firth bias reduction removes the $O(N^{-1})$ first order term from the small-sample bias of the Maximum Likelihood Estimator. Here we show that the general Firth bias reduction technique simplifies to encouraging uniform class assignment probabilities for multinomial logistic classification, and almost has the same effect in cosine classifiers. We derive an easy-to-implement optimization objective for Firth penalized multinomial logistic and cosine classifiers, which is equivalent to penalizing the cross-entropy loss with a KL-divergence between the uniform label distribution and the predictions. Then, we empirically evaluate that it is consistently effective across the board for few-shot image classification, regardless of (1) the feature representations from different backbones, (2) the number of samples per class, and (3) the number of classes. Finally, we show the robustness of Firth bias reduction, in the case of imbalanced data distribution. Our implementation is available at //github.com/ehsansaleh/firth_bias_reduction
This paper proposes an active learning algorithm for solving regression and classification problems based on inverse-distance weighting functions for selecting the feature vectors to query. The algorithm has the following features: (i) supports both pool-based and population-based sampling; (ii) is independent of the type of predictor used; (iii) can handle known and unknown constraints on the queryable feature vectors; and (iv) can run either sequentially, or in batch mode, depending on how often the predictor is retrained. The method's potential is shown in numerical tests on illustrative synthetic problems and real-world regression and classification datasets from the UCI repository. A Python implementation of the algorithm that we call IDEAL (Inverse-Distance based Exploration for Active Learning), is available at \url{//cse.lab.imtlucca.it/~bemporad/ideal}.