We consider the Vector Scheduling problem on identical machines: we have m machines, and a set J of n jobs, where each job j has a processing-time vector $p_j\in \mathbb{R}^d_{\geq 0}$. The goal is to find an assignment $\sigma:J\to [m]$ of jobs to machines so as to minimize the makespan $\max_{i\in [m]}\max_{r\in [d]}( \sum_{j:\sigma(j)=i}p_{j,r})$. A natural lower bound on the optimal makespan is lb $:=\max\{\max_{j\in J,r\in [d]}p_{j,r},\max_{r\in [d]}(\sum_{j\in J}p_{j,r}/m)\}$. Our main result is a very simple O(log d)-approximation algorithm for vector scheduling with respect to the lower bound lb: we devise an algorithm that returns an assignment whose makespan is at most O(log d)*lb. As an application, we show that the above guarantee leads to an O(log log m)-approximation for Stochastic Minimum-Norm Load Balancing (StochNormLB). In StochNormLB, we have m identical machines, a set J of n independent stochastic jobs whose processing times are nonnegative random variables, and a monotone, symmetric norm $f:\mathbb{R}^m \to \mathbb{R}_{\geq 0}$. The goal is to find an assignment $\sigma:J\to [m]$ that minimizes the expected $f$-norm of the induced machine-load vector, where the load on machine i is the (random) total processing time assigned to it. Our O(log log m)-approximation guarantee is in fact much stronger: we obtain an assignment that is simultaneously an O(log log m)-approximation for StochNormLB with all monotone, symmetric norms. Next, this approximation factor significantly improves upon the O(log m/log log m)-approximation in (Ibrahimpur and Swamy, FOCS 2020) for StochNormLB, and is a consequence of a more-general black-box reduction that we present, showing that a $\gamma(d)$-approximation for d-dimensional vector scheduling with respect to the lower bound lb yields a simultaneous $\gamma(\log m)$-approximation for StochNormLB with all monotone, symmetric norms.
We prove upper and lower bounds on the minimal spherical dispersion, improving upon previous estimates obtained by Rote and Tichy [Spherical dispersion with an application to polygonal approximation of curves, Anz. \"Osterreich. Akad. Wiss. Math.-Natur. Kl. 132 (1995), 3--10]. In particular, we see that the inverse $N(\varepsilon,d)$ of the minimal spherical dispersion is, for fixed $\varepsilon>0$, linear in the dimension $d$ of the ambient space. We also derive upper and lower bounds on the expected dispersion for points chosen independently and uniformly at random from the Euclidean unit sphere. In terms of the corresponding inverse $\widetilde{N}(\varepsilon,d)$, our bounds are optimal with respect to the dependence on $\varepsilon$.
We determine the exact minimax rate of a Gaussian sequence model under bounded convex constraints, purely in terms of the local geometry of the given constraint set $K$. Our main result shows that the minimax risk (up to constant factors) under the squared $L_2$ loss is given by $\epsilon^{*2} \wedge \operatorname{diam}(K)^2$ with \begin{align*} \epsilon^* = \sup \bigg\{\epsilon : \frac{\epsilon^2}{\sigma^2} \leq \log M^{\operatorname{loc}}(\epsilon)\bigg\}, \end{align*} where $\log M^{\operatorname{loc}}(\epsilon)$ denotes the local entropy of the set $K$, and $\sigma^2$ is the variance of the noise. We utilize our abstract result to re-derive known minimax rates for some special sets $K$ such as hyperrectangles, ellipses, and more generally quadratically convex orthosymmetric sets. Finally, we extend our results to the unbounded case with known $\sigma^2$ to show that the minimax rate in that case is $\epsilon^{*2}$.
We consider performance enhancement of asymmetrically-clipped optical orthogonal frequency division multiplexing (ACO-OFDM) and related optical OFDM schemes, which are variations of OFDM in intensity-modulated optical wireless communications. Unlike most existing studies on specific designs of improved receivers, this paper investigates information theoretic limits of all possible receivers. For independent and identically distributed complex Gaussian inputs, we obtain an exact characterization of information rate of ACO-OFDM with improved receivers for all SNRs. It is proved that the high-SNR gain of improved receivers asymptotically achieve 1/4 bits per channel use, which is equivalent to 3 dB in electrical SNR or 1.5 dB in optical SNR; as the SNR decreases, the maximum achievable SNR gain of improved receivers decreases monotonically to a non-zero low-SNR limit, corresponding to an information rate gain of 36.3%. For practically used constellations, we derive an upper bound on the gain of improved receivers. Numerical results demonstrate that the upper bound can be approached to within 1 dB in optical SNR by combining existing improved receivers and coded modulation. We also show that our information theoretic analyses can be extended to Flip-OFDM and PAM-DMT. Our results imply that, for the considered schemes, improved receivers may reduce the gap to channel capacity significantly at low-to-moderate SNR.
We show that solution to the Hermite-Pad\'{e} type I approximation problem leads in a natural way to a subclass of solutions of the Hirota (discrete Kadomtsev-Petviashvili) system and of its adjoint linear problem. Our result explains the appearence of various ingredients of the integrable systems theory in application to multiple orthogonal polynomials, numerical algorthms, random matrices, and in other branches of mathematical physics and applied mathematics where the Hermite-Pad\'{e} approximation problem is relevant. We present also the geometric algorithm, based on the notion of Desargues maps, of construction of solutions of the problem in the projective space over the field of rational functions. As a byproduct we obtain the corresponding generalization of the Wynn recurrence. We isolate the boundary data of the Hirota system which provide solutions to Hermite-Pad\'{e} problem showing that the corresponding reduction lowers dimensionality of the system. In particular, we obtain certain equations which, in addition to the known ones given by Paszkowski, can be considered as direct analogs of the Frobenius identities. We study the place of the reduced system within the integrability theory, which results in finding multidimensional (in the sense of number of variables) extension of the discrete-time Toda chain equations.
Given an $n$-point metric space $(\mathcal{X},d)$ where each point belongs to one of $m=O(1)$ different categories or groups and a set of integers $k_1, \ldots, k_m$, the fair Max-Min diversification problem is to select $k_i$ points belonging to category $i\in [m]$, such that the minimum pairwise distance between selected points is maximized. The problem was introduced by Moumoulidou et al. [ICDT 2021] and is motivated by the need to down-sample large data sets in various applications so that the derived sample achieves a balance over diversity, i.e., the minimum distance between a pair of selected points, and fairness, i.e., ensuring enough points of each category are included. We prove the following results: 1. We first consider general metric spaces. We present a randomized polynomial time algorithm that returns a factor $2$-approximation to the diversity but only satisfies the fairness constraints in expectation. Building upon this result, we present a $6$-approximation that is guaranteed to satisfy the fairness constraints up to a factor $1-\epsilon$ for any constant $\epsilon$. We also present a linear time algorithm returning an $m+1$ approximation with exact fairness. The best previous result was a $3m-1$ approximation. 2. We then focus on Euclidean metrics. We first show that the problem can be solved exactly in one dimension. For constant dimensions, categories and any constant $\epsilon>0$, we present a $1+\epsilon$ approximation algorithm that runs in $O(nk) + 2^{O(k)}$ time where $k=k_1+\ldots+k_m$. We can improve the running time to $O(nk)+ poly(k)$ at the expense of only picking $(1-\epsilon) k_i$ points from category $i\in [m]$. Finally, we present algorithms suitable to processing massive data sets including single-pass data stream algorithms and composable coresets for the distributed processing.
We aim at estimating the invariant density associated to a stochastic differential equation with jumps in low dimension, which is for $d=1$ and $d=2$. We consider a class of jump diffusion processes whose invariant density belongs to some H\"older space. Firstly, in dimension one, we show that the kernel density estimator achieves the convergence rate $\frac{1}{T}$, which is the optimal rate in the absence of jumps. This improves the convergence rate obtained in [Amorino, Gloter (2021)], which depends on the Blumenthal-Getoor index for $d=1$ and is equal to $\frac{\log T}{T}$ for $d=2$. Secondly, we show that is not possible to find an estimator with faster rates of estimation. Indeed, we get some lower bounds with the same rates $\{\frac{1}{T},\frac{\log T}{T}\}$ in the mono and bi-dimensional cases, respectively. Finally, we obtain the asymptotic normality of the estimator in the one-dimensional case.
We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.
Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.