Recent work by Dhulipala, Liu, Raskhodnikova, Shi, Shun, and Yu~\cite{DLRSSY22} initiated the study of the $k$-core decomposition problem under differential privacy. They show that approximate $k$-core numbers can be output while guaranteeing differential privacy, while only incurring a multiplicative error of $(2 +\eta)$ (for any constant $\eta >0$) and additive error of $\poly(\log(n))/\eps$. In this paper, we revisit this problem. Our main result is an $\eps$-edge differentially private algorithm for $k$-core decomposition which outputs the core numbers with no multiplicative error and $O(\text{log}(n)/\eps)$ additive error. This improves upon previous work by a factor of 2 in the multiplicative error, while giving near-optimal additive error. With a little additional work, this implies improved algorithms for densest subgraph and low out-degree ordering under differential privacy. For low out-degree ordering, we give an $\eps$-edge differentially private algorithm which outputs an implicit orientation such that the out-degree of each vertex is at most $d+O(\log{n}/{\eps})$, where $d$ is the degeneracy of the graph. This improves upon the best known guarantees for the problem by a factor of $4$ and gives near-optimal additive error. For densest subgraph, we give an $\eps$-edge differentially private algorithm outputting a subset of nodes that induces a subgraph of density at least ${D^*}/{2}-O(\text{log}(n)/\eps)$, where $D^*$ is the density for the optimal subgraph.
This study introduces a novel estimation method for the entries and structure of a matrix $A$ in the linear factor model $\textbf{X} = A\textbf{Z} + \textbf{E}$. This is applied to an observable vector $\textbf{X} \in \mathbb{R}^d$ with $\textbf{Z} \in \mathbb{R}^K$, a vector composed of independently regularly varying random variables, and light-tailed independent noise $\textbf{E} \in \mathbb{R}^d$. This leads to max-linear models treated in classical multivariate extreme value theory. The spectral measure of the limit distribution is subsequently discrete and completely characterized by the matrix $A$. Every max-stable random vector with discrete spectral measure can be written as a max-linear model. Each row of the matrix $A$ is both scaled and sparse. Additionally, the value of $K$ is not known a priori. The problem of identifying the matrix $A$ from its matrix of pairwise extremal correlation is addressed. In the presence of pure variables, which are elements of $\textbf{X}$ linked, through $A$, to a single latent factor, the matrix $A$ can be reconstructed from the extremal correlation matrix. Our proofs of identifiability are constructive and pave the way for our innovative estimation for determining the number of factors $K$ and the matrix $A$ from $n$ weakly dependent observations on $\textbf{X}$. We apply the suggested method to weekly maxima rainfall and wildfires to demonstrate its applicability.
Leroux has proved that unreachability in Petri nets can be witnessed by a Presburger separator, i.e. if a marking $\vec{m}_\text{src}$ cannot reach a marking $\vec{m}_\text{tgt}$, then there is a formula $\varphi$ of Presburger arithmetic such that: $\varphi(\vec{m}_\text{src})$ holds; $\varphi$ is forward invariant, i.e., $\varphi(\vec{m})$ and $\vec{m} \rightarrow \vec{m}'$ imply $\varphi(\vec{m}'$); and $\neg \varphi(\vec{m}_\text{tgt})$ holds. While these separators could be used as explanations and as formal certificates of unreachability, this has not yet been the case due to their worst-case size, which is at least Ackermannian, and the complexity of checking that a formula is a separator, which is at least exponential (in the formula size). We show that, in continuous Petri nets, these two problems can be overcome. We introduce locally closed separators, and prove that: (a) unreachability can be witnessed by a locally closed separator computable in polynomial time; (b) checking whether a formula is a locally closed separator is in NC (so, simpler than unreachability, which is P-complete). We further consider the more general problem of (existential) set-to-set reachability, where two sets of markings are given as convex polytopes. We show that, while our approach does not extend directly, we can efficiently certify unreachability via an altered Petri net.
Let $X$ be a $d$-dimensional simplicial complex. A function $F\colon X(k)\to \{0,1\}^k$ is said to be a direct product function if there exists a function $f\colon X(1)\to \{0,1\}$ such that $F(\sigma) = (f(\sigma_1), \ldots, f(\sigma_k))$ for each $k$-face $\sigma$. In an effort to simplify components of the PCP theorem, Goldreich and Safra introduced the problem of direct product testing, which asks whether one can test if $F\colon X(k)\to \{0,1\}^k$ is correlated with a direct product function by querying $F$ on only $2$ inputs. Dinur and Kaufman conjectured that there exist bounded degree complexes with a direct product test in the small soundness regime. We resolve their conjecture by showing that for all $\delta>0$, there exists a family of high-dimensional expanders with degree $O_{\delta}(1)$ and a $2$-query direct product tester with soundness $\delta$. We use the characterization given by a subset of the authors and independently by Dikstein and Dinur, who showed that some form of non-Abelian coboundary expansion (which they called "Unique-Games coboundary expansion") is a necessary and sufficient condition for a complex to admit such direct product testers. Our main technical contribution is a general technique for showing coboundary expansion of complexes with coefficients in a non-Abelian group. This allows us to prove that the high dimensional expanders constructed by Chapman and Lubotzky satisfies the necessary conditions, thus admitting a 2-query direct product tester with small soundness.
I study the problem of learning a Lipschitz function with corrupted binary signals. The learner tries to learn a $L$-Lipschitz function $f: [0,1]^d \rightarrow [0, L]$ that the adversary chooses. There is a total of $T$ rounds. In each round $t$, the adversary selects a context vector $x_t$ in the input space, and the learner makes a guess to the true function value $f(x_t)$ and receives a binary signal indicating whether the guess is high or low. In a total of $C$ rounds, the signal may be corrupted, though the value of $C$ is \emph{unknown} to the learner. The learner's goal is to incur a small cumulative loss. This work introduces the new algorithmic technique \emph{agnostic checking} as well as new analysis techniques. I design algorithms which: for the symmetric loss, the learner achieves regret $L\cdot O(C\log T)$ with $d = 1$ and $L\cdot O_d(C\log T + T^{(d-1)/d})$ with $d > 1$; for the pricing loss, the learner achieves regret $L\cdot \widetilde{O} (T^{d/(d+1)} + C\cdot T^{1/(d+1)})$.
We study the problem of online conditional distribution estimation with \emph{unbounded} label sets under local differential privacy. Let $\mathcal{F}$ be a distribution-valued function class with unbounded label set. We aim at estimating an \emph{unknown} function $f\in \mathcal{F}$ in an online fashion so that at time $t$ when the context $\boldsymbol{x}_t$ is provided we can generate an estimate of $f(\boldsymbol{x}_t)$ under KL-divergence knowing only a privatized version of the true labels sampling from $f(\boldsymbol{x}_t)$. The ultimate objective is to minimize the cumulative KL-risk of a finite horizon $T$. We show that under $(\epsilon,0)$-local differential privacy of the privatized labels, the KL-risk grows as $\tilde{\Theta}(\frac{1}{\epsilon}\sqrt{KT})$ upto poly-logarithmic factors where $K=|\mathcal{F}|$. This is in stark contrast to the $\tilde{\Theta}(\sqrt{T\log K})$ bound demonstrated by Wu et al. (2023a) for bounded label sets. As a byproduct, our results recover a nearly tight upper bound for the hypothesis selection problem of gopi et al. (2020) established only for the batch setting.
In Linear Logic ($\mathsf{LL}$), the exponential modality $!$ brings forth a distinction between non-linear proofs and linear proofs, where linear means using an argument exactly once. Differential Linear Logic ($\mathsf{DiLL}$) is an extension of Linear Logic which includes additional rules for $!$ which encode differentiation and the ability of linearizing proofs. On the other hand, Graded Linear Logic ($\mathsf{GLL}$) is a variation of Linear Logic in such a way that $!$ is now indexed over a semiring $R$. This $R$-grading allows for non-linear proofs of degree $r \in R$, such that the linear proofs are of degree $1 \in R$. There has been recent interest in combining these two variations of $\mathsf{LL}$ together and developing Graded Differential Linear Logic ($\mathsf{GDiLL}$). In this paper we present a sequent calculus for $\mathsf{GDiLL}$, as well as introduce its categorical semantics, which we call graded differential categories, using both coderelictions and deriving transformations. We prove that symmetric powers always give graded differential categories, and provide other examples of graded differential categories. We also discuss graded versions of (monoidal) coalgebra modalities, additive bialgebra modalities, and the Seely isomorphisms, as well as their implementations in the sequent calculus of $\mathsf{GDiLL}$.
If $G$ is a group, we say a subset $S$ of $G$ is product-free if the equation $xy=z$ has no solutions with $x,y,z \in S$. For $D \in \mathbb{N}$, a group $G$ is said to be $D$-quasirandom if the minimal dimension of a nontrivial complex irreducible representation of $G$ is at least $D$. Gowers showed that in a $D$-quasirandom finite group $G$, the maximal size of a product-free set is at most $|G|/D^{1/3}$. This disproved a longstanding conjecture of Babai and S\'os from 1985. For the special unitary group, $G=SU(n)$, Gowers observed that his argument yields an upper bound of $n^{-1/3}$ on the measure of a measurable product-free subset. In this paper, we improve Gowers' upper bound to $\exp(-cn^{1/3})$, where $c>0$ is an absolute constant. In fact, we establish something stronger, namely, product-mixing for measurable subsets of $SU(n)$ with measure at least $\exp(-cn^{1/3})$; for this product-mixing result, the $n^{1/3}$ in the exponent is sharp. Our approach involves introducing novel hypercontractive inequalities, which imply that the non-Abelian Fourier spectrum of the indicator function of a small set concentrates on high-dimensional irreducible representations. Our hypercontractive inequalities are obtained via methods from representation theory, harmonic analysis, random matrix theory and differential geometry. We generalize our hypercontractive inequalities from $SU(n)$ to an arbitrary $D$-quasirandom compact connected Lie group for $D$ at least an absolute constant, thereby extending our results on product-free sets to such groups. We also demonstrate various other applications of our inequalities to geometry (viz., non-Abelian Brunn-Minkowski type inequalities), mixing times, and the theory of growth in compact Lie groups.
We investigate the complexity of solving stable or perturbation-resilient instances of $k$-Means and $k$-Median clustering in fixed dimension Euclidean metrics (more generally doubling metrics). The notion of stable (perturbation resilient) instances was introduced by Bilu and Linial [2010] and Awasthi et al. [2012]. In our context we say a $k$-Means instance is $\alpha$-stable if there is a unique OPT which remains optimum if distances are (non-uniformly) stretched by a factor of at most $\alpha$. Stable clustering instances have been studied to explain why heuristics such as Lloyd's algorithm perform well in practice. In this work we show that for any fixed $\epsilon>0$, $(1+\epsilon)$-stable instances of $k$-Means in doubling metrics can be solved in polynomial time. More precisely we show a natural multiswap local search algorithm finds OPT for $(1+\epsilon)$-stable instances of $k$-Means and $k$-Median in a polynomial number of iterations. We complement this result by showing that under a new PCP theorem, this is essentially tight: that when the dimension d is part of the input, there is a fixed $\epsilon_0>0$ s.t. there is not even a PTAS for $(1+\epsilon_0)$-stable $k$-Means in $R^d$ unless NP=RP. To do this, we consider a robust property of CSPs; call an instance stable if there is a unique optimum solution $x^*$ and for any other solution $x'$, the number of unsatisfied clauses is proportional to the Hamming distance between $x^*$ and $x'$. Dinur et al. have already shown stable QSAT is hard to approximate for some constant Q, our hypothesis is simply that stable QSAT with bounded variable occurrence is also hard. Given this hypothesis we consider "stability-preserving" reductions to prove our hardness for stable k-Means. Such reductions seem to be more fragile than standard L-reductions and may be of further use to demonstrate other stable optimization problems are hard.
Conformal prediction (CP) is a method for constructing a prediction interval around the output of a fitted model, whose validity does not rely on the model being correct--the CP interval offers a coverage guarantee that is distribution-free, but relies on the training data being drawn from the same distribution as the test data. A recent variant, weighted conformal prediction (WCP), reweights the method to allow for covariate shift between the training and test distributions. However, WCP requires knowledge of the nature of the covariate shift-specifically,the likelihood ratio between the test and training covariate distributions. In practice, since this likelihood ratio is estimated rather than known exactly, the coverage guarantee may degrade due to the estimation error. In this paper, we consider a special scenario where observations belong to a finite number of groups, and these groups determine the covariate shift between the training and test distributions-for instance, this may arise if the training set is collected via stratified sampling. Our results demonstrate that in this special case, the predictive coverage guarantees of WCP can be drastically improved beyond the bounds given by existing estimation error bounds.
Federated Learning (FL) is a decentralized machine-learning paradigm, in which a global server iteratively averages the model parameters of local users without accessing their data. User heterogeneity has imposed significant challenges to FL, which can incur drifted global models that are slow to converge. Knowledge Distillation has recently emerged to tackle this issue, by refining the server model using aggregated knowledge from heterogeneous users, other than directly averaging their model parameters. This approach, however, depends on a proxy dataset, making it impractical unless such a prerequisite is satisfied. Moreover, the ensemble knowledge is not fully utilized to guide local model learning, which may in turn affect the quality of the aggregated model. Inspired by the prior art, we propose a data-free knowledge distillation} approach to address heterogeneous FL, where the server learns a lightweight generator to ensemble user information in a data-free manner, which is then broadcasted to users, regulating local training using the learned knowledge as an inductive bias. Empirical studies powered by theoretical implications show that, our approach facilitates FL with better generalization performance using fewer communication rounds, compared with the state-of-the-art.