In this paper, by constructing extremely hard examples of CSP (with large domains) and SAT (with long clauses), we prove that such examples cannot be solved without exhaustive search, which implies a weaker conclusion P $\neq$ NP. This constructive approach for proving impossibility results is very different (and missing) from those currently used in computational complexity theory, but is similar to that used by Kurt G\"{o}del in proving his famous logical impossibility results. Just as shown by G\"{o}del's results that proving formal unprovability is feasible in mathematics, the results of this paper show that proving computational hardness is not hard in mathematics. The intuition behind this mathematical tractability is that proving exhaustive search for constructed examples avoids handling numerous effective strategies of avoiding exhaustive search that exist for many hard problems such as 3-SAT. Consequently, it makes the separation of lower bounds between SAT (with long clauses) and 3-SAT much easier than that between 3-SAT and 2-SAT.
A classic 1993 paper by Alth\H{o}fer et al. proved a tight reduction from spanners, emulators, and distance oracles to the extremal function $\gamma$ of high-girth graphs. This paper initiated a large body of work in network design, in which problems are attacked by reduction to $\gamma$ or the analogous extremal function for other girth concepts. In this paper, we introduce and study a new girth concept that we call the bridge girth of path systems, and we show that it can be used to significantly expand and improve this web of connections between girth problems and network design. We prove two kinds of results: 1) We write the maximum possible size of an $n$-node, $p$-path system with bridge girth $>k$ as $\beta(n, p, k)$, and we write a certain variant for "ordered" path systems as $\beta^*(n, p, k)$. We identify several arguments in the literature that implicitly show upper or lower bounds on $\beta, \beta^*$, and we provide some polynomially improvements to these bounds. In particular, we construct a tight lower bound for $\beta(n, p, 2)$, and we polynomially improve the upper bounds for $\beta(n, p, 4)$ and $\beta^*(n, p, \infty)$. 2) We show that many state-of-the-art results in network design can be recovered or improved via black-box reductions to $\beta$ or $\beta^*$. Examples include bounds for distance/reachability preservers, exact hopsets, shortcut sets, the flow-cut gaps for directed multicut and sparsest cut, an integrality gap for directed Steiner forest. We believe that the concept of bridge girth can lead to a stronger and more organized map of the research area. Towards this, we leave many open problems, related to both bridge girth reductions and extremal bounds on the size of path systems with high bridge girth.
Machine learning models exhibit two seemingly contradictory phenomena: training data memorization, and various forms of forgetting. In memorization, models overfit specific training examples and become susceptible to privacy attacks. In forgetting, examples which appeared early in training are forgotten by the end. In this work, we connect these phenomena. We propose a technique to measure to what extent models "forget" the specifics of training examples, becoming less susceptible to privacy attacks on examples they have not seen recently. We show that, while non-convex models can memorize data forever in the worst-case, standard image, speech, and language models empirically do forget examples over time. We identify nondeterminism as a potential explanation, showing that deterministically trained models do not forget. Our results suggest that examples seen early when training with extremely large datasets - for instance those examples used to pre-train a model - may observe privacy benefits at the expense of examples seen later.
Diffusion-based generative graph models have been proven effective in generating high-quality small graphs. However, they need to be more scalable for generating large graphs containing thousands of nodes desiring graph statistics. In this work, we propose EDGE, a new diffusion-based generative graph model that addresses generative tasks with large graphs. To improve computation efficiency, we encourage graph sparsity by using a discrete diffusion process that randomly removes edges at each time step and finally obtains an empty graph. EDGE only focuses on a portion of nodes in the graph at each denoising step. It makes much fewer edge predictions than previous diffusion-based models. Moreover, EDGE admits explicitly modeling the node degrees of the graphs, further improving the model performance. The empirical study shows that EDGE is much more efficient than competing methods and can generate large graphs with thousands of nodes. It also outperforms baseline models in generation quality: graphs generated by our approach have more similar graph statistics to those of the training graphs.
Modern DNN workloads increasingly rely on activation functions consisting of computationally complex operations. This poses a challenge to current accelerators optimized for convolutions and matrix-matrix multiplications. This work presents Flex-SFU, a lightweight hardware accelerator for activation functions implementing non-uniform piecewise interpolation supporting multiple data formats. Non-Uniform segments and floating-point numbers are enabled by implementing a binary-tree comparison within the address decoding unit. An SGD-based optimization algorithm with heuristics is proposed to find the interpolation function reducing the mean squared error. Thanks to non-uniform interpolation and floating-point support, Flex-SFU achieves on average 22.3x better mean squared error compared to previous piecewise linear interpolation approaches. The evaluation with more than 700 computer vision and natural language processing models shows that Flex-SFU can, on average, improve the end-to-end performance of state-of-the-art AI hardware accelerators by 35.7%, achieving up to 3.3x speedup with negligible impact in the models' accuracy when using 32 segments, and only introducing an area and power overhead of 5.9% and 0.8% relative to the baseline vector processing unit.
Privacy-preserving instance encoding aims to encode raw data as feature vectors without revealing their privacy-sensitive information. When designed properly, these encodings can be used for downstream ML applications such as training and inference with limited privacy risk. However, the vast majority of existing instance encoding schemes are based on heuristics and their privacy-preserving properties are only validated empirically against a limited set of attacks. In this paper, we propose a theoretically-principled measure for the privacy of instance encoding based on Fisher information. We show that our privacy measure is intuitive, easily applicable, and can be used to bound the invertibility of encodings both theoretically and empirically.
This paper investigates the problem of efficient constrained global optimization of composite functions (hybrid models) whose input is an expensive black-box function with vector-valued outputs and noisy observations, which often arises in real-world science, engineering, manufacturing, and control applications. We propose a novel algorithm, Constrained Upper Quantile Bound (CUQB), to solve such problems that directly exploits the composite structure of the objective and constraint functions that we show leads substantially improved sampling efficiency. CUQB is conceptually simple and avoids the constraint approximations used by previous methods. Although the CUQB acquisition function is not available in closed form, we propose a novel differentiable stochastic approximation that enables it to be efficiently maximized. We further derive bounds on the cumulative regret and constraint violation. Since these bounds depend sublinearly on the number of iterations under some regularity assumptions, we establish explicit bounds on the convergence rate to the optimal solution of the original constrained problem. In contrast to existing methods, CUQB further incorporates a simple infeasibility detection scheme, which we prove triggers in a finite number of iterations (with high probability) when the original problem is infeasible. Numerical experiments on several test problems, including environmental model calibration and real-time reactor optimization, show that CUQB significantly outperforms traditional Bayesian optimization in both constrained and unconstrained cases. Furthermore, compared to other state-of-the-art methods that exploit composite structure, CUQB achieves competitive empirical performance while also providing substantially improved theoretical guarantees.
Recently, Man\v{c}inska and Roberson proved that two graphs $G$ and $G'$ are quantum isomorphic if and only if they admit the same number of homomorphisms from all planar graphs. We extend this result to planar #CSP with any pair of sets $\mathcal{F}$ and $\mathcal{F}'$ of real-valued, arbitrary-arity constraint functions. Graph homomorphism is the special case where each of $\mathcal{F}$ and $\mathcal{F}'$ contains a single symmetric 0-1-valued binary constraint function. Our treatment uses the framework of planar Holant problems. To prove that quantum isomorphic constraint function sets give the same value on any planar #CSP instance, we apply a novel form of holographic transformation of Valiant, using the quantum permutation matrix $\mathcal{U}$ defining the quantum isomorphism. Due to the noncommutativity of $\mathcal{U}$'s entries, it turns out that this form of holographic transformation is only applicable to planar Holant. To prove the converse, we introduce the quantum automorphism group Qut$(\mathcal{F})$ of a set of constraint functions $\mathcal{F}$, and characterize the intertwiners of Qut$(\mathcal{F})$ as the signature matrices of planar Holant$(\mathcal{F}\,|\,\mathcal{EQ})$ quantum gadgets. Then we define a new notion of (projective) connectivity for constraint functions and reduce arity while preserving the quantum automorphism group. Finally, to address the challenges posed by generalizing from 0-1 valued to real-valued constraint functions, we adapt a technique of Lov\'asz in the classical setting for isomorphisms of real-weighted graphs to the setting of quantum isomorphisms.
We explore the features of two methods of stabilization, aggregation and supremizer methods, for reduced-order modeling of parametrized optimal control problems. In both methods, the reduced basis spaces are augmented to guarantee stability. For the aggregation method, the reduced basis approximation spaces for the state and adjoint variables are augmented in such a way that the spaces are identical. For the supremizer method, the reduced basis approximation space for the state-control product space is augmented with the solutions of a supremizer equation. We implement both of these methods for solving several parametrized control problems and assess their performance. Results indicate that the number of reduced basis vectors needed to approximate the solution space to some tolerance with the supremizer method is much larger, possibly double, that for aggregation. There are also some cases where the supremizer method fails to produce a converged solution. We present results to compare the accuracy, efficiency, and computational costs associated with both methods of stabilization which suggest that stabilization by aggregation is a superior stabilization method for control problems.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.
Normalization is known to help the optimization of deep neural networks. Curiously, different architectures require specialized normalization methods. In this paper, we study what normalization is effective for Graph Neural Networks (GNNs). First, we adapt and evaluate the existing methods from other domains to GNNs. Faster convergence is achieved with InstanceNorm compared to BatchNorm and LayerNorm. We provide an explanation by showing that InstanceNorm serves as a preconditioner for GNNs, but such preconditioning effect is weaker with BatchNorm due to the heavy batch noise in graph datasets. Second, we show that the shift operation in InstanceNorm results in an expressiveness degradation of GNNs for highly regular graphs. We address this issue by proposing GraphNorm with a learnable shift. Empirically, GNNs with GraphNorm converge faster compared to GNNs using other normalization. GraphNorm also improves the generalization of GNNs, achieving better performance on graph classification benchmarks.