We compare the solutions of two systems of partial differential equations (PDE), seen as two different interpretations of the same model that describes formation of complex biological networks. Both approaches take into account the time evolution of the medium flowing through the network, and we compute the solution of an elliptic-parabolic PDE system for the conductivity vector $m$, the conductivity tensor $\mathbb{C}$ and the pressure $p$. We use finite differences schemes in a uniform Cartesian grid in the spatially two-dimensional setting to solve the two systems, where the parabolic equation is solved by a semi-implicit scheme in time. Since the conductivity vector and tensor appear also in the Poisson equation for the pressure $p$, the elliptic equation depends implicitly on time. For this reason we compute the solution of three linear systems in the case of the conductivity vector $m\in\mathbb{R}^2$, and four linear systems in the case of the symmetric conductivity tensor $\mathbb{C}\in\mathbb{R}^{2\times 2}$, at each time step. To accelerate the simulations, we make use of the Alternating Direction Implicit (ADI) method. The role of the parameters is important for obtaining detailed solutions. We provide numerous tests with various values of the parameters involved, to see the differences in the solutions of the two systems.
Despite significant advances, deep networks remain highly susceptible to adversarial attack. One fundamental challenge is that small input perturbations can often produce large movements in the network's final-layer feature space. In this paper, we define an attack model that abstracts this challenge, to help understand its intrinsic properties. In our model, the adversary may move data an arbitrary distance in feature space but only in random low-dimensional subspaces. We prove such adversaries can be quite powerful: defeating any algorithm that must classify any input it is given. However, by allowing the algorithm to abstain on unusual inputs, we show such adversaries can be overcome when classes are reasonably well-separated in feature space. We further provide strong theoretical guarantees for setting algorithm parameters to optimize over accuracy-abstention trade-offs using data-driven methods. Our results provide new robustness guarantees for nearest-neighbor style algorithms, and also have application to contrastive learning, where we empirically demonstrate the ability of such algorithms to obtain high robust accuracy with low abstention rates. Our model is also motivated by strategic classification, where entities being classified aim to manipulate their observable features to produce a preferred classification, and we provide new insights into that area as well.
Learned regularization for MRI reconstruction can provide complex data-driven priors to inverse problems while still retaining the control and insight of a variational regularization method. Moreover, unsupervised learning, without paired training data, allows the learned regularizer to remain flexible to changes in the forward problem such as noise level, sampling pattern or coil sensitivities. One such approach uses generative models, trained on ground-truth images, as priors for inverse problems, penalizing reconstructions far from images the generator can produce. In this work, we utilize variational autoencoders (VAEs) that generate not only an image but also a covariance uncertainty matrix for each image. The covariance can model changing uncertainty dependencies caused by structure in the image, such as edges or objects, and provides a new distance metric from the manifold of learned images. We demonstrate these novel generative regularizers on radially sub-sampled MRI knee measurements from the fastMRI dataset and compare them to other unlearned, unsupervised and supervised methods. Our results show that the proposed method is competitive with other state-of-the-art methods and behaves consistently with changing sampling patterns and noise levels.
Estimating the expectations of functionals applied to sums of random variables (RVs) is a well-known problem encountered in many challenging applications. Generally, closed-form expressions of these quantities are out of reach. A naive Monte Carlo simulation is an alternative approach. However, this method requires numerous samples for rare event problems. Therefore, it is paramount to use variance reduction techniques to develop fast and efficient estimation methods. In this work, we use importance sampling (IS), known for its efficiency in requiring fewer computations to achieve the same accuracy requirements. We propose a state-dependent IS scheme based on a stochastic optimal control formulation, where the control is dependent on state and time. We aim to calculate rare event quantities that could be written as an expectation of a functional of the sums of independent RVs. The proposed algorithm is generic and can be applied without restrictions on the univariate distributions of RVs or the functional applied to the sum. We apply this approach to the log-normal distribution to compute the left tail and cumulative distribution of the ratio of independent RVs. For each case, we numerically demonstrate that the proposed state-dependent IS algorithm compares favorably to most well-known estimators dealing with similar problems.
Recently cloud-based graph convolutional network (GCN) has demonstrated great success and potential in many privacy-sensitive applications such as personal healthcare and financial systems. Despite its high inference accuracy and performance on cloud, maintaining data privacy in GCN inference, which is of paramount importance to these practical applications, remains largely unexplored. In this paper, we take an initial attempt towards this and develop $\textit{CryptoGCN}$--a homomorphic encryption (HE) based GCN inference framework. A key to the success of our approach is to reduce the tremendous computational overhead for HE operations, which can be orders of magnitude higher than its counterparts in the plaintext space. To this end, we develop an approach that can effectively take advantage of the sparsity of matrix operations in GCN inference to significantly reduce the computational overhead. Specifically, we propose a novel AMA data formatting method and associated spatial convolution methods, which can exploit the complex graph structure and perform efficient matrix-matrix multiplication in HE computation and thus greatly reduce the HE operations. We also develop a co-optimization framework that can explore the trade offs among the accuracy, security level, and computational overhead by judicious pruning and polynomial approximation of activation module in GCNs. Based on the NTU-XVIEW skeleton joint dataset, i.e., the largest dataset evaluated homomorphically by far as we are aware of, our experimental results demonstrate that $\textit{CryptoGCN}$ outperforms state-of-the-art solutions in terms of the latency and number of homomorphic operations, i.e., achieving as much as a 3.10$\times$ speedup on latency and reduces the total Homomorphic Operation Count by 77.4\% with a small accuracy loss of 1-1.5$\%$.
Spatially inhomogeneous functions, which may be smooth in some regions and rough in other regions, are modelled naturally in a Bayesian manner using so-called Besov priors which are given by random wavelet expansions with Laplace-distributed coefficients. This paper studies theoretical guarantees for such prior measures - specifically, we examine their frequentist posterior contraction rates in the setting of non-linear inverse problems with Gaussian white noise. Our results are first derived under a general local Lipschitz assumption on the forward map. We then verify the assumption for two non-linear inverse problems arising from elliptic partial differential equations, the Darcy flow model from geophysics as well as a model for the Schr\"odinger equation appearing in tomography. In the course of the proofs, we also obtain novel concentration inequalities for penalized least squares estimators with $\ell^1$ wavelet penalty, which have a natural interpretation as maximum a posteriori (MAP) estimators. The true parameter is assumed to belong to some spatially inhomogeneous Besov class $B^{\alpha}_{11}$, $\alpha>0$. In a setting with direct observations, we complement these upper bounds with a lower bound on the rate of contraction for arbitrary Gaussian priors. An immediate consequence of our results is that while Laplace priors can achieve minimax-optimal rates over $B^{\alpha}_{11}$-classes, Gaussian priors are limited to a (by a polynomial factor) slower contraction rate. This gives information-theoretical justification for the intuition that Laplace priors are more compatible with $\ell^1$ regularity structure in the underlying parameter.
Bayesian nonparametric mixture models are common for modeling complex data. While these models are well-suited for density estimation, their application for clustering has some limitations. Miller and Harrison (2014) proved posterior inconsistency in the number of clusters when the true number of clusters is finite for Dirichlet process and Pitman--Yor process mixture models. In this work, we extend this result to additional Bayesian nonparametric priors such as Gibbs-type processes and finite-dimensional representations of them. The latter include the Dirichlet multinomial process and the recently proposed Pitman--Yor and normalized generalized gamma multinomial processes. We show that mixture models based on these processes are also inconsistent in the number of clusters and discuss possible solutions. Notably, we show that a post-processing algorithm introduced by Guha et al. (2021) for the Dirichlet process extends to more general models and provides a consistent method to estimate the number of components.
For the well-known Survivable Network Design Problem (SNDP) we are given an undirected graph $G$ with edge costs, a set $R$ of terminal vertices, and an integer demand $d_{s,t}$ for every terminal pair $s,t\in R$. The task is to compute a subgraph $H$ of $G$ of minimum cost, such that there are at least $d_{s,t}$ disjoint paths between $s$ and $t$ in $H$. If the paths are required to be edge-disjoint we obtain the edge-connectivity variant (EC-SNDP), while internally vertex-disjoint paths result in the vertex-connectivity variant (VC-SNDP). Another important case is the element-connectivity variant (LC-SNDP), where the paths are disjoint on edges and non-terminals. In this work we shed light on the parameterized complexity of the above problems. We consider several natural parameters, which include the solution size $\ell$, the sum of demands $D$, the number of terminals $k$, and the maximum demand $d_\max$. Using simple, elegant arguments, we prove the following results. - We give a complete picture of the parameterized tractability of the three variants w.r.t. parameter $\ell$: both EC-SNDP and LC-SNDP are FPT, while VC-SNDP is W[1]-hard. - We identify some special cases of VC-SNDP that are FPT: * when $d_\max\leq 3$ for parameter $\ell$, * on locally bounded treewidth graphs (e.g., planar graphs) for parameter $\ell$, and * on graphs of treewidth $tw$ for parameter $tw+D$. - The well-known Directed Steiner Tree (DST) problem can be seen as single-source EC-SNDP with $d_\max=1$ on directed graphs, and is FPT parameterized by $k$ [Dreyfus & Wagner 1971]. We show that in contrast, the 2-DST problem, where $d_\max=2$, is W[1]-hard, even when parameterized by $\ell$.
This work presents concepts and algorithms for the simulation of dynamic fractures with a Lattice Boltzmann method (LBM) for linear elastic solids. This LBM has been presented previously and solves the wave equation, which is interpreted as the governing equation for antiplane shear deformation. Besides the steady growth of a crack at a prescribed crack velocity, a fracture criterion based on stress intensity factors (SIF) has been implemented. This is the first time, that crack propagation with a mechanically relevant criterion is regarded in the context of LBMs. Numerical results are examined to validate the proposed method. The concepts of crack propagation introduced here are not limited to mode III cracks or the simplified deformation assumption of antiplane shear. By introducing a rather simple processing step into the existing LBM at the level of individual lattice sites, the overall performance of the LBM is maintained. Our findings underline the validity of the LBM as a numerical tool to simulate solids in general as well as dynamic fractures in particular.
Machine learning (ML) models are costly to train as they can require a significant amount of data, computational resources and technical expertise. Thus, they constitute valuable intellectual property that needs protection from adversaries wanting to steal them. Ownership verification techniques allow the victims of model stealing attacks to demonstrate that a suspect model was in fact stolen from theirs. Although a number of ownership verification techniques based on watermarking or fingerprinting have been proposed, most of them fall short either in terms of security guarantees (well-equipped adversaries can evade verification) or computational cost. A fingerprinting technique introduced at ICLR '21, Dataset Inference (DI), has been shown to offer better robustness and efficiency than prior methods. The authors of DI provided a correctness proof for linear (suspect) models. However, in the same setting, we prove that DI suffers from high false positives (FPs) -- it can incorrectly identify an independent model trained with non-overlapping data from the same distribution as stolen. We further prove that DI also triggers FPs in realistic, non-linear suspect models. We then confirm empirically that DI leads to FPs, with high confidence. Second, we show that DI also suffers from false negatives (FNs) -- an adversary can fool DI by regularising a stolen model's decision boundaries using adversarial training, thereby leading to an FN. To this end, we demonstrate that DI fails to identify a model adversarially trained from a stolen dataset -- the setting where DI is the hardest to evade. Finally, we discuss the implications of our findings, the viability of fingerprinting-based ownership verification in general, and suggest directions for future work.
As soon as abstract mathematical computations were adapted to computation on digital computers, the problem of efficient representation, manipulation, and communication of the numerical values in those computations arose. Strongly related to the problem of numerical representation is the problem of quantization: in what manner should a set of continuous real-valued numbers be distributed over a fixed discrete set of numbers to minimize the number of bits required and also to maximize the accuracy of the attendant computations? This perennial problem of quantization is particularly relevant whenever memory and/or computational resources are severely restricted, and it has come to the forefront in recent years due to the remarkable performance of Neural Network models in computer vision, natural language processing, and related areas. Moving from floating-point representations to low-precision fixed integer values represented in four bits or less holds the potential to reduce the memory footprint and latency by a factor of 16x; and, in fact, reductions of 4x to 8x are often realized in practice in these applications. Thus, it is not surprising that quantization has emerged recently as an important and very active sub-area of research in the efficient implementation of computations associated with Neural Networks. In this article, we survey approaches to the problem of quantizing the numerical values in deep Neural Network computations, covering the advantages/disadvantages of current methods. With this survey and its organization, we hope to have presented a useful snapshot of the current research in quantization for Neural Networks and to have given an intelligent organization to ease the evaluation of future research in this area.