Recent advances in neural-network architecture allow for seamless integration of convex optimization problems as differentiable layers in an end-to-end trainable neural network. Integrating medium and large scale quadratic programs into a deep neural network architecture, however, is challenging as solving quadratic programs exactly by interior-point methods has worst-case cubic complexity in the number of variables. In this paper, we present an alternative network layer architecture based on the alternating direction method of multipliers (ADMM) that is capable of scaling to problems with a moderately large number of variables. Backward differentiation is performed by implicit differentiation of the residual map of a modified fixed-point iteration. Simulated results demonstrate the computational advantage of the ADMM layer, which for medium scaled problems is approximately an order of magnitude faster than the OptNet quadratic programming layer. Furthermore, our novel backward-pass routine is efficient, from both a memory and computation standpoint, in comparison to the standard approach based on unrolled differentiation or implicit differentiation of the KKT optimality conditions. We conclude with examples from portfolio optimization in the integrated prediction and optimization paradigm.
We present a parametric family of semi-implicit second order accurate numerical methods for non-conservative and conservative advection equation for which the numerical solutions can be obtained in a fixed number of forward and backward alternating substitutions. The methods use a novel combination of implicit and explicit time discretizations for one-dimensional case and the Strang splitting method in several dimensional case. The methods are described for advection equations with a continuous variable velocity that can change its sign inside of computational domain. The methods are unconditionally stable in the non-conservative case for variable velocity and for variable numerical parameter. Several numerical experiments confirm the advantages of presented methods including an involvement of differential programming to find optimized values of the variable numerical parameter.
We present a novel energy-based numerical analysis of semilinear diffusion-reaction boundary value problems. Based on a suitable variational setting, the proposed computational scheme can be seen as an energy minimisation approach. More specifically, this procedure aims to generate a sequence of numerical approximations, which results from the iterative solution of related (stabilised) linearised discrete problems, and tends to a local minimum of the underlying energy functional. Simultaneously, the finite-dimensional approximation spaces are adaptively refined; this is implemented in terms of a new mesh refinement strategy in the context of finite element discretisations, which again relies on the energy structure of the problem under consideration, and does not involve any a posteriori error indicators. In combination, the resulting adaptive algorithm consists of an iterative linearisation procedure on a sequence of hierarchically refined discrete spaces, which we prove to converge towards a solution of the continuous problem in an appropriate sense. Numerical experiments demonstrate the robustness and reliability of our approach for a series of examples.
Haemorrhaging of the brain is the leading cause of death in people between the ages of 15 and 24 and the third leading cause of death in people older than that. Computed tomography (CT) is an imaging modality used to diagnose neurological emergencies, including stroke and traumatic brain injury. Recent advances in Deep Learning and Image Processing have utilised different modalities like CT scans to help automate the detection and segmentation of brain haemorrhage occurrences. In this paper, we propose a novel implementation of an architecture consisting of traditional Convolutional Neural Networks(CNN) along with Graph Neural Networks(GNN) to produce a holistic model for the task of brain haemorrhage segmentation.GNNs work on the principle of neighbourhood aggregation thus providing a reliable estimate of global structures present in images. GNNs work with few layers thus in turn requiring fewer parameters to work with. We were able to achieve a dice coefficient score of around 0.81 with limited data with our implementation.
A singularly perturbed parabolic problem of convection-diffusion type with a discontinuous initial condition is examined. An analytic function is identified which matches the discontinuity in the initial condition and also satisfies the homogenous parabolic differential equation associated with the problem. The difference between this analytical function and the solution of the parabolic problem is approximated numerically, using an upwind finite difference operator combined with an appropriate layer-adapted mesh. The numerical method is shown to be parameter-uniform. Numerical results are presented to illustrate the theoretical error bounds established in the paper.
Differentiable Wavetable Synthesis (DWTS) is a technique for neural audio synthesis which learns a dictionary of one-period waveforms i.e. wavetables, through end-to-end training. We achieve high-fidelity audio synthesis with as little as 10 to 20 wavetables and demonstrate how a data-driven dictionary of waveforms opens up unprecedented one-shot learning paradigms on short audio clips. Notably, we show audio manipulations, such as high quality pitch-shifting, using only a few seconds of input audio. Lastly, we investigate performance gains from using learned wavetables for realtime and interactive audio synthesis.
This paper introduces the Banzhaf and Banzhaf-Owen values as novel centrality measures for ranking terrorists in a network. This new approach let integrate the complete topology (i.e. nodes and edges) of the network and a coalitional structure on the nodes of the network. More precisely, the characteristics of the nodes (e.g., terrorists) of the network and their possible relationships (e.g., types of communication links), as well as coalitional information (e.g. level of hierarchies) independent of the network. First, for both centrality measures, we provide approximation algorithms and the corresponding R-codes. Second, as illustration, we rank the members of the Zerkani network, responsible for the attacks in Paris (2015) and Brussels (2016). Finally, we give a comparison between the rankings established by Banzhaf and Banzhaf-Owen and the rankings obtained when using the Shapley value (cf. Hamers et al., 2019) and the Owen value as centrality measures
Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.
Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.
Inferencing with network data necessitates the mapping of its nodes into a vector space, where the relationships are preserved. However, with multi-layered networks, where multiple types of relationships exist for the same set of nodes, it is crucial to exploit the information shared between layers, in addition to the distinct aspects of each layer. In this paper, we propose a novel approach that first obtains node embeddings in all layers jointly via DeepWalk on a \textit{supra} graph, which allows interactions between layers, and then fine-tunes the embeddings to encourage cohesive structure in the latent space. With empirical studies in node classification, link prediction and multi-layered community detection, we show that the proposed approach outperforms existing single- and multi-layered network embedding algorithms on several benchmarks. In addition to effectively scaling to a large number of layers (tested up to $37$), our approach consistently produces highly modular community structure, even when compared to methods that directly optimize for the modularity function.
This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.