We introduce a new model for contagion spread using a network of interacting finite memory two-color P\'{o}lya urns, which we refer to as the finite memory interacting P\'{o}lya contagion network. The urns interact in the sense that the probability of drawing a red ball (which represents an infection state) for a given urn, not only depends on the ratio of red balls in that urn but also on the ratio of red balls in the other urns in the network, hence accounting for the effect of spatial contagion. The resulting network-wide contagion process is a discrete-time finite-memory ($M$th order) Markov process, whose transition probability matrix is determined. The stochastic properties of the network contagion Markov process are analytically examined, and for homogeneous system parameters, we characterize the limiting state of infection in each urn. For the non-homogeneous case, given the complexity of the stochastic process, and in the same spirit as the well-studied SIS models, we use a mean-field type approximation to obtain a discrete-time dynamical system for the finite memory interacting P\'{o}lya contagion network. Interestingly, for $M=1$, we obtain a linear dynamical system which exactly represents the corresponding Markov process. For $M>1$, we use mean-field approximation to obtain a nonlinear dynamical system. Furthermore, noting that the latter dynamical system admits a linear variant (realized by retaining its leading linear terms), we study the asymptotic behavior of the linear systems for both memory modes and characterize their equilibrium. Finally, we present simulation studies to assess the quality of the approximation purveyed by the linear and non-linear dynamical systems.
In this paper the singular Emden-Fowler equation of fractional order is introduced and a computational method is proposed for its numerical solution. For the approximation of the solutions we have used Boubaker polynomials and defined the formulation for its fractional derivative operational matrix. This tool was not used yet, however, this area has not found many practical applications yet, and here introduced for the first time. The operational matrix of the Caputo fractional derivative tool converts these problems to a system of algebraic equations whose solutions are simple and easy to compute. Numerical examples are examined to prove the validity and the effectiveness of the proposed method to find approximate and precise solutions.
We introduce the Conic Blackwell Algorithm$^+$ (CBA$^+$) regret minimizer, a new parameter- and scale-free regret minimizer for general convex sets. CBA$^+$ is based on Blackwell approachability and attains $O(\sqrt{T})$ regret. We show how to efficiently instantiate CBA$^+$ for many decision sets of interest, including the simplex, $\ell_{p}$ norm balls, and ellipsoidal confidence regions in the simplex. Based on CBA$^+$, we introduce SP-CBA$^+$, a new parameter-free algorithm for solving convex-concave saddle-point problems, which achieves a $O(1/\sqrt{T})$ ergodic rate of convergence. In our simulations, we demonstrate the wide applicability of SP-CBA$^+$ on several standard saddle-point problems, including matrix games, extensive-form games, distributionally robust logistic regression, and Markov decision processes. In each setting, SP-CBA$^+$ achieves state-of-the-art numerical performance, and outperforms classical methods, without the need for any choice of step sizes or other algorithmic parameters.
The aim of this paper is to propose an efficient adaptive finite element method for eigenvalue problems based on the multilevel correction scheme and inverse power method. This method involves solving associated boundary value problems on each adaptive partitions and very low dimensional eigenvalue problems on some special meshes which are controlled by the proposed algorithm. Since we Hence the efficiency of solving eigenvalue problems can be improved to be similar to the adaptive finite element method for the associated boundary value problems. The convergence and optimal complexity is theoretically verified and numerically demonstrated.
We propose a new bootstrap-based online algorithm for stochastic linear bandit problems. The key idea is to adopt residual bootstrap exploration, in which the agent estimates the next step reward by re-sampling the residuals of mean reward estimate. Our algorithm, residual bootstrap exploration for stochastic linear bandit (\texttt{LinReBoot}), estimates the linear reward from its re-sampling distribution and pulls the arm with the highest reward estimate. In particular, we contribute a theoretical framework to demystify residual bootstrap-based exploration mechanisms in stochastic linear bandit problems. The key insight is that the strength of bootstrap exploration is based on collaborated optimism between the online-learned model and the re-sampling distribution of residuals. Such observation enables us to show that the proposed \texttt{LinReBoot} secure a high-probability $\tilde{O}(d \sqrt{n})$ sub-linear regret under mild conditions. Our experiments support the easy generalizability of the \texttt{ReBoot} principle in the various formulations of linear bandit problems and show the significant computational efficiency of \texttt{LinReBoot}.
The Dynamic Mode Decomposition has proved to be a very efficient technique to study dynamic data. This is entirely a data-driven approach that extracts all necessary information from data snapshots which are commonly supposed to be sampled from measurement. The application of this approach becomes problematic if the available data is incomplete because some dimensions of smaller scale either missing or unmeasured. Such setting occurs very often in modeling complex dynamical systems such as power grids, in particular with reduced-order modeling. To take into account the effect of unresolved variables the optimal prediction approach based on the Mori-Zwanzig formalism can be applied to obtain the most expected prediction under existing uncertainties. This effectively leads to the development of a time-predictive model accounting for the impact of missing data. In the present paper we provide a detailed derivation of the considered method from the Liouville equation and finalize it with the optimization problem that defines the optimal transition operator corresponding to the observed data. In contrast to the existing approach, we consider a first-order approximation of the Mori-Zwanzig decomposition, state the corresponding optimization problem and solve it with the gradient-based optimization method. The gradient of the obtained objective function is computed precisely through the automatic differentiation technique. The numerical experiments illustrate that the considered approach gives practically the same dynamics as the exact Mori-Zwanzig decomposition, but is less computationally intensive.
We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points, which can lead to more scalable algorithms than previous methods. It is based on decomposing a Gaussian process as a sum of two independent processes: one spanned by a finite basis of inducing points and the other capturing the remaining variation. We show that this formulation recovers existing approximations and at the same time allows to obtain tighter lower bounds on the marginal likelihood and new stochastic variational inference algorithms. We demonstrate the efficiency of these algorithms in several Gaussian process models ranging from standard regression to multi-class classification using (deep) convolutional Gaussian processes and report state-of-the-art results on CIFAR-10 among purely GP-based models.
Many complex systems are composed of interacting parts, and the underlying laws are usually simple and universal. While graph neural networks provide a useful relational inductive bias for modeling such systems, generalization to new system instances of the same type is less studied. In this work we trained graph neural networks to fit time series from an example nonlinear dynamical system, the belief propagation algorithm. We found simple interpretations of the learned representation and model components, and they are consistent with core properties of the probabilistic inference algorithm. We successfully identified a `graph translator' between the statistical interactions in belief propagation and parameters of the corresponding trained network, and showed that it enables two types of novel generalization: to recover the underlying structure of a new system instance based solely on time series observations, or to construct a new network from this structure directly. Our results demonstrated a path towards understanding both dynamics and structure of a complex system and how such understanding can be used for generalization.
Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.
In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.