We propose a method of constructing a joint statistical model for mixed-domain data to analyze their dependence. Multivariate Gaussian and log-linear models are particular examples of the proposed model. It is shown that the functional equation defining the model has a unique solution under fairly weak conditions. The model is characterized by two orthogonal sets of parameters: the dependence parameter and the marginal parameter. To estimate the dependence parameter, a conditional inference together with a sampling procedure is established and is shown to provide a consistent estimator of the dependence parameter. Illustrative examples of data analyses involving penguins and earthquakes are presented.
We consider the problem of computing a sparse binary representation of an image. To be precise, given an image and an overcomplete, non-orthonormal basis, we aim to find a sparse binary vector indicating the minimal set of basis vectors that when added together best reconstruct the given input. We formulate this problem with an $L_2$ loss on the reconstruction error, and an $L_0$ (or, equivalently, an $L_1$) loss on the binary vector enforcing sparsity. This yields a so-called Quadratic Unconstrained Binary Optimization (QUBO) problem, whose solution is generally NP-hard to find. The contribution of this work is twofold. First, the method of unsupervised and unnormalized dictionary feature learning for a desired sparsity level to best match the data is presented. Second, the binary sparse coding problem is then solved on the Loihi 1 neuromorphic chip by the use of stochastic networks of neurons to traverse the non-convex energy landscape. The solutions are benchmarked against the classical heuristic simulated annealing. We demonstrate neuromorphic computing is suitable for sampling low energy solutions of binary sparse coding QUBO models, and although Loihi 1 is capable of sampling very sparse solutions of the QUBO models, there needs to be improvement in the implementation in order to be competitive with simulated annealing.
We consider the information fiber optical channel modeled by the nonlinear Schrodinger equation with additive Gaussian noise. Using path-integral approach and perturbation theory for the small dimensionless parameter of the second dispersion, we calculate the conditional probability density functional in the leading and next-to-leading order in the dimensionless second dispersion parameter associated with the input signal bandwidth. Taking into account specific filtering of the output signal by the output signal receiver, we calculate the mutual information in the leading and next-to-leading order in the dispersion parameter and in the leading order in the parameter signal-to-noise ratio (SNR). Further, we find the explicit expression for the mutual information in case of the modified Gaussian input signal distribution taking into account the limited frequency bandwidth of the input signal.
A non-intrusive model order reduction (MOR) method that combines features of the dynamic mode decomposition (DMD) and the radial basis function (RBF) network is proposed to predict the dynamics of parametric nonlinear systems. In many applications, we have limited access to the information of the whole system, which motivates non-intrusive model reduction. One bottleneck is capturing the dynamics of the solution without knowing the physics inside the "black-box" system. DMD is a powerful tool to mimic the dynamics of the system and give a reliable approximation of the solution in the time domain using only the dominant DMD modes. However, DMD cannot reproduce the parametric behavior of the dynamics. Our contribution focuses on extending DMD to parametric DMD by RBF interpolation. Specifically, a RBF network is first trained using snapshot matrices at limited parameter samples. The snapshot matrix at any new parameter sample can be quickly learned from the RBF network. DMD will use the newly generated snapshot matrix at the online stage to predict the time patterns of the dynamics corresponding to the new parameter sample. The proposed framework and algorithm are tested and validated by numerical examples including models with parametrized and time-varying inputs.
We consider the problem of finite-time identification of linear dynamical systems from $T$ samples of a single trajectory. Recent results have predominantly focused on the setup where no structural assumption is made on the system matrix $A^* \in \mathbb{R}^{n \times n}$, and have consequently analyzed the ordinary least squares (OLS) estimator in detail. We assume prior structural information on $A^*$ is available, which can be captured in the form of a convex set $\mathcal{K}$ containing $A^*$. For the solution of the ensuing constrained least squares estimator, we derive non-asymptotic error bounds in the Frobenius norm that depend on the local size of $\mathcal{K}$ at $A^*$. To illustrate the usefulness of these results, we instantiate them for three examples, namely when (i) $A^*$ is sparse and $\mathcal{K}$ is a suitably scaled $\ell_1$ ball; (ii) $\mathcal{K}$ is a subspace; (iii) $\mathcal{K}$ consists of matrices each of which is formed by sampling a bivariate convex function on a uniform $n \times n$ grid (convex regression). In all these situations, we show that $A^*$ can be reliably estimated for values of $T$ much smaller than what is needed for the unconstrained setting.
We are interested in numerically solving a transitional model derived from the Bloch model. The Bloch equation describes the time evolution of the density matrix of a quantum system forced by an electromagnetic wave. In a high frequency and low amplitude regime, it asymptotically reduces to a non-stiff rate equation. As a middle ground, the transitional model governs the diagonal part of the density matrix. It fits in a general setting of linear problems with a high-frequency quasi-periodic forcing and an exponentially decaying forcing. The numerical resolution of such problems is challenging. Adapting high-order averaging techniques to this setting, we separate the slow (rate) dynamics from the fast (oscillatory and decay) dynamics to derive a new micro-macro problem. We derive estimates for the size of the micro part of the decomposition, and of its time derivatives, showing that this new problem is non-stiff. As such, we may solve this micro-macro problem with uniform accuracy using standard numerical schemes. To validate this approach, we present numerical results first on a toy problem and then on the transitional Bloch model.
Bayesian inference and kernel methods are well established in machine learning. The neural network Gaussian process in particular provides a concept to investigate neural networks in the limit of infinitely wide hidden layers by using kernel and inference methods. Here we build upon this limit and provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks. We systematically compute generalization properties of linear, non-linear, and deep non-linear networks for kernel matrices with heterogeneous entries. In contrast to currently employed spectral methods we derive the generalization properties from the statistical properties of the input, elucidating the interplay of input dimensionality, size of the training data set, and variability of the data. We show that data variability leads to a non-Gaussian action reminiscent of a ($\varphi^3+\varphi^4$)-theory. Using our formalism on a synthetic task and on MNIST we obtain a homogeneous kernel matrix approximation for the learning curve as well as corrections due to data variability which allow the estimation of the generalization properties and exact results for the bounds of the learning curves in the case of infinitely many training data points.
Which technological linkages affect the sector's ability to innovate? How do these effects transmit through the technology space? This paper answers these two key questions using novel methods of text mining and network analysis. We examine technological interdependence across sectors over a period of half a century (from 1976 to 2021) by analyzing the text of 6.5 million patents granted by the United States Patent and Trademark Office (USPTO), and applying network analysis to uncover the full spectrum of linkages existing across technology areas. We demonstrate that patent text contains a wealth of information often not captured by traditional innovation metrics, such as patent citations. By using network analysis, we document that indirect linkages are as important as direct connections and that the former would remain mostly hidden using more traditional measures of indirect linkages, such as the Leontief inverse matrix. Finally, based on an impulse-response analysis, we illustrate how technological shocks transmit through the technology (network-based) space, affecting the innovation capacity of the sectors.
We propose a new concept of codivergence, which quantifies the similarity between two probability measures $P_1, P_2$ relative to a reference probability measure $P_0$. In the neighborhood of the reference measure $P_0$, a codivergence behaves like an inner product between the measures $P_1 - P_0$ and $P_2 - P_0$. Codivergences of covariance-type and correlation-type are introduced and studied with a focus on two specific correlation-type codivergences, the $\chi^2$-codivergence and the Hellinger codivergence. We derive explicit expressions for several common parametric families of probability distributions. For a codivergence, we introduce moreover the divergence matrix as an analogue of the Gram matrix. It is shown that the $\chi^2$-divergence matrix satisfies a data-processing inequality.
In this work, an exponential Discontinuous Galerkin (DG) method is proposed to solve numerically Vlasov type equations. The DG method is used for space discretization which is combined exponential Lawson Runge-Kutta method for time discretization to get high order accuracy in time and space. In addition to get high order accuracy in time, the use of Lawson methods enables to overcome the stringent condition on the time step induced by the linear part of the system. Moreover, it can be proved that a discrete Poisson equation is preserved. Numerical results on Vlasov-Poisson and Vlasov Maxwell equations are presented to illustrate the good behavior of the exponential DG method.
Monte Carlo methods represent a cornerstone of computer science. They allow to sample high dimensional distribution functions in an efficient way. In this paper we consider the extension of Automatic Differentiation (AD) techniques to Monte Carlo process, addressing the problem of obtaining derivatives (and in general, the Taylor series) of expectation values. Borrowing ideas from the lattice field theory community, we examine two approaches. One is based on reweighting while the other represents an extension of the Hamiltonian approach typically used by the Hybrid Monte Carlo (HMC) and similar algorithms. We show that the Hamiltonian approach can be understood as a change of variables of the reweighting approach, resulting in much reduced variances of the coefficients of the Taylor series. This work opens the door to find other variance reduction techniques for derivatives of expectation values.