We investigate the use of models from the theory of regularity structure as features in machine learning tasks. A model is a multi-linear function of a space-time signal designed to well-approximate solutions to partial differential equations (PDEs), even in low regularity regimes. Models can be seen as natural multi-dimensional generalisations of signatures of paths; our work therefore aims to extend the recent use of signatures in data science beyond the context of time-ordered data. We provide a flexible definition of a model feature vector associated to a space-time signal, along with two algorithms which illustrate ways in which these features can be combined with linear regression. We apply these algorithms in several numerical experiments designed to learn solutions to PDEs with a given forcing and boundary data. Our experiments include semi-linear parabolic and wave equations with forcing, and Burgers' equation with no forcing. We find an advantage in favour of our algorithms when compared to several alternative methods. Additionally, in the experiment with Burgers' equation, we noticed stability in the prediction power when noise is added to the observations.
Modern datasets often contain large subsets of correlated features and nuisance features, which are not or loosely related to the main underlying structures of the data. Nuisance features can be identified using the Laplacian score criterion, which evaluates the importance of a given feature via its consistency with the Graph Laplacians' leading eigenvectors. We demonstrate that in the presence of large numbers of nuisance features, the Laplacian must be computed on the subset of selected features rather than on the complete feature set. To do this, we propose a fully differentiable approach for unsupervised feature selection, utilizing the Laplacian score criterion to avoid the selection of nuisance features. We employ an autoencoder architecture to cope with correlated features, trained to reconstruct the data from the subset of selected features. Building on the recently proposed concrete layer that allows controlling for the number of selected features via architectural design, simplifying the optimization process. Experimenting on several real-world datasets, we demonstrate that our proposed approach outperforms similar approaches designed to avoid only correlated or nuisance features, but not both. Several state-of-the-art clustering results are reported.
The promise of quantum computing to open new unexplored possibilities in several scientific fields has been long discussed, but until recently the lack of a functional quantum computer has confined this discussion mostly to theoretical algorithmic papers. It was only in the last few years that small but functional quantum computers have become available to the broader research community. One paradigm in particular, quantum annealing, can be used to sample optimal solutions for a number of NP-hard optimization problems represented with classical operations research tools, providing an easy access to the potential of this emerging technology. One of the tasks that most naturally fits in this mathematical formulation is feature selection. In this paper, we investigate how to design a hybrid feature selection algorithm for recommender systems that leverages the domain knowledge and behavior hidden in the user interactions data. We represent the feature selection as an optimization problem and solve it on a real quantum computer, provided by D-Wave. The results indicate that the proposed approach is effective in selecting a limited set of important features and that quantum computers are becoming powerful enough to enter the wider realm of applied science.
One of the major issues in the computational mechanics is to take into account the geometrical complexity. To overcome this difficulty and to avoid the expensive mesh generation, geometrically unfitted methods, i.e. the numerical methods using the simple computational meshes that do not fit the boundary of the domain, and/or the internal interfaces, have been widely developed. In the present work, we investigate the performances of an unfitted method called $\phi$-FEM that converges optimally and uses classical finite element spaces so that it can be easily implemented using general FEM libraries. The main idea is to take into account the geometry thanks to a level set function describing the boundary or the interface. Up to now, the $\phi$-FEM approach has been proposed, tested and substantiated mathematically only in some simplest settings: Poisson equation with Dirichlet/Neumann/Robin boundary conditions. Our goal here is to demonstrate its applicability to some more sophisticated governing equations arising in the computational mechanics. We consider the linear elasticity equations accompanied by either pure Dirichlet boundary conditions or by the mixed ones (Dirichlet and Neumann boundary conditions co-existing on parts of the boundary), an interface problem (linear elasticity with material coefficients abruptly changing over an internal interface), a model of elastic structures with cracks, and finally the heat equation. In all these settings, we derive an appropriate variant of $\phi$-FEM and then illustrate it by numerical tests on manufactured solutions. We also compare the accuracy and efficiency of $\phi$-FEM with those of the standard fitted FEM on the meshes of similar size, revealing the substantial gains that can be achieved by $\phi$-FEM in both the accuracy and the computational time.
Harnessing a block-sparse prior to recover signals through underdetermined linear measurements has been extensively shown to allow exact recovery in conditions where classical compressed sensing would provably fail. We exploit this result to propose a novel private communication framework where the secrecy is achieved by transmitting instances of an unidentifiable compressed sensing problem over a public channel. The legitimate receiver can attempt to overcome this ill-posedness by leveraging secret knowledge of a block structure that was used to encode the transmitter's message. We study the privacy guarantees of this communication protocol to a single transmission, and to multiple transmissions without refreshing the shared secret. Additionally, we propose an algorithm for an eavesdropper to learn the block structure via the method of moments and highlight the privacy benefits of this framework through numerical experiments.
Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data. Our method is based on the observation that target data, which might no longer align with the source domain classifier, still forms clear clusters. We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity. We observe that higher affinity should be assigned to reciprocal neighbors, and propose a self regularization loss to decrease the negative impact of noisy neighbors. Furthermore, to aggregate information with more context, we consider expanded neighborhoods with small affinity values. In the experimental results we verify that the inherent structure of the target features is an important source of information for domain adaptation. We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood. Finally, we achieve state-of-the-art performance on several 2D image and 3D point cloud recognition datasets. Code is available in //github.com/Albert0147/SFDA_neighbors.
In this paper, we focus on learning sparse graphs with a core-periphery structure. We propose a generative model for data associated with core-periphery structured networks to model the dependence of node attributes on core scores of the nodes of a graph through a latent graph structure. Using the proposed model, we jointly infer a sparse graph and nodal core scores that induce dense (sparse) connections in core (respectively, peripheral) parts of the network. Numerical experiments on a variety of real-world data indicate that the proposed method learns a core-periphery structured graph from node attributes alone, while simultaneously learning core score assignments that agree well with existing works that estimate core scores using graph as input and ignoring commonly available node attributes.
We formulate a theory of shape valid for objects of arbitrary dimension whose contours are path connected. We apply this theory to the design and modeling of viable trajectories of complex dynamical systems. Infinite families of qualitatively similar shapes are constructed giving as input a finite ordered set of characteristic points (landmarks) and the value of a continuous parameter $\kappa \in (0,\infty)$. We prove that all shapes belonging to the same family are located within the convex hull of the landmarks. The theory is constructive in the sense that it provides a systematic means to build a mathematical model for any shape taken from the physical world. We illustrate this with a variety of examples: (chaotic) time series, plane curves, space filling curves, knots and strange attractors.
Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D$^3$M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D$^3$M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D$^3$M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.
The era of big data provides researchers with convenient access to copious data. However, people often have little knowledge about it. The increasing prevalence of big data is challenging the traditional methods of learning causality because they are developed for the cases with limited amount of data and solid prior causal knowledge. This survey aims to close the gap between big data and learning causality with a comprehensive and structured review of traditional and frontier methods and a discussion about some open problems of learning causality. We begin with preliminaries of learning causality. Then we categorize and revisit methods of learning causality for the typical problems and data types. After that, we discuss the connections between learning causality and machine learning. At the end, some open problems are presented to show the great potential of learning causality with data.