In ultracold atom experiments, data often comes in the form of images which suffer information loss inherent in the techniques used to prepare and measure the system. This is particularly problematic when the processes of interest are complicated, such as interactions among excitations in Bose-Einstein condensates (BECs). In this paper, we describe a framework combining machine learning (ML) models with physics-based traditional analyses to identify and track multiple solitonic excitations in images of BECs. We use an ML-based object detector to locate the solitonic excitations and develop a physics-informed classifier to sort solitonic excitations into physically motivated sub-categories. Lastly, we introduce a quality metric quantifying the likelihood that a specific feature is a kink soliton. Our trained implementation of this framework -- SolDet -- is publicly available as an open-source python package. SolDet is broadly applicable to feature identification in cold atom images when trained on a suitable user-provided dataset.
Robust machine learning is an increasingly important topic that focuses on developing models resilient to various forms of imperfect data. Due to the pervasiveness of recommender systems in online technologies, researchers have carried out several robustness studies focusing on data sparsity and profile injection attacks. Instead, we propose a more holistic view of robustness for recommender systems that encompasses multiple dimensions - robustness with respect to sub-populations, transformations, distributional disparity, attack, and data sparsity. While there are several libraries that allow users to compare different recommender system models, there is no software library for comprehensive robustness evaluation of recommender system models under different scenarios. As our main contribution, we present a robustness evaluation toolkit, Robustness Gym for RecSys (RGRecSys -- //www.github.com/salesforce/RGRecSys), that allows us to quickly and uniformly evaluate the robustness of recommender system models.
The inaccessibility of controlled randomized trials due to inherent constraints in many fields of science has been a fundamental issue in causal inference. In this paper, we focus on distinguishing the cause from effect in the bivariate setting under limited observational data. Based on recent developments in meta learning as well as in causal inference, we introduce a novel generative model that allows distinguishing cause and effect in the small data setting. Using a learnt task variable that contains distributional information of each dataset, we propose an end-to-end algorithm that makes use of similar training datasets at test time. We demonstrate our method on various synthetic as well as real-world data and show that it is able to maintain high accuracy in detecting directions across varying dataset sizes.
Recent advances in sensor and mobile devices have enabled an unprecedented increase in the availability and collection of urban trajectory data, thus increasing the demand for more efficient ways to manage and analyze the data being produced. In this survey, we comprehensively review recent research trends in trajectory data management, ranging from trajectory pre-processing, storage, common trajectory analytic tools, such as querying spatial-only and spatial-textual trajectory data, and trajectory clustering. We also explore four closely related analytical tasks commonly used with trajectory data in interactive or real-time processing. Deep trajectory learning is also reviewed for the first time. Finally, we outline the essential qualities that a trajectory management system should possess in order to maximize flexibility.
We present a continuous formulation of machine learning, as a problem in the calculus of variations and differential-integral equations, very much in the spirit of classical numerical analysis and statistical physics. We demonstrate that conventional machine learning models and algorithms, such as the random feature model, the shallow neural network model and the residual neural network model, can all be recovered as particular discretizations of different continuous formulations. We also present examples of new models, such as the flow-based random feature model, and new algorithms, such as the smoothed particle method and spectral method, that arise naturally from this continuous formulation. We discuss how the issues of generalization error and implicit regularization can be studied under this framework.
We introduce the Neural State Machine, seeking to bridge the gap between the neural and symbolic views of AI and integrate their complementary strengths for the task of visual reasoning. Given an image, we first predict a probabilistic graph that represents its underlying semantics and serves as a structured world model. Then, we perform sequential reasoning over the graph, iteratively traversing its nodes to answer a given question or draw a new inference. In contrast to most neural architectures that are designed to closely interact with the raw sensory data, our model operates instead in an abstract latent space, by transforming both the visual and linguistic modalities into semantic concept-based representations, thereby achieving enhanced transparency and modularity. We evaluate our model on VQA-CP and GQA, two recent VQA datasets that involve compositionality, multi-step inference and diverse reasoning skills, achieving state-of-the-art results in both cases. We provide further experiments that illustrate the model's strong generalization capacity across multiple dimensions, including novel compositions of concepts, changes in the answer distribution, and unseen linguistic structures, demonstrating the qualities and efficacy of our approach.
Deep neural networks have been shown to be very powerful modeling tools for many supervised learning tasks involving complex input patterns. However, they can also easily overfit to training set biases and label noises. In addition to various regularizers, example reweighting algorithms are popular solutions to these problems, but they require careful tuning of additional hyperparameters, such as example mining schedules and regularization hyperparameters. In contrast to past reweighting methods, which typically consist of functions of the cost value of each example, in this work we propose a novel meta-learning algorithm that learns to assign weights to training examples based on their gradient directions. To determine the example weights, our method performs a meta gradient descent step on the current mini-batch example weights (which are initialized from zero) to minimize the loss on a clean unbiased validation set. Our proposed method can be easily implemented on any type of deep network, does not require any additional hyperparameter tuning, and achieves impressive performance on class imbalance and corrupted label problems where only a small amount of clean validation data is available.
Machine learning methods are powerful in distinguishing different phases of matter in an automated way and provide a new perspective on the study of physical phenomena. We train a Restricted Boltzmann Machine (RBM) on data constructed with spin configurations sampled from the Ising Hamiltonian at different values of temperature and external magnetic field using Monte Carlo methods. From the trained machine we obtain the flow of iterative reconstruction of spin state configurations to faithfully reproduce the observables of the physical system. We find that the flow of the trained RBM approaches the spin configurations of the maximal possible specific heat which resemble the near criticality region of the Ising model. In the special case of the vanishing magnetic field the trained RBM converges to the critical point of the Renormalization Group (RG) flow of the lattice model. Our results suggest an alternative explanation of how the machine identifies the physical phase transitions, by recognizing certain properties of the configuration like the maximization of the specific heat, instead of associating directly the recognition procedure with the RG flow and its fixed points. Then from the reconstructed data we deduce the critical exponent associated to the magnetization to find satisfactory agreement with the actual physical value. We assume no prior knowledge about the criticality of the system and its Hamiltonian.
This tutorial is based on the lecture notes for the courses "Machine Learning: Basic Principles" and "Artificial Intelligence", which I have (co-)taught since 2015 at Aalto University. The aim is to provide an accessible introduction to some of the main concepts and methods within machine learning. Many of the current systems which are considered as (artificially) intelligent are based on combinations of few basic machine learning methods. After formalizing the main building blocks of a machine learning problem, some popular algorithmic design patterns formachine learning methods are discussed in some detail.
Meta-learning is a powerful tool that builds on multi-task learning to learn how to quickly adapt a model to new tasks. In the context of reinforcement learning, meta-learning algorithms can acquire reinforcement learning procedures to solve new problems more efficiently by meta-learning prior tasks. The performance of meta-learning algorithms critically depends on the tasks available for meta-training: in the same way that supervised learning algorithms generalize best to test points drawn from the same distribution as the training points, meta-learning methods generalize best to tasks from the same distribution as the meta-training tasks. In effect, meta-reinforcement learning offloads the design burden from algorithm design to task design. If we can automate the process of task design as well, we can devise a meta-learning algorithm that is truly automated. In this work, we take a step in this direction, proposing a family of unsupervised meta-learning algorithms for reinforcement learning. We describe a general recipe for unsupervised meta-reinforcement learning, and describe an effective instantiation of this approach based on a recently proposed unsupervised exploration technique and model-agnostic meta-learning. We also discuss practical and conceptual considerations for developing unsupervised meta-learning methods. Our experimental results demonstrate that unsupervised meta-reinforcement learning effectively acquires accelerated reinforcement learning procedures without the need for manual task design, significantly exceeds the performance of learning from scratch, and even matches performance of meta-learning methods that use hand-specified task distributions.
MLtuner automatically tunes settings for training tunables (such as the learning rate, the momentum, the mini-batch size, and the data staleness bound) that have a significant impact on large-scale machine learning (ML) performance. Traditionally, these tunables are set manually, which is unsurprisingly error-prone and difficult to do without extensive domain knowledge. MLtuner uses efficient snapshotting, branching, and optimization-guided online trial-and-error to find good initial settings as well as to re-tune settings during execution. Experiments show that MLtuner can robustly find and re-tune tunable settings for a variety of ML applications, including image classification (for 3 models and 2 datasets), video classification, and matrix factorization. Compared to state-of-the-art ML auto-tuning approaches, MLtuner is more robust for large problems and over an order of magnitude faster.