Scene reconstruction in the presence of high-speed motion and low illumination is important in many applications such as augmented and virtual reality, drone navigation, and autonomous robotics. Traditional motion estimation techniques fail in such conditions, suffering from too much blur in the presence of high-speed motion and strong noise in low-light conditions. Single-photon cameras have recently emerged as a promising technology capable of capturing hundreds of thousands of photon frames per second thanks to their high speed and extreme sensitivity. Unfortunately, traditional computer vision techniques are not well suited for dealing with the binary-valued photon data captured by these cameras because these are corrupted by extreme Poisson noise. Here we present a method capable of estimating extreme scene motion under challenging conditions, such as low light or high dynamic range, from a sequence of high-speed image frames such as those captured by a single-photon camera. Our method relies on iteratively improving a motion estimate by grouping and aggregating frames after-the-fact, in a stratified manner. We demonstrate the creation of high-quality panoramas under fast motion and extremely low light, and super-resolution results using a custom single-photon camera prototype. For code and supplemental material see our $\href{//wisionlab.com/project/panoramas-from-photons/}{\text{project webpage}}$.
Estimating weights in the synthetic control method, typically resulting in sparse weights where only a few control units have non-zero weights, involves an optimization procedure that simultaneously selects and aligns control units to closely match the treated unit. However, this simultaneous selection and alignment of control units may lead to a loss of efficiency. Another concern arising from the aforementioned procedure is its susceptibility to under-fitting due to imperfect pre-treatment fit. It is not uncommon for the linear combination, using nonnegative weights, of pre-treatment period outcomes for the control units to inadequately approximate the pre-treatment outcomes for the treated unit. To address both of these issues, this paper proposes a simple and effective method called Synthetic Regressing Control (SRC). The SRC method begins by performing the univariate linear regression to appropriately align the pre-treatment periods of the control units with the treated unit. Subsequently, a SRC estimator is obtained by synthesizing (taking a weighted average) the fitted controls. To determine the weights in the synthesis procedure, we propose an approach that utilizes a criterion of unbiased risk estimator. Theoretically, we show that the synthesis way is asymptotically optimal in the sense of achieving the lowest possible squared error. Extensive numerical experiments highlight the advantages of the SRC method.
Significant computational resources are required to train Graph Neural Networks (GNNs) at a large scale, and the process is highly data-intensive. One of the most effective ways to reduce resource requirements is minibatch training coupled with graph sampling. GNNs have the unique property that items in a minibatch have overlapping data. However, the commonly implemented Independent Minibatching approach assigns each Processing Element (PE) its own minibatch to process, leading to duplicated computations and input data access across PEs. This amplifies the Neighborhood Explosion Phenomenon (NEP), which is the main bottleneck limiting scaling. To reduce the effects of NEP in the multi-PE setting, we propose a new approach called Cooperative Minibatching. Our approach capitalizes on the fact that the size of the sampled subgraph is a concave function of the batch size, leading to significant reductions in the amount of work per seed vertex as batch sizes increase. Hence, it is favorable for processors equipped with a fast interconnect to work on a large minibatch together as a single larger processor, instead of working on separate smaller minibatches, even though global batch size is identical. We also show how to take advantage of the same phenomenon in serial execution by generating dependent consecutive minibatches. Our experimental evaluations show up to 4x bandwidth savings for fetching vertex embeddings, by simply increasing this dependency without harming model convergence. Combining our proposed approaches, we achieve up to 64% speedup over Independent Minibatching on single-node multi-GPU systems.
We consider the problem of automatically inferring specifications in the branching-time logic, Computation Tree Logic (CTL), from a given system. Designing functional and usable specifications has always been one of the biggest challenges of formal methods. While in recent years, works have focused on automatically designing specifications in linear-time logics such as Linear Temporal Logic (LTL) and Signal Temporal Logic (STL), little attention has been given to branching-time logics despite its popularity in formal methods. We intend to infer concise (thus, interpretable) CTL formulas from a given finite state model of the system in consideration. However, inferring specification only from the given model (and, in general, from only positive examples) is an ill-posed problem. As a result, we infer a CTL formula that, along with being concise, is also language-minimal, meaning that it is rather specific to the given model. We design a counter-example guided algorithm to infer a concise and language-minimal CTL formula via the generation of undesirable models. In the process, we also develop, for the first time, a passive learning algorithm to infer CTL formulas from a set of desirable and undesirable Kripke structures. The passive learning algorithm involves encoding a popular CTL model-checking procedure in the Boolean Satisfiability problem.
Quadrotors are widely used for surveillance, mapping, and deliveries. In several scenarios the quadrotor operates in pure inertial navigation mode resulting in a navigation solution drift. To handle such situations and bind the navigation drift, the quadrotor dead reckoning (QDR) approach requires flying the quadrotor in a periodic trajectory. Then, using model or learning based approaches the quadrotor position vector can be estimated. We propose to use multiple inertial measurement units (MIMU) to improve the positioning accuracy of the QDR approach. Several methods to utilize MIMU data in a deep learning framework are derived and evaluated. Field experiments were conducted to validate the proposed approach and show its benefits.
Accurately measuring discrimination in machine learning-based automated decision systems is required to address the vital issue of fairness between subpopulations and/or individuals. Any bias in measuring discrimination can lead to either amplification or underestimation of the true value of discrimination. This paper focuses on a class of bias originating in the way training data is generated and/or collected. We call such class causal biases and use tools from the field of causality to formally define and analyze such biases. Four sources of bias are considered, namely, confounding, selection, measurement, and interaction. The main contribution of this paper is to provide, for each source of bias, a closed-form expression in terms of the model parameters. This makes it possible to analyze the behavior of each source of bias, in particular, in which cases they are absent and in which other cases they are maximized. We hope that the provided characterizations help the community better understand the sources of bias in machine learning applications.
The problem of finding a constant bound on a term given a set of assumptions has wide applications in optimization as well as program analysis. However, in many contexts the objective term may be unbounded. Still, some sort of symbolic bound may be useful. In this paper we introduce the optimal symbolic-bound synthesis problem, and a technique that tackles this problem for non-linear arithmetic with function symbols. This allows us to automatically produce symbolic bounds on complex arithmetic expressions from a set of both equality and inequality assumptions. Our solution employs a novel combination of powerful mathematical objects -- Gr\"obner bases together with polyhedral cones -- to represent an infinite set of implied inequalities. We obtain a sound symbolic bound by reducing the objective term by this infinite set. We implemented our method in a tool, AutoBound, which we tested on problems originating from real Solidity programs. We find that AutoBound yields relevant bounds in each case, matching or nearly-matching upper bounds produced by a human analyst on the same set of programs.
Radar odometry estimation has emerged as a critical technique in the field of autonomous navigation, providing robust and reliable motion estimation under various environmental conditions. Despite its potential, the complex nature of radar signals and the inherent challenges associated with processing these signals have limited the widespread adoption of this technology. This paper aims to address these challenges by proposing novel improvements to an existing method for radar odometry estimation, designed to enhance accuracy and reliability in diverse scenarios. Our pipeline consists of filtering, motion compensation, oriented surface points computation, smoothing, one-to-many radar scan registration, and pose refinement. The developed method enforces local understanding of the scene, by adding additional information through smoothing techniques, and alignment of consecutive scans, as a refinement posterior to the one-to-many registration. We present an in-depth investigation of the contribution of each improvement to the localization accuracy, and we benchmark our system on the sequences of the main datasets for radar understanding, i.e., the Oxford Radar RobotCar, MulRan, and Boreas datasets. The proposed pipeline is able to achieve superior results, on all scenarios considered and under harsh environmental constraints.
The adaptive processing of structured data is a long-standing research topic in machine learning that investigates how to automatically learn a mapping from a structured input to outputs of various nature. Recently, there has been an increasing interest in the adaptive processing of graphs, which led to the development of different neural network-based methodologies. In this thesis, we take a different route and develop a Bayesian Deep Learning framework for graph learning. The dissertation begins with a review of the principles over which most of the methods in the field are built, followed by a study on graph classification reproducibility issues. We then proceed to bridge the basic ideas of deep learning for graphs with the Bayesian world, by building our deep architectures in an incremental fashion. This framework allows us to consider graphs with discrete and continuous edge features, producing unsupervised embeddings rich enough to reach the state of the art on several classification tasks. Our approach is also amenable to a Bayesian nonparametric extension that automatizes the choice of almost all model's hyper-parameters. Two real-world applications demonstrate the efficacy of deep learning for graphs. The first concerns the prediction of information-theoretic quantities for molecular simulations with supervised neural models. After that, we exploit our Bayesian models to solve a malware-classification task while being robust to intra-procedural code obfuscation techniques. We conclude the dissertation with an attempt to blend the best of the neural and Bayesian worlds together. The resulting hybrid model is able to predict multimodal distributions conditioned on input graphs, with the consequent ability to model stochasticity and uncertainty better than most works. Overall, we aim to provide a Bayesian perspective into the articulated research field of deep learning for graphs.
This manuscript portrays optimization as a process. In many practical applications the environment is so complex that it is infeasible to lay out a comprehensive theoretical model and use classical algorithmic theory and mathematical optimization. It is necessary as well as beneficial to take a robust approach, by applying an optimization method that learns as one goes along, learning from experience as more aspects of the problem are observed. This view of optimization as a process has become prominent in varied fields and has led to some spectacular success in modeling and systems that are now part of our daily lives.
Transformer is a new kind of neural architecture which encodes the input data as powerful features via the attention mechanism. Basically, the visual transformers first divide the input images into several local patches and then calculate both representations and their relationship. Since natural images are of high complexity with abundant detail and color information, the granularity of the patch dividing is not fine enough for excavating features of objects in different scales and locations. In this paper, we point out that the attention inside these local patches are also essential for building visual transformers with high performance and we explore a new architecture, namely, Transformer iN Transformer (TNT). Specifically, we regard the local patches (e.g., 16$\times$16) as "visual sentences" and present to further divide them into smaller patches (e.g., 4$\times$4) as "visual words". The attention of each word will be calculated with other words in the given visual sentence with negligible computational costs. Features of both words and sentences will be aggregated to enhance the representation ability. Experiments on several benchmarks demonstrate the effectiveness of the proposed TNT architecture, e.g., we achieve an 81.5% top-1 accuracy on the ImageNet, which is about 1.7% higher than that of the state-of-the-art visual transformer with similar computational cost. The PyTorch code is available at //github.com/huawei-noah/CV-Backbones, and the MindSpore code is available at //gitee.com/mindspore/models/tree/master/research/cv/TNT.