This work presents a number of techniques to improve the ability to create magnetic field maps on a UAV which can be used to quickly and reliably gather magnetic field observations at multiple altitudes in a workspace. Unfortunately, the electronics on the UAV can introduce their own magnetic fields, distorting the resultant magnetic field map. We show methods of reducing and working with UAV-induced noise to better enable magnetic fields as a sensing modality for indoor navigation. First, some gains in our flight controller create high-frequency motor commands that introduce large noise in the measured magnetic field. Next, we implement a common noise reduction method of distancing the magnetometer from other components on our UAV. Finally, we introduce what we call a compromise GPR (Gaussian process regression) map that can be trained on multiple flight tests to learn any flight-by-flight variations between UAV observation tests. We investigate the spatial density of observations used to train a GPR map then use the compromise map to define a consistency test that can indicate whether or not the magnetometer data and corresponding GPR map are appropriate to use for state estimation. The interventions we introduce in this work facilitate indoor position localization of a UAV whose estimates we found to be quite sensitive to noise generated by the UAV.
Electricity grids have become an essential part of daily life, even if they are often not noticed in everyday life. We usually only become particularly aware of this dependence by the time the electricity grid is no longer available. However, significant changes, such as the transition to renewable energy (photovoltaic, wind turbines, etc.) and an increasing number of energy consumers with complex load profiles (electric vehicles, home battery systems, etc.), pose new challenges for the electricity grid. To address these challenges, we propose two first-of-its-kind datasets based on measurements in a broadband powerline communications (PLC) infrastructure. Both datasets FiN-1 and FiN-2, were collected during real practical use in a part of the German low-voltage grid that supplies around 4.4 million people and show more than 13 billion datapoints collected by more than 5100 sensors. In addition, we present different use cases in asset management, grid state visualization, forecasting, predictive maintenance, and novelty detection to highlight the benefits of these types of data. For these applications, we particularly highlight the use of novel machine learning architectures to extract rich information from real-world data that cannot be captured using traditional approaches. By publishing the first large-scale real-world dataset, we aim to shed light on the previously largely unrecognized potential of PLC data and emphasize machine-learning-based research in low-voltage distribution networks by presenting a variety of different use cases.
The Internet has turned the entire world into a small village;this is because it has made it possible to share millions of images and videos. However, sending and receiving a huge amount of data is considered to be a main challenge. To address this issue, a new algorithm is required to reduce image bits and represent the data in a compressed form. Nevertheless, image compression is an important application for transferring large files and images. This requires appropriate and efficient transfers in this field to achieve the task and reach the best results. In this work, we propose a new algorithm based on discrete Hermite wavelets transformation (DHWT) that shows the efficiency and quality of the color images. By compressing the color image, this method analyzes it and divides it into approximate coefficients and detail coefficients after adding the wavelets into MATLAB. With Multi-Resolution Analyses (MRA), the appropriate filter is derived, and the mathematical aspects prove to be validated by testing a new filter and performing its operation. After the decomposition of the rows and upon the process of the reconstruction, taking the inverse of the filter and dealing with the columns of the matrix, the original matrix is improved by measuring the parameters of the image to achieve the best quality of the resulting image, such as the peak signal-to-noise ratio (PSNR), compression ratio (CR), bits per pixel (BPP), and mean square error (MSE).
Algorithm aversion occurs when humans are reluctant to use algorithms despite their superior performance. Studies show that giving users outcome control by providing agency over how models' predictions are incorporated into decision-making mitigates algorithm aversion. We study whether algorithm aversion is mitigated by process control, wherein users can decide what input factors and algorithms to use in model training. We conduct a replication study of outcome control, and test novel process control study conditions on Amazon Mechanical Turk (MTurk) and Prolific. Our results partly confirm prior findings on the mitigating effects of outcome control, while also forefronting reproducibility challenges. We find that process control in the form of choosing the training algorithm mitigates algorithm aversion, but changing inputs does not. Furthermore, giving users both outcome and process control does not reduce algorithm aversion more than outcome or process control alone. This study contributes to design considerations around mitigating algorithm aversion.
When data is collected in an adaptive manner, even simple methods like ordinary least squares can exhibit non-normal asymptotic behavior. As an undesirable consequence, hypothesis tests and confidence intervals based on asymptotic normality can lead to erroneous results. We propose a family of online debiasing estimators to correct these distributional anomalies in least squares estimation. Our proposed methods take advantage of the covariance structure present in the dataset and provide sharper estimates in directions for which more information has accrued. We establish an asymptotic normality property for our proposed online debiasing estimators under mild conditions on the data collection process and provide asymptotically exact confidence intervals. We additionally prove a minimax lower bound for the adaptive linear regression problem, thereby providing a baseline by which to compare estimators. There are various conditions under which our proposed estimators achieve the minimax lower bound. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
The concepts of sparsity, and regularised estimation, have proven useful in many high-dimensional statistical applications. Dynamic factor models (DFMs) provide a parsimonious approach to modelling high-dimensional time series, however, it is often hard to interpret the meaning of the latent factors. This paper formally introduces a class of sparse DFMs whereby the loading matrices are constrained to have few non-zero entries, thus increasing interpretability of factors. We present a regularised M-estimator for the model parameters, and construct an efficient expectation maximisation algorithm to enable estimation. Synthetic experiments demonstrate consistency in terms of estimating the loading structure, and superior predictive performance where a low-rank factor structure may be appropriate. The utility of the method is further illustrated in an application forecasting electricity consumption across a large set of smart meters.
We review Quasi Maximum Likelihood estimation of factor models for high-dimensional panels of time series. We consider two cases: (1) estimation when no dynamic model for the factors is specified (Bai and Li, 2016); (2) estimation based on the Kalman smoother and the Expectation Maximization algorithm thus allowing to model explicitly the factor dynamics (Doz et al., 2012). Our interest is in approximate factor models, i.e., when we allow for the idiosyncratic components to be mildly cross-sectionally, as well as serially, correlated. Although such setting apparently makes estimation harder, we show, in fact, that factor models do not suffer of the curse of dimensionality problem, but instead they enjoy a blessing of dimensionality property. In particular, we show that if the cross-sectional dimension of the data, $N$, grows to infinity, then: (i) identification of the model is still possible, (ii) the mis-specification error due to the use of an exact factor model log-likelihood vanishes. Moreover, if we let also the sample size, $T$, grow to infinity, we can also consistently estimate all parameters of the model and make inference. The same is true for estimation of the latent factors which can be carried out by weighted least-squares, linear projection, or Kalman filtering/smoothing. We also compare the approaches presented with: Principal Component analysis and the classical, fixed $N$, exact Maximum Likelihood approach. We conclude with a discussion on efficiency of the considered estimators.
The safe deployment of autonomous vehicles relies on their ability to effectively react to environmental changes. This can require maneuvering on varying surfaces which is still a difficult problem, especially for slippery terrains. To address this issue we propose a new approach that learns a surface-aware dynamics model by conditioning it on a latent variable vector storing surface information about the current location. A latent mapper is trained to update these latent variables during inference from multiple modalities on every traversal of the corresponding locations and stores them in a map. By training everything end-to-end with the loss of the dynamics model, we enforce the latent mapper to learn an update rule for the latent map that is useful for the subsequent dynamics model. We implement and evaluate our approach on a real miniature electric car. The results show that the latent map is updated to allow more accurate predictions of the dynamics model compared to a model without this information. We further show that by using this model, the driving performance can be improved on varying and challenging surfaces.
Explicit exploration in the action space was assumed to be indispensable for online policy gradient methods to avoid a drastic degradation in sample complexity, for solving general reinforcement learning problems over finite state and action spaces. In this paper, we establish for the first time an $\tilde{\mathcal{O}}(1/\epsilon^2)$ sample complexity for online policy gradient methods without incorporating any exploration strategies. The essential development consists of two new on-policy evaluation operators and a novel analysis of the stochastic policy mirror descent method (SPMD). SPMD with the first evaluation operator, called value-based estimation, tailors to the Kullback-Leibler divergence. Provided the Markov chains on the state space of generated policies are uniformly mixing with non-diminishing minimal visitation measure, an $\tilde{\mathcal{O}}(1/\epsilon^2)$ sample complexity is obtained with a linear dependence on the size of the action space. SPMD with the second evaluation operator, namely truncated on-policy Monte Carlo (TOMC), attains an $\tilde{\mathcal{O}}(\mathcal{H}_{\mathcal{D}}/\epsilon^2)$ sample complexity, where $\mathcal{H}_{\mathcal{D}}$ mildly depends on the effective horizon and the size of the action space with properly chosen Bregman divergence (e.g., Tsallis divergence). SPMD with TOMC also exhibits stronger convergence properties in that it controls the optimality gap with high probability rather than in expectation. In contrast to explicit exploration, these new policy gradient methods can prevent repeatedly committing to potentially high-risk actions when searching for optimal policies.
Existing research on merging behavior generally prioritize the application of various algorithms, but often overlooks the fine-grained process and analysis of trajectories. This leads to the neglect of surrounding vehicle matching, the opaqueness of indicators definition, and reproducible crisis. To address these gaps, this paper presents a reproducible approach to merging behavior analysis. Specifically, we outline the causes of subjectivity and irreproducibility in existing studies. Thereafter, we employ lanelet2 High Definition (HD) map to construct a reproducible framework, that minimizes subjectivities, defines standardized indicators, identifies alongside vehicles, and divides scenarios. A comparative macroscopic and microscopic analysis is subsequently conducted. More importantly, this paper adheres to the Reproducible Research concept, providing all the source codes and reproduction instructions. Our results demonstrate that although scenarios with alongside vehicles occur in less than 6% of cases, their characteristics are significantly different from others, and these scenarios are often accompanied by high risk. This paper refines the understanding of merging behavior, raises awareness of reproducible studies, and serves as a watershed moment.
We introduce a new regression framework designed to deal with large-scale, complex data that lies around a low-dimensional manifold. Our approach first constructs a graph representation, referred to as the skeleton, to capture the underlying geometric structure. We then define metrics on the skeleton graph and apply nonparametric regression techniques, along with feature transformations based on the graph, to estimate the regression function. In addition to the included nonparametric methods, we also discuss the limitations of some nonparametric regressors with respect to the general metric space such as the skeleton graph. The proposed regression framework allows us to bypass the curse of dimensionality and provides additional advantages that it can handle the union of multiple manifolds and is robust to additive noise and noisy observations. We provide statistical guarantees for the proposed method and demonstrate its effectiveness through simulations and real data examples.