This paper presents the utilization of advanced methodologies in aerial manipulation to address meaningful industrial applications and develop versatile ultrasonic Non-Destructive Testing (NDT) technologies with aerial robots. The primary objectives of this work are to enable multi-point measurements through sliding without re-approaching the work surface, and facilitate the representation of material thickness with B and C scans via dynamic scanning in arbitrary directions (i.e. omnidirections). To accomplish these objectives, a payload that can slide in omnidirections (here we call the omni-sliding payload) is designed for an over-actuated aerial vehicle, ensuring truly omnidirectional sliding mobility while exerting consistent forces in contact with a flat work surface. The omni-sliding payload is equipped with an omniwheel-based active end-effector and an Electro Magnetic Acoustic Transducer (EMAT). Furthermore, to ensure successful development of the designed payload and integration with the aerial vehicle, a comprehensive studying on contact conditions and system dynamics during active sliding is presented, and the derived system constraints are later used as guidelines for the hardware development and control setting. The proposed methods are validated through experiments, encompassing both the wall-sliding task and dynamic scanning for Ultrasonic Testing (UT), employing the aerial platform - Voliro T.
Neural graphics primitives are faster and achieve higher quality when their neural networks are augmented by spatial data structures that hold trainable features arranged in a grid. However, existing feature grids either come with a large memory footprint (dense or factorized grids, trees, and hash tables) or slow performance (index learning and vector quantization). In this paper, we show that a hash table with learned probes has neither disadvantage, resulting in a favorable combination of size and speed. Inference is faster than unprobed hash tables at equal quality while training is only 1.2-2.6x slower, significantly outperforming prior index learning approaches. We arrive at this formulation by casting all feature grids into a common framework: they each correspond to a lookup function that indexes into a table of feature vectors. In this framework, the lookup functions of existing data structures can be combined by simple arithmetic combinations of their indices, resulting in Pareto optimal compression and speed.
This paper presents a novel approach to network pruning, targeting block pruning in deep neural networks for edge computing environments. Our method diverges from traditional techniques that utilize proxy metrics, instead employing a direct block removal strategy to assess the impact on classification accuracy. This hands-on approach allows for an accurate evaluation of each block's importance. We conducted extensive experiments on CIFAR-10, CIFAR-100, and ImageNet datasets using ResNet architectures. Our results demonstrate the efficacy of our method, particularly on large-scale datasets like ImageNet with ResNet50, where it excelled in reducing model size while retaining high accuracy, even when pruning a significant portion of the network. The findings underscore our method's capability in maintaining an optimal balance between model size and performance, especially in resource-constrained edge computing scenarios.
We propose Stable Yet Memory Bounded Open-Loop (SYMBOL) planning, a general memory bounded approach to partially observable open-loop planning. SYMBOL maintains an adaptive stack of Thompson Sampling bandits, whose size is bounded by the planning horizon and can be automatically adapted according to the underlying domain without any prior domain knowledge beyond a generative model. We empirically test SYMBOL in four large POMDP benchmark problems to demonstrate its effectiveness and robustness w.r.t. the choice of hyperparameters and evaluate its adaptive memory consumption. We also compare its performance with other open-loop planning algorithms and POMCP.
The paper describes a system that uses large language model (LLM) technology to support the automatic learning of new entries in an intelligent agent's semantic lexicon. The process is bootstrapped by an existing non-toy lexicon and a natural language generator that converts formal, ontologically-grounded representations of meaning into natural language sentences. The learning method involves a sequence of LLM requests and includes an automatic quality control step. To date, this learning method has been applied to learning multiword expressions whose meanings are equivalent to those of transitive verbs in the agent's lexicon. The experiment demonstrates the benefits of a hybrid learning architecture that integrates knowledge-based methods and resources with both traditional data analytics and LLMs.
This paper introduces a learnable Deformable Hypothesis Sampler (DeformSampler) to address the challenging issue of noisy depth estimation for accurate PatchMatch Multi-View Stereo (MVS). We observe that the heuristic depth hypothesis sampling modes employed by PatchMatch MVS solvers are insensitive to (i) the piece-wise smooth distribution of depths across the object surface, and (ii) the implicit multi-modal distribution of depth prediction probabilities along the ray direction on the surface points. Accordingly, we develop DeformSampler to learn distribution-sensitive sample spaces to (i) propagate depths consistent with the scene's geometry across the object surface, and (ii) fit a Laplace Mixture model that approaches the point-wise probabilities distribution of the actual depths along the ray direction. We integrate DeformSampler into a learnable PatchMatch MVS system to enhance depth estimation in challenging areas, such as piece-wise discontinuous surface boundaries and weakly-textured regions. Experimental results on DTU and Tanks \& Temples datasets demonstrate its superior performance and generalization capabilities compared to state-of-the-art competitors. Code is available at //github.com/Geo-Tell/DS-PMNet.
This paper presents a Gaussian Process (GP) framework, a non-parametric technique widely acknowledged for regression and classification tasks, to address inverse problems in mean field games (MFGs). By leveraging GPs, we aim to recover agents' strategic actions and the environment's configurations from partial and noisy observations of the population of agents and the setup of the environment. Our method is a probabilistic tool to infer the behaviors of agents in MFGs from data in scenarios where the comprehensive dataset is either inaccessible or contaminated by noises.
This paper deals with developing techniques for the reconstruction of high-dimensional datasets given each bivariate projection, as would be found in a matrix scatterplot. A graph-based solution is introduced, involving clique-finding, providing a set of possible rows that might make up the original dataset. Complications are discussed, including cases where phantom cliques are found, as well as cases where an exact solution is impossible. Additional methods are shown, with some dealing with fully deducing rows and others dealing with having to creatively produce methods that find some possibilities to be more likely than others. Results show that these methods are highly successful in recreating a significant portion of the original dataset in many cases - for randomly generated and real-world datasets - with the factors leading to a greater rate of failure being lower dimension, higher n, and lower interval.
This tutorial aims to establish connections between polynomial modular multiplication over a ring to circular convolution and discrete Fourier transform (DFT). The main goal is to extend the well-known theory of DFT in signal processing (SP) to other applications involving polynomials in a ring such as homomorphic encryption (HE). HE allows any third party to operate on the encrypted data without decrypting it in advance. Since most HE schemes are constructed from the ring-learning with errors (R-LWE) problem, efficient polynomial modular multiplication implementation becomes critical. Any improvement in the execution of these building blocks would have significant consequences for the global performance of HE. This lecture note describes three approaches to implementing long polynomial modular multiplication using the number theoretic transform (NTT): zero-padded convolution, without zero-padding, also referred to as negative wrapped convolution (NWC), and low-complexity NWC (LC-NWC).
In this paper, we propose a cell-free scheme for unmanned aerial vehicle (UAV) base stations (BSs) to manage the severe intercell interference between terrestrial users and UAV-BSs of neighboring cells. Since the cell-free scheme requires enormous bandwidth for backhauling, we propose to use the sub-terahertz (sub-THz) band for the backhaul links between UAV-BSs and central processing unit (CPU). Also, because the sub-THz band requires a reliable line-of-sight link, we propose to use a high altitude platform station (HAPS) as a CPU. At the first time-slot of the proposed scheme, users send their messages to UAVs at the sub-6 GHz band. The UAVs then apply match-filtering and power allocation. At the second time-slot, at each UAV, orthogonal resource blocks are allocated for each user at the sub-THz band, and the signals are sent to the HAPS after analog beamforming. In the HAPS receiver, after analog beamforming, the message of each user is decoded. We formulate an optimization problem that maximizes the minimum signal-to-interference-plus-noise ratio of users by finding the optimum allocated power as well as the optimum locations of UAVs. Simulation results demonstrate the superiority of the proposed scheme compared with aerial cellular and terrestrial cell-free baseline schemes.
Graph classification aims to perform accurate information extraction and classification over graphstructured data. In the past few years, Graph Neural Networks (GNNs) have achieved satisfactory performance on graph classification tasks. However, most GNNs based methods focus on designing graph convolutional operations and graph pooling operations, overlooking that collecting or labeling graph-structured data is more difficult than grid-based data. We utilize meta-learning for fewshot graph classification to alleviate the scarce of labeled graph samples when training new tasks.More specifically, to boost the learning of graph classification tasks, we leverage GNNs as graph embedding backbone and meta-learning as training paradigm to capture task-specific knowledge rapidly in graph classification tasks and transfer them to new tasks. To enhance the robustness of meta-learner, we designed a novel step controller driven by Reinforcement Learning. The experiments demonstrate that our framework works well compared to baselines.