In this paper, we introduce a novel open source toolbox for design optimization in Soft Robotics. We consider that design optimization is an important trend in Soft Robotics that is changing the way in which designs will be shared and adopted. We evaluate this toolbox on the example of a cable-driven, sensorized soft finger. For devices like these, that feature both actuation and sensing, the need for multi-objective optimization capabilities naturally arises, because at the very least, a trade-off between these two aspects has to be found. Thus, multi-objective optimization capability is one of the central features of the proposed toolbox. We evaluate the optimization of the soft finger and show that extreme points of the optimization trade-off between sensing and actuation are indeed far apart on actually fabricated devices for the established metrics. Furthermore, we provide an in depth analysis of the sim-to-real behavior of the example, taking into account factors such as the mesh density in the simulation, mechanical parameters and fabrication tolerances.
We present a comprehensive case study comparing the performance of D-Waves' quantum-classical hybrid framework, Fujitsu's quantum-inspired digital annealer, and Gurobi's state-of-the-art classical solver in solving a transport robot scheduling problem. This problem originates from an industrially relevant real-world scenario. We provide three different models for our problem following different design philosophies. In our benchmark, we focus on the solution quality and end-to-end runtime of different model and solver combinations. We find promising results for the digital annealer and some opportunities for the hybrid quantum annealer in direct comparison with Gurobi. Our study provides insights into the workflow for solving an application-oriented optimization problem with different strategies, and can be useful for evaluating the strengths and weaknesses of different approaches.
In this paper, we aim at maximizing the weighted sum-rate (WSR) of rate splitting multiple access (RSMA) in multi-user multi-antenna transmission networks through the joint optimization of rate allocation and beamforming. Unlike conventional methods like weighted minimum mean square error (WMMSE) and standard fractional programming (FP), which tackle the non-convex WSR problem iteratively using disciplined convex subproblems and optimization toolboxes, our work pioneers a novel toolbox-free approach. For the first time, we identify the optimal beamforming structure and common rate allocation for WSR maximization in RSMA by leveraging FP and Lagrangian duality. Then we propose an algorithm based on FP and fixed point iteration to optimize the beamforming and common rate allocation without the need for optimization toolboxes. Our numerical results demonstrate that the proposed algorithm attains the same performance as standard FP and classical WMMSE methods while significantly reducing computational time.
In this paper, we study an intelligent reflecting surface (IRS)-aided communication system with single-antenna transmitter and receiver, under imperfect channel state information (CSI). More specifically, we deal with the robust selection of binary (on/off) states of the IRS elements in order to maximize the worst-case energy efficiency (EE), given a bounded CSI uncertainty, while satisfying a minimum signal-to-noise ratio (SNR). In addition, we consider not only continuous but also discrete IRS phase shifts. First, we derive closed-form expressions of the worst-case SNRs, and then formulate the robust (discrete) optimization problems for each case. In the case of continuous phase shifts, we design a dynamic programming (DP) algorithm that is theoretically guaranteed to achieve the global maximum with polynomial complexity $O(L\,{\log L})$, where $L$ is the number of IRS elements. In the case of discrete phase shifts, we develop a convex-relaxation-based method (CRBM) to obtain a feasible (sub-optimal) solution in polynomial time $O(L^{3.5})$, with a posteriori performance guarantee. Furthermore, numerical simulations provide useful insights and confirm the theoretical results. In particular, the proposed algorithms are several orders of magnitude faster than the exhaustive search when $L$ is large, thus being highly scalable and suitable for practical applications. Moreover, both algorithms outperform a baseline scheme, namely, the activation of all IRS elements.
Quantization and pruning are two effective Deep Neural Networks model compression methods. In this paper, we propose Automatic Prune Binarization (APB), a novel compression technique combining quantization with pruning. APB enhances the representational capability of binary networks using a few full-precision weights. Our technique jointly maximizes the accuracy of the network while minimizing its memory impact by deciding whether each weight should be binarized or kept in full precision. We show how to efficiently perform a forward pass through layers compressed using APB by decomposing it into a binary and a sparse-dense matrix multiplication. Moreover, we design two novel efficient algorithms for extremely quantized matrix multiplication on CPU, leveraging highly efficient bitwise operations. The proposed algorithms are 6.9x and 1.5x faster than available state-of-the-art solutions. We extensively evaluate APB on two widely adopted model compression datasets, namely CIFAR10 and ImageNet. APB delivers better accuracy/memory trade-off compared to state-of-the-art methods based on i) quantization, ii) pruning, and iii) combination of pruning and quantization. APB outperforms quantization in the accuracy/efficiency trade-off, being up to 2x faster than the 2-bit quantized model with no loss in accuracy.
In this work, we develop a novel efficient quadrature and sparse grid based polynomial interpolation method to price American options with multiple underlying assets. The approach is based on first formulating the pricing of American options using dynamic programming, and then employing static sparse grids to interpolate the continuation value function at each time step. To achieve high efficiency, we first transform the domain from $\mathbb{R}^d$ to $(-1,1)^d$ via a scaled tanh map, and then remove the boundary singularity of the resulting multivariate function over $(-1,1)^d$ by a bubble function and simultaneously, to significantly reduce the number of interpolation points. We rigorously establish that with a proper choice of the bubble function, the resulting function has bounded mixed derivatives up to a certain order, which provides theoretical underpinnings for the use of sparse grids. Numerical experiments for American arithmetic and geometric basket put options with the number of underlying assets up to 16 are presented to validate the effectiveness of the approach.
Visual Inertial Odometry (VIO) is an essential component of modern Augmented Reality (AR) applications. However, VIO only tracks the relative pose of the device, leading to drift over time. Absolute pose estimation methods infer the device's absolute pose, but their accuracy depends on the input quality. This paper introduces VIO-APR, a new framework for markerless mobile AR that combines an absolute pose regressor (APR) with a local VIO tracking system. VIO-APR uses VIO to assess the reliability of the APR and the APR to identify and compensate for VIO drift. This feedback loop results in more accurate positioning and more stable AR experiences. To evaluate VIO-APR, we created a dataset that combines camera images with ARKit's VIO system output for six indoor and outdoor scenes of various scales. Over this dataset, VIO-APR improves the median accuracy of popular APR by up to 36\% in position and 29\% in orientation, increases the percentage of frames in the high ($0.25 m, 2^{\circ}$) accuracy level by up to 112\% and reduces the percentage of frames predicted below the low ($5 m, 10^\circ$) accuracy greatly. We implement VIO-APR into a mobile AR application using Unity to demonstrate its capabilities. VIO-APR results in noticeably more accurate localization and a more stable overall experience.
This article illustrates a novel Quantum Secure Aggregation (QSA) scheme that is designed to provide highly secure and efficient aggregation of local model parameters for federated learning. The scheme is secure in protecting private model parameters from being disclosed to semi-honest attackers by utilizing quantum bits i.e. qubits to represent model parameters. The proposed security mechanism ensures that any attempts to eavesdrop private model parameters can be immediately detected and stopped. The scheme is also efficient in terms of the low computational complexity of transmitting and aggregating model parameters through entangled qubits. Benefits of the proposed QSA scheme are showcased in a horizontal federated learning setting in which both a centralized and decentralized architectures are taken into account. It was empirically demonstrated that the proposed QSA can be readily applied to aggregate different types of local models including logistic regression (LR), convolutional neural networks (CNN) as well as quantum neural network (QNN), indicating the versatility of the QSA scheme. Performances of global models are improved to various extents with respect to local models obtained by individual participants, while no private model parameters are disclosed to semi-honest adversaries.
In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.
In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.
In this paper, we propose a deep reinforcement learning framework called GCOMB to learn algorithms that can solve combinatorial problems over large graphs. GCOMB mimics the greedy algorithm in the original problem and incrementally constructs a solution. The proposed framework utilizes Graph Convolutional Network (GCN) to generate node embeddings that predicts the potential nodes in the solution set from the entire node set. These embeddings enable an efficient training process to learn the greedy policy via Q-learning. Through extensive evaluation on several real and synthetic datasets containing up to a million nodes, we establish that GCOMB is up to 41% better than the state of the art, up to seven times faster than the greedy algorithm, robust and scalable to large dynamic networks.