Support Vector Machine (SVM) algorithm requires a high computational cost (both in memory and time) to solve a complex quadratic programming (QP) optimization problem during the training process. Consequently, SVM necessitates high computing hardware capabilities. The central processing unit (CPU) clock frequency cannot be increased due to physical limitations in the miniaturization process. However, the potential of parallel multi-architecture, available in both multi-core CPUs and highly scalable GPUs, emerges as a promising solution to enhance algorithm performance. Therefore, there is an opportunity to reduce the high computational time required by SVM for solving the QP optimization problem. This paper presents a comparative study that implements the SVM algorithm on different parallel architecture frameworks. The experimental results show that SVM MPI-CUDA implementation achieves a speedup over SVM TensorFlow implementation on different datasets. Moreover, SVM TensorFlow implementation provides a cross-platform solution that can be migrated to alternative hardware components, which will reduces the development time.
Adaptive finite element methods are a powerful tool to obtain numerical simulation results in a reasonable time. Due to complex chemical and mechanical couplings in lithium-ion batteries, numerical simulations are very helpful to investigate promising new battery active materials such as amorphous silicon featuring a higher energy density than graphite. Based on a thermodynamically consistent continuum model with large deformation and chemo-mechanically coupled approach, we compare three different spatial adaptive refinement strategies: Kelly-, gradient recovery- and residual based error estimation. For the residual based case, the strong formulation of the residual is explicitly derived. With amorphous silicon as example material, we investigate two 3D representative host particle geometries, reduced with symmetry assumptions to a 1D unit interval and a 2D elliptical domain. Our numerical studies show that the Kelly estimator overestimates the error, whereas the gradient recovery estimator leads to lower refinement levels and a good capture of the change of the lithium flux. The residual based error estimator reveals a strong dependency on the cell error part which can be improved by a more suitable choice of constants to be more efficient. In a 2D domain, the concentration has a larger influence on the mesh distribution than the Cauchy stress.
We propose a new algorithm for the problem of recovering data that adheres to multiple, heterogeneous low-dimensional structures from linear observations. Focusing on data matrices that are simultaneously row-sparse and low-rank, we propose and analyze an iteratively reweighted least squares (IRLS) algorithm that is able to leverage both structures. In particular, it optimizes a combination of non-convex surrogates for row-sparsity and rank, a balancing of which is built into the algorithm. We prove locally quadratic convergence of the iterates to a simultaneously structured data matrix in a regime of minimal sample complexity (up to constants and a logarithmic factor), which is known to be impossible for a combination of convex surrogates. In experiments, we show that the IRLS method exhibits favorable empirical convergence, identifying simultaneously row-sparse and low-rank matrices from fewer measurements than state-of-the-art methods. Code is available at //github.com/ckuemmerle/simirls.
Goal representation affects the performance of Hierarchical Reinforcement Learning (HRL) algorithms by decomposing the complex learning problem into easier subtasks. Recent studies show that representations that preserve temporally abstract environment dynamics are successful in solving difficult problems and provide theoretical guarantees for optimality. These methods however cannot scale to tasks where environment dynamics increase in complexity i.e. the temporally abstract transition relations depend on larger number of variables. On the other hand, other efforts have tried to use spatial abstraction to mitigate the previous issues. Their limitations include scalability to high dimensional environments and dependency on prior knowledge. In this paper, we propose a novel three-layer HRL algorithm that introduces, at different levels of the hierarchy, both a spatial and a temporal goal abstraction. We provide a theoretical study of the regret bounds of the learned policies. We evaluate the approach on complex continuous control tasks, demonstrating the effectiveness of spatial and temporal abstractions learned by this approach.
The Blahut-Arimoto (BA) algorithm has played a fundamental role in the numerical computation of rate-distortion (RD) functions. This algorithm possesses a desirable monotonic convergence property by alternatively minimizing its Lagrangian with a fixed multiplier. In this paper, we propose a novel modification of the BA algorithm, wherein the multiplier is updated through a one-dimensional root-finding step using a monotonic univariate function, efficiently implemented by Newton's method in each iteration. Consequently, the modified algorithm directly computes the RD function for a given target distortion, without exploring the entire RD curve as in the original BA algorithm. Moreover, this modification presents a versatile framework, applicable to a wide range of problems, including the computation of distortion-rate (DR) functions. Theoretical analysis shows that the outputs of the modified algorithms still converge to the solutions of the RD and DR functions with rate $O(1/n)$, where $n$ is the number of iterations. Additionally, these algorithms provide $\varepsilon$-approximation solutions with $O\left(\frac{MN\log N}{\varepsilon}(1+\log |\log \varepsilon|)\right)$ arithmetic operations, where $M,N$ are the sizes of source and reproduced alphabets respectively. Numerical experiments demonstrate that the modified algorithms exhibit significant acceleration compared with the original BA algorithms and showcase commendable performance across classical source distributions such as discretized Gaussian, Laplacian and uniform sources.
Delay and Doppler ambiguities of comb reference signal patterns are investigated through time delay and Doppler shift detection using high-resolution sensing algorithms. Necessary conditions of designing comb RS patterns and synthesizing different reference signal patterns in general are derived under the goal of eliminating side peaks and preserving the best achievable ambiguity performance of OFDM signals for target detection.
Traditionally, classical numerical schemes have been employed to solve partial differential equations (PDEs) using computational methods. Recently, neural network-based methods have emerged. Despite these advancements, neural network-based methods, such as physics-informed neural networks (PINNs) and neural operators, exhibit deficiencies in robustness and generalization. To address these issues, numerous studies have integrated classical numerical frameworks with machine learning techniques, incorporating neural networks into parts of traditional numerical methods. In this study, we focus on hyperbolic conservation laws by replacing traditional numerical fluxes with neural operators. To this end, we developed loss functions inspired by established numerical schemes related to conservation laws and approximated numerical fluxes using Fourier neural operators (FNOs). Our experiments demonstrated that our approach combines the strengths of both traditional numerical schemes and FNOs, outperforming standard FNO methods in several respects. For instance, we demonstrate that our method is robust, has resolution invariance, and is feasible as a data-driven method. In particular, our method can make continuous predictions over time and exhibits superior generalization capabilities with out-of-distribution (OOD) samples, which are challenges that existing neural operator methods encounter.
Machine learning algorithms are heavily relied on to understand the vast amounts of data from high-energy particle collisions at the CERN Large Hadron Collider (LHC). The data from such collision events can naturally be represented with graph structures. Therefore, deep geometric methods, such as graph neural networks (GNNs), have been leveraged for various data analysis tasks in high-energy physics. One typical task is jet tagging, where jets are viewed as point clouds with distinct features and edge connections between their constituent particles. The increasing size and complexity of the LHC particle datasets, as well as the computational models used for their analysis, greatly motivate the development of alternative fast and efficient computational paradigms such as quantum computation. In addition, to enhance the validity and robustness of deep networks, one can leverage the fundamental symmetries present in the data through the use of invariant inputs and equivariant layers. In this paper, we perform a fair and comprehensive comparison between classical graph neural networks (GNNs) and equivariant graph neural networks (EGNNs) and their quantum counterparts: quantum graph neural networks (QGNNs) and equivariant quantum graph neural networks (EQGNN). The four architectures were benchmarked on a binary classification task to classify the parton-level particle initiating the jet. Based on their AUC scores, the quantum networks were shown to outperform the classical networks. However, seeing the computational advantage of the quantum networks in practice may have to wait for the further development of quantum technology and its associated APIs.
Recent constructions of quantum low-density parity-check (QLDPC) codes provide optimal scaling of the number of logical qubits and the minimum distance in terms of the code length, thereby opening the door to fault-tolerant quantum systems with minimal resource overhead. However, the hardware path from nearest-neighbor-connection-based topological codes to long-range-interaction-demanding QLDPC codes is a challenging one. Given the practical difficulty in building a monolithic architecture for quantum computers based on optimal QLDPC codes, it is worth considering a distributed implementation of such codes over a network of interconnected quantum processors. In such a setting, all syndrome measurements and logical operations must be performed using high-fidelity shared entangled states between the processing nodes. Since probabilistic many-to-1 distillation schemes for purifying entanglement are inefficient, we investigate quantum error correction based entanglement purification in this work. Specifically, we employ QLDPC codes to distill GHZ states, as the resulting high-fidelity logical GHZ states can interact directly with the code used to perform distributed quantum computing (DQC), e.g. for fault-tolerant Steane syndrome extraction. This protocol is applicable beyond DQC since entanglement purification is a quintessential task of any quantum network. We use the min-sum algorithm (MSA) based iterative decoder for distilling $3$-qubit GHZ states using a rate $0.118$ family of lifted product QLDPC codes and obtain an input threshold of $\approx 0.7974$ under i.i.d. single-qubit depolarizing noise. This represents the best threshold for a yield of $0.118$ for any GHZ purification protocol. Our results apply to larger size GHZ states as well, where we extend our technical result about a measurement property of $3$-qubit GHZ states to construct a scalable GHZ purification protocol.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.