This paper presents a computational framework for the Principal Geodesic Analysis of merge trees (MT-PGA), a novel adaptation of the celebrated Principal Component Analysis (PCA) framework [87] to the Wasserstein metric space of merge trees [92]. We formulate MT-PGA computation as a constrained optimization problem, aiming at adjusting a basis of orthogonal geodesic axes, while minimizing a fitting energy. We introduce an efficient, iterative algorithm which exploits shared-memory parallelism, as well as an analytic expression of the fitting energy gradient, to ensure fast iterations. Our approach also trivially extends to extremum persistence diagrams. Extensive experiments on public ensembles demonstrate the efficiency of our approach - with MT-PGA computations in the orders of minutes for the largest examples. We show the utility of our contributions by extending to merge trees two typical PCA applications. First, we apply MT-PGA to data reduction and reliably compress merge trees by concisely representing them by their first coordinates in the MT-PGA basis. Second, we present a dimensionality reduction framework exploiting the first two directions of the MT-PGA basis to generate two-dimensional layouts of the ensemble. We augment these layouts with persistence correlation views, enabling global and local visual inspections of the feature variability in the ensemble. In both applications, quantitative experiments assess the relevance of our framework. Finally, we provide a lightweight C++ implementation that can be used to reproduce our results.
Sparse code multiple access (SCMA) is the most concerning scheme among non-orthogonal multiple access (NOMA) technologies for 5G wireless communication new interface. Another efficient technique in 5G aimed to improve spectral efficiency for local communications is device-to-device (D2D) communications. Therefore, we utilize the SCMA cellular network coexisting with D2D communications for the connection demand of the Internet of things (IOT), and improve the system sum rate performance of the hybrid network. We first derive the information-theoretic expression of the capacity for all users and find the capacity bound of cellular users based on the mutual interference between cellular users and D2D users. Then we consider the power optimization problem for the cellular users and D2D users jointly to maximize the system sum rate. To tackle the non-convex optimization problem, we propose a geometric programming (GP) based iterative power allocation algorithm. Simulation results demonstrate that the proposed algorithm converges fast and well improves the sum rate performance.
Different from traditional reflection-only reconfigurable intelligent surfaces (RISs), simultaneously transmitting and reflecting RISs (STAR-RISs) represent a novel technology, which extends the half-space coverage to full-space coverage by simultaneously transmitting and reflecting incident signals. STAR-RISs provide new degrees-of-freedom (DoF) for manipulating signal propagation. Motivated by the above, a novel STAR-RIS assisted non-orthogonal multiple access (NOMA) (STAR-RIS-NOMA) system is proposed in this paper. Our objective is to maximize the achievable sum rate by jointly optimizing the decoding order, power allocation coefficients, active beamforming, and transmission and reflection beamforming. However, the formulated problem is non-convex with intricately coupled variables. To tackle this challenge, a suboptimal two-layer iterative algorithm is proposed. Specifically, in the inner-layer iteration, for a given decoding order, the power allocation coefficients, active beamforming, transmission and reflection beamforming are optimized alternatingly. For the outer-layer iteration, the decoding order of NOMA users in each cluster is updated with the solutions obtained from the inner-layer iteration. Moreover, an efficient decoding order determination scheme is proposed based on the equivalent-combined channel gains. Simulation results are provided to demonstrate that the proposed STAR-RIS-NOMA system, aided by our proposed algorithm, outperforms conventional RIS-NOMA and RIS assisted orthogonal multiple access (RIS-OMA) systems.
Modeling and control of high-dimensional, nonlinear robotic systems remains a challenging task. While various model- and learning-based approaches have been proposed to address these challenges, they broadly lack generalizability to different control tasks and rarely preserve the structure of the dynamics. In this work, we propose a new, data-driven approach for extracting low-dimensional models from data using Spectral Submanifold Reduction (SSMR). In contrast to other data-driven methods which fit dynamical models to training trajectories, we identify the dynamics on generic, low-dimensional attractors embedded in the full phase space of the robotic system. This allows us to obtain computationally-tractable models for control which preserve the system's dominant dynamics and better track trajectories radically different from the training data. We demonstrate the superior performance and generalizability of SSMR in dynamic trajectory tracking tasks vis-a-vis the state of the art, including Koopman operator-based approaches.
Removing background noise from speech audio has been the subject of considerable effort, especially in recent years due to the rise of virtual communication and amateur recordings. Yet background noise is not the only unpleasant disturbance that can prevent intelligibility: reverb, clipping, codec artifacts, problematic equalization, limited bandwidth, or inconsistent loudness are equally disturbing and ubiquitous. In this work, we propose to consider the task of speech enhancement as a holistic endeavor, and present a universal speech enhancement system that tackles 55 different distortions at the same time. Our approach consists of a generative model that employs score-based diffusion, together with a multi-resolution conditioning network that performs enhancement with mixture density networks. We show that this approach significantly outperforms the state of the art in a subjective test performed by expert listeners. We also show that it achieves competitive objective scores with just 4-8 diffusion steps, despite not considering any particular strategy for fast sampling. We hope that both our methodology and technical contributions encourage researchers and practitioners to adopt a universal approach to speech enhancement, possibly framing it as a generative task.
Generative Adversarial Network (GAN) based art has proliferated in the past year, going from a shiny new tool to generate fake human faces to a stage where anyone can generate thousands of artistic images with minimal effort. Some of these images are now ``good'' enough to win accolades from qualified judges. In this paper, we explore how Generative Models have impacted artistry, not only from a qualitative point of view, but also from an angle of exploitation of artisans --both via plagiarism, where models are trained on their artwork without permission, and via profit shifting, where profits in the art market have shifted from art creators to model owners or to traders in the unregulated secondary crypto market. This confluence of factors risks completely detaching humans from the artistic process, devaluing the labor of artists and distorting the public perception of the value of art.
The rapid advancement of models based on artificial intelligence demands innovative monitoring techniques which can operate in real time with low computational costs. In machine learning, especially if we consider neural network (NN) learning algorithms, and in particular deep-learning architectures, the models are often trained in a supervised manner. Consequently, the learned relationship between the input and the output must remain valid during the model's deployment. If this stationarity assumption holds, we can conclude that the NN generates accurate predictions. Otherwise, the retraining or rebuilding of the model is required. We propose to consider the latent feature representation of the data (called "embedding") generated by the NN for determining the time point when the data stream starts being nonstationary. To be precise, we monitor embeddings by applying multivariate control charts based on the calculation of the data depth and normalized ranks. The performance of the introduced method is evaluated using various NNs with different underlying data formats.
The processor failures in a multiprocessor system have a negative impact on its distributed computing efficiency. Because of the rapid expansion of multiprocessor systems, the importance of fault diagnosis is becoming increasingly prominent. The $h$-component diagnosability of $G$, denoted by $ct_{h}(G)$, is the maximum number of nodes of the faulty set $F$ that is correctly identified in a system, and the number of components in $G-F$ is at least $h$. In this paper, we determine the $(h+1)$-component diagnosability of general networks under the PMC model and MM$^{*}$ model. As applications, the component diagnosability is explored for some well-known networks, including complete cubic networks, hierarchical cubic networks, generalized exchanged hypercubes, dual-cube-like networks, hierarchical hypercubes, Cayley graphs generated by transposition trees (except star graphs), and DQcube as well. Furthermore, we provide some comparison results between the component diagnosability and other fault diagnosabilities.
In this article, we propose the construction of polar codes based on piecewise Gaussian approximation (PGA) techniques. The PGA is first optimized and then compared to the Gaussian approximation (GA) construction method, showing performance gains for medium blocks and high precision for long blocks, in scenarios with successive cancellation (SC) decoding and additive white gaussian noise (AWGN) channel. Based on the PGA, we develop two approximations based on multi-segmented polynomials that are easy to implement. We present the Approximate PGA (APGA) that is optimized for medium blocks and provides a performance improvement without increasing complexity. Furthermore, we develop the simplified PGA (SPGA) as an alternative to the GA, which is optimized for long blocks and achieves high construction accuracy. Simulation results show that the APGA and SPGA construction methods outperform existing GA and competing approaches for medium and long block codes with notable performance improvement.
Deep learning has shown great potential for modeling the physical dynamics of complex particle systems such as fluids (in Lagrangian descriptions). Existing approaches, however, require the supervision of consecutive particle properties, including positions and velocities. In this paper, we consider a partially observable scenario known as fluid dynamics grounding, that is, inferring the state transitions and interactions within the fluid particle systems from sequential visual observations of the fluid surface. We propose a differentiable two-stage network named NeuroFluid. Our approach consists of (i) a particle-driven neural renderer, which involves fluid physical properties into the volume rendering function, and (ii) a particle transition model optimized to reduce the differences between the rendered and the observed images. NeuroFluid provides the first solution to unsupervised learning of particle-based fluid dynamics by training these two models jointly. It is shown to reasonably estimate the underlying physics of fluids with different initial shapes, viscosity, and densities. It is a potential alternative approach to understanding complex fluid mechanics, such as turbulence, that are difficult to model using traditional methods of mathematical physics.
The conjoining of dynamical systems and deep learning has become a topic of great interest. In particular, neural differential equations (NDEs) demonstrate that neural networks and differential equation are two sides of the same coin. Traditional parameterised differential equations are a special case. Many popular neural network architectures, such as residual networks and recurrent networks, are discretisations. NDEs are suitable for tackling generative problems, dynamical systems, and time series (particularly in physics, finance, ...) and are thus of interest to both modern machine learning and traditional mathematical modelling. NDEs offer high-capacity function approximation, strong priors on model space, the ability to handle irregular data, memory efficiency, and a wealth of available theory on both sides. This doctoral thesis provides an in-depth survey of the field. Topics include: neural ordinary differential equations (e.g. for hybrid neural/mechanistic modelling of physical systems); neural controlled differential equations (e.g. for learning functions of irregular time series); and neural stochastic differential equations (e.g. to produce generative models capable of representing complex stochastic dynamics, or sampling from complex high-dimensional distributions). Further topics include: numerical methods for NDEs (e.g. reversible differential equations solvers, backpropagation through differential equations, Brownian reconstruction); symbolic regression for dynamical systems (e.g. via regularised evolution); and deep implicit models (e.g. deep equilibrium models, differentiable optimisation). We anticipate this thesis will be of interest to anyone interested in the marriage of deep learning with dynamical systems, and hope it will provide a useful reference for the current state of the art.