In compact settings, the convergence rate of the empirical optimal transport cost to its population value is well understood for a wide class of spaces and cost functions. In unbounded settings, however, hitherto available results require strong assumptions on the ground costs and the concentration of the involved measures. In this work, we pursue a decomposition-based approach to generalize the convergence rates found in compact spaces to unbounded settings under generic moment assumptions that are sharp up to an arbitrarily small $\epsilon > 0$. Hallmark properties of empirical optimal transport on compact spaces, like the recently established adaptation to lower complexity, are shown to carry over to the unbounded case.
The Strahler number was originally proposed to characterize the complexity of river bifurcation and has found various applications. This article proposes computation of the Strahler number's upper and lower limits for natural language sentence tree structures. Through empirical measurements across grammatically annotated data, the Strahler number of natural language sentences is shown to be almost 3 or 4, similarly to the case of river bifurcation as reported by Strahler (1957). From the theory behind the number, we show that it is one kind of lower limit on the amount of memory required to process sentences. We consider the Strahler number to provide reasoning that explains reports showing that the number of required memory areas to process sentences is 3 to 4 for parsing (Abney and Johnson, 1991; Schuler et al., 2010), and reports indicating a psychological "magical number" of 3 to 5 (Cowan, 2001). An analytical and empirical analysis shows that the Strahler number is not constant but grows logarithmically; therefore, the Strahler number of sentences derives from the range of sentence lengths. Furthermore, the Strahler number is not different for random trees, which could suggest that its origin is not specific to natural language.
In a mobile wireless channel, the small-scale multipath fading induces temporal channel fluctuations in the form of peaks and deep fades. The channel capacity degradation with fading severity in the high signal-to-noise ratio (SNR) regime is well known in the wireless communication literature: the probability of deep fades increases significantly with higher fading severity resulting in poor performance. In this paper, we focus on double-fading pinhole channels under perfect CSIT to show a very counter-intuitive result that - higher fading severity enables higher ergodic capacity at sufficiently low SNR. The underlying reason is that at low SNRs, ergodic capacity depends crucially on the probability distribution of channel peaks (simply tail distribution); for the pinhole channel, the tail distribution improves with increased fading severity. This allows a transmitter operating at low SNR to exploit channel peaks more efficiently resulting in a net improvement in achievable spectral efficiency. We derive a new key result quantifying the above dependence for the double-Nakagami-$m$ fading pinhole channel - that is, the ergodic capacity ${C} \propto (m_T m_R)^{-1}$ at low SNR, where $m_T m_R$ is the product of fading (severity) parameters of the two independent Nakagami-$m$ fadings involved.
Novel view synthesis and 3D modeling using implicit neural field representation are shown to be very effective for calibrated multi-view cameras. Such representations are known to benefit from additional geometric and semantic supervision. Most existing methods that exploit additional supervision require dense pixel-wise labels or localized scene priors. These methods cannot benefit from high-level vague scene priors provided in terms of scenes' descriptions. In this work, we aim to leverage the geometric prior of Manhattan scenes to improve the implicit neural radiance field representations. More precisely, we assume that only the knowledge of the indoor scene (under investigation) being Manhattan is known -- with no additional information whatsoever -- with an unknown Manhattan coordinate frame. Such high-level prior is used to self-supervise the surface normals derived explicitly in the implicit neural fields. Our modeling allows us to cluster the derived normals and exploit their orthogonality constraints for self-supervision. Our exhaustive experiments on datasets of diverse indoor scenes demonstrate the significant benefit of the proposed method over the established baselines. The source code will be available at //github.com/nikola3794/normal-clustering-nerf.
The design of personalized cranial implants is a challenging and tremendous task that has become a hot topic in terms of process automation with the use of deep learning techniques. The main challenge is associated with the high diversity of possible cranial defects. The lack of appropriate data sources negatively influences the data-driven nature of deep learning algorithms. Hence, one of the possible solutions to overcome this problem is to rely on synthetic data. In this work, we propose three volumetric variations of deep generative models to augment the dataset by generating synthetic skulls, i.e. Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP), WGAN-GP hybrid with Variational Autoencoder pretraining (VAE/WGAN-GP) and Introspective Variational Autoencoder (IntroVAE). We show that it is possible to generate dozens of thousands of defective skulls with compatible defects that achieve a trade-off between defect heterogeneity and the realistic shape of the skull. We evaluate obtained synthetic data quantitatively by defect segmentation with the use of V-Net and qualitatively by their latent space exploration. We show that the synthetically generated skulls highly improve the segmentation process compared to using only the original unaugmented data. The generated skulls may improve the automatic design of personalized cranial implants for real medical cases.
Children possess the ability to learn multiple cognitive tasks sequentially, which is a major challenge toward the long-term goal of artificial general intelligence. Existing continual learning frameworks are usually applicable to Deep Neural Networks (DNNs) and lack the exploration on more brain-inspired, energy-efficient Spiking Neural Networks (SNNs). Drawing on continual learning mechanisms during child growth and development, we propose Dynamic Structure Development of Spiking Neural Networks (DSD-SNN) for efficient and adaptive continual learning. When learning a sequence of tasks, the DSD-SNN dynamically assigns and grows new neurons to new tasks and prunes redundant neurons, thereby increasing memory capacity and reducing computational overhead. In addition, the overlapping shared structure helps to quickly leverage all acquired knowledge to new tasks, empowering a single network capable of supporting multiple incremental tasks (without the separate sub-network mask for each task). We validate the effectiveness of the proposed model on multiple class incremental learning and task incremental learning benchmarks. Extensive experiments demonstrated that our model could significantly improve performance, learning speed and memory capacity, and reduce computational overhead. Besides, our DSD-SNN model achieves comparable performance with the DNNs-based methods, and significantly outperforms the state-of-the-art (SOTA) performance for existing SNNs-based continual learning methods.
Even though the analysis of unsteady 2D flow fields is challenging, fluid mechanics experts generally have an intuition on where in the simulation domain specific features are expected. Using this intuition, showing similar regions enables the user to discover flow patterns within the simulation data. When focusing on similarity, a solid mathematical framework for a specific flow pattern is not required. We propose a technique that visualizes similar and dissimilar regions with respect to a region selected by the user. Using infinitesimal strain theory, we capture the strain and rotation progression and therefore the dynamics of fluid parcels along pathlines, which we encode as distributions. We then apply the Jensen-Shannon divergence to compute the (dis)similarity between pathline dynamics originating in a user-defined flow region and the pathline dynamics of the flow field. We validate our method by applying it to two simulation datasets of two-dimensional unsteady flows. Our results show that our approach is suitable for analyzing the similarity of time-dependent flow fields.
Learning on big data brings success for artificial intelligence (AI), but the annotation and training costs are expensive. In future, learning on small data is one of the ultimate purposes of AI, which requires machines to recognize objectives and scenarios relying on small data as humans. A series of machine learning models is going on this way such as active learning, few-shot learning, deep clustering. However, there are few theoretical guarantees for their generalization performance. Moreover, most of their settings are passive, that is, the label distribution is explicitly controlled by one specified sampling scenario. This survey follows the agnostic active sampling under a PAC (Probably Approximately Correct) framework to analyze the generalization error and label complexity of learning on small data using a supervised and unsupervised fashion. With these theoretical analyses, we categorize the small data learning models from two geometric perspectives: the Euclidean and non-Euclidean (hyperbolic) mean representation, where their optimization solutions are also presented and discussed. Later, some potential learning scenarios that may benefit from small data learning are then summarized, and their potential learning scenarios are also analyzed. Finally, some challenging applications such as computer vision, natural language processing that may benefit from learning on small data are also surveyed.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.