Most urban applications necessitate building footprints in the form of concise vector graphics with sharp boundaries rather than pixel-wise raster images. This need contrasts with the majority of existing methods, which typically generate over-smoothed footprint polygons. Editing these automatically produced polygons can be inefficient, if not more time-consuming than manual digitization. This paper introduces a semi-automatic approach for building footprint extraction through semantically-sensitive superpixels and neural graph networks. Drawing inspiration from object-based classification techniques, we first learn to generate superpixels that are not only boundary-preserving but also semantically-sensitive. The superpixels respond exclusively to building boundaries rather than other natural objects, while simultaneously producing semantic segmentation of the buildings. These intermediate superpixel representations can be naturally considered as nodes within a graph. Consequently, graph neural networks are employed to model the global interactions among all superpixels and enhance the representativeness of node features for building segmentation. Classical approaches are utilized to extract and regularize boundaries for the vectorized building footprints. Utilizing minimal clicks and straightforward strokes, we efficiently accomplish accurate segmentation outcomes, eliminating the necessity for editing polygon vertices. Our proposed approach demonstrates superior precision and efficacy, as validated by experimental assessments on various public benchmark datasets. A significant improvement of 8% in AP50 was observed in vector graphics evaluation, surpassing established techniques. Additionally, we have devised an optimized and sophisticated pipeline for interactive editing, poised to further augment the overall quality of the results.
Digital image correlation (DIC) has become a valuable tool in the evaluation of mechanical experiments, particularly fatigue crack growth experiments. The evaluation requires accurate information of the crack path and crack tip position, which is difficult to obtain due to inherent noise and artefacts. Machine learning models have been extremely successful in recognizing this relevant information. But for the training of robust models, which generalize well, big data is needed. However, data is typically scarce in the field of material science and engineering because experiments are expensive and time-consuming. We present a method to generate synthetic DIC data using generative adversarial networks with a physics-guided discriminator. To decide whether data samples are real or fake, this discriminator additionally receives the derived von Mises equivalent strain. We show that this physics-guided approach leads to improved results in terms of visual quality of samples, sliced Wasserstein distance, and geometry score.
Characterizing shapes of high-dimensional objects via Ricci curvatures plays a critical role in many research areas in mathematics and physics. However, even though several discretizations of Ricci curvatures for discrete combinatorial objects such as networks have been proposed and studied by mathematicians, the computational complexity aspects of these discretizations have escaped the attention of theoretical computer scientists to a large extent. In this paper, we study one such discretization, namely the Ollivier-Ricci curvature, from the perspective of efficient computation by fine-grained reductions and local query-based algorithms. Our main contributions are the following. (a) We relate our curvature computation problem to minimum weight perfect matching problem on complete bipartite graphs via fine-grained reduction. (b) We formalize the computational aspects of the curvature computation problems in suitable frameworks so that they can be studied by researchers in local algorithms. (c) We provide the first known lower and upper bounds on queries for query-based algorithms for the curvature computation problems in our local algorithms framework. En route, we also illustrate a localized version of our fine-grained reduction. We believe that our results bring forth an intriguing set of research questions, motivated both in theory and practice, regarding designing efficient algorithms for curvatures of objects.
The direct deep learning simulation for multi-scale problems remains a challenging issue. In this work, a novel higher-order multi-scale deep Ritz method (HOMS-DRM) is developed for thermal transfer equation of authentic composite materials with highly oscillatory and discontinuous coefficients. In this novel HOMS-DRM, higher-order multi-scale analysis and modeling are first employed to overcome limitations of prohibitive computation and Frequency Principle when direct deep learning simulation. Then, improved deep Ritz method are designed to high-accuracy and mesh-free simulation for macroscopic homogenized equation without multi-scale property and microscopic lower-order and higher-order cell problems with highly discontinuous coefficients. Moreover, the theoretical convergence of the proposed HOMS-DRM is rigorously demonstrated under appropriate assumptions. Finally, extensive numerical experiments are presented to show the computational accuracy of the proposed HOMS-DRM. This study offers a robust and high-accuracy multi-scale deep learning framework that enables the effective simulation and analysis of multi-scale problems of authentic composite materials.
The application of deep learning to non-stationary temporal datasets can lead to overfitted models that underperform under regime changes. In this work, we propose a modular machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes. The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks, with and without feature engineering. We evaluate our framework on financial data for stock portfolio prediction, and find that GBDT models with dropout display high performance, robustness and generalisability with reduced complexity and computational cost. We then demonstrate how online learning techniques, which require no retraining of models, can be used post-prediction to enhance the results. First, we show that dynamic feature projection improves robustness by reducing drawdown in regime changes. Second, we demonstrate that dynamical model ensembling based on selection of models with good recent performance leads to improved Sharpe and Calmar ratios of out-of-sample predictions. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility.
In this work, we propose a simple yet generic preconditioned Krylov subspace method for a large class of nonsymmetric block Toeplitz all-at-once systems arising from discretizing evolutionary partial differential equations. Namely, our main result is to propose two novel symmetric positive definite preconditioners, which can be efficiently diagonalized by the discrete sine transform matrix. More specifically, our approach is to first permute the original linear system to obtain a symmetric one, and subsequently develop desired preconditioners based on the spectral symbol of the modified matrix. Then, we show that the eigenvalues of the preconditioned matrix sequences are clustered around $\pm 1$, which entails rapid convergence when the minimal residual method is devised. Alternatively, when the conjugate gradient method on the normal equations is used, we show that our preconditioner is effective in the sense that the eigenvalues of the preconditioned matrix sequence are clustered around unity. An extension of our proposed preconditioned method is given for high-order backward difference time discretization schemes, which can be applied on a wide range of time-dependent equations. Numerical examples are given, also in the variable-coefficient setting, to demonstrate the effectiveness of our proposed preconditioners, which consistently outperforms an existing block circulant preconditioner discussed in the relevant literature.
Tactile perception is an increasingly popular gateway in human-machine interaction, yet universal design guidelines for tactile displays are still lacking, largely due to the absence of methods to measure sensibility across skin areas. In this study, we address this gap by developing and evaluating two fully automated vibrotactile tasks that require subjects to discriminate the position of vibrotactile stimuli using a two-interval forced-choice procedure (2IFC). Of the two methodologies, one was initially validated through a preliminary study involving 13 participants. Subsequently, we applied the validated and improved vibrotactile testing procedure to a larger sample of 23 participants, enabling a direct and valid comparison with static perception. Our findings reveal a significantly finer spatial acuity for static stimuli perception compared to vibrotactile stimuli perception from a stimulus separation of 15 mm onwards. This study introduces a novel method for generating both universal thresholds and individual person-specific data for vibratory perception, marking a critical step towards the development of functional vibrotactile displays. The results underline the need for further research in this area and provide a foundation for the development of universal design guidelines for tactile displays.
In this paper, we propose a parameter identification methodology of the SIRD model, an extension of the classical SIR model, that considers the deceased as a separate category. In addition, our model includes one parameter which is the ratio between the real total number of infected and the number of infected that were documented in the official statistics. Due to many factors, like governmental decisions, several variants circulating, opening and closing of schools, the typical assumption that the parameters of the model stay constant for long periods of time is not realistic. Thus our objective is to create a method which works for short periods of time. In this scope, we approach the estimation relying on the previous 7 days of data and then use the identified parameters to make predictions. To perform the estimation of the parameters we propose the average of an ensemble of neural networks. Each neural network is constructed based on a database built by solving the SIRD for 7 days, with random parameters. In this way, the networks learn the parameters from the solution of the SIRD model. Lastly we use the ensemble to get estimates of the parameters from the real data of Covid19 in Romania and then we illustrate the predictions for different periods of time, from 10 up to 45 days, for the number of deaths. The main goal was to apply this approach on the analysis of COVID-19 evolution in Romania, but this was also exemplified on other countries like Hungary, Czech Republic and Poland with similar results. The results are backed by a theorem which guarantees that we can recover the parameters of the model from the reported data. We believe this methodology can be used as a general tool for dealing with short term predictions of infectious diseases or in other compartmental models.
This article presents the openCFS submodule scattered data reader for coupling multi-physical simulations performed in different simulation programs. For instance, by considering a forward-coupling of a surface vibration simulation (mechanical system) to an acoustic propagation simulation using time-dependent acoustic absorbing material as a noise mitigation measure. The nearest-neighbor search of the target and source points from the interpolation is performed using the FLANN or the CGAL library. In doing so, the coupled field (e.g., surface velocity) is interpolated from a source representation consisting of field values physically stored and organized in a file directory to a target representation being the quadrature points in the case of the finite element method. A test case of the functionality is presented in the "testsuite" module of the openCFS software called "Abc2dcsvt". This scattered data reader module was successfully applied in numerous studies on flow-induced sound generation. Within this short article, the functionality, and usability of this module are described.
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.
Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvolutional Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.