Lattice-skin structures composed of a thin-shell skin and a lattice infill are widespread in nature and large-scale engineering due to their efficiency and exceptional mechanical properties. Recent advances in additive manufacturing, or 3D printing, make it possible to create lattice-skin structures of almost any size with arbitrary shape and geometric complexity. We propose a novel gradient-based approach to optimising both the shape and infill of lattice-skin structures to improve their efficiency further. The respective gradients are computed by fully considering the lattice-skin coupling while the lattice topology and shape optimisation problems are solved in a sequential manner. The shell is modelled as a Kirchhoff-Love shell and analysed using isogeometric subdivision surfaces, whereas the lattice is modelled as a pin-jointed truss. The lattice consists of many cells, possibly of different sizes, with each containing a small number of struts. We propose a penalisation approach akin to the SIMP (solid isotropic material with penalisation) method for topology optimisation of the lattice. Furthermore, a corresponding sensitivity filter and a lattice extraction technique are introduced to ensure the stability of the optimisation process and to eliminate scattered struts of small cross-sectional areas. The developed topology optimisation technique is suitable for non-periodic, non-uniform lattices. For shape optimisation of both the shell and the lattice, the geometry of the lattice-skin structure is parameterised using the free-form deformation technique. The topology and shape optimisation problems are solved in an iterative, sequential manner. The effectiveness of the proposed approach and the influence of different algorithmic parameters are demonstrated with several numerical examples.
This paper presents a real-time online vision framework to jointly recover an indoor scene's 3D structure and semantic label. Given noisy depth maps, a camera trajectory, and 2D semantic labels at train time, the proposed deep neural network based approach learns to fuse the depth over frames with suitable semantic labels in the scene space. Our approach exploits the joint volumetric representation of the depth and semantics in the scene feature space to solve this task. For a compelling online fusion of the semantic labels and geometry in real-time, we introduce an efficient vortex pooling block while dropping the use of routing network in online depth fusion to preserve high-frequency surface details. We show that the context information provided by the semantics of the scene helps the depth fusion network learn noise-resistant features. Not only that, it helps overcome the shortcomings of the current online depth fusion method in dealing with thin object structures, thickening artifacts, and false surfaces. Experimental evaluation on the Replica dataset shows that our approach can perform depth fusion at 37 and 10 frames per second with an average reconstruction F-score of 88% and 91%, respectively, depending on the depth map resolution. Moreover, our model shows an average IoU score of 0.515 on the ScanNet 3D semantic benchmark leaderboard.
Cross Z-complementary pairs (CZCPs) are a special kind of Z-complementary pairs (ZCPs) having zero autocorrelation sums around the in-phase position and end-shift position, also having zero cross-correlation sums around the end-shift position. It can be utilized as a key component in designing optimal training sequences for broadband spatial modulation (SM) systems over frequency selective channels. In this paper, we focus on designing new CZCPs with large cross Z-complementary ratio $(\mathrm{CZC}_{\mathrm{ratio}})$ by exploring two promising approaches. The first one of CZCPs via properly cascading sequences from a Golay complementary pair (GCP). The proposed construction leads to $(28L,13L)-\mathrm{CZCPs}$, $(28L,13L+\frac{L}{2})-\mathrm{CZCPs}$ and $(30L,13L-1)-\mathrm{CZCPs}$, where $L$ is the length of a binary GCP. Besides, we emphasize that, our proposed CZCPs have the largest $\mathrm{CZC}_{\mathrm{ratio}}=\frac{27}{28}$, compared with known CZCPs but no-perfect CZCPs in the literature. Specially, we proposed optimal binary CZCPs with $(28,13)-\mathrm{CZCP}$ and $(56,27)-\mathrm{CZCP}$. The second one of CZCPs based on Boolean functions (BFs), and the construction of CZCPs have the largest $\mathrm{CZC}_{\mathrm{ratio}}=\frac{13}{14}$, compared with known CZCPs but no-perfect CZCPs in the literature.
Aiming to recover the data from several concurrent node failures, linear $r$-LRC codes with locality $r$ were extended into $(r, \delta)$-LRC codes with locality $(r, \delta)$ which can enable the local recovery of a failed node in case of more than one node failure. Optimal LRC codes are those whose parameters achieve the generalized Singleton bound with equality. In the present paper, we are interested in studying optimal LRC codes over small fields and, more precisely, over $\mathbb{F}_4$. We shall adopt an approach by investigating optimal quaternary $(r,\delta)$-LRC codes through their parity-check matrices. Our study includes determining the structural properties of optimal $(r,\delta)$-LRC codes, their constructions, and their complete classification over $\F_4$ by browsing all possible parameters. We emphasize that the precise structure of optimal quaternary $(r,\delta)$-LRC codes and their classification are obtained via the parity-check matrix approach use proofs-techniques different from those used recently for optimal binary and ternary $(r,\delta)$-LRC codes obtained by Hao et al. in [IEEE Trans. Inf. Theory, 2020, 66(12): 7465-7474].
We present a learning-based approach for removing unwanted obstructions, such as window reflections, fence occlusions or raindrops, from a short sequence of images captured by a moving camera. Our method leverages the motion differences between the background and the obstructing elements to recover both layers. Specifically, we alternate between estimating dense optical flow fields of the two layers and reconstructing each layer from the flow-warped images via a deep convolutional neural network. The learning-based layer reconstruction allows us to accommodate potential errors in the flow estimation and brittle assumptions such as brightness consistency. We show that training on synthetically generated data transfers well to real images. Our results on numerous challenging scenarios of reflection and fence removal demonstrate the effectiveness of the proposed method.
Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D$^3$M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D$^3$M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D$^3$M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.
Deep structured models are widely used for tasks like semantic segmentation, where explicit correlations between variables provide important prior information which generally helps to reduce the data needs of deep nets. However, current deep structured models are restricted by oftentimes very local neighborhood structure, which cannot be increased for computational complexity reasons, and by the fact that the output configuration, or a representation thereof, cannot be transformed further. Very recent approaches which address those issues include graphical model inference inside deep nets so as to permit subsequent non-linear output space transformations. However, optimization of those formulations is challenging and not well understood. Here, we develop a novel model which generalizes existing approaches, such as structured prediction energy networks, and discuss a formulation which maintains applicability of existing inference techniques.
The per-pixel cross-entropy loss (CEL) has been widely used in structured output prediction tasks as a spatial extension of generic image classification. However, its i.i.d. assumption neglects the structural regularity present in natural images. Various attempts have been made to incorporate structural reasoning mostly through structure priors in a cooperative way where co-occuring patterns are encouraged. We, on the other hand, approach this problem from an opposing angle and propose a new framework for training such structured prediction networks via an adversarial process, in which we train a structure analyzer that provides the supervisory signals, the adversarial structure matching loss (ASML). The structure analyzer is trained to maximize ASML, or to exaggerate recurring structural mistakes usually among co-occurring patterns. On the contrary, the structured output prediction network is trained to reduce those mistakes and is thus enabled to distinguish fine-grained structures. As a result, training structured output prediction networks using ASML reduces contextual confusion among objects and improves boundary localization. We demonstrate that ASML outperforms its counterpart CEL especially in context and boundary aspects on figure-ground segmentation and semantic segmentation tasks with various base architectures, such as FCN, U-Net, DeepLab, and PSPNet.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.
We propose an Active Learning approach to image segmentation that exploits geometric priors to streamline the annotation process. We demonstrate this for both background-foreground and multi-class segmentation tasks in 2D images and 3D image volumes. Our approach combines geometric smoothness priors in the image space with more traditional uncertainty measures to estimate which pixels or voxels are most in need of annotation. For multi-class settings, we additionally introduce two novel criteria for uncertainty. In the 3D case, we use the resulting uncertainty measure to show the annotator voxels lying on the same planar patch, which makes batch annotation much easier than if they were randomly distributed in the volume. The planar patch is found using a branch-and-bound algorithm that finds a patch with the most informative instances. We evaluate our approach on Electron Microscopy and Magnetic Resonance image volumes, as well as on regular images of horses and faces. We demonstrate a substantial performance increase over state-of-the-art approaches.
The Residual Networks of Residual Networks (RoR) exhibits excellent performance in the image classification task, but sharply increasing the number of feature map channels makes the characteristic information transmission incoherent, which losses a certain of information related to classification prediction, limiting the classification performance. In this paper, a Pyramidal RoR network model is proposed by analysing the performance characteristics of RoR and combining with the PyramidNet. Firstly, based on RoR, the Pyramidal RoR network model with channels gradually increasing is designed. Secondly, we analysed the effect of different residual block structures on performance, and chosen the residual block structure which best favoured the classification performance. Finally, we add an important principle to further optimize Pyramidal RoR networks, drop-path is used to avoid over-fitting and save training time. In this paper, image classification experiments were performed on CIFAR-10/100 and SVHN datasets, and we achieved the current lowest classification error rates were 2.96%, 16.40% and 1.59%, respectively. Experiments show that the Pyramidal RoR network optimization method can improve the network performance for different data sets and effectively suppress the gradient disappearance problem in DCNN training.