We present an application of multi-mesh finite element methods as part of a methodology for optimizing settlement layouts. By formulating a multi-objective optimization problem, we demonstrate how a given number of buildings may be optimally placed on a given piece of land with respect to both wind conditions and the view experienced from the buildings. The wind flow is modeled by a multi-mesh (cut finite element) method. This allows each building to be embedded in a boundary-fitted mesh which can be moved freely on top of a fixed background mesh. This approach enables a multitude of settlement layouts to be evaluated without the need for costly mesh generation when changing the configuration of buildings. The view is modeled by a measure that takes into account the totality of unobstructed view from the collection of buildings, and is efficiently computed by rasterization.
We describe the R package glmmrBase and an extension glmmrOptim. glmmrBase provides a flexible approach to specifying and analysing generalised linear mixed models. We use an object-orientated class system within R to provide methods for a wide range of covariance and mean functions relevant to multiple applications including cluster randomised trials, cohort studies, spatial and spatio-temporal modelling, and split-plot designs. The class generates relevant matrices and statistics and a wide range of methods including full likelihood estimation of generalised linear mixed models using Markov Chain Monte Carlo Maximum Likelihood, Laplace approximation, power calculation, and access to relevant calculations. The class also includes Hamiltonian Monte Carlo simulation of random effects, sparse matrix methods, and other functionality to support efficient estimation. The glmmrOptim package implements a set of algorithms to identify c-optimal experimental designs where observations are correlated and can be specified using a generalised linear mixed model. Several examples and comparisons to existing packages are provided to illustrate use of the packages.
The concepts of sparsity, and regularised estimation, have proven useful in many high-dimensional statistical applications. Dynamic factor models (DFMs) provide a parsimonious approach to modelling high-dimensional time series, however, it is often hard to interpret the meaning of the latent factors. This paper formally introduces a class of sparse DFMs whereby the loading matrices are constrained to have few non-zero entries, thus increasing interpretability of factors. We present a regularised M-estimator for the model parameters, and construct an efficient expectation maximisation algorithm to enable estimation. Synthetic experiments demonstrate consistency in terms of estimating the loading structure, and superior predictive performance where a low-rank factor structure may be appropriate. The utility of the method is further illustrated in an application forecasting electricity consumption across a large set of smart meters.
Occluded person re-identification (Re-ID) aims to address the potential occlusion problem when matching occluded or holistic pedestrians from different camera views. Many methods use the background as artificial occlusion and rely on attention networks to exclude noisy interference. However, the significant discrepancy between simple background occlusion and realistic occlusion can negatively impact the generalization of the network.To address this issue, we propose a novel transformer-based Attention Disturbance and Dual-Path Constraint Network (ADP) to enhance the generalization of attention networks. Firstly, to imitate real-world obstacles, we introduce an Attention Disturbance Mask (ADM) module that generates an offensive noise, which can distract attention like a realistic occluder, as a more complex form of occlusion.Secondly, to fully exploit these complex occluded images, we develop a Dual-Path Constraint Module (DPC) that can obtain preferable supervision information from holistic images through dual-path interaction. With our proposed method, the network can effectively circumvent a wide variety of occlusions using the basic ViT baseline. Comprehensive experimental evaluations conducted on person re-ID benchmarks demonstrate the superiority of ADP over state-of-the-art methods.
In real-world scenarios, it may not always be possible to collect hundreds of labeled samples per class for training deep learning-based SAR Automatic Target Recognition (ATR) models. This work specifically tackles the few-shot SAR ATR problem, where only a handful of labeled samples may be available to support the task of interest. Our approach is composed of two stages. In the first, a global representation model is trained via self-supervised learning on a large pool of diverse and unlabeled SAR data. In the second stage, the global model is used as a fixed feature extractor and a classifier is trained to partition the feature space given the few-shot support samples, while simultaneously being calibrated to detect anomalous inputs. Unlike competing approaches which require a pristine labeled dataset for pretraining via meta-learning, our approach learns highly transferable features from unlabeled data that have little-to-no relation to the downstream task. We evaluate our method in standard and extended MSTAR operating conditions and find it to achieve high accuracy and robust out-of-distribution detection in many different few-shot settings. Our results are particularly significant because they show the merit of a global model approach to SAR ATR, which makes minimal assumptions, and provides many axes for extendability.
With the rising complexity of numerous novel applications that serve our modern society comes the strong need to design efficient computing platforms. Designing efficient hardware is, however, a complex multi-objective problem that deals with multiple parameters and their interactions. Given that there are a large number of parameters and objectives involved in hardware design, synthesizing all possible combinations is not a feasible method to find the optimal solution. One promising approach to tackle this problem is statistical modeling of a desired hardware performance. Here, we propose a model-based active learning approach to solve this problem. Our proposed method uses Bayesian models to characterize various aspects of hardware performance. We also use transfer learning and Gaussian regression bootstrapping techniques in conjunction with active learning to create more accurate models. Our proposed statistical modeling method provides hardware models that are sufficiently accurate to perform design space exploration as well as performance prediction simultaneously. We use our proposed method to perform design space exploration and performance prediction for various hardware setups, such as micro-architecture design and OpenCL kernels for FPGA targets. Our experiments show that the number of samples required to create performance models significantly reduces while maintaining the predictive power of our proposed statistical models. For instance, in our performance prediction setting, the proposed method needs 65% fewer samples to create the model, and in the design space exploration setting, our proposed method can find the best parameter settings by exploring less than 50 samples.
This paper proposes a criterion for detecting change structures in tensor data. To accommodate tensor structure with structural mode that is not suitable to be equally treated and summarized in a distance to measure the difference between any two adjacent tensors, we define a mode-based signal-screening Frobenius distance for the moving sums of slices of tensor data to handle both dense and sparse model structures of the tensors. As a general distance, it can also deal with the case without structural mode. Based on the distance, we then construct signal statistics using the ratios with adaptive-to-change ridge functions. The number of changes and their locations can then be consistently estimated in certain senses, and the confidence intervals of the locations of change points are constructed. The results hold when the size of the tensor and the number of change points diverge at certain rates, respectively. Numerical studies are conducted to examine the finite sample performances of the proposed method. We also analyze two real data examples for illustration.
Having reliable specifications is an unavoidable challenge in achieving verifiable correctness, robustness, and interpretability of AI systems. Existing specifications for neural networks are in the paradigm of data as specification. That is, the local neighborhood centering around a reference input is considered to be correct (or robust). While existing specifications contribute to verifying adversarial robustness, a significant problem in many research domains, our empirical study shows that those verified regions are somewhat tight, and thus fail to allow verification of test set inputs, making them impractical for some real-world applications. To this end, we propose a new family of specifications called neural representation as specification, which uses the intrinsic information of neural networks - neural activation patterns (NAPs), rather than input data to specify the correctness and/or robustness of neural network predictions. We present a simple statistical approach to mining neural activation patterns. To show the effectiveness of discovered NAPs, we formally verify several important properties, such as various types of misclassifications will never happen for a given NAP, and there is no ambiguity between different NAPs. We show that by using NAP, we can verify a significant region of the input space, while still recalling 84% of the data on MNIST. Moreover, we can push the verifiable bound to 10 times larger on the CIFAR10 benchmark. Thus, we argue that NAPs can potentially be used as a more reliable and extensible specification for neural network verification.
This paper presents a new method for combining (or aggregating or ensembling) multivariate probabilistic forecasts, taking into account dependencies between quantiles and covariates through a smoothing procedure that allows for online learning. Two smoothing methods are discussed: dimensionality reduction using Basis matrices and penalized smoothing. The new online learning algorithm generalizes the standard CRPS learning framework into multivariate dimensions. It is based on Bernstein Online Aggregation (BOA) and yields optimal asymptotic learning properties. We provide an in-depth discussion on possible extensions of the algorithm and several nested cases related to the existing literature on online forecast combination. The methodology is applied to forecasting day-ahead electricity prices, which are 24-dimensional distributional forecasts. The proposed method yields significant improvements over uniform combination in terms of continuous ranked probability score (CRPS). We discuss the temporal evolution of the weights and hyperparameters and present the results of reduced versions of the preferred model. A fast C++ implementation of all discussed methods is provided in the R-Package profoc.
Triply periodic minimal surface (TPMS) metamaterials characterized by mathematically-controlled topologies exhibit better mechanical properties compared to uniform structures. The unit cell topology of such metamaterials can be further optimized to improve a desired mechanical property for a specific application. However, such inverse design involves multiple costly 3D finite element analyses in topology optimization and hence has not been attempted. Data-driven models have recently gained popularity as surrogate models in the geometrical design of metamaterials. Gyroid-like unit cells are designed using a novel voxel algorithm, a homogenization-based topology optimization, and a Heaviside filter to attain optimized densities of 0-1 configuration. Few optimization data are used as input-output for supervised learning of the topology optimization process from a 3D CNN model. These models could then be used to instantaneously predict the optimized unit cell geometry for any topology parameters, thus alleviating the need to run any topology optimization for future design. The high accuracy of the model was demonstrated by a low mean square error metric and a high dice coefficient metric. This accelerated design of 3D metamaterials opens the possibility of designing any computationally costly problems involving complex geometry of metamaterials with multi-objective properties or multi-scale applications.
Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.