Data-driven research in Additive Manufacturing (AM) has gained significant success in recent years. This has led to a plethora of scientific literature to emerge. The knowledge in these works consists of AM and Artificial Intelligence (AI) contexts that have not been mined and formalized in an integrated way. Moreover, no tools or guidelines exist to support data-driven knowledge transfer from one context to another. As a result, data-driven solutions using specific AI techniques are being developed and validated only for specific AM process technologies. There is a potential to exploit the inherent similarities across various AM technologies and adapt the existing solutions from one process or problem to another using AI, such as Transfer Learning. We propose a three-step knowledge transferability analysis framework in AM to support data-driven AM knowledge transfer. As a prerequisite to transferability analysis, AM knowledge is featurized into identified knowledge components. The framework consists of pre-transfer, transfer, and post-transfer steps to accomplish knowledge transfer. A case study is conducted between flagship metal AM processes. Laser Powder Bed Fusion (LPBF) is the source of knowledge motivated by its relative matureness in applying AI over Directed Energy Deposition (DED), which drives the need for knowledge transfer as the less explored target process. We show successful transfer at different levels of the data-driven solution, including data representation, model architecture, and model parameters. The pipeline of AM knowledge transfer can be automated in the future to allow efficient cross-context or cross-process knowledge exchange.
Threshold selection is a fundamental problem in any threshold-based extreme value analysis. While models are asymptotically motivated, selecting an appropriate threshold for finite samples can be difficult through standard methods. Inference can also be highly sensitive to the choice of threshold. Too low a threshold choice leads to bias in the fit of the extreme value model, while too high a choice leads to unnecessary additional uncertainty in the estimation of model parameters. In this paper, we develop a novel methodology for automated threshold selection that directly tackles this bias-variance trade-off. We also develop a method to account for the uncertainty in this threshold choice and propagate this uncertainty through to high quantile inference. Through a simulation study, we demonstrate the effectiveness of our method for threshold selection and subsequent extreme quantile estimation. We apply our method to the well-known, troublesome example of the River Nidd dataset.
The success of over-parameterized neural networks trained to near-zero training error has caused great interest in the phenomenon of benign overfitting, where estimators are statistically consistent even though they interpolate noisy training data. While benign overfitting in fixed dimension has been established for some learning methods, current literature suggests that for regression with typical kernel methods and wide neural networks, benign overfitting requires a high-dimensional setting where the dimension grows with the sample size. In this paper, we show that the smoothness of the estimators, and not the dimension, is the key: benign overfitting is possible if and only if the estimator's derivatives are large enough. We generalize existing inconsistency results to non-interpolating models and more kernels to show that benign overfitting with moderate derivatives is impossible in fixed dimension. Conversely, we show that rate-optimal benign overfitting is possible for regression with a sequence of spiky-smooth kernels with large derivatives. Using neural tangent kernels, we translate our results to wide neural networks. We prove that while infinite-width networks do not overfit benignly with the ReLU activation, this can be fixed by adding small high-frequency fluctuations to the activation function. Our experiments verify that such neural networks, while overfitting, can indeed generalize well even on low-dimensional data sets.
With the advent of vehicles equipped with advanced driver-assistance systems, such as adaptive cruise control (ACC) and other automated driving features, the potential for cyberattacks on these automated vehicles (AVs) has emerged. While overt attacks that force vehicles to collide may be easily identified, more insidious attacks, which only slightly alter driving behavior, can result in network-wide increases in congestion, fuel consumption, and even crash risk without being easily detected. To address the detection of such attacks, we first present a traffic model framework for three types of potential cyberattacks: malicious manipulation of vehicle control commands, false data injection attacks on sensor measurements, and denial-of-service (DoS) attacks. We then investigate the impacts of these attacks at both the individual vehicle (micro) and traffic flow (macro) levels. A novel generative adversarial network (GAN)-based anomaly detection model is proposed for real-time identification of such attacks using vehicle trajectory data. We provide numerical evidence {to demonstrate} the efficacy of our machine learning approach in detecting cyberattacks on ACC-equipped vehicles. The proposed method is compared against some recently proposed neural network models and observed to have higher accuracy in identifying anomalous driving behaviors of ACC vehicles.
Neural networks (NNs) are primarily developed within the frequentist statistical framework. Nevertheless, frequentist NNs lack the capability to provide uncertainties in the predictions, and hence their robustness can not be adequately assessed. Conversely, the Bayesian neural networks (BNNs) naturally offer predictive uncertainty by applying Bayes' theorem. However, their computational requirements pose significant challenges. Moreover, both frequentist NNs and BNNs suffer from overfitting issues when dealing with noisy and sparse data, which render their predictions unwieldy away from the available data space. To address both these problems simultaneously, we leverage insights from a hierarchical setting in which the parameter priors are conditional on hyperparameters to construct a BNN by applying a semi-analytical framework known as nonlinear sparse Bayesian learning (NSBL). We call our network sparse Bayesian neural network (SBNN) which aims to address the practical and computational issues associated with BNNs. Simultaneously, imposing a sparsity-inducing prior encourages the automatic pruning of redundant parameters based on the automatic relevance determination (ARD) concept. This process involves removing redundant parameters by optimally selecting the precision of the parameters prior probability density functions (pdfs), resulting in a tractable treatment for overfitting. To demonstrate the benefits of the SBNN algorithm, the study presents an illustrative regression problem and compares the results of a BNN using standard Bayesian inference, hierarchical Bayesian inference, and a BNN equipped with the proposed algorithm. Subsequently, we demonstrate the importance of considering the full parameter posterior by comparing the results with those obtained using the Laplace approximation with and without NSBL.
It is known that the generating function associated with the enumeration of non-backtracking walks on finite graphs is a rational matrix-valued function of the parameter; such function is also closely related to graph-theoretical results such as Ihara's theorem and the zeta function on graphs. In [P. Grindrod, D. J. Higham, V. Noferini, The deformed graph Laplacian and its application to network centrality analysis, SIAM J. Matrix Anal. Appl. 39(1), 310--341, 2018], the radius of convergence of the generating function was studied for simple (i.e., undirected, unweighted and with no loops) graphs, and shown to depend on the number of cycles in the graph. In this paper, we use technologies from the theory of polynomial and rational matrices to greatly extend these results by studying the radius of convergence of the corresponding generating function for general, possibly directed and/or weighted, graphs. We give an analogous characterization of the radius of convergence for directed unweighted graphs, showing that it depends on the number of cycles in the undirectization of the graph. For weighted graphs, we provide for the first time an exact formula for the radius of convergence, improving a previous result that exhibited a lower bound. Finally, we consider also backtracking-downweighted walks on unweighted digraphs, and we prove a version of Ihara's theorem in that case.
Numerous applications in the field of molecular communications (MC) such as healthcare systems are often event-driven. The conventional Shannon capacity may not be the appropriate metric for assessing performance in such cases. We propose the identification (ID) capacity as an alternative metric. Particularly, we consider randomized identification (RI) over the discrete-time Poisson channel (DTPC), which is typically used as a model for MC systems that utilize molecule-counting receivers. In the ID paradigm, the receiver's focus is not on decoding the message sent. However, he wants to determine whether a message of particular significance to him has been sent or not. In contrast to Shannon transmission codes, the size of ID codes for a Discrete Memoryless Channel (DMC) grows doubly exponentially fast with the blocklength, if randomized encoding is used. In this paper, we derive the capacity formula for RI over the DTPC subject to some peak and average power constraints. Furthermore, we analyze the case of state-dependent DTPC.
This proposed model introduces novel deep learning methodologies. The objective here is to create a reliable intrusion detection mechanism to help identify malicious attacks. Deep learning based solution framework is developed consisting of three approaches. The first approach is Long-Short Term Memory Recurrent Neural Network (LSTM-RNN) with seven optimizer functions such as adamax, SGD, adagrad, adam, RMSprop, nadam and adadelta. The model is evaluated on NSL-KDD dataset and classified multi attack classification. The model has outperformed with adamax optimizer in terms of accuracy, detection rate and low false alarm rate. The results of LSTM-RNN with adamax optimizer is compared with existing shallow machine and deep learning models in terms of accuracy, detection rate and low false alarm rate. The multi model methodology consisting of Recurrent Neural Network (RNN), Long-Short Term Memory Recurrent Neural Network (LSTM-RNN), and Deep Neural Network (DNN). The multi models are evaluated on bench mark datasets such as KDD99, NSL-KDD, and UNSWNB15 datasets. The models self-learnt the features and classifies the attack classes as multi-attack classification. The models RNN, and LSTM-RNN provide considerable performance compared to other existing methods on KDD99 and NSL-KDD dataset
Recommender systems build user profiles using concept analysis of usage matrices. The concepts are mined as spectra and form Galois connections. Descent is a general method for spectral decomposition in algebraic geometry and topology which also leads to generalized Galois connections. Both recommender systems and descent theory are vast research areas, separated by a technical gap so large that trying to establish a link would seem foolish. Yet a formal link emerged, all on its own, bottom-up, against authors' intentions and better judgment. Familiar problems of data analysis led to a novel solution in category theory. The present paper arose from a series of earlier efforts to provide a top-down account of these developments.
Deep Learning (DL) is the most widely used tool in the contemporary field of computer vision. Its ability to accurately solve complex problems is employed in vision research to learn deep neural models for a variety of tasks, including security critical applications. However, it is now known that DL is vulnerable to adversarial attacks that can manipulate its predictions by introducing visually imperceptible perturbations in images and videos. Since the discovery of this phenomenon in 2013~[1], it has attracted significant attention of researchers from multiple sub-fields of machine intelligence. In [2], we reviewed the contributions made by the computer vision community in adversarial attacks on deep learning (and their defenses) until the advent of year 2018. Many of those contributions have inspired new directions in this area, which has matured significantly since witnessing the first generation methods. Hence, as a legacy sequel of [2], this literature review focuses on the advances in this area since 2018. To ensure authenticity, we mainly consider peer-reviewed contributions published in the prestigious sources of computer vision and machine learning research. Besides a comprehensive literature review, the article also provides concise definitions of technical terminologies for non-experts in this domain. Finally, this article discusses challenges and future outlook of this direction based on the literature reviewed herein and [2].
Recent advances in 3D fully convolutional networks (FCN) have made it feasible to produce dense voxel-wise predictions of volumetric images. In this work, we show that a multi-class 3D FCN trained on manually labeled CT scans of several anatomical structures (ranging from the large organs to thin vessels) can achieve competitive segmentation results, while avoiding the need for handcrafting features or training class-specific models. To this end, we propose a two-stage, coarse-to-fine approach that will first use a 3D FCN to roughly define a candidate region, which will then be used as input to a second 3D FCN. This reduces the number of voxels the second FCN has to classify to ~10% and allows it to focus on more detailed segmentation of the organs and vessels. We utilize training and validation sets consisting of 331 clinical CT images and test our models on a completely unseen data collection acquired at a different hospital that includes 150 CT scans, targeting three anatomical organs (liver, spleen, and pancreas). In challenging organs such as the pancreas, our cascaded approach improves the mean Dice score from 68.5 to 82.2%, achieving the highest reported average score on this dataset. We compare with a 2D FCN method on a separate dataset of 240 CT scans with 18 classes and achieve a significantly higher performance in small organs and vessels. Furthermore, we explore fine-tuning our models to different datasets. Our experiments illustrate the promise and robustness of current 3D FCN based semantic segmentation of medical images, achieving state-of-the-art results. Our code and trained models are available for download: //github.com/holgerroth/3Dunet_abdomen_cascade.