The API economy refers to the widespread integration of API (advanced programming interface) microservices, where software applications can communicate with each other, as a crucial element in business models and functions. The number of possible ways in which such a system could be used is huge. It is thus desirable to monitor the usage patterns and identify when the system is used in a way that was never used before. This provides a warning to the system analysts and they can ensure uninterrupted operation of the system. In this work we analyze both histograms and call graph of API usage to determine if the usage patterns of the system has shifted. We compare the application of nonparametric statistical and Bayesian sequential analysis to the problem. This is done in a way that overcomes the issue of repeated statistical tests and insures statistical significance of the alerts. The technique was simulated and tested and proven effective in detecting the drift in various scenarios. We also mention modifications to the technique to decrease its memory so that it can respond more quickly when the distribution drift occurs at a delay from when monitoring begins.
Social media plays an increasing role in our communication with friends and family, and our consumption of information and entertainment. Hence, to design effective ranking functions for posts on social media, it would be useful to predict the affective response to a post (e.g., whether the user is likely to be humored, inspired, angered, informed). Similar to work on emotion recognition (which focuses on the affect of the publisher of the post), the traditional approach to recognizing affective response would involve an expensive investment in human annotation of training data. We introduce CARE$_{db}$, a dataset of 230k social media posts annotated according to 7 affective responses using the Common Affective Response Expression (CARE) method. The CARE method is a means of leveraging the signal that is present in comments that are posted in response to a post, providing high-precision evidence about the affective response of the readers to the post without human annotation. Unlike human annotation, the annotation process we describe here can be iterated upon to expand the coverage of the method, particularly for new affective responses. We present experiments that demonstrate that the CARE annotations compare favorably with crowd-sourced annotations. Finally, we use CARE$_{db}$ to train competitive BERT-based models for predicting affective response as well as emotion detection, demonstrating the utility of the dataset for related tasks.
In future sixth-generation (6G) mobile networks, the Internet-of-Everything (IoE) is expected to provide extremely massive connectivity for small battery-powered devices. Indeed, massive devices with limited energy storage capacity impose persistent energy demand hindering the lifetime of communication networks. As a remedy, wireless energy transfer (WET) is a key technology to address these critical energy supply issues. On the other hand, cell-free (CF) massive multiple-input multiple-output (MIMO) systems offer an efficient network architecture to realize the roll-out of the IoE. In this article, we first propose the paradigm of reconfigurable intelligent surface (RIS)-aided CF massive MIMO systems for WET, including its potential application scenarios and system architecture. The four-stage transmission procedure is discussed and analyzed to illustrate the practicality of the architecture. Then we put forward and analyze the hardware design of RIS. Particularly, we discuss the three corresponding operating modes and the amalgamation of WET technology and RIS-aided CF massive MIMO. Representative simulation results are given to confirm the superior performance achieved by our proposed schemes. Also, we investigate the optimal location of deploying multiple RISs to achieve the best system performance. Finally, several important research directions of RIS-aided CF massive MIMO systems with WET are presented to inspire further potential investigation.
We develop an efficient Bayesian sequential inference framework for factor analysis models observed via various data types, such as continuous, binary and ordinal data. In the continuous data case, where it is possible to marginalise over the latent factors, the proposed methodology tailors the Iterated Batch Importance Sampling (IBIS) of Chopin (2002) to handle such models and we incorporate Hamiltonian Markov Chain Monte Carlo. For binary and ordinal data, we develop an efficient IBIS scheme to handle the parameter and latent factors, combining with Laplace or Variational Bayes approximations. The methodology can be used in the context of sequential hypothesis testing via Bayes factors, which are known to have advantages over traditional null hypothesis testing. Moreover, the developed sequential framework offers multiple benefits even in non-sequential cases, by providing posterior distribution, model evidence and scoring rules (under the prequential framework) in one go, and by offering a more robust alternative computational scheme to Markov Chain Monte Carlo that can be useful in problematic target distributions.
We introduce a novel, probabilistic binary latent variable model to detect noisy or approximate repeats of patterns in sparse binary data. The model is based on the "Noisy-OR model" (Heckerman, 1990), used previously for disease and topic modelling. The model's capability is demonstrated by extracting structure in recordings from retinal neurons, but it can be widely applied to discover and model latent structure in noisy binary data. In the context of spiking neural data, the task is to "explain" spikes of individual neurons in terms of groups of neurons, "Cell Assemblies" (CAs), that often fire together, due to mutual interactions or other causes. The model infers sparse activity in a set of binary latent variables, each describing the activity of a cell assembly. When the latent variable of a cell assembly is active, it reduces the probabilities of neurons belonging to this assembly to be inactive. The conditional probability kernels of the latent components are learned from the data in an expectation maximization scheme, involving inference of latent states and parameter adjustments to the model. We thoroughly validate the model on synthesized spike trains constructed to statistically resemble recorded retinal responses to white noise stimulus and natural movie stimulus in data. We also apply our model to spiking responses recorded in retinal ganglion cells (RGCs) during stimulation with a movie and discuss the found structure.
Recently, many studies have shed light on the high adaptivity of deep neural network methods in nonparametric regression models, and their superior performance has been established for various function classes. Motivated by this development, we study a deep neural network method to estimate the drift coefficient of a multi-dimensional diffusion process from discrete observations. We derive generalization error bounds for least squares estimates based on deep neural networks and show that they achieve the minimax rate of convergence up to a logarithmic factor when the drift function has a compositional structure.
This paper considers identification and estimation of the causal effect of the time Z until a subject is treated on a survival outcome T. The treatment is not randomly assigned, T is randomly right censored by a random variable C and the time to treatment Z is right censored by min(T,C) The endogeneity issue is treated using an instrumental variable explaining Z and independent of the error term of the model. We study identification in a fully nonparametric framework. We show that our specification generates an integral equation, of which the regression function of interest is a solution. We provide identification conditions that rely on this identification equation. For estimation purposes, we assume that the regression function follows a parametric model. We propose an estimation procedure and give conditions under which the estimator is asymptotically normal. The estimators exhibit good finite sample properties in simulations. Our methodology is applied to find evidence supporting the efficacy of a therapy for burn-out.
Cluster analysis aims at partitioning data into groups or clusters. In applications, it is common to deal with problems where the number of clusters is unknown. Bayesian mixture models employed in such applications usually specify a flexible prior that takes into account the uncertainty with respect to the number of clusters. However, a major empirical challenge involving the use of these models is in the characterisation of the induced prior on the partitions. This work introduces an approach to compute descriptive statistics of the prior on the partitions for three selected Bayesian mixture models developed in the areas of Bayesian finite mixtures and Bayesian nonparametrics. The proposed methodology involves computationally efficient enumeration of the prior on the number of clusters in-sample (termed as ``data clusters'') and determining the first two prior moments of symmetric additive statistics characterising the partitions. The accompanying reference implementation is made available in the R package 'fipp'. Finally, we illustrate the proposed methodology through comparisons and also discuss the implications for prior elicitation in applications.
Deep Learning (DL) is vulnerable to out-of-distribution and adversarial examples resulting in incorrect outputs. To make DL more robust, several posthoc anomaly detection techniques to detect (and discard) these anomalous samples have been proposed in the recent past. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection for DL based applications. We provide a taxonomy for existing techniques based on their underlying assumptions and adopted approaches. We discuss various techniques in each of the categories and provide the relative strengths and weaknesses of the approaches. Our goal in this survey is to provide an easier yet better understanding of the techniques belonging to different categories in which research has been done on this topic. Finally, we highlight the unsolved research challenges while applying anomaly detection techniques in DL systems and present some high-impact future research directions.
Transfer learning is one of the subjects undergoing intense study in the area of machine learning. In object recognition and object detection there are known experiments for the transferability of parameters, but not for neural networks which are suitable for object-detection in real time embedded applications, such as the SqueezeDet neural network. We use transfer learning to accelerate the training of SqueezeDet to a new group of classes. Also, experiments are conducted to study the transferability and co-adaptation phenomena introduced by the transfer learning process. To accelerate training, we propose a new implementation of the SqueezeDet training which provides a faster pipeline for data processing and achieves $1.8$ times speedup compared to the initial implementation. Finally, we created a mechanism for automatic hyperparamer optimization using an empirical method.
eCommerce transaction frauds keep changing rapidly. This is the major issue that prevents eCommerce merchants having a robust machine learning model for fraudulent transactions detection. The root cause of this problem is that rapid changing fraud patterns alters underlying data generating system and causes the performance deterioration for machine learning models. This phenomenon in statistical modeling is called "Concept Drift". To overcome this issue, we propose an approach which adds dynamic risk features as model inputs. Dynamic risk features are a set of features built on entity profile with fraud feedback. They are introduced to quantify the fluctuation of probability distribution of risk features from certain entity profile caused by concept drift. In this paper, we also illustrate why this strategy can successfully handle the effect of concept drift under statistical learning framework. We also validate our approach on multiple businesses in production and have verified that the proposed dynamic model has a superior ROC curve than a static model built on the same data and training parameters.