Tensors, also known as multidimensional arrays, are useful data structures in machine learning and statistics. In recent years, Bayesian methods have emerged as a popular direction for analyzing tensor-valued data since they provide a convenient way to introduce sparsity into the model and conduct uncertainty quantification. In this article, we provide an overview of frequentist and Bayesian methods for solving tensor completion and regression problems, with a focus on Bayesian methods. We review common Bayesian tensor approaches including model formulation, prior assignment, posterior computation, and theoretical properties. We also discuss potential future directions in this field.
Understanding how and why certain communities bear a disproportionate burden of disease is challenging due to the scarcity of data on these communities. Surveys provide a useful avenue for accessing hard-to-reach populations, as many surveys specifically oversample understudied and vulnerable populations. When survey data is used for analysis, it is important to account for the complex survey design that gave rise to the data, in order to avoid biased conclusions. The field of Bayesian survey statistics aims to account for such survey design while leveraging the advantages of Bayesian models, which can flexibly handle sparsity through borrowing of information and provide a coherent inferential framework to easily obtain variances for complex models and data types. For these reasons, Bayesian survey methods seem uniquely well-poised for health disparities research, where heterogeneity and sparsity are frequent considerations. This review discusses three main approaches found in the Bayesian survey methodology literature: 1) multilevel regression and post-stratification, 2) weighted pseudolikelihood-based methods, and 3) synthetic population generation. We discuss advantages and disadvantages of each approach, examine recent applications and extensions, and consider how these approaches may be leveraged to improve research in population health equity.
Bayesian model averaging is a practical method for dealing with uncertainty due to model specification. Use of this technique requires the estimation of model probability weights. In this work, we revisit the derivation of estimators for these model weights. Use of the Kullback-Leibler divergence as a starting point leads naturally to a number of alternative information criteria suitable for Bayesian model weight estimation. We explore three such criteria, known to the statistics literature before, in detail: a Bayesian analogue of the Akaike information criterion which we call the BAIC, the Bayesian predictive information criterion (BPIC), and the posterior predictive information criterion (PPIC). We compare the use of these information criteria in numerical analysis problems common in lattice field theory calculations. We find that the PPIC has the most appealing theoretical properties and can give the best performance in terms of model-averaging uncertainty, particularly in the presence of noisy data.
The $k$-tensor Ising model is an exponential family on a $p$-dimensional binary hypercube for modeling dependent binary data, where the sufficient statistic consists of all $k$-fold products of the observations, and the parameter is an unknown $k$-fold tensor, designed to capture higher-order interactions between the binary variables. In this paper, we describe an approach based on a penalization technique that helps us recover the signed support of the tensor parameter with high probability, assuming that no entry of the true tensor is too close to zero. The method is based on an $\ell_1$-regularized node-wise logistic regression, that recovers the signed neighborhood of each node with high probability. Our analysis is carried out in the high-dimensional regime, that allows the dimension $p$ of the Ising model, as well as the interaction factor $k$ to potentially grow to $\infty$ with the sample size $n$. We show that if the minimum interaction strength is not too small, then consistent recovery of the entire signed support is possible if one takes $n = \Omega((k!)^8 d^3 \log \binom{p-1}{k-1})$ samples, where $d$ denotes the maximum degree of the hypernetwork in question. Our results are validated in two simulation settings, and applied on a real neurobiological dataset consisting of multi-array electro-physiological recordings from the mouse visual cortex, to model higher-order interactions between the brain regions.
Over the past few years, extensive research has been devoted to enhancing YOLO object detectors. Since its introduction, eight major versions of YOLO have been introduced with the purpose of improving its accuracy and efficiency. While the evident merits of YOLO have yielded to its extensive use in many areas, deploying it on resource-limited devices poses challenges. To address this issue, various neural network compression methods have been developed, which fall under three main categories, namely network pruning, quantization, and knowledge distillation. The fruitful outcomes of utilizing model compression methods, such as lowering memory usage and inference time, make them favorable, if not necessary, for deploying large neural networks on hardware-constrained edge devices. In this review paper, our focus is on pruning and quantization due to their comparative modularity. We categorize them and analyze the practical results of applying those methods to YOLOv5. By doing so, we identify gaps in adapting pruning and quantization for compressing YOLOv5, and provide future directions in this area for further exploration. Among several versions of YOLO, we specifically choose YOLOv5 for its excellent trade-off between recency and popularity in literature. This is the first specific review paper that surveys pruning and quantization methods from an implementation point of view on YOLOv5. Our study is also extendable to newer versions of YOLO as implementing them on resource-limited devices poses the same challenges that persist even today. This paper targets those interested in the practical deployment of model compression methods on YOLOv5, and in exploring different compression techniques that can be used for subsequent versions of YOLO.
In recent years, Graph Neural Networks have reported outstanding performance in tasks like community detection, molecule classification and link prediction. However, the black-box nature of these models prevents their application in domains like health and finance, where understanding the models' decisions is essential. Counterfactual Explanations (CE) provide these understandings through examples. Moreover, the literature on CE is flourishing with novel explanation methods which are tailored to graph learning. In this survey, we analyse the existing Graph Counterfactual Explanation methods, by providing the reader with an organisation of the literature according to a uniform formal notation for definitions, datasets, and metrics, thus, simplifying potential comparisons w.r.t to the method advantages and disadvantages. We discussed seven methods and sixteen synthetic and real datasets providing details on the possible generation strategies. We highlight the most common evaluation strategies and formalise nine of the metrics used in the literature. We first introduce the evaluation framework GRETEL and how it is possible to extend and use it while providing a further dimension of comparison encompassing reproducibility aspects. Finally, we provide a discussion on how counterfactual explanation interplays with privacy and fairness, before delving into open challenges and future works.
Graph neural networks generalize conventional neural networks to graph-structured data and have received widespread attention due to their impressive representation ability. In spite of the remarkable achievements, the performance of Euclidean models in graph-related learning is still bounded and limited by the representation ability of Euclidean geometry, especially for datasets with highly non-Euclidean latent anatomy. Recently, hyperbolic space has gained increasing popularity in processing graph data with tree-like structure and power-law distribution, owing to its exponential growth property. In this survey, we comprehensively revisit the technical details of the current hyperbolic graph neural networks, unifying them into a general framework and summarizing the variants of each component. More importantly, we present various HGNN-related applications. Last, we also identify several challenges, which potentially serve as guidelines for further flourishing the achievements of graph learning in hyperbolic spaces.
In the last decade or so, we have witnessed deep learning reinvigorating the machine learning field. It has solved many problems in the domains of computer vision, speech recognition, natural language processing, and various other tasks with state-of-the-art performance. The data is generally represented in the Euclidean space in these domains. Various other domains conform to non-Euclidean space, for which graph is an ideal representation. Graphs are suitable for representing the dependencies and interrelationships between various entities. Traditionally, handcrafted features for graphs are incapable of providing the necessary inference for various tasks from this complex data representation. Recently, there is an emergence of employing various advances in deep learning to graph data-based tasks. This article provides a comprehensive survey of graph neural networks (GNNs) in each learning setting: supervised, unsupervised, semi-supervised, and self-supervised learning. Taxonomy of each graph based learning setting is provided with logical divisions of methods falling in the given learning setting. The approaches for each learning task are analyzed from both theoretical as well as empirical standpoints. Further, we provide general architecture guidelines for building GNNs. Various applications and benchmark datasets are also provided, along with open challenges still plaguing the general applicability of GNNs.
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well known causal inference framework. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.
Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require that a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with an arbitrary depth. Although the primitive graph neural networks have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on graph convolutional network (GCN) and gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.
The era of big data provides researchers with convenient access to copious data. However, people often have little knowledge about it. The increasing prevalence of big data is challenging the traditional methods of learning causality because they are developed for the cases with limited amount of data and solid prior causal knowledge. This survey aims to close the gap between big data and learning causality with a comprehensive and structured review of traditional and frontier methods and a discussion about some open problems of learning causality. We begin with preliminaries of learning causality. Then we categorize and revisit methods of learning causality for the typical problems and data types. After that, we discuss the connections between learning causality and machine learning. At the end, some open problems are presented to show the great potential of learning causality with data.