An expurgating linear function (ELF) is a linear outer code that disallows the low-weight codewords of the inner code. ELFs can be designed either to maximize the minimum distance or to minimize the codeword error rate (CER) of the expurgated code. List decoding of the inner code from the noiseless all-zeros codeword is an efficient way to identify ELFs that maximize the minimum distance of the expurgated code. For convolutional inner codes, this paper provides distance spectrum union (DSU) upper bounds on the CER of the concatenated code. For short codeword lengths, ELFs transform a good inner code into a great concatenated code. For a constant message size of $K=64$ bits or constant codeword blocklength of $N=152$ bits, an ELF can reduce the gap at CER $10^{-6}$ between the DSU and the random-coding union (RCU) bounds from over 1 dB for the inner code alone to 0.23 dB for the concatenated code. The DSU bounds can also characterize puncturing that mitigates the rate overhead of the ELF while maintaining the DSU-to-RCU gap. The reduction in DSU-to-RCU gap comes with a minimal increase in average complexity. List Viterbi decoding guided by the ELF approaches maximum likelihood (ML) decoding of the concatenated code, and average list size converges to 1 as SNR increases. Thus, average complexity is similar to Viterbi decoding on the trellis of the inner code. For rare large-magnitude noise events, which occur less often than the FER of the inner code, a deep search in the list finds the ML codeword.
In many high-frequency simulation workflows, eigenvalue tracking along a parameter variation is necessary. This can become computationally prohibitive when repeated time-consuming eigenvalue problems must be solved. Therefore, we employ a reduced basis approximation to bring down the computational costs. It is based on the greedy strategy from Horger et al. 2017 which considers multiple eigenvalues for elliptic eigenvalue problems. We extend this algorithm to deal with parameter-dependent domains and the Maxwell eigenvalue problem. In this setting, the reduced basis may contain spurious eigenmodes, which require special treatment. We demonstrate our algorithm in an eigenvalue tracking application for an eigenmode classification.
An introductory exposition of the virtual element method (VEM) is provided. The intent is to make this method more accessible to those unfamiliar with VEM. Familiarity with the finite element method for solving 2D linear elasticity problems is assumed. Derivations relevant to successful implementation are covered. Some theory is covered, but the focus here is on implementation and results. Examples are given that illustrate the utility of the method. Numerical results are provided to help researchers implement and verify their own results.
Matching has been widely used to mimic a randomized experiment with observational data. Ideally, treated subjects are exactly matched with controls for the covariates, and randomization-based estimation can then be conducted as in a randomized experiment (assuming no unobserved covariates). However, when there exists continuous covariates or many covariates, matching typically should be inexact. Previous studies have routinely ignored inexact matching in the downstream randomization-based estimation as long as some covariate balance criteria are satisfied, which can cause severe estimation bias. Built on the covariate-adaptive randomization inference framework, in this research note, we propose two new classes of bias-corrected randomization-based estimators to reduce estimation bias due to inexact matching: the bias-corrected maximum $p$-value estimator for the constant treatment effect and the bias-corrected difference-in-means estimator for the average treatment effect. Our simulation results show that the proposed bias-corrected estimators can effectively reduce estimation bias due to inexact matching.
Given a graph $G$, the number of its vertices is represented by $n(G)$, while the number of its edges is denoted as $m(G)$. An independent set in a graph is a set of vertices where no two vertices are adjacent to each other and the size of the maximum independent set is denoted by $\alpha(G)$. A matching in a graph refers to a set of edges where no two edges share a common vertex and the maximum matching size is denoted by $\mu(G)$. If $\alpha(G) + \mu(G) = n(G)$, then the graph $G$ is called a K\"{o}nig-Egerv\'{a}ry graph. Considering a graph $G$ with a degree sequence $d_1 \leq d_2 \leq \cdots \leq d_n$, the annihilation number $a(G)$ is defined as the largest integer $k$ such that the sum of the first $k$ degrees in the sequence is less than or equal to $m(G)$ (Pepper, 2004). It is a known fact that $\alpha(G)$ is less than or equal to $a(G)$ for any graph $G$. Our goal is to estimate the difference between these two parameters. Specifically, we prove a series of inequalities, including $a(G) - \alpha(G) \leq \frac{\mu(G) - 1}{2}$ for trees, $a(G) - \alpha(G) \leq 2 + \mu(G) - 2\sqrt{1 + \mu(G)}$ for bipartite graphs and $a(G) - \alpha(G) \leq \mu(G) - 2$ for K\"{o}nig-Egerv\'{a}ry graphs. Furthermore, we demonstrate that these inequalities serve as tight upper bounds for the difference between the annihilation and independence numbers, regardless of the assigned value for $\mu(G)$.
AI alignment refers to models acting towards human-intended goals, preferences, or ethical principles. Given that most large-scale deep learning models act as black boxes and cannot be manually controlled, analyzing the similarity between models and humans can be a proxy measure for ensuring AI safety. In this paper, we focus on the models' visual perception alignment with humans, further referred to as AI-human visual alignment. Specifically, we propose a new dataset for measuring AI-human visual alignment in terms of image classification, a fundamental task in machine perception. In order to evaluate AI-human visual alignment, a dataset should encompass samples with various scenarios that may arise in the real world and have gold human perception labels. Our dataset consists of three groups of samples, namely Must-Act (i.e., Must-Classify), Must-Abstain, and Uncertain, based on the quantity and clarity of visual information in an image and further divided into eight categories. All samples have a gold human perception label; even Uncertain (severely blurry) sample labels were obtained via crowd-sourcing. The validity of our dataset is verified by sampling theory, statistical theories related to survey design, and experts in the related fields. Using our dataset, we analyze the visual alignment and reliability of five popular visual perception models and seven abstention methods. Our code and data is available at \url{//github.com/jiyounglee-0523/VisAlign}.
Temporal point processes (TPP) are a natural tool for modeling event-based data. Among all TPP models, Hawkes processes have proven to be the most widely used, mainly due to their adequate modeling for various applications, particularly when considering exponential or non-parametric kernels. Although non-parametric kernels are an option, such models require large datasets. While exponential kernels are more data efficient and relevant for specific applications where events immediately trigger more events, they are ill-suited for applications where latencies need to be estimated, such as in neuroscience. This work aims to offer an efficient solution to TPP inference using general parametric kernels with finite support. The developed solution consists of a fast $\ell_2$ gradient-based solver leveraging a discretized version of the events. After theoretically supporting the use of discretization, the statistical and computational efficiency of the novel approach is demonstrated through various numerical experiments. Finally, the method's effectiveness is evaluated by modeling the occurrence of stimuli-induced patterns from brain signals recorded with magnetoencephalography (MEG). Given the use of general parametric kernels, results show that the proposed approach leads to an improved estimation of pattern latency than the state-of-the-art.
Artificial Intelligence (AI) and its applications have sparked extraordinary interest in recent years. This achievement can be ascribed in part to advances in AI subfields including Machine Learning (ML), Computer Vision (CV), and Natural Language Processing (NLP). Deep learning, a sub-field of machine learning that employs artificial neural network concepts, has enabled the most rapid growth in these domains. The integration of vision and language has sparked a lot of attention as a result of this. The tasks have been created in such a way that they properly exemplify the concepts of deep learning. In this review paper, we provide a thorough and an extensive review of the state of the arts approaches, key models design principles and discuss existing datasets, methods, their problem formulation and evaluation measures for VQA and Visual reasoning tasks to understand vision and language representation learning. We also present some potential future paths in this field of research, with the hope that our study may generate new ideas and novel approaches to handle existing difficulties and develop new applications.
Recently, Mutual Information (MI) has attracted attention in bounding the generalization error of Deep Neural Networks (DNNs). However, it is intractable to accurately estimate the MI in DNNs, thus most previous works have to relax the MI bound, which in turn weakens the information theoretic explanation for generalization. To address the limitation, this paper introduces a probabilistic representation of DNNs for accurately estimating the MI. Leveraging the proposed MI estimator, we validate the information theoretic explanation for generalization, and derive a tighter generalization bound than the state-of-the-art relaxations.
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.