We study the complexity of infinite-domain constraint satisfaction problems: our basic setting is that a complexity classification for the CSPs of first-order expansions of a structure $\mathfrak A$ can be transferred to a classification of the CSPs of first-order expansions of another structure $\mathfrak B$. We exploit a product of structures (the algebraic product) that corresponds to the product of the respective polymorphism clones and present a complete complexity classification of the CSPs for first-order expansions of the $n$-fold algebraic power of $(\mathbb{Q};<)$. This is proved by various algebraic and logical methods in combination with knowledge of the polymorphisms of the tractable first-order expansions of $(\mathbb{Q};<)$ and explicit descriptions of the expressible relations in terms of syntactically restricted first-order formulas. By combining our classification result with general classification transfer techniques, we obtain surprisingly strong new classification results for highly relevant formalisms such as Allen's Interval Algebra, the $n$-dimensional Block Algebra, and the Cardinal Direction Calculus, even if higher-arity relations are allowed. Our results confirm the infinite-domain tractability conjecture for classes of structures that have been difficult to analyse with older methods. For the special case of structures with binary signatures, the results can be substantially strengthened and tightly connected to Ord-Horn formulas; this solves several longstanding open problems from the AI literature.
One of the most prominent methods for uncertainty quantification in high-dimen-sional statistics is the desparsified LASSO that relies on unconstrained $\ell_1$-minimization. The majority of initial works focused on real (sub-)Gaussian designs. However, in many applications, such as magnetic resonance imaging (MRI), the measurement process possesses a certain structure due to the nature of the problem. The measurement operator in MRI can be described by a subsampled Fourier matrix. The purpose of this work is to extend the uncertainty quantification process using the desparsified LASSO to design matrices originating from a bounded orthonormal system, which naturally generalizes the subsampled Fourier case and also allows for the treatment of the case where the sparsity basis is not the standard basis. In particular we construct honest confidence intervals for every pixel of an MR image that is sparse in the standard basis provided the number of measurements satisfies $n \gtrsim\max\{ s\log^2 s\log p, s \log^2 p \}$ or that is sparse with respect to the Haar Wavelet basis provided a slightly larger number of measurements.
Node classification is a fundamental graph-based task that aims to predict the classes of unlabeled nodes, for which Graph Neural Networks (GNNs) are the state-of-the-art methods. Current GNNs assume that nodes in the training set contribute equally during training. However, the quality of training nodes varies greatly, and the performance of GNNs could be harmed by two types of low-quality training nodes: (1) inter-class nodes situated near class boundaries that lack the typical characteristics of their corresponding classes. Because GNNs are data-driven approaches, training on these nodes could degrade the accuracy. (2) mislabeled nodes. In real-world graphs, nodes are often mislabeled, which can significantly degrade the robustness of GNNs. To mitigate the detrimental effect of the low-quality training nodes, we present CLNode, which employs a selective training strategy to train GNN based on the quality of nodes. Specifically, we first design a multi-perspective difficulty measurer to accurately measure the quality of training nodes. Then, based on the measured qualities, we employ a training scheduler that selects appropriate training nodes to train GNN in each epoch. To evaluate the effectiveness of CLNode, we conduct extensive experiments by incorporating it in six representative backbone GNNs. Experimental results on real-world networks demonstrate that CLNode is a general framework that can be combined with various GNNs to improve their accuracy and robustness.
For optimal control problems constrained by a initial-valued parabolic PDE, we have to solve a large scale saddle point algebraic system consisting of considering the discrete space and time points all together. A popular strategy to handle such a system is the Krylov subspace method, for which an efficient preconditioner plays a crucial role. The matching-Schur-complement preconditioner has been extensively studied in literature and the implementation of this preconditioner lies in solving the underlying PDEs twice, sequentially in time. In this paper, we propose a new preconditioner for the Schur complement, which can be used parallel-in-time (PinT) via the so called diagonalization technique. We show that the eigenvalues of the preconditioned matrix are low and upper bounded by positive constants independent of matrix size and the regularization parameter. The uniform boundedness of the eigenvalues leads to an optimal linear convergence rate of conjugate gradient solver for the preconditioned Schur complement system. To the best of our knowledge, it is the first time to have an optimal convergence analysis for a PinT preconditioning technique of the optimal control problem. Numerical results are reported to show that the performance of the proposed preconditioner is robust with respect to the discretization step-sizes and the regularization parameter.
Transfer learning is a popular technique for improving the performance of neural networks. However, existing methods are limited to transferring parameters between networks with same architectures. We present a method for transferring parameters between neural networks with different architectures. Our method, called DPIAT, uses dynamic programming to match blocks and layers between architectures and transfer parameters efficiently. Compared to existing parameter prediction and random initialization methods, it significantly improves training efficiency and validation accuracy. In experiments on ImageNet, our method improved validation accuracy by an average of 1.6 times after 50 epochs of training. DPIAT allows both researchers and neural architecture search systems to modify trained networks and reuse knowledge, avoiding the need for retraining from scratch. We also introduce a network architecture similarity measure, enabling users to choose the best source network without any training.
VeriFast is a prototype tool based on separation logic for modular verification of C and Java programs. We are in the process of adding support for C++. In this report, we describe the features of C++ for which we added support so far, as well as the proof obligations we generate for these features. At this point, VeriFast has basic support for most object-oriented programming features of C++: member functions, member function and operator overloading, implicit and explicit conversions, constructors and initializer lists, destructors, reference types, allocation and deallocation on the stack or on the heap (using new and delete), inheritance (including multiple inheritance but not virtual base classes), and virtual member functions and overriding. To support specification of inheritance hierarchies, we added support for instance predicates, which can be introduced in a base class and overridden in derived classes. The main missing feature at this point is support for C++ templates, which we plan to work on next.
The aim of this work is to present a parallel solver for a formulation of fluid-structure interaction (FSI) problems which makes use of a distributed Lagrange multiplier in the spirit of the fictitious domain method. The fluid subproblem, consisting of the non-stationary Stokes equations, is discretized in space by $\mathcal{Q}_2$-$\mathcal{P}_1$ finite elements, whereas the structure subproblem, consisting of the linear or finite incompressible elasticity equations, is discretized in space by $\mathcal{Q}_1$ finite elements. A first order semi-implicit finite difference scheme is employed for time discretization. The resulting linear system at each time step is solved by a parallel GMRES solver, accelerated by block diagonal or triangular preconditioners. The parallel implementation is based on the PETSc library. Several numerical tests have been performed on Linux clusters to investigate the effectiveness of the proposed FSI solver.
We study the problem of few-shot graph classification across domains with nonequivalent feature spaces by introducing three new cross-domain benchmarks constructed from publicly available datasets. We also propose an attention-based graph encoder that uses three congruent views of graphs, one contextual and two topological views, to learn representations of task-specific information for fast adaptation, and task-agnostic information for knowledge transfer. We run exhaustive experiments to evaluate the performance of contrastive and meta-learning strategies. We show that when coupled with metric-based meta-learning frameworks, the proposed encoder achieves the best average meta-test classification accuracy across all benchmarks. The source code and data will be released here: //github.com/kavehhassani/metagrl
Recently, graph neural networks (GNNs) have been widely used for document classification. However, most existing methods are based on static word co-occurrence graphs without sentence-level information, which poses three challenges:(1) word ambiguity, (2) word synonymity, and (3) dynamic contextual dependency. To address these challenges, we propose a novel GNN-based sparse structure learning model for inductive document classification. Specifically, a document-level graph is initially generated by a disjoint union of sentence-level word co-occurrence graphs. Our model collects a set of trainable edges connecting disjoint words between sentences and employs structure learning to sparsely select edges with dynamic contextual dependencies. Graphs with sparse structures can jointly exploit local and global contextual information in documents through GNNs. For inductive learning, the refined document graph is further fed into a general readout function for graph-level classification and optimization in an end-to-end manner. Extensive experiments on several real-world datasets demonstrate that the proposed model outperforms most state-of-the-art results, and reveal the necessity to learn sparse structures for each document.
Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing the generalization capabilities of a model, it can also address many other challenges and problems, from overcoming a limited amount of training data over regularizing the objective to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation (C1) and a taxonomy for existing works (C2), this survey is concerned with data augmentation methods for textual classification and aims to achieve a concise and comprehensive overview for researchers and practitioners (C3). Derived from the taxonomy, we divided more than 100 methods into 12 different groupings and provide state-of-the-art references expounding which methods are highly promising (C4). Finally, research perspectives that may constitute a building block for future work are given (C5).
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.