In this paper, we propose a table and image generation task to verify how the knowledge about entities acquired from natural language is retained in Vision & Language (V&L) models. This task consists of two parts: the first is to generate a table containing knowledge about an entity and its related image, and the second is to generate an image from an entity with a caption and a table containing related knowledge of the entity. In both tasks, the model must know the entities used to perform the generation properly. We created the Wikipedia Table and Image Generation (WikiTIG) dataset from about 200,000 infoboxes in English Wikipedia articles to perform the proposed tasks. We evaluated the performance on the tasks with respect to the above research question using the V&L model OFA, which has achieved state-of-the-art results in multiple tasks. Experimental results show that OFA forgets part of its entity knowledge by pre-training as a complement to improve the performance of image related tasks.
In this paper, we consider experiments involving both discrete factors and continuous factors under general parametric statistical models. To search for optimal designs under the D-criterion, we propose a new algorithm, called the ForLion algorithm, which performs an exhaustive search in a design space with discrete and continuous factors while keeping high efficiency and a reduced number of design points. Its optimality is guaranteed by the general equivalence theorem. We show its advantage using a real-life experiment under multinomial logistic models, and further specialize the algorithm for generalized linear models to show the improved efficiency with model-specific formulae and iterative steps.
In this paper, we propose three generic models of capacitated coverage and, more generally, submodular maximization to study task-worker assignment problems that arise in a wide range of gig economy platforms. Our models incorporate the following features: (1) Each task and worker can have an arbitrary matching capacity, which captures the limited number of copies or finite budget for the task and the working capacity of the worker; (2) Each task is associated with a coverage or, more generally, a monotone submodular utility function. Our objective is to design an allocation policy that maximizes the sum of all tasks' utilities, subject to capacity constraints on tasks and workers. We consider two settings: offline, where all tasks and workers are static, and online, where tasks are static while workers arrive dynamically. We present three LP-based rounding algorithms that achieve optimal approximation ratios of $1-1/\mathsf{e} \sim 0.632$ for offline coverage maximization, competitive ratios of $(19-67/\mathsf{e}^3)/27\sim 0.580$ and $0.436$ for online coverage and online monotone submodular maximization, respectively.
In this paper, we continue the research on the power of contextual grammars with selection languages from subfamilies of the family of regular languages. In the past, two independent hierarchies have been obtained for external and internal contextual grammars, one based on selection languages defined by structural properties (finite, monoidal, nilpotent, combinational, definite, ordered, non-counting, power-separating, suffix-closed, commutative, circular, or union-free languages), the other one based on selection languages defined by resources (number of non-terminal symbols, production rules, or states needed for generating or accepting them). In a previous paper, the language families of these hierarchies for external contextual grammars were compared and the hierarchies merged. In the present paper, we compare the language families of these hierarchies for internal contextual grammars and merge these hierarchies.
In this paper, we propose a blind source separation of a linear mixture of dependent sources based on copula statistics that measure the non-linear dependence between source component signals structured as copula density functions. The source signals are assumed to be stationary. The method minimizes the Kullback-Leibler divergence between the copula density functions of the estimated sources and of the dependency structure. The proposed method is applied to data obtained from the time-domain analysis of the classical 11-Bus 4-Machine system. Extensive simulation results demonstrate that the proposed method based on copula statistics converges faster and outperforms the state-of-the-art blind source separation method for dependent sources in terms of interference-to-signal ratio.
Polar duality is a well-known concept from convex geometry and analysis. In the present paper, we study two symplectically covariant versions of polar duality keeping in mind their applications to quantum mechanics. The first variant makes use of the symplectic form on phase space and allows a precise study of the covariance matrix of a density operator. The latter is a fundamental object in quantum information theory., The second variant is a symplectically covariant version of the usual polar duality highlighting the role played by Lagrangian planes. It allows us to define the notion of "geometric quantum states" with are in bijection with generalized Gaussians.
In this paper, we present methodologies for optimal selection for renewable energy sites under a different set of constraints and objectives. We consider two different models for the site-selection problem - coarse-grained and fine-grained, and analyze them to find solutions. We consider multiple different ways to measure the benefits of setting up a site. We provide approximation algorithms with a guaranteed performance bound for two different benefit metrics with the coarse-grained model. For the fine-grained model, we provide a technique utilizing Integer Linear Program to find the optimal solution. We present the results of our extensive experimentation with synthetic data generated from sparsely available real data from solar farms in Arizona.
Aiming at expanding few-shot relations' coverage in knowledge graphs (KGs), few-shot knowledge graph completion (FKGC) has recently gained more research interests. Some existing models employ a few-shot relation's multi-hop neighbor information to enhance its semantic representation. However, noise neighbor information might be amplified when the neighborhood is excessively sparse and no neighbor is available to represent the few-shot relation. Moreover, modeling and inferring complex relations of one-to-many (1-N), many-to-one (N-1), and many-to-many (N-N) by previous knowledge graph completion approaches requires high model complexity and a large amount of training instances. Thus, inferring complex relations in the few-shot scenario is difficult for FKGC models due to limited training instances. In this paper, we propose a few-shot relational learning with global-local framework to address the above issues. At the global stage, a novel gated and attentive neighbor aggregator is built for accurately integrating the semantics of a few-shot relation's neighborhood, which helps filtering the noise neighbors even if a KG contains extremely sparse neighborhoods. For the local stage, a meta-learning based TransH (MTransH) method is designed to model complex relations and train our model in a few-shot learning fashion. Extensive experiments show that our model outperforms the state-of-the-art FKGC approaches on the frequently-used benchmark datasets NELL-One and Wiki-One. Compared with the strong baseline model MetaR, our model achieves 5-shot FKGC performance improvements of 8.0% on NELL-One and 2.8% on Wiki-One by the metric Hits@10.
Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.
In order to answer natural language questions over knowledge graphs, most processing pipelines involve entity and relation linking. Traditionally, entity linking and relation linking has been performed either as dependent sequential tasks or independent parallel tasks. In this paper, we propose a framework called "EARL", which performs entity linking and relation linking as a joint single task. EARL uses a graph connection based solution to the problem. We model the linking task as an instance of the Generalised Travelling Salesman Problem (GTSP) and use GTSP approximate algorithm solutions. We later develop EARL which uses a pair-wise graph-distance based solution to the problem.The system determines the best semantic connection between all keywords of the question by referring to a knowledge graph. This is achieved by exploiting the "connection density" between entity candidates and relation candidates. The "connection density" based solution performs at par with the approximate GTSP solution.We have empirically evaluated the framework on a dataset with 5000 questions. Our system surpasses state-of-the-art scores for entity linking task by reporting an accuracy of 0.65 to 0.40 from the next best entity linker.
In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.