The question whether a partition $\mathcal{P}$ and a hierarchy $\mathcal{H}$ or a tree-like split system $\mathfrak{S}$ are compatible naturally arises in a wide range of classification problems. In the setting of phylogenetic trees, one asks whether the sets of $\mathcal{P}$coincide with leaf sets of connected components obtained by deleting some edges from the tree $T$ that represents $\mathcal{H}$ or $\mathfrak{S}$, respectively. More generally, we ask whether a refinement $T^*$ of $T$ exists such that $T^*$ and $\mathcal{P}$ are compatible in this sense. The latter is closely related to the question as to whether there exists a tree at all that is compatible with $\mathcal{P}$. We report several characterizations for (refinements of) hierarchies and split systems that are compatible with (systems of) partitions. In addition, we provide a linear-time algorithm to check whether refinements of trees and a given partition are compatible. The latter problem becomes NP-complete but fixed-parameter tractable if a system of partitions is considered instead of a single partition. In this context, we also explore the close relationship of the concept of compatibility and so-called Fitch maps.
This paper presents the slurk software, a lightweight interaction server for setting up dialog data collections and running experiments. Slurk enables a multitude of settings including text-based, speech and video interaction between two or more humans or humans and bots, and a multimodal display area for presenting shared or private interactive context. The software is implemented in Python with an HTML and JS frontend that can easily be adapted to individual needs. It also provides a setup for pairing participants on common crowdworking platforms such as Amazon Mechanical Turk and some example bot scripts for common interaction scenarios.
This paper is the first from a series of papers that provide a characterization of maximum packings of $T$-cuts in bipartite graphs. Given a connected graph, a set $T$ of an even number of vertices, and a minimum $T$-join, an edge weighting can be defined, from which distances between vertices can be defined. Furthermore, given a specified vertex called root, vertices can be classified according to their distances from the root, and this classification of vertices can be used to define a family of subgraphs called {\em distance components}. Seb\"o provided a theorem that revealed a relationship between distance components, minimum $T$-joins, and $T$-cuts. In this paper, we further investigate the structure of distance components in bipartite graphs. Particularly, we focus on {\em capital} distance components, that is, those that include the root. We reveal the structure of capital distance components in terms of the $T$-join analogue of the general Kotzig-Lov\'asz canonical decomposition.
Online markets are a part of everyday life, and their rules are governed by algorithms. Assuming participants are inherently self-interested, well designed rules can help to increase social welfare. Many algorithms for online markets are based on prices: the seller is responsible for posting prices while buyers make purchases which are most profitable given the posted prices. To make adjustments to the market the seller is allowed to update prices at certain timepoints. Posted prices are an intuitive way to design a market. Despite the fact that each buyer acts selfishly, the seller's goal is often assumed to be that of social welfare maximization. Berger, Eden and Feldman recently considered the case of a market with only three buyers where each buyer has a fixed number of goods to buy and the profit of a bought bundle of items is the sum of profits of the items in the bundle. For such markets, Berger et. al. showed that the seller can maximize social welfare by dynamically updating posted prices before arrival of each buyer. B\'{e}rczi, B\'{e}rczi-Kov\'{a}cs and Sz\"{o}gi showed that the social welfare can be maximized also when each buyer is ready to buy at most two items. We study the power of posted prices with dynamical updates in more general cases. First, we show that the result of Berger et. al. can be generalized from three to four buyers. Then we show that the result of B\'{e}rczi, B\'{e}rczi-Kov\'{a}cs and Sz\"{o}gi can be generalized to the case when each buyer is ready to buy up to three items. We also show that a dynamic pricing is possible whenever there are at most two allocations maximizing social welfare.
A radio labelling of a graph $G$ is a mapping $f : V(G) \rightarrow \{0, 1, 2,\ldots\}$ such that $|f(u)-f(v)| \geq diam(G) + 1 - d(u,v)$ for every pair of distinct vertices $u,v$ of $G$, where $diam(G)$ is the diameter of $G$ and $d(u,v)$ is the distance between $u$ and $v$ in $G$. The radio number $rn(G)$ of $G$ is the smallest integer $k$ such that $G$ admits a radio labelling $f$ with $\max\{f(v):v \in V(G)\} = k$. The weight of a tree $T$ from a vertex $v \in V(T)$ is the sum of the distances in $T$ from $v$ to all other vertices, and a vertex of $T$ achieving the minimum weight is called a weight center of $T$. It is known that any tree has one or two weight centers. A tree is called a two-branch tree if the removal of all its weight centers results in a forest with exactly two components. In this paper we obtain a sharp lower bound for the radio number of two-branch trees which improves a known lower bound for general trees. We also give a necessary and sufficient condition for this improved lower bound to be achieved. Using these results, we determine the radio number of two families of level-wise regular two-branch trees.
We prove a bound of $O( k (n+m)\log^{d-1})$ on the number of incidences between $n$ points and $m$ axis parallel boxes in $\mathbb{R}^d$, if no $k$ boxes contain $k$ common points. That is, the incidence graph between the points and the boxes does not contain $K_{k,k}$ as a subgraph. This new bound improves over previous work by a factor of $\log^d n$, for $d >2$. We also study other variants of the problem. For halfspaces, using shallow cuttings, we get a near linear bound in two and three dimensions. Finally, we present near linear bound for the case of shapes in the plane with low union complexity (e.g. fat triangles).
We revisit widely used preferential Gaussian processes by Chu et al.(2005) and challenge their modelling assumption that imposes rankability of data items via latent utility function values. We propose a generalisation of pgp which can capture more expressive latent preferential structures in the data and thus be used to model inconsistent preferences, i.e. where transitivity is violated, or to discover clusters of comparable items via spectral decomposition of the learned preference functions. We also consider the properties of associated covariance kernel functions and its reproducing kernel Hilbert Space (RKHS), giving a simple construction that satisfies universality in the space of preference functions. Finally, we provide an extensive set of numerical experiments on simulated and real-world datasets showcasing the competitiveness of our proposed method with state-of-the-art. Our experimental findings support the conjecture that violations of rankability are ubiquitous in real-world preferential data.
One of the most remarkable properties of word embeddings is the fact that they capture certain types of semantic and syntactic relationships. Recently, pre-trained language models such as BERT have achieved groundbreaking results across a wide range of Natural Language Processing tasks. However, it is unclear to what extent such models capture relational knowledge beyond what is already captured by standard word embeddings. To explore this question, we propose a methodology for distilling relational knowledge from a pre-trained language model. Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation. We then use a subset of these extracted sentences as templates. Finally, we fine-tune a language model to predict whether a given word pair is likely to be an instance of some relation, when given an instantiated template for that relation as input.
Recent progress has been made in using attention based encoder-decoder framework for image and video captioning. Most existing decoders apply the attention mechanism to every generated word including both visual words (e.g., "gun" and "shooting") and non-visual words (e.g. "the", "a"). However, these non-visual words can be easily predicted using natural language model without considering visual signals or attention. Imposing attention mechanism on non-visual words could mislead and decrease the overall performance of visual captioning. Furthermore, the hierarchy of LSTMs enables more complex representation of visual data, capturing information at different scales. To address these issues, we propose a hierarchical LSTM with adaptive attention (hLSTMat) approach for image and video captioning. Specifically, the proposed framework utilizes the spatial or temporal attention for selecting specific regions or frames to predict the related words, while the adaptive attention is for deciding whether to depend on the visual information or the language context information. Also, a hierarchical LSTMs is designed to simultaneously consider both low-level visual information and high-level language context information to support the caption generation. We initially design our hLSTMat for video captioning task. Then, we further refine it and apply it to image captioning task. To demonstrate the effectiveness of our proposed framework, we test our method on both video and image captioning tasks. Experimental results show that our approach achieves the state-of-the-art performance for most of the evaluation metrics on both tasks. The effect of important components is also well exploited in the ablation study.
Model-based methods for recommender systems have been studied extensively in recent years. In systems with large corpus, however, the calculation cost for the learnt model to predict all user-item preferences is tremendous, which makes full corpus retrieval extremely difficult. To overcome the calculation barriers, models such as matrix factorization resort to inner product form (i.e., model user-item preference as the inner product of user, item latent factors) and indexes to facilitate efficient approximate k-nearest neighbor searches. However, it still remains challenging to incorporate more expressive interaction forms between user and item features, e.g., interactions through deep neural networks, because of the calculation cost. In this paper, we focus on the problem of introducing arbitrary advanced models to recommender systems with large corpus. We propose a novel tree-based method which can provide logarithmic complexity w.r.t. corpus size even with more expressive models such as deep neural networks. Our main idea is to predict user interests from coarse to fine by traversing tree nodes in a top-down fashion and making decisions for each user-node pair. We also show that the tree structure can be jointly learnt towards better compatibility with users' interest distribution and hence facilitate both training and prediction. Experimental evaluations with two large-scale real-world datasets show that the proposed method significantly outperforms traditional methods. Online A/B test results in Taobao display advertising platform also demonstrate the effectiveness of the proposed method in production environments.
Deep neural networks and decision trees operate on largely separate paradigms; typically, the former performs representation learning with pre-specified architectures, while the latter is characterised by learning hierarchies over pre-specified features with data-driven architectures. We unite the two via adaptive neural trees (ANTs), a model that incorporates representation learning into edges, routing functions and leaf nodes of a decision tree, along with a backpropagation-based training algorithm that adaptively grows the architecture from primitive modules (e.g., convolutional layers). ANTs allow increased interpretability via hierarchical clustering, e.g., learning meaningful class associations, such as separating natural vs. man-made objects. We demonstrate this on classification and regression tasks, achieving over 99% and 90% accuracy on the MNIST and CIFAR-10 datasets, and outperforming standard neural networks, random forests and gradient boosted trees on the SARCOS dataset. Furthermore, ANT optimisation naturally adapts the architecture to the size and complexity of the training data.