Given an $n$-vertex planar embedded digraph $G$ with non-negative edge weights and a face $f$ of $G$, Klein presented a data structure with $O(n\log n)$ space and preprocessing time which can answer any query $(u,v)$ for the shortest path distance in $G$ from $u$ to $v$ or from $v$ to $u$ in $O(\log n)$ time, provided $u$ is on $f$. This data structure is a key tool in a number of state-of-the-art algorithms and data structures for planar graphs. Klein's data structure relies on dynamic trees and the persistence technique as well as a highly non-trivial interaction between primal shortest path trees and their duals. The construction of our data structure follows a completely different and in our opinion very simple divide-and-conquer approach that solely relies on Single-Source Shortest Path computations and contractions in the primal graph. Our space and preprocessing time bound is $O(n\log |f|)$ and query time is $O(\log |f|)$ which is an improvement over Klein's data structure when $f$ has small size.
We show that feasibility of the $t^\text{th}$ level of the Lasserre semidefinite programming hierarchy for graph isomorphism can be expressed as a homomorphism indistinguishability relation. In other words, we define a class $\mathcal{L}_t$ of graphs such that graphs $G$ and $H$ are not distinguished by the $t^\text{th}$ level of the Lasserre hierarchy if and only if they admit the same number of homomorphisms from any graph in $\mathcal{L}_t$. By analysing the treewidth of graphs in $\mathcal{L}_t$ we prove that the $3t^\text{th}$ level of Sherali--Adams linear programming hierarchy is as strong as the $t^\text{th}$ level of Lasserre. Moreover, we show that this is best possible in the sense that $3t$ cannot be lowered to $3t-1$ for any $t$. The same result holds for the Lasserre hierarchy with non-negativity constraints, which we similarly characterise in terms of homomorphism indistinguishability over a family $\mathcal{L}_t^+$ of graphs. Additionally, we give characterisations of level-$t$ Lasserre with non-negativity constraints in terms of logical equivalence and via a graph colouring algorithm akin to the Weisfeiler--Leman algorithm. This provides a polynomial time algorithm for determining if two given graphs are distinguished by the $t^\text{th}$ level of the Lasserre hierarchy with non-negativity constraints.
In scene graph generation (SGG), learning with cross-entropy loss yields biased predictions owing to the severe imbalance in the distribution of the relationship labels in the dataset. Thus, this study proposes a method to generate scene graphs using optimal transport as a measure for comparing two probability distributions. We apply learning with the optimal transport loss, which reflects the similarity between the labels in terms of transportation cost, for predicate classification in SGG. In the proposed approach, the transportation cost of the optimal transport is defined using the similarity of words obtained from the pre-trained model. The experimental evaluation of the effectiveness demonstrates that the proposed method outperforms existing methods in terms of mean Recall@50 and 100. Furthermore, it improves the recall of the relationship labels scarcely available in the dataset.
NSFW (Not Safe for Work) content, in the context of a dialogue, can have severe side effects on users in open-domain dialogue systems. However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection. Leveraging knowledge distillation techniques involving GPT-4 and ChatGPT, this dataset offers a cost-effective means of constructing NSFW content detectors. The process entails collecting real-life human-machine interaction data and breaking it down into single utterances and single-turn dialogues, with the chatbot delivering the final utterance. ChatGPT is employed to annotate unlabeled data, serving as a training set. Rationale validation and test sets are constructed using ChatGPT and GPT-4 as annotators, with a self-criticism strategy for resolving discrepancies in labeling. A BERT model is fine-tuned as a text classifier on pseudo-labeled data, and its performance is assessed. The study emphasizes the importance of AI systems prioritizing user safety and well-being in digital conversations while respecting freedom of expression. The proposed approach not only advances NSFW content detection but also aligns with evolving user protection needs in AI-driven dialogues.
3D scene reconstruction from 2D images has been a long-standing task. Instead of estimating per-frame depth maps and fusing them in 3D, recent research leverages the neural implicit surface as a unified representation for 3D reconstruction. Equipped with data-driven pre-trained geometric cues, these methods have demonstrated promising performance. However, inaccurate prior estimation, which is usually inevitable, can lead to suboptimal reconstruction quality, particularly in some geometrically complex regions. In this paper, we propose a two-stage training process, decouple view-dependent and view-independent colors, and leverage two novel consistency constraints to enhance detail reconstruction performance without requiring extra priors. Additionally, we introduce an essential mask scheme to adaptively influence the selection of supervision constraints, thereby improving performance in a self-supervised paradigm. Experiments on synthetic and real-world datasets show the capability of reducing the interference from prior estimation errors and achieving high-quality scene reconstruction with rich geometric details.
The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $\phi(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are linear functions in this representation. But where do these features come from? In the absence of expert domain knowledge, a tempting strategy is to use the ``kitchen sink" approach and hope that the true features are included in a much larger set of potential features. In this paper we revisit linear MDPs from the perspective of feature selection. In a $k$-sparse linear MDP, there is an unknown subset $S \subset [d]$ of size $k$ containing all the relevant features, and the goal is to learn a near-optimal policy in only poly$(k,\log d)$ interactions with the environment. Our main result is the first polynomial-time algorithm for this problem. In contrast, earlier works either made prohibitively strong assumptions that obviated the need for exploration, or required solving computationally intractable optimization problems. Along the way we introduce the notion of an emulator: a succinct approximate representation of the transitions that suffices for computing certain Bellman backups. Since linear MDPs are a non-parametric model, it is not even obvious whether polynomial-sized emulators exist. We show that they do exist and can be computed efficiently via convex programming. As a corollary of our main result, we give an algorithm for learning a near-optimal policy in block MDPs whose decoding function is a low-depth decision tree; the algorithm runs in quasi-polynomial time and takes a polynomial number of samples. This can be seen as a reinforcement learning analogue of classic results in computational learning theory. Furthermore, it gives a natural model where improving the sample complexity via representation learning is computationally feasible.
Given $n$-vertex simple graphs $X$ and $Y$, the friends-and-strangers graph $\mathsf{FS}(X, Y)$ has as its vertices all $n!$ bijections from $V(X)$ to $V(Y)$, where two bijections are adjacent if and only if they differ on two adjacent elements of $V(X)$ whose mappings are adjacent in $Y$. We consider the setting where $X$ and $Y$ are both edge-subgraphs of $K_{r,r}$: due to a parity obstruction, $\mathsf{FS}(X,Y)$ is always disconnected in this setting. Sharpening a result of Bangachev, we show that if $X$ and $Y$ respectively have minimum degrees $\delta(X)$ and $\delta(Y)$ and they satisfy $\delta(X) + \delta(Y) \geq \lfloor 3r/2 \rfloor + 1$, then $\mathsf{FS}(X,Y)$ has exactly two connected components. This proves that the cutoff for $\mathsf{FS}(X,Y)$ to avoid isolated vertices is equal to the cutoff for $\mathsf{FS}(X,Y)$ to have exactly two connected components. We also consider a probabilistic setup in which we fix $Y$ to be $K_{r,r}$, but randomly generate $X$ by including each edge in $K_{r,r}$ independently with probability $p$. Invoking a result of Zhu, we exhibit a phase transition phenomenon with threshold function $(\log r)/r$: below the threshold, $\mathsf{FS}(X,Y)$ has more than two connected components with high probability, while above the threshold, $\mathsf{FS}(X,Y)$ has exactly two connected components with high probability. Altogether, our results settle a conjecture and completely answer two problems of Alon, Defant, and Kravitz.
It is shown that every $n$-vertex graph that admits a 2-bend RAC drawing in the plane, where the edges are polylines with two bends per edge and any pair of edges can only cross at a right angle, has at most $20n-24$ edges for $n\geq 3$. This improves upon the previous upper bound of $74.2n$; this is the first improvement in more than 12 years. A crucial ingredient of the proof is an upper bound on the size of plane multigraphs with polyline edges in which the first and last segments are either parallel or orthogonal.
We propose a method, named DualMesh-UDF, to extract a surface from unsigned distance functions (UDFs), encoded by neural networks, or neural UDFs. Neural UDFs are becoming increasingly popular for surface representation because of their versatility in presenting surfaces with arbitrary topologies, as opposed to the signed distance function that is limited to representing a closed surface. However, the applications of neural UDFs are hindered by the notorious difficulty in extracting the target surfaces they represent. Recent methods for surface extraction from a neural UDF suffer from significant geometric errors or topological artifacts due to two main difficulties: (1) A UDF does not exhibit sign changes; and (2) A neural UDF typically has substantial approximation errors. DualMesh-UDF addresses these two difficulties. Specifically, given a neural UDF encoding a target surface $\bar{S}$ to be recovered, we first estimate the tangent planes of $\bar{S}$ at a set of sample points close to $\bar{S}$. Next, we organize these sample points into local clusters, and for each local cluster, solve a linear least squares problem to determine a final surface point. These surface points are then connected to create the output mesh surface, which approximates the target surface. The robust estimation of the tangent planes of the target surface and the subsequent minimization problem constitute our core strategy, which contributes to the favorable performance of DualMesh-UDF over other competing methods. To efficiently implement this strategy, we employ an adaptive Octree. Within this framework, we estimate the location of a surface point in each of the octree cells identified as containing part of the target surface. Extensive experiments show that our method outperforms existing methods in terms of surface reconstruction quality while maintaining comparable computational efficiency.
Let $A$ and $B$ be sets of vertices in a graph $G$. Menger's theorem states that for every positive integer $k$, either there exists a collection of $k$ vertex-disjoint paths between $A$ and $B$, or $A$ can be separated from $B$ by a set of at most $k-1$ vertices. Let $\Delta$ be the maximum degree of $G$. We show that there exists a function $f(\Delta) = (\Delta+1)^{\Delta^2+1}$, so that for every positive integer $k$, either there exists a collection of $k$ vertex-disjoint and pairwise anticomplete paths between $A$ and $B$, or $A$ can be separated from $B$ by a set of at most $k \cdot f(\Delta)$ vertices. We also show that the result can be generalized from bounded-degree graphs to graphs excluding a topological minor. On the negative side, we show that no such relation holds on graphs that have degeneracy 2 and arbitrarily large girth, even when $k = 2$. Similar results were obtained independently and concurrently by Hendrey, Norin, Steiner, and Turcotte [arXiv:2309.07905].
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.