Low-distortional metric embeddings are a crucial component in the modern algorithmic toolkit. In an online metric embedding, points arrive sequentially and the goal is to embed them into a simple space irrevocably, while minimizing the distortion. Our first result is a deterministic online embedding of a general metric into Euclidean space with distortion $O(\log n)\cdot\min\{\sqrt{\log\Phi},\sqrt{n}\}$ (or, $O(d)\cdot\min\{\sqrt{\log\Phi},\sqrt{n}\}$ if the metric has doubling dimension $d$), solving a conjecture by Newman and Rabinovich (2020), and quadratically improving the dependence on the aspect ratio $\Phi$ from Indyk et al.\ (2010). Our second result is a stochastic embedding of a metric space into trees with expected distortion $O(d\cdot \log\Phi)$, generalizing previous results (Indyk et al.\ (2010), Bartal et al.\ (2020)). Next, we study the \emph{online minimum-weight perfect matching} problem, where a sequence of $2n$ metric points arrive in pairs, and one has to maintain a perfect matching at all times. We allow recourse (as otherwise the order of arrival determines the matching). The goal is to return a perfect matching that approximates the \emph{minimum-weight} perfect matching at all times, while minimizing the recourse. Our third result is a randomized algorithm with competitive ratio $O(d\cdot \log \Phi)$ and recourse $O(\log \Phi)$ against an oblivious adversary, this result is obtained via our new stochastic online embedding. Our fourth result is a deterministic algorithm against an adaptive adversary, using $O(\log^2 n)$ recourse, that maintains a matching of weight at most $O(\log n)$ times the weight of the MST, i.e., a matching of lightness $O(\log n)$. We complement our upper bounds with a strategy for an oblivious adversary that, with recourse $r$, establishes a lower bound of $\Omega(\frac{\log n}{r \log r})$ for both competitive ratio and lightness.
We study data-free knowledge distillation (KD) for monocular depth estimation (MDE), which learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain. Owing to the essential difference between image classification and dense regression, previous methods of data-free KD are not applicable to MDE. To strengthen its applicability in real-world tasks, in this paper, we propose to apply KD with out-of-distribution simulated images. The major challenges to be resolved are i) lacking prior information about scene configurations of real-world training data and ii) domain shift between simulated and real-world images. To cope with these difficulties, we propose a tailored framework for depth distillation. The framework generates new training samples for embracing a multitude of possible object arrangements in the target domain and utilizes a transformation network to efficiently adapt them to the feature statistics preserved in the teacher model. Through extensive experiments on various depth estimation models and two different datasets, we show that our method outperforms the baseline KD by a good margin and even achieves slightly better performance with as few as 1/6 of training images, demonstrating a clear superiority.
Representing and rendering dynamic scenes has been an important but challenging task. Especially, to accurately model complex motions, high efficiency is usually hard to guarantee. To achieve real-time dynamic scene rendering while also enjoying high training and storage efficiency, we propose 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes rather than applying 3D-GS for each individual frame. In 4D-GS, a novel explicit representation containing both 3D Gaussians and 4D neural voxels is proposed. A decomposed neural voxel encoding algorithm inspired by HexPlane is proposed to efficiently build Gaussian features from 4D neural voxels and then a lightweight MLP is applied to predict Gaussian deformations at novel timestamps. Our 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800$\times$800 resolution on an RTX 3090 GPU while maintaining comparable or better quality than previous state-of-the-art methods. More demos and code are available at //guanjunwu.github.io/4dgs/.
This paper presents a comparative analysis of different optimization techniques for the K-means algorithm in the context of big data. K-means is a widely used clustering algorithm, but it can suffer from scalability issues when dealing with large datasets. The paper explores different approaches to overcome these issues, including parallelization, approximation, and sampling methods. The authors evaluate the performance of these techniques on various benchmark datasets and compare them in terms of speed, quality of clustering, and scalability according to the LIMA dominance criterion. The results show that different techniques are more suitable for different types of datasets and provide insights into the trade-offs between speed and accuracy in K-means clustering for big data. Overall, the paper offers a comprehensive guide for practitioners and researchers on how to optimize K-means for big data applications.
This paper gives a broad account of the various sequent-based proof formalisms in the proof-theoretic literature. We consider formalisms for various modal and tense logics, intuitionistic logic, conditional logics, and bunched logics. After providing an overview of the logics and proof formalisms under consideration, we show how these sequent-based formalisms can be placed in a hierarchy in terms of the underlying data structure of the sequents. We then discuss how this hierarchy can be traversed using translations. Translating proofs up this hierarchy is found to be relatively easy while translating proofs down the hierarchy is substantially more difficult. Finally, we inspect the prevalent distinction in structural proof theory between 'internal calculi' and 'external calculi'. It is observed that these classes resist a rigorous separation, and we critically assess the properties that (calculi from) these classes are purported to possess.
The Knapsack Problem is a classic problem in combinatorial optimisation. Solving these problems may be computationally expensive. Recent years have seen a growing interest in the use of deep learning methods to approximate the solutions to such problems. A core problem is how to enforce or encourage constraint satisfaction in predicted solutions. A promising approach for predicting solutions to constrained optimisation problems is the Lagrangian Dual Framework which builds on the method of Lagrangian Relaxation. In this paper we develop neural network models to approximate Knapsack Problem solutions using the Lagrangian Dual Framework while improving constraint satisfaction. We explore the problems of output interpretation and model selection within this context. Experimental results show strong constraint satisfaction with a minor reduction of optimality as compared to a baseline neural network which does not explicitly model the constraints.
We propose a novel machine learning algorithm for simulating radiative transfer. Our algorithm is based on physics informed neural networks (PINNs), which are trained by minimizing the residual of the underlying radiative tranfer equations. We present extensive experiments and theoretical error estimates to demonstrate that PINNs provide a very easy to implement, fast, robust and accurate method for simulating radiative transfer. We also present a PINN based algorithm for simulating inverse problems for radiative transfer efficiently.
We present parallel proof-of-work with DAG-style voting, a novel proof-of-work cryptocurrency protocol that, compared to Bitcoin, provides better consistency guarantees, higher transaction throughput, lower transaction confirmation latency, and higher resilience against incentive attacks. The superior consistency guarantees follow from implementing parallel proof-of-work, a recent consensus scheme that enforces a configurable number of proof-of-work votes per block. Our work is inspired by another recent protocol, Tailstorm, which structures the individual votes as tree and mitigates incentive attacks by discounting the mining rewards proportionally to the depth of the tree. We propose to structure the votes as a directed acyclic graph (DAG) instead of a tree. This allows for a more targeted punishment of offending miners and, as we show through a reinforcement learning based attack search, makes the protocol even more resilient to incentive attacks. An interesting by-product of our analysis is that parallel proof-of-work without reward discounting is less resilient to incentive attacks than Bitcoin in some realistic network scenarios.
The subgraph isomorphism finding problem is a well-studied problem in the field of computer science and graph theory, and it aims to enumerate all instances of a query graph in the respective data graph. In this paper, we propose an efficient method, SubISO, to find subgraph isomorphisms using an objective function, which exploits some isomorphic invariants and eccentricity of the query graph's vertices. The proposed objective function is used to determine pivot vertex, which minimizes both number and size of the candidate regions in the data graph. SubISO also limits the maximum recursive calls of the generic SubgraphSearch function to deal with straggler queries for which most of the existing algorithms show exponential behaviour. The proposed approach is evaluated over three benchmark datasets. It is also compared with three well known subgraph isomorphism finding algorithms in terms of execution time, number of identified embeddings, and ability to deal with the straggler queries, and it performs significantly better.
Knowledge graph completion aims to predict missing relations between entities in a knowledge graph. While many different methods have been proposed, there is a lack of a unifying framework that would lead to state-of-the-art results. Here we develop PathCon, a knowledge graph completion method that harnesses four novel insights to outperform existing methods. PathCon predicts relations between a pair of entities by: (1) Considering the Relational Context of each entity by capturing the relation types adjacent to the entity and modeled through a novel edge-based message passing scheme; (2) Considering the Relational Paths capturing all paths between the two entities; And, (3) adaptively integrating the Relational Context and Relational Path through a learnable attention mechanism. Importantly, (4) in contrast to conventional node-based representations, PathCon represents context and path only using the relation types, which makes it applicable in an inductive setting. Experimental results on knowledge graph benchmarks as well as our newly proposed dataset show that PathCon outperforms state-of-the-art knowledge graph completion methods by a large margin. Finally, PathCon is able to provide interpretable explanations by identifying relations that provide the context and paths that are important for a given predicted relation.
We study how to generate captions that are not only accurate in describing an image but also discriminative across different images. The problem is both fundamental and interesting, as most machine-generated captions, despite phenomenal research progresses in the past several years, are expressed in a very monotonic and featureless format. While such captions are normally accurate, they often lack important characteristics in human languages - distinctiveness for each caption and diversity for different images. To address this problem, we propose a novel conditional generative adversarial network for generating diverse captions across images. Instead of estimating the quality of a caption solely on one image, the proposed comparative adversarial learning framework better assesses the quality of captions by comparing a set of captions within the image-caption joint space. By contrasting with human-written captions and image-mismatched captions, the caption generator effectively exploits the inherent characteristics of human languages, and generates more discriminative captions. We show that our proposed network is capable of producing accurate and diverse captions across images.