We consider the maximum weight $b$-matching problem in the random-order semi-streaming model. Assuming all weights are small integers drawn from $[1,W]$, we present a $2 - \frac{1}{2W} + \varepsilon$ approximation algorithm, using a memory of $O(\max(|M_G|, n) \cdot poly(\log(m),W,1/\varepsilon))$, where $|M_G|$ denotes the cardinality of the optimal matching. Our result generalizes that of Bernstein [Bernstein, 2015], which achieves a $3/2 + \varepsilon$ approximation for the maximum cardinality simple matching. When $W$ is small, our result also improves upon that of Gamlath et al. [Gamlath et al., 2019], which obtains a $2 - \delta$ approximation (for some small constant $\delta \sim 10^{-17}$) for the maximum weight simple matching. In particular, for the weighted $b$-matching problem, ours is the first result beating the approximation ratio of $2$. Our technique hinges on a generalized weighted version of edge-degree constrained subgraphs, originally developed by Bernstein and Stein [Bernstein and Stein, 2015]. Such a subgraph has bounded vertex degree (hence uses only a small number of edges), and can be easily computed. The fact that it contains a $2 - \frac{1}{2W} + \varepsilon$ approximation of the maximum weight matching is proved using the classical K\H{o}nig-Egerv\'ary's duality theorem.
The problem of String Matching to Labeled Graphs (SMLG) asks to find all the paths in a labeled graph $G = (V, E)$ whose spellings match that of an input string $S \in \Sigma^m$. SMLG can be solved in quadratic $O(m|E|)$ time [Amir et al., JALG], which was proven to be optimal by a recent lower bound conditioned on SETH [Equi et al., ICALP 2019]. The lower bound states that no strongly subquadratic time algorithm exists, even if restricted to directed acyclic graphs (DAGs). In this work we present the first parameterized algorithms for SMLG in DAGs. Our parameters capture the topological structure of $G$. All our results are derived from a generalization of the Knuth-Morris-Pratt algorithm [Park and Kim, CPM 1995] optimized to work in time proportional to the number of prefix-incomparable matches. To obtain the parameterization in the topological structure of $G$, we first study a special class of DAGs called funnels [Millani et al., JCO] and generalize them to $k$-funnels and the class $ST_k$. We present several novel characterizations and algorithmic contributions on both funnels and their generalizations.
Maximum Inner Product Search (MIPS) is a popular problem in the machine learning literature due to its applicability in a wide array of applications, such as recommender systems. In high-dimensional settings, however, MIPS queries can become computationally expensive as most existing solutions do not scale well with data dimensionality. In this work, we present a state-of-the-art algorithm for the MIPS problem in high dimensions, dubbed BanditMIPS. BanditMIPS is a randomized algorithm that borrows techniques from multi-armed bandits to reduce the MIPS problem to a best-arm identification problem. BanditMIPS reduces the complexity of state-of-the-art algorithms from $O(\sqrt{d})$ to $O(\text{log}d)$, where $d$ is the dimension of the problem data vectors. On high-dimensional real-world datasets, BanditMIPS runs approximately 12 times faster than existing approaches and returns the same solution. BanditMIPS requires no preprocessing of the data and includes a hyperparameter that practitioners may use to trade off accuracy and runtime. We also propose a variant of our algorithm, named BanditMIPS-$\alpha$, which employs non-uniform sampling across the data dimensions to provide further speedups.
Finite element methods are well-known to admit robust optimal convergence on simplicial meshes satisfying the maximum angle conditions. But how to generalize this condition to polyhedra is unknown in the literature. In this work, we argue that this generation is possible for virtual element methods (VEMs). In particular, we develop an anisotropic analysis framework for VEMs where the virtual spaces and projection spaces remain abstract and can be problem-adapted, carrying forward the ``virtual'' spirit of VEMs. Three anisotropic cases will be analyzed under this framework: (1) elements only contain non-shrinking inscribed balls but are not necessarily star convex to those balls; (2) elements are cut arbitrarily from a background Cartesian mesh, which can extremely shrink; (3) elements contain different materials on which the virtual spaces involve discontinuous coefficients. The error estimates are guaranteed to be independent of polyhedral element shapes. The present work largely improves the current theoretical results in the literature and also broadens the scope of the application of VEMs.
Recent research has reported a performance degradation in self-supervised contrastive learning for specially designed efficient networks, such as MobileNet and EfficientNet. A common practice to address this problem is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher. However, it is time and resource consuming to pretrain a teacher model when it is not available. In this work, we aim to establish a stronger baseline for lightweight contrastive models without using a pretrained teacher model. Specifically, we show that the optimal recipe for efficient models is different from that of larger models, and using the same training settings as ResNet50, as previous research does, is inappropriate. Additionally, we observe a common issu e in contrastive learning where either the positive or negative views can be noisy, and propose a smoothed version of InfoNCE loss to alleviate this problem. As a result, we successfully improve the linear evaluation results from 36.3\% to 62.3\% for MobileNet-V3-Large and from 42.2\% to 65.8\% for EfficientNet-B0 on ImageNet, closing the accuracy gap to ResNet50 with $5\times$ fewer parameters. We hope our research will facilitate the usage of lightweight contrastive models.
We consider priority-based matching problems with limited farsightedness. We show that, once agents are sufficiently farsighted, the matching obtained from the Top Trading Cycles (TTC) algorithm becomes stable: a singleton set consisting of the TTC matching is a horizon-$k$ vNM stable set if the degree of farsightedness is greater than three times the number of agents in the largest cycle of the TTC. On the contrary, the matching obtained from the Deferred Acceptance (DA) algorithm may not belong to any horizon-$k$ vNM stable set for $k$ large enough.
Motivated by baterryless IoT devices, we consider the following scheduling problem. The input includes $n$ unit time jobs $\mathcal{J} = \{J_1, \ldots, J_n\}$, where each job $J_i$ has a release time $r_i$, due date $d_i$, energy requirement $e_i$, and weight $w_i$. We consider time to be slotted; hence, all time related job values refer to slots. Let $T=\max_i\{d_i\}$. The input also includes an $h_t$ value for every time slot $t$ ($1 \leq t \leq T$), which is the energy harvestable on that slot. Energy is harvested at time slots when no job is executed. The objective is to find a feasible schedule that maximizes the weight of the scheduled jobs. A schedule is feasible if for every job $J_j$ in the schedule and its corresponding slot $t_j$, $t_{j} \neq t_{j'}$ if ${j} \neq {j'}$, $r_j \leq t_j \leq d_j$, and the available energy before $t_j$ is at least $e_j$. To the best of our knowledge, we are the first to consider the theoretical aspects of this problem. In this work we show the following. (1) A polynomial time algorithm when all jobs have identical $r_i, d_i$ and $w_i$. (2) A $\frac{1}{2}$-approximation algorithm when all jobs have identical $w_i$ but arbitrary $r_i$ and $d_i$. (3) An FPTAS when all jobs have identical $r_i$ and $d_i$ but arbitrary $w_i$. (4) Reductions showing that all the variants of the problem in which at least one of the attributes $r_i$, $d_i$, or $w_i$ are not identical for all jobs are NP-Hard.
We introduce MegaPose, a method to estimate the 6D pose of novel objects, that is, objects unseen during training. At inference time, the method only assumes knowledge of (i) a region of interest displaying the object in the image and (ii) a CAD model of the observed object. The contributions of this work are threefold. First, we present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects. The shape and coordinate system of the novel object are provided as inputs to the network by rendering multiple synthetic views of the object's CAD model. Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner. Third, we introduce a large-scale synthetic dataset of photorealistic images of thousands of objects with diverse visual and shape properties and show that this diversity is crucial to obtain good generalization performance on novel objects. We train our approach on this large synthetic dataset and apply it without retraining to hundreds of novel objects in real images from several pose estimation benchmarks. Our approach achieves state-of-the-art performance on the ModelNet and YCB-Video datasets. An extensive evaluation on the 7 core datasets of the BOP challenge demonstrates that our approach achieves performance competitive with existing approaches that require access to the target objects during training. Code, dataset and trained models are available on the project page: //megapose6d.github.io/.
We study streaming algorithms for the fundamental geometric problem of computing the cost of the Euclidean Minimum Spanning Tree (MST) on an $n$-point set $X \subset \mathbb{R}^d$. In the streaming model, the points in $X$ can be added and removed arbitrarily, and the goal is to maintain an approximation in small space. In low dimensions, $(1+\epsilon)$ approximations are possible in sublinear space [Frahling, Indyk, Sohler, SoCG '05]. However, for high dimensional spaces the best known approximation for this problem was $\tilde{O}(\log n)$, due to [Chen, Jayaram, Levi, Waingarten, STOC '22], improving on the prior $O(\log^2 n)$ bound due to [Indyk, STOC '04] and [Andoni, Indyk, Krauthgamer, SODA '08]. In this paper, we break the logarithmic barrier, and give the first constant factor sublinear space approximation to Euclidean MST. For any $\epsilon\geq 1$, our algorithm achieves an $\tilde{O}(\epsilon^{-2})$ approximation in $n^{O(\epsilon)}$ space. We complement this by proving that any single pass algorithm which obtains a better than $1.10$-approximation must use $\Omega(\sqrt{n})$ space, demonstrating that $(1+\epsilon)$ approximations are not possible in high-dimensions, and that our algorithm is tight up to a constant. Nevertheless, we demonstrate that $(1+\epsilon)$ approximations are possible in sublinear space with $O(1/\epsilon)$ passes over the stream. More generally, for any $\alpha \geq 2$, we give a $\alpha$-pass streaming algorithm which achieves a $(1+O(\frac{\log \alpha + 1}{ \alpha \epsilon}))$ approximation in $n^{O(\epsilon)} d^{O(1)}$ space. Our streaming algorithms are linear sketches, and therefore extend to the massively-parallel computation model (MPC). Thus, our results imply the first $(1+\epsilon)$-approximation to Euclidean MST in a constant number of rounds in the MPC model.
In this paper, we study the problem of a batch of linearly correlated image alignment, where the observed images are deformed by some unknown domain transformations, and corrupted by additive Gaussian noise and sparse noise simultaneously. By stacking these images as the frontal slices of a third-order tensor, we propose to utilize the tensor factorization method via transformed tensor-tensor product to explore the low-rankness of the underlying tensor, which is factorized into the product of two smaller tensors via transformed tensor-tensor product under any unitary transformation. The main advantage of transformed tensor-tensor product is that its computational complexity is lower compared with the existing literature based on transformed tensor nuclear norm. Moreover, the tensor $\ell_p$ $(0<p<1)$ norm is employed to characterize the sparsity of sparse noise and the tensor Frobenius norm is adopted to model additive Gaussian noise. A generalized Gauss-Newton algorithm is designed to solve the resulting model by linearizing the domain transformations and a proximal Gauss-Seidel algorithm is developed to solve the corresponding subproblem. Furthermore, the convergence of the proximal Gauss-Seidel algorithm is established, whose convergence rate is also analyzed based on the Kurdyka-$\L$ojasiewicz property. Extensive numerical experiments on real-world image datasets are carried out to demonstrate the superior performance of the proposed method as compared to several state-of-the-art methods in both accuracy and computational time.
Standard contrastive learning approaches usually require a large number of negatives for effective unsupervised learning and often exhibit slow convergence. We suspect this behavior is due to the suboptimal selection of negatives used for offering contrast to the positives. We counter this difficulty by taking inspiration from support vector machines (SVMs) to present max-margin contrastive learning (MMCL). Our approach selects negatives as the sparse support vectors obtained via a quadratic optimization problem, and contrastiveness is enforced by maximizing the decision margin. As SVM optimization can be computationally demanding, especially in an end-to-end setting, we present simplifications that alleviate the computational burden. We validate our approach on standard vision benchmark datasets, demonstrating better performance in unsupervised representation learning over state-of-the-art, while having better empirical convergence properties.