In a recent work, Esmer et al. describe a simple method - Approximate Monotone Local Search - to obtain exponential approximation algorithms from existing parameterized exact algorithms, polynomial-time approximation algorithms and, more generally, parameterized approximation algorithms. In this work, we generalize those results to the weighted setting. More formally, we consider monotone subset minimization problems over a weighted universe of size $n$ (e.g., Vertex Cover, $d$-Hitting Set and Feedback Vertex Set). We consider a model where the algorithm is only given access to a subroutine that finds a solution of weight at most $\alpha \cdot W$ (and of arbitrary cardinality) in time $c^k \cdot n^{O(1)}$ where $W$ is the minimum weight of a solution of cardinality at most $k$. In the unweighted setting, Esmer et al. determine the smallest value $d$ for which a $\beta$-approximation algorithm running in time $d^n \cdot n^{O(1)}$ can be obtained in this model. We show that the same dependencies also hold in a weighted setting in this model: for every fixed $\varepsilon>0$ we obtain a $\beta$-approximation algorithm running in time $O\left((d+\varepsilon)^{n}\right)$, for the same $d$ as in the unweighted setting. Similarly, we also extend a $\beta$-approximate brute-force search (in a model which only provides access to a membership oracle) to the weighted setting. Using existing approximation algorithms and exact parameterized algorithms for weighted problems, we obtain the first exponential-time $\beta$-approximation algorithms that are better than brute force for a variety of problems including Weighted Vertex Cover, Weighted $d$-Hitting Set, Weighted Feedback Vertex Set and Weighted Multicut.
We develop a new method for multivariate scalar on multidimensional distribution regression. Traditional approaches typically analyze isolated univariate scalar outcomes or consider unidimensional distributional representations as predictors. However, these approaches are sub-optimal because: i) they fail to utilize the dependence between the distributional predictors: ii) neglect the correlation structure of the response. To overcome these limitations, we propose a multivariate distributional analysis framework that harnesses the power of multivariate density functions and multitask learning. We develop a computationally efficient semiparametric estimation method for modelling the effect of the latent joint density on multivariate response of interest. Additionally, we introduce a new conformal algorithm for quantifying the uncertainty of regression models with multivariate responses and distributional predictors, providing valuable insights into the conditional distribution of the response. We have validated the effectiveness of our proposed method through comprehensive numerical simulations, clearly demonstrating its superior performance compared to traditional methods. The application of the proposed method is demonstrated on tri-axial accelerometer data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014 for modelling the association between cognitive scores across various domains and distributional representation of physical activity among older adult population. Our results highlight the advantages of the proposed approach, emphasizing the significance of incorporating complete spatial information derived from the accelerometer device.
Diffusion models (DMs) have demonstrated advantageous potential on generative tasks. Widespread interest exists in incorporating DMs into downstream applications, such as producing or editing photorealistic images. However, practical deployment and unprecedented power of DMs raise legal issues, including copyright protection and monitoring of generated content. In this regard, watermarking has been a proven solution for copyright protection and content monitoring, but it is underexplored in the DMs literature. Specifically, DMs generate samples from longer tracks and may have newly designed multimodal structures, necessitating the modification of conventional watermarking pipelines. To this end, we conduct comprehensive analyses and derive a recipe for efficiently watermarking state-of-the-art DMs (e.g., Stable Diffusion), via training from scratch or finetuning. Our recipe is straightforward but involves empirically ablated implementation details, providing a foundation for future research on watermarking DMs. The code is available at //github.com/yunqing-me/WatermarkDM.
This article introduces a quick and simple combinatorial approximation algorithm for the Weighted correlation clustering problem. In this problem, we have a set of vertices and two difference and similarity weight values for each pair of vertices, and the goal is to cluster the vertices with minimum total intra-cluster difference weights plus inter-cluster similarity weights. Our algorithm's approximation factor is 3 when an instance of this problem satisfies probability constraints (the best-known was 5). If the instance satisfies triangle inequality in addition to probability constraints, the approximation factor is 1.6 (the best-known was 2).
In this work, a local Fourier analysis is presented to study the convergence of multigrid methods based on additive Schwarz smoothers. This analysis is presented as a general framework which allows us to study these smoothers for any type of discretization and problem. The presented framework is crucial in practice since it allows one to know a priori the answer to questions such as what is the size of the patch to use within these relaxations, the size of the overlapping, or even the optimal values for the weights involved in the smoother. Results are shown for a class of additive and restricted additive Schwarz relaxations used within a multigrid framework applied to high-order finite-element discretizations and saddle point problems, which are two of the contexts in which these type of relaxations are widely used.
We introduce Probabilistic Regular Expressions (PRE), a probabilistic analogue of regular expressions denoting probabilistic languages in which every word is assigned a probability of being generated. We present and prove the completeness of an inference system for reasoning about probabilistic language equivalence of PRE based on Salomaa's axiomatisation of Kleene Algebra.
This paper introduces a Factor Augmented Sparse Throughput (FAST) model that utilizes both latent factors and sparse idiosyncratic components for nonparametric regression. The FAST model bridges factor models on one end and sparse nonparametric models on the other end. It encompasses structured nonparametric models such as factor augmented additive models and sparse low-dimensional nonparametric interaction models and covers the cases where the covariates do not admit factor structures. Via diversified projections as estimation of latent factor space, we employ truncated deep ReLU networks to nonparametric factor regression without regularization and to a more general FAST model using nonconvex regularization, resulting in factor augmented regression using neural network (FAR-NN) and FAST-NN estimators respectively. We show that FAR-NN and FAST-NN estimators adapt to the unknown low-dimensional structure using hierarchical composition models in nonasymptotic minimax rates. We also study statistical learning for the factor augmented sparse additive model using a more specific neural network architecture. Our results are applicable to the weak dependent cases without factor structures. In proving the main technical result for FAST-NN, we establish a new deep ReLU network approximation result that contributes to the foundation of neural network theory. Our theory and methods are further supported by simulation studies and an application to macroeconomic data.
Score-based Generative Models (SGMs) have demonstrated exceptional synthesis outcomes across various tasks. However, the current design landscape of the forward diffusion process remains largely untapped and often relies on physical heuristics or simplifying assumptions. Utilizing insights from the development of scalable Bayesian posterior samplers, we present a complete recipe for formulating forward processes in SGMs, ensuring convergence to the desired target distribution. Our approach reveals that several existing SGMs can be seen as specific manifestations of our framework. Building upon this method, we introduce Phase Space Langevin Diffusion (PSLD), which relies on score-based modeling within an augmented space enriched by auxiliary variables akin to physical phase space. Empirical results exhibit the superior sample quality and improved speed-quality trade-off of PSLD compared to various competing approaches on established image synthesis benchmarks. Remarkably, PSLD achieves sample quality akin to state-of-the-art SGMs (FID: 2.10 for unconditional CIFAR-10 generation). Lastly, we demonstrate the applicability of PSLD in conditional synthesis using pre-trained score networks, offering an appealing alternative as an SGM backbone for future advancements. Code and model checkpoints can be accessed at \url{//github.com/mandt-lab/PSLD}.
2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields. Existing multimodal industrial anomaly detection methods directly concatenate the multimodal features, which leads to a strong disturbance between features and harms the detection performance. In this paper, we propose Multi-3D-Memory (M3DM), a novel multimodal anomaly detection method with hybrid fusion scheme: firstly, we design an unsupervised feature fusion with patch-wise contrastive learning to encourage the interaction of different modal features; secondly, we use a decision layer fusion with multiple memory banks to avoid loss of information and additional novelty classifiers to make the final decision. We further propose a point feature alignment operation to better align the point cloud and RGB features. Extensive experiments show that our multimodal industrial anomaly detection model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTec-3D AD dataset. Code is available at //github.com/nomewang/M3DM.
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.
Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.