We have developed efficient parameterized algorithms for the enumeration problems of graphs arising in chemistry. In particular, we have focused on the following problems: enumeration of Kekul\'e structures, computation of Hosoya index, computation of Merrifield-Simmons index, and computation of graph entropy based on matchings and independent sets. All these problems are known to be $\# P$-complete. We have developed FPT algorithms for bounded treewidth and bounded pathwidth for these problems with a better time complexity than the known state-of-the-art in the literature. We have also conducted experiments on the entire PubChem database of chemical compounds and tested our algorithms. We also provide a comparison with naive baseline algorithms for these problems, along with a distribution of treewidth for the chemical compounds available in the PubChem database.
We propose and analyze a class of particle methods for the Vlasov equation with a strong external magnetic field in a torus configuration. In this regime, the time step can be subject to stability constraints related to the smallness of Larmor radius. To avoid this limitation, our approach is based on higher-order semi-implicit numerical schemes already validated on dissipative systems [3] and for magnetic fields pointing in a fixed direction [9, 10, 12]. It hinges on asymptotic insights gained in [11] at the continuous level. Thus, when the magnitude of the external magnetic field is large, this scheme provides a consistent approximation of the guiding-center system taking into account curvature and variation of the magnetic field. Finally, we carry out a theoretical proof of consistency and perform several numerical experiments that establish a solid validation of the method and its underlying concepts.
Convolutional codes with a maximum distance profile attain the largest possible column distances for the maximum number of time instants and thus have outstanding error-correcting capability especially for streaming applications. Explicit constructions of such codes are scarce in the literature. In particular, known constructions of convolutional codes with rate k/n and a maximum distance profile require a field of size at least exponential in n for general code parameters. At the same time, the only known lower bound on the field size is the trivial bound that is linear in n. In this paper, we show that a finite field of size $\Omega_L(n^{L-1})$ is necessary for constructing convolutional codes with rate k/n and a maximum distance profile of length L. As a direct consequence, this rules out the possibility of constructing convolutional codes with a maximum distance profile of length L >= 3 over a finite field of size O(n). Additionally, we also present an explicit construction of convolutional code with rate k/n and a maximum profile of length L = 1 over a finite field of size $O(n^{\min\{k,n-k\}})$, achieving a smaller field size than known constructions with the same profile length.
We study first-order methods for constrained min-max optimization. Existing methods either require two gradient calls or two projections in each iteration, which may be costly in some applications. In this paper, we first show that a variant of the Optimistic Gradient (OG) method, a single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ best-iterate convergence rate for inclusion problems with operators that satisfy the weak Minty variation inequality (MVI). Our second result is the first single-call single-projection algorithm -- the Accelerated Reflected Gradient (ARG) method that achieves the optimal $O(\frac{1}{T})$ last-iterate convergence rate for inclusion problems that satisfy negative comonotonicity. Both the weak MVI and negative comonotonicity are well-studied assumptions and capture a rich set of non-convex non-concave min-max optimization problems. Finally, we show that the Reflected Gradient (RG) method, another single-call single-projection algorithm, has $O(\frac{1}{\sqrt{T}})$ last-iterate convergence rate for constrained convex-concave min-max optimization, answering an open problem of [Heish et al, 2019]. Our convergence rates hold for standard measures such as the tangent residual and the natural residual.
A Low-rank Spectral Optimization Problem (LSOP) minimizes a linear objective subject to multiple two-sided linear matrix inequalities intersected with a low-rank and spectral constrained domain set. Although solving LSOP is, in general, NP-hard, its partial convexification (i.e., replacing the domain set by its convex hull) termed "LSOP-R," is often tractable and yields a high-quality solution. This motivates us to study the strength of LSOP-R. Specifically, we derive rank bounds for any extreme point of the feasible set of LSOP-R and prove their tightness for the domain sets with different matrix spaces. The proposed rank bounds recover two well-known results in the literature from a fresh angle and also allow us to derive sufficient conditions under which the relaxation LSOP-R is equivalent to the original LSOP. To effectively solve LSOP-R, we develop a column generation algorithm with a vector-based convex pricing oracle, coupled with a rank-reduction algorithm, which ensures the output solution satisfies the theoretical rank bound. Finally, we numerically verify the strength of the LSOP-R and the efficacy of the proposed algorithms.
This paper is concerned with the multi-frequency factorization method for imaging the support of a wave-number-dependent source function. It is supposed that the source function is given by the Fourier transform of some time-dependent source with a priori given radiating period. Using the multi-frequency far-field data at a fixed observation direction, we provide a computational criterion for characterizing the smallest strip containing the support and perpendicular to the observation direction. The far-field data from sparse observation directions can be used to recover a $\Theta$-convex polygon of the support. The inversion algorithm is proven valid even with multi-frequency near-field data in three dimensions. The connections to time-dependent inverse source problems are discussed in the near-field case. We also comment on possible extensions to source functions with two disconnected supports. Numerical tests in both two and three dimensions are implemented to show effectiveness and feasibility of the approach. This paper provides numerical analysis for a frequency-domain approach to recover the support of an admissible class of time-dependent sources.
An experimental comparison of two or more optimization algorithms requires the same computational resources to be assigned to each algorithm. When a maximum runtime is set as the stopping criterion, all algorithms need to be executed in the same machine if they are to use the same resources. Unfortunately, the implementation code of the algorithms is not always available, which means that running the algorithms to be compared in the same machine is not always possible. And even if they are available, some optimization algorithms might be costly to run, such as training large neural-networks in the cloud. In this paper, we consider the following problem: how do we compare the performance of a new optimization algorithm B with a known algorithm A in the literature if we only have the results (the objective values) and the runtime in each instance of algorithm A? Particularly, we present a methodology that enables a statistical analysis of the performance of algorithms executed in different machines. The proposed methodology has two parts. Firstly, we propose a model that, given the runtime of an algorithm in a machine, estimates the runtime of the same algorithm in another machine. This model can be adjusted so that the probability of estimating a runtime longer than what it should be is arbitrarily low. Secondly, we introduce an adaptation of the one-sided sign test that uses a modified \textit{p}-value and takes into account that probability. Such adaptation avoids increasing the probability of type I error associated with executing algorithms A and B in different machines.
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,\delta)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) \delta(p,X)^z$ is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+\epsilon)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $\eta_0 >0.0006$, we devise a $3^z(1-\eta_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $\Theta(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+\epsilon)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.
A graph database is a digraph whose arcs are labeled with symbols from a fixed alphabet. A regular graph pattern (RGP) is a digraph whose edges are labeled with regular expressions over the alphabet. RGPs model navigational queries for graph databases called conjunctive regular path queries (CRPQs). A match of a CRPQ in the database is witnessed by a special navigational homomorphism of the corresponding RGP to the database. We study the complexity of deciding the existence of a homomorphism between two RGPs. Such homomorphisms model a strong type of containment between the two corresponding CRPQs. We show that this problem can be solved by an EXPTIME algorithm (while general query containmement in this context is EXPSPACE-complete). We also study the problem for restricted RGPs over a unary alphabet, that arise from some applications like XPath or SPARQL. For this case, homomorphism-based CRPQ containment is in NP. We prove that certain interesting cases are in fact polynomial-time solvable.
We demonstrate the relevance of an algorithm called generalized iterative scaling (GIS) or simultaneous multiplicative algebraic reconstruction technique (SMART) and its rescaled block-iterative version (RBI-SMART) in the field of optimal transport (OT). Many OT problems can be tackled through the use of entropic regularization by solving the Schr\"odinger problem, which is an information projection problem, that is, with respect to the Kullback--Leibler divergence. Here we consider problems that have several affine constraints. It is well-known that cyclic information projections onto the individual affine sets converge to the solution. In practice, however, even these individual projections are not explicitly available in general. In this paper, we exchange them for one GIS iteration. If this is done for every affine set, we obtain RBI-SMART. We provide a convergence proof using an interpretation of these iterations as two-step affine projections in an equivalent problem. This is done in a slightly more general setting than RBI-SMART, since we use a mix of explicitly known information projections and GIS iterations. We proceed to specialize this algorithm to several OT applications. First, we find the measure that minimizes the regularized OT divergence to a given measure under moment constraints. Second and third, the proposed framework yields an algorithm for solving a regularized martingale OT problem, as well as a relaxed version of the barycentric weak OT problem. Finally, we show an approach from the literature for unbalanced OT problems.
Given $\mathbf A \in \mathbb{R}^{n \times n}$ with entries bounded in magnitude by $1$, it is well-known that if $S \subset [n] \times [n]$ is a uniformly random subset of $\tilde{O} (n/\epsilon^2)$ entries, and if ${\mathbf A}_S$ equals $\mathbf A$ on the entries in $S$ and is zero elsewhere, then $\|\mathbf A - \frac{n^2}{s} \cdot {\mathbf A}_S\|_2 \le \epsilon n$ with high probability, where $\|\cdot\|_2$ is the spectral norm. We show that for positive semidefinite (PSD) matrices, no randomness is needed at all in this statement. Namely, there exists a fixed subset $S$ of $\tilde{O} (n/\epsilon^2)$ entries that acts as a universal sparsifier: the above error bound holds simultaneously for every bounded entry PSD matrix $\mathbf A \in \mathbb{R}^{n \times n}$. One can view this result as a significant extension of a Ramanujan expander graph, which sparsifies any bounded entry PSD matrix, not just the all ones matrix. We leverage the existence of such universal sparsifiers to give the first deterministic algorithms for several central problems related to singular value computation that run in faster than matrix multiplication time. We also prove universal sparsification bounds for non-PSD matrices, showing that $\tilde{O} (n/\epsilon^4)$ entries suffices to achieve error $\epsilon \cdot \max(n,\|\mathbf A\|_1)$, where $\|\mathbf A\|_1$ is the trace norm. We prove that this is optimal up to an $\tilde{O} (1/\epsilon^2)$ factor. Finally, we give an improved deterministic spectral approximation algorithm for PSD $\mathbf A$ with entries lying in $\{-1,0,1\}$, which we show is nearly information-theoretically optimal.