Given access to the hypergraph through a subset query oracle in the query model, we give sublinear time algorithms for Hitting-Set with almost tight parameterized query complexity. In parameterized query complexity, we estimate the number of queries to the oracle based on the parameter $k$, the size of the Hitting-Set. The subset query oracle we use in this paper is called Generalized $d$-partite Independent Set query oracle (GPIS) and it was introduced by Bishnu et al. (ISAAC'18). GPIS is a generalization to hypergraphs of the Bipartite Independent Set query oracle (BIS) introduced by Beame et al. (ITCS'18 and TALG'20) for estimating the number of edges in graphs. Formally, GPIS is defined as follows: GPIS oracle for a $d$-uniform hypergraph $\mathcal{H}$ takes as input $d$ pairwise disjoint non-empty subsets $A_1, \ldots, A_d$ of vertices in $\cal H$ and answers whether there is a hyperedge in $\mathcal{H}$ that intersects each set $A_i$, where $i \in \{1, \, 2, \, \ldots, d\}$. } For $d=2$, the GPIS oracle is nothing but BIS oracle. We show that $d$-Hitting-Set, the hitting set problem for $d$-uniform hypergraphs, can be solved using $\widetilde{\mathcal{O}}_d(k^{d} \log n)$ GPIS queries. Additionally, we also showed that $d$-Decesion-Hitting-Set, the decision version of $d$-Hitting-Set can be solved with $\widetilde{\mathcal{O}}_d\left( \min \left\{ k^d\log n, k^{2d^2} \right\} \right)$ {\sc GPIS} queries. We complement these parameterized upper bounds with an almost matching parameterized lower bound that states that any algorithm that solves $d$-Decesion-Hitting-Set requires $\Omega \left( \binom{k+d}{d} \right)$ GPIS queries.
We consider the question of estimating multi-dimensional Gaussian mixtures (GM) with compactly supported or subgaussian mixing distributions. Minimax estimation rate for this class (under Hellinger, TV and KL divergences) is a long-standing open question, even in one dimension. In this paper we characterize this rate (for all constant dimensions) in terms of the metric entropy of the class. Such characterizations originate from seminal works of Le Cam (1973); Birge (1983); Haussler and Opper (1997); Yang and Barron (1999). However, for GMs a key ingredient missing from earlier work (and widely sought-after) is a comparison result showing that the KL and the squared Hellinger distance are within a constant multiple of each other uniformly over the class. Our main technical contribution is in showing this fact, from which we derive entropy characterization for estimation rate under Hellinger and KL. Interestingly, the sequential (online learning) estimation rate is characterized by the global entropy, while the single-step (batch) rate corresponds to local entropy, paralleling a similar result for the Gaussian sequence model recently discovered by Neykov (2022) and Mourtada (2023). Additionally, since Hellinger is a proper metric, our comparison shows that GMs under KL satisfy the triangle inequality within multiplicative constants, implying that proper and improper estimation rates coincide.
We give improved and almost optimal testers for several classes of Boolean functions on $n$ inputs that have concise representation in the uniform and distribution-free model. Classes, such as $k$-junta, $k$-linear functions, $s$-term DNF, $s$-term monotone DNF, $r$-DNF, decision list, $r$-decision list, size-$s$ decision tree, size-$s$ Boolean formula, size-$s$ branching programs, $s$-sparse polynomials over the binary field and function with Fourier degree at most $d$. The method can be extended to several other classes of functions over any domain that can be approximated by functions that have a small number of relevant variables.
We revisit the problem of computing with noisy information considered in Feige et al. 1994, which includes computing the OR function from noisy queries, and computing the MAX, SEARCH and SORT functions from noisy pairwise comparisons. For $K$ given elements, the goal is to correctly recover the desired function with probability at least $1-\delta$ when the outcome of each query is flipped with probability $p$. We consider both the adaptive sampling setting where each query can be adaptively designed based on past outcomes, and the non-adaptive sampling setting where the query cannot depend on past outcomes. The prior work provides tight bounds on the worst-case query complexity in terms of the dependence on $K$. However, the upper and lower bounds do not match in terms of the dependence on $\delta$ and $p$. We improve the lower bounds for all the four functions under both adaptive and non-adaptive query models. Most of our lower bounds match the upper bounds up to constant factors when either $p$ or $\delta$ is bounded away from $0$, while the ratio between the best prior upper and lower bounds goes to infinity when $p\rightarrow 0$ or $p\rightarrow 1/2$. On the other hand, we also provide matching upper and lower bounds for the number of queries in expectation, improving both the upper and lower bounds for the variable-length query model.
We study the problem of finding the smallest graph that does not occur as an induced subgraph of a given graph. This missing induced subgraph has at most logarithmic size and can be found by a brute-force search, in an $n$-vertex graph, in time $n^{O(\log n)}$. We show that under the Exponential Time Hypothesis this quasipolynomial time bound is optimal. We also consider variations of the problem in which either the missing subgraph or the given graph comes from a restricted graph family; for instance, we prove that the smallest missing planar induced subgraph of a given planar graph can be found in polynomial time.
We study the problem of social welfare maximization in bilateral trade, where two agents, a buyer and a seller, trade an indivisible item. We consider arguably the simplest form of mechanisms -- the fixed-price mechanisms, where the designer offers trade at a fixed price to the seller and buyer. Besides the simple form, fixed-price mechanisms are also the only DSIC and budget balanced mechanisms in bilateral trade. We obtain improved approximation ratios of fixed-price mechanisms in different settings. In the full prior information setting where the designer has access to the value distributions of both the seller and buyer, we show that the optimal fixed-price mechanism can achieve at least $0.72$ of the optimal welfare, and no fixed-price mechanism can achieve more than $0.7381$ of the optimal welfare. Prior to our result the state of the art approximation ratio was $1 - 1/e + 0.0001 \approx 0.632$. Interestingly, we further show that the optimal approximation ratio achievable with full prior information is identical to the optimal approximation ratio obtainable with only one-sided prior information. We further consider two limited information settings. In the first one, the designer is only given the mean of the buyer's (or the seller's) value. We show that with such minimal information, one can already design a fixed-price mechanism that achieves $2/3$ of the optimal social welfare, which surpasses the previous state of the art ratio even when the designer has access to the full prior information. Furthermore, $2/3$ is the optimal attainable ratio in this setting. In the second one, we assume that the designer has sample access to the value distributions. We propose a new family mechanisms called order statistic mechanisms and provide a complete characterization of their approximation ratios for any fixed number of samples.
The Multiple Drone-Delivery Scheduling Problem (MDSP) is a scheduling problem that optimizes the maximum reward earned by a set of $m$ drones executing a sequence of deliveries on a truck delivery route. The current best-known approximation algorithm for the problem is a $\frac{1}{4}$-approximation algorithm developed by Jana and Mandal (2022). In this paper, we propose exact and approximation algorithms for the general MDSP, as well as a unit-cost variant. We first propose a greedy algorithm which we show to be a $\frac{1}{3}$-approximation algorithm for the general MDSP problem formulation, provided the number of conflicting intervals is less than the number of drones. We then introduce a unit-cost variant of MDSP and we devise an exact dynamic programming algorithm that runs in polynomial time when the number of drones $m$ can be assumed to be a constant.
The problem Power Dominating Set (PDS) is motivated by the placement of phasor measurement units to monitor electrical networks. It asks for a minimum set of vertices in a graph that observes all remaining vertices by exhaustively applying two observation rules. Our contribution is twofold. First, we determine the parameterized complexity of PDS by proving it is $W[P]$-complete when parameterized with respect to the solution size. We note that it was only known to be $W[2]$-hard before. Our second and main contribution is a new algorithm for PDS that efficiently solves practical instances. Our algorithm consists of two complementary parts. The first is a set of reduction rules for PDS that can also be used in conjunction with previously existing algorithms. The second is an algorithm for solving the remaining kernel based on the implicit hitting set approach. Our evaluation on a set of power grid instances from the literature shows that our solver outperforms previous state-of-the-art solvers for PDS by more than one order of magnitude on average. Furthermore, our algorithm can solve previously unsolved instances of continental scale within a few minutes.
We consider the problem of answering connectivity queries on a real algebraic curve. The curve is given as the real trace of an algebraic curve, assumed to be in generic position, and being defined by some rational parametrizations. The query points are given by a zero-dimensional parametrization. We design an algorithm which counts the number of connected components of the real curve under study, and decides which query point lie in which connected component, in time log-linear in $N^6$, where $N$ is the maximum of the degrees and coefficient bit-sizes of the polynomials given as input. This matches the currently best-known bound for computing the topology of real plane curves. The main novelty of this algorithm is the avoidance of the computation of the complete topology of the curve.
Despite the growing interest in parallel-in-time methods as an approach to accelerate numerical simulations in atmospheric modelling, improving their stability and convergence remains a substantial challenge for their application to operational models. In this work, we study the temporal parallelization of the shallow water equations on the rotating sphere combined with time-stepping schemes commonly used in atmospheric modelling due to their stability properties, namely an Eulerian implicit-explicit (IMEX) method and a semi-Lagrangian semi-implicit method (SL-SI-SETTLS). The main goal is to investigate the performance of parallel-in-time methods, namely Parareal and Multigrid Reduction in Time (MGRIT), when these well-established schemes are used on the coarse discretization levels and provide insights on how they can be improved for better performance. We begin by performing an analytical stability study of Parareal and MGRIT applied to a linearized ordinary differential equation depending on the choice of coarse scheme. Next, we perform numerical simulations of two standard tests to evaluate the stability, convergence and speedup provided by the parallel-in-time methods compared to a fine reference solution computed serially. We also conduct a detailed investigation on the influence of artificial viscosity and hyperviscosity approaches, applied on the coarse discretization levels, on the performance of the temporal parallelization. Both the analytical stability study and the numerical simulations indicate a poorer stability behaviour when SL-SI-SETTLS is used on the coarse levels, compared to the IMEX scheme. With the IMEX scheme, a better trade-off between convergence, stability and speedup compared to serial simulations can be obtained under proper parameters and artificial viscosity choices, opening the perspective of the potential competitiveness for realistic models.
Reliable probabilistic primality tests are fundamental in public-key cryptography. In adversarial scenarios, a composite with a high probability of passing a specific primality test could be chosen. In such cases, we need worst-case error estimates for the test. However, in many scenarios the numbers are randomly chosen and thus have significantly smaller error probability. Therefore, we are interested in average case error estimates. In this paper, we establish such bounds for the strong Lucas primality test, as only worst-case, but no average case error bounds, are currently available. This allows us to use this test with more confidence. We examine an algorithm that draws odd $k$-bit integers uniformly and independently, runs $t$ independent iterations of the strong Lucas test with randomly chosen parameters, and outputs the first number that passes all $t$ consecutive rounds. We attain numerical upper bounds on the probability on returing a composite. Furthermore, we consider a modified version of this algorithm that excludes integers divisible by small primes, resulting in improved bounds. Additionally, we classify the numbers that contribute most to our estimate.