The $p$-center problem (pCP) is a fundamental problem in location science, where we are given customer demand points and possible facility locations, and we want to choose $p$ of these locations to open a facility such that the maximum distance of any customer demand point to its closest open facility is minimized. State-of-the-art solution approaches of pCP use its connection to the set cover problem to solve pCP in an iterative fashion by repeatedly solving set cover problems. The classical textbook integer programming (IP) formulation of pCP is usually dismissed due to its size and bad linear programming (LP)-relaxation bounds. We present a novel solution approach that works on a new IP formulation that can be obtained by a projection from the classical formulation. The formulation is solved by means of branch-and-cut, where cuts for demand points are iteratively generated. Moreover, the formulation can be strengthened with combinatorial information to obtain a much tighter LP-relaxation. In particular, we present a novel way to use lower bound information to obtain stronger cuts. We show that the LP-relaxation bound of our strengthened formulation has the same strength as the best known bound in literature, which is based on a semi-relaxation. Finally, we also present a computational study on instances from literature with up to more than 700,000 customers and locations. Our solution algorithm is competitive with highly sophisticated set-cover-based solution algorithms, which depend on various components and parameters.
Given a fixed finite metric space $(V,\mu)$, the {\em minimum $0$-extension problem}, denoted as ${\tt 0\mbox{-}Ext}[\mu]$, is equivalent to the following optimization problem: minimize function of the form $\min\limits_{x\in V^n} \sum_i f_i(x_i) + \sum_{ij}c_{ij}\mu(x_i,x_j)$ where $c_{ij},c_{vi}$ are given nonnegative costs and $f_i:V\rightarrow \mathbb R$ are functions given by $f_i(x_i)=\sum_{v\in V}c_{vi}\mu(x_i,v)$. The computational complexity of ${\tt 0\mbox{-}Ext}[\mu]$ has been recently established by Karzanov and by Hirai: if metric $\mu$ is {\em orientable modular} then ${\tt 0\mbox{-}Ext}[\mu]$ can be solved in polynomial time, otherwise ${\tt 0\mbox{-}Ext}[\mu]$ is NP-hard. To prove the tractability part, Hirai developed a theory of discrete convex functions on orientable modular graphs generalizing several known classes of functions in discrete convex analysis, such as $L^\natural$-convex functions. We consider a more general version of the problem in which unary functions $f_i(x_i)$ can additionally have terms of the form $c_{uv;i}\mu(x_i,\{u,v\})$ for $\{u,v\}\in F$, where set $F\subseteq\binom{V}{2}$ is fixed. We extend the complexity classification above by providing an explicit condition on $(\mu,F)$ for the problem to be tractable. In order to prove the tractability part, we generalize Hirai's theory and define a larger class of discrete convex functions. It covers, in particular, another well-known class of functions, namely submodular functions on an integer lattice. Finally, we improve the complexity of Hirai's algorithm for solving ${\tt 0\mbox{-}Ext}[\mu]$ on orientable modular graphs.
We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs). In this setting, the agent works in two phases. In the exploration phase, the agent interacts with the environment and collects samples without the reward. In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy. We propose a new provably efficient algorithm, called UCRL-RFE under the Linear Mixture MDP assumption, where the transition probability kernel of the MDP can be parameterized by a linear function over certain feature mappings defined on the triplet of state, action, and next state. We show that to obtain an $\epsilon$-optimal policy for arbitrary reward function, UCRL-RFE needs to sample at most $\tilde O(H^5d^2\epsilon^{-2})$ episodes during the exploration phase. Here, $H$ is the length of the episode, $d$ is the dimension of the feature mapping. We also propose a variant of UCRL-RFE using Bernstein-type bonus and show that it needs to sample at most $\tilde O(H^4d(H + d)\epsilon^{-2})$ to achieve an $\epsilon$-optimal policy. By constructing a special class of linear Mixture MDPs, we also prove that for any reward-free algorithm, it needs to sample at least $\tilde \Omega(H^2d\epsilon^{-2})$ episodes to obtain an $\epsilon$-optimal policy. Our upper bound matches the lower bound in terms of the dependence on $\epsilon$ and the dependence on $d$ if $H \ge d$.
The well-known middle levels conjecture asserts that for every integer $n\geq 1$, all binary strings of length $2(n+1)$ with exactly $n+1$ many 0s and 1s can be ordered cyclically so that any two consecutive strings differ in swapping the first bit with a complementary bit at some later position. In his book `The Art of Computer Programming Vol. 4A' Knuth raised a stronger form of this conjecture (Problem 56 in Chapter 7, Section 2.1.3), which requires that the sequence of positions with which the first bit is swapped in each step of such an ordering has $2n+1$ blocks of the same length, and each block is obtained by adding $s=1$ (modulo $2n+1$) to the previous block. In this work, we prove Knuth's conjecture in a more general form, allowing for arbitrary shifts $s\geq 1$ that are coprime to $2n+1$. We also present an algorithm to compute this ordering, generating each new bitstring in $\mathcal{O}(n)$ time, using $\mathcal{O}(n)$ memory in total.
Approximate linear programs (ALPs) are well-known models based on value function approximations (VFAs) to obtain policies and lower bounds on the optimal policy cost of discounted-cost Markov decision processes (MDPs). Formulating an ALP requires (i) basis functions, the linear combination of which defines the VFA, and (ii) a state-relevance distribution, which determines the relative importance of different states in the ALP objective for the purpose of minimizing VFA error. Both these choices are typically heuristic: basis function selection relies on domain knowledge while the state-relevance distribution is specified using the frequency of states visited by a heuristic policy. We propose a self-guided sequence of ALPs that embeds random basis functions obtained via inexpensive sampling and uses the known VFA from the previous iteration to guide VFA computation in the current iteration. Self-guided ALPs mitigate the need for domain knowledge during basis function selection as well as the impact of the initial choice of the state-relevance distribution, thus significantly reducing the ALP implementation burden. We establish high probability error bounds on the VFAs from this sequence and show that a worst-case measure of policy performance is improved. We find that these favorable implementation and theoretical properties translate to encouraging numerical results on perishable inventory control and options pricing applications, where self-guided ALP policies improve upon policies from problem-specific methods. More broadly, our research takes a meaningful step toward application-agnostic policies and bounds for MDPs.
Fluid-structure interactions are central to many bio-molecular processes, and they impose a great challenge for computational and modeling methods. In this paper, we consider the immersed boundary method (IBM) for biofluid systems, and to alleviate the computational cost, we apply reduced-order techniques to eliminate the degrees of freedom associated with a large number of fluid variables. We show how reduced models can be derived using Petrov-Galerkin projection and subspaces that maintain the incompressibility condition. More importantly, the reduced-order model is shown to preserve the Lyapunov stability. We also address the practical issue of computing coefficient matrices in the reduced-order model using an interpolation technique. The efficiency and robustness of the proposed formulation are examined with test examples from various applications.
In this work we study the problem of differentially private (DP) quantiles, in which given dataset $X$ and quantiles $q_1, ..., q_m \in [0,1]$, we want to output $m$ quantile estimations which are as close as possible to the true quantiles and preserve DP. We describe a simple recursive DP algorithm, which we call ApproximateQuantiles (AQ), for this task. We give a worst case upper bound on its error, and show that its error is much lower than of previous implementations on several different datasets. Furthermore, it gets this low error while running time two orders of magnitude faster that the best previous implementation.
We give a number of approximation metatheorems for monotone maximization problems expressible in the first-order logic, in substantially more general settings than the previously known. We obtain * constant-factor approximation algorithm in any class of graphs with bounded expansion, * a QPTAS in any class with strongly sublinear separators, and * a PTAS in any fractionally treewidth-fragile class (which includes all common classes with strongly sublinear separators. Moreover, our tools also give an exact subexponential-time algorithm in any class with strongly sublinear separators.
In the Minimum Common String Partition Problem (MCSP), we are given two strings on input, and we want to partition both into the same collection of substrings, minimizing the number of the substrings in the partition. This combinatorial optimization problem has applications in computational biology and is NP-hard. Many different heuristic and exact methods exist for this problem, such as a Greedy approach, Ant Colony Optimization, or Integer Linear Programming. In this paper, we formulate the MCSP as a Dynamic Program and develop an exact solution algorithm based on Decision Diagrams for it. We also introduce a restricted Decision Diagram that allows to compute heuristic solutions to the MCSP and compare the quality of solution and runtime on instances from literature with existing approaches. Our approach scales well and is suitable for heuristic solution of large-scale instances.
We study financial networks with debt contracts and credit default swaps between specific pairs of banks. Given such a financial system, we want to decide which of the banks are in default, and how much of their liabilities can these defaulting banks pay. There can easily be multiple different solutions to this problem, leading to a situation of default ambiguity, and a range of possible solutions to implement for a financial authority. In this paper, we study the properties of the solution space of such financial systems, and analyze a wide range of reasonable objective functions for selecting from the set of solutions. Examples of such objective functions include minimizing the number of defaulting banks, minimizing the amount of unpaid debt, maximizing the number of satisfied banks, and many others. We show that for all of these objectives, it is NP-hard to approximate the optimal solution to an $n^{1-\epsilon}$ factor for any $\epsilon>0$, with $n$ denoting the number of banks. Furthermore, we show that this situation is rather difficult to avoid from a financial regulator's perspective: the same hardness results also hold if we apply strong restrictions on the weights of the debts, the structure of the network, or the amount of funds that banks must possess. However, if we restrict both the network structure and the amount of funds simultaneously, then the solution becomes unique, and it can be found efficiently.
We study query and computationally efficient planning algorithms with linear function approximation and a simulator. We assume that the agent only has local access to the simulator, meaning that the agent can only query the simulator at states that have been visited before. This setting is more practical than many prior works on reinforcement learning with a generative model. We propose an algorithm named confident Monte Carlo least square policy iteration (Confident MC-LSPI) for this setting. Under the assumption that the Q-functions of all deterministic policies are linear in known features of the state-action pairs, we show that our algorithm has polynomial query and computational complexities in the dimension of the features, the effective planning horizon and the targeted sub-optimality, while these complexities are independent of the size of the state space. One technical contribution of our work is the introduction of a novel proof technique that makes use of a virtual policy iteration algorithm. We use this method to leverage existing results on $\ell_\infty$-bounded approximate policy iteration to show that our algorithm can learn the optimal policy for the given initial state even only with local access to the simulator. We believe that this technique can be extended to broader settings beyond this work.