We revisit the problem of computing an optimal partial cover of points by intervals. We show that the greedy algorithm computes a permutation $\Pi = \pi_1, \pi_2,\ldots$ of the intervals that is $3/4$-competitive for any prefix of $k$ intervals. That is, for any $k$, the intervals $\pi_1 \cup \cdots \cup \pi_k$ covers at least $3/4$-fraction of the points covered by the optimal solution using $k$ intervals. We also provide an approximation algorithm that, in $O(n + m/\varepsilon)$ time, computes a cover by $(1+\varepsilon)k$ intervals that is as good as the optimal solution using $k$ intervals, where $n$ is the number of input points, and $m$ is the number of intervals (we assume here the input is presorted). Finally, we show a counter example illustrating that the optimal solutions for set cover do not have the diminishing return property -- that is, the marginal benefit from using more sets is not monotonically decreasing. Fortunately, the diminishing returns does hold for intervals.
Why is ordinary language vague? We argue that in contexts in which a cooperative speaker is not perfectly informed about the world, the use of vague expressions can offer an optimal tradeoff between truthfulness (Gricean Quality) and informativeness (Gricean Quantity). Focusing on expressions of approximation such as "around", which are semantically vague, we show that they allow the speaker to convey indirect probabilistic information, in a way that can give the listener a more accurate representation of the information available to the speaker than any more precise expression would (intervals of the form "between"). That is, vague sentences can be more informative than their precise counterparts. We give a probabilistic treatment of the interpretation of "around", and offer a model for the interpretation and use of "around"-statements within the Rational Speech Act (RSA) framework. In our account the shape of the speaker's distribution matters in ways not predicted by the Lexical Uncertainty model standardly used in the RSA framework for vague predicates. We use our approach to draw further lessons concerning the semantic flexibility of vague expressions and their irreducibility to more precise meanings.
Let $f$ be a nonnegative integer valued function on the vertex set of a graph. A graph is \textbf{strictly $f$-degenerate} if each nonempty subgraph $\Gamma$ has a vertex $v$ such that $\mathrm{deg}_{\Gamma}(v) < f(v)$. In this paper, we define a new concept, strictly $f$-degenerate transversal, which generalizes list coloring, signed coloring, DP-coloring, $L$-forested-coloring, and $(f_{1}, f_{2}, \dots, f_{s})$-partition. A \textbf{cover} of a graph $G$ is a graph $H$ with vertex set $V(H) = \bigcup_{v \in V(G)} X_{v}$, where $X_{v} = \{(v, 1), (v, 2), \dots, (v, s)\}$; the edge set $\mathscr{M} = \bigcup_{uv \in E(G)}\mathscr{M}_{uv}$, where $\mathscr{M}_{uv}$ is a matching between $X_{u}$ and $X_{v}$. A vertex set $R \subseteq V(H)$ is a \textbf{transversal} of $H$ if $|R \cap X_{v}| = 1$ for each $v \in V(G)$. A transversal $R$ is a \textbf{strictly $f$-degenerate transversal} if $H[R]$ is strictly $f$-degenerate. The main result of this paper is a degree type result, which generalizes Brooks' theorem, Gallai's theorem, degree-choosable result, signed degree-colorable result, and DP-degree-colorable result. We also give some structural results on critical graphs with respect to strictly $f$-degenerate transversal. Using these results, we can uniformly prove many new and known results. In the final section, we pose some open problems.
When are inferences (whether Direct-Likelihood, Bayesian, or Frequentist) obtained from partial data valid? This paper answers this question by offering a new theory about inference with missing data. It proves that as the sample size increases and the extent of missingness decreases, the mean-loglikelihood function generated by partial data and that ignores the missingness mechanism will almost surely converge uniformly to that which would have been generated by complete data; and if the data are Missing at Random (or "partially missing at random"), this convergence depends only on sample size. Thus, inferences from partial data, such as posterior modes, uncertainty estimates, confidence intervals, likelihood ratios, and indeed, all quantities or features derived from the partial-data loglikelihood function, will be consistently estimated. They will approximate their complete-data analogues. This adds to previous research which has only proved the consistency of the posterior mode. Practical implications of this result are discussed, and the theory is verified using a previous study of International Human Rights Law.
Solving the time-dependent Schr\"odinger equation is an important application area for quantum algorithms. We consider Schr\"odinger's equation in the semi-classical regime. Here the solutions exhibit strong multiple-scale behavior due to a small parameter $\hbar$, in the sense that the dynamics of the quantum states and the induced observables can occur on different spatial and temporal scales. Such a Schr\"odinger equation finds many applications, including in Born-Oppenheimer molecular dynamics and Ehrenfest dynamics. This paper considers quantum analogues of pseudo-spectral (PS) methods on classical computers. Estimates on the gate counts in terms of $\hbar$ and the precision $\varepsilon$ are obtained. It is found that the number of required qubits, $m$, scales only logarithmically with respect to $\hbar$. When the solution has bounded derivatives up to order $\ell$, the symmetric Trotting method has gate complexity $\mathcal{O}\Big({ (\varepsilon \hbar)^{-\frac12} \mathrm{polylog}(\varepsilon^{-\frac{3}{2\ell}} \hbar^{-1-\frac{1}{2\ell}})}\Big),$ provided that the diagonal unitary operators in the pseudo-spectral methods can be implemented with $\mathrm{poly}(m)$ operations. When physical observables are the desired outcomes, however, the step size in the time integration can be chosen independently of $\hbar$. The gate complexity in this case is reduced to $\mathcal{O}\Big({\varepsilon^{-\frac12} \mathrm{polylog}( \varepsilon^{-\frac3{2\ell}} \hbar^{-1} )}\Big),$ with $\ell$ again indicating the smoothness of the solution.
We consider the dynamic pricing problem with covariates under a generalized linear demand model: a seller can dynamically adjust the price of a product over a horizon of $T$ time periods, and at each time period $t$, the demand of the product is jointly determined by the price and an observable covariate vector $x_t\in\mathbb{R}^d$ through an unknown generalized linear model. Most of the existing literature assumes the covariate vectors $x_t$'s are independently and identically distributed (i.i.d.); the few papers that relax this assumption either sacrifice model generality or yield sub-optimal regret bounds. In this paper we show that a simple pricing algorithm has an $O(d\sqrt{T}\log T)$ regret upper bound without assuming any statistical structure on the covariates $x_t$ (which can even be arbitrarily chosen). The upper bound on the regret matches the lower bound (even under the i.i.d. assumption) up to logarithmic factors. Our paper thus shows that (i) the i.i.d. assumption is not necessary for obtaining low regret, and (ii) the regret bound can be independent of the (inverse) minimum eigenvalue of the covariance matrix of the $x_t$'s, a quantity present in previous bounds. Furthermore, we discuss a condition under which a better regret is achievable and how a Thompson sampling algorithm can be applied to give an efficient computation of the prices.
In this article, we address a class of non convex, integer, non linear mathematical programs using dynamic programming. The mathematical program considered, whose properties are studied in this article, may be used to model the optimal liquidation problem of a single asset portfolio, held in a very large quantity, in a low volatility and perfect memory market, with few market participants. In this context, the Portfolio Manager's selling actions convey information to market participants, which in turn lower bid prices and further penalize the liquidation proceeds we attempt to maximize. We show the problem can be solved exactly using Dynamic Programming (DP) in polynomial time. However, exact resolution is only efficient for small instances. For medium size and large instances, we introduce dedicated heuristics which provide thin admissible solutions, hence tight lower bounds for the initial problem. We also benchmark them against a commercial solver, such as LocalSolver [7]. We are also interested in the continuously relaxed problem, which is non convex. Firstly, we use continuous solutions, obtained by free solver NLopt [26] and transform them into thin admissible solutions of the discrete problem. Secondly, we provide, under some convexity assumptions, an upper bound for the continuous relaxation, and hence for the initial (integer) problem. Numerical experiments confirm the quality of proposed heuristics (lower bounds), which often reach the optimal, or prove very tight, for small and medium size instances, with a very fast CPU time. Our upper bound, however, is not tight.
This article presents a matheuristic algorithm for the single-source capacitated facility location problem (SSCFLP) and its variants: SSCFLP with K facilities (SSCKFLP), SSCFLP with contiguous service areas (CFLSAP), and SSCFLP with K facilities and contiguous service areas (CKFLSAP). The algorithm starts from an initial solution, and iteratively improves the solution by exactly solving large neighborhood-based sub-problems. The performance of the algorithm is tested on 5 sets of SSCFLP benchmark instances. Among the 272 instances, 191 optimal solutions are found, and 35 best-known solutions are updated. For the largest set of instances with 300-1000 facilities and 300-1500 customers (Avella and Boccia 2009), the proposed algorithm outperforms existing methods in terms of the solution quality and the computational time. Furthermore, based on two geographic areas, two sets of instances are generated to test the algorithm for solving SSCFLP and its variants. The solutions found by the proposed algorithm approximate optimal solutions or the lower bounds with average gaps of 0.07% for SSCFLP, 0.22% for CFLSAP, 0.04% for SSCKFLP, and 0.13% for CKFLSAP.
A fundamental problem in numerical analysis and approximation theory is approximating smooth functions by polynomials. A much harder version under recent consideration is to enforce bounds constraints on the approximating polynomial. In this paper, we consider the problem of approximating functions by polynomials whose Bernstein coefficients with respect to a given degree satisfy such bounds, which implies such bounds on the approximant. We frame the problem as an inequality-constrained optimization problem and give an algorithm for finding the Bernstein coefficients of the exact solution. Additionally, our method can be modified slightly to include equality constraints such as mass preservation. It also extends naturally to multivariate polynomials over a simplex.
Simulations of high-energy particle collisions, such as those used at the Large Hadron Collider, are based on quantum field theory; however, many approximations are made in practice. For example, the simulation of the parton shower, which gives rise to objects called `jets', is based on a semi-classical approximation that neglects various interference effects. While there is a desire to incorporate interference effects, new computational techniques are needed to cope with the exponential growth in complexity associated to quantum processes. We present a classical algorithm called the quantum trellis to efficiently compute the un-normalized probability density over N-body phase space including all interference effects, and we pair this with an MCMC-based sampling strategy. This provides a potential path forward for classical computers and a strong baseline for approaches based on quantum computing.
In this work, we compare three different modeling approaches for the scores of soccer matches with regard to their predictive performances based on all matches from the four previous FIFA World Cups 2002 - 2014: Poisson regression models, random forests and ranking methods. While the former two are based on the teams' covariate information, the latter method estimates adequate ability parameters that reflect the current strength of the teams best. Within this comparison the best-performing prediction methods on the training data turn out to be the ranking methods and the random forests. However, we show that by combining the random forest with the team ability parameters from the ranking methods as an additional covariate we can improve the predictive power substantially. Finally, this combination of methods is chosen as the final model and based on its estimates, the FIFA World Cup 2018 is simulated repeatedly and winning probabilities are obtained for all teams. The model slightly favors Spain before the defending champion Germany. Additionally, we provide survival probabilities for all teams and at all tournament stages as well as the most probable tournament outcome.