We consider the problem of maximizing a fractionally subadditive function under a knapsack constraint that grows over time. An incremental solution to this problem is given by an order in which to include the elements of the ground set, and the competitive ratio of an incremental solution is defined by the worst ratio over all capacities relative to an optimum solution of the corresponding capacity. We present an algorithm that finds an incremental solution of competitive ratio at most $\max\{3.293\sqrt{M},2M\}$, under the assumption that the values of singleton sets are in the range $[1,M]$, and we give a lower bound of $\max\{2.618,M\}$ on the attainable competitive ratio. In addition, we establish that our framework captures potential-based flows between two vertices, and we give a lower bound of $\max\{2,M\}$ and an upper bound of $2M$ for the incremental maximization of classical flows with capacities in $[1,M]$ which is tight for the unit capacity case.
The non-parametric estimation of a non-linear reaction term in a semi-linear parabolic stochastic partial differential equation (SPDE) is discussed. The estimation error can be bounded in terms of the diffusivity and the noise level. The estimator is easily computable and consistent under general assumptions due to the asymptotic spatial ergodicity of the SPDE as both the diffusivity and the noise level tend to zero. If the SPDE is driven by space-time white noise, a central limit theorem for the estimation error and minimax-optimality of the convergence rate are obtained. The analysis of the estimation error requires the control of spatial averages of non-linear transformations of the SPDE, and combines the Clark-Ocone formula from Malliavin calculus with the Markovianity of the SPDE. In contrast to previous results on the convergence of spatial averages, the obtained variance bound is uniform in the Lipschitz-constant of the transformation. Additionally, new upper and lower Gaussian bounds for the marginal (Lebesgue-) densities of the SPDE are required and derived.
We propose a simple and efficient approach to generate a prediction intervals (PI) for approximated and forecasted trends. Our method leverages a weighted asymmetric loss function to estimate the lower and upper bounds of the PI, with the weights determined by its coverage probability. We provide a concise mathematical proof of the method, show how it can be extended to derive PIs for parametrised functions and argue why the method works for predicting PIs of dependent variables. The presented tests of the method on a real-world forecasting task using a neural network-based model show that it can produce reliable PIs in complex machine learning scenarios.
We study the initial beam acquisition problem in millimeter wave (mm-wave) networks from the perspective of best arm identification in multi-armed bandits (MABs). For the stationary environment, we propose a novel algorithm called concurrent beam exploration, CBE, in which multiple beams are grouped based on the beam indices and are simultaneously activated to detect the presence of the user. The best beam is then identified using a Hamming decoding strategy. For the case of orthogonal and highly directional thin beams, we characterize the performance of CBE in terms of the probability of missed detection and false alarm in a beam group (BG). Leveraging this, we derive the probability of beam selection error and prove that CBE outperforms the state-of-the-art strategies in this metric. Then, for the abruptly changing environments, e.g., in the case of moving blockages, we characterize the performance of the classical sequential halving (SH) algorithm. In particular, we derive the conditions on the distribution of the change for which the beam selection error is exponentially bounded. In case the change is restricted to a subset of the beams, we devise a strategy called K-sequential halving and exhaustive search, K-SHES, that leads to an improved bound for the beam selection error as compared to SH. This policy is particularly useful when a near-optimal beam becomes optimal during the beam-selection procedure due to abruptly changing channel conditions. Finally, we demonstrate the efficacy of the proposed scheme by employing it in a tandem beam refinement and data transmission scheme.
This work addresses a version of the two-armed Bernoulli bandit problem where the sum of the means of the arms is one (the symmetric two-armed Bernoulli bandit). In a regime where the gap between these means goes to zero and the number of prediction periods approaches infinity, we obtain the leading order terms of the minmax optimal regret and pseudoregret for this problem by associating each of them with a solution of a linear heat equation. Our results improve upon the previously known results; specifically, we explicitly compute these leading order terms in three different scaling regimes for the gap. Additionally, we obtain new non-asymptotic bounds for any given time horizon.
\textit{Pursuit-evasion games} have been intensively studied for several decades due to their numerous applications in artificial intelligence, robot motion planning, database theory, distributed computing, and algorithmic theory. \textsc{Cops and Robber} (\CR) is one of the most well-known pursuit-evasion games played on graphs, where multiple \textit{cops} pursue a single \textit{robber}. The aim is to compute the \textit{cop number} of a graph, $k$, which is the minimum number of cops that ensures the \textit{capture} of the robber. From the viewpoint of parameterized complexity, \CR is W[2]-hard parameterized by $k$~[Fomin et al., TCS, 2010]. Thus, we study structural parameters of the input graph. We begin with the \textit{vertex cover number} ($\mathsf{vcn}$). First, we establish that $k \leq \frac{\mathsf{vcn}}{3}+1$. Second, we prove that \CR parameterized by $\mathsf{vcn}$ is \FPT by designing an exponential kernel. We complement this result by showing that it is unlikely for \CR parameterized by $\mathsf{vcn}$ to admit a polynomial compression. We extend our exponential kernels to the parameters \textit{cluster vertex deletion number} and \textit{deletion to stars number}, and design a linear vertex kernel for \textit{neighborhood diversity}. Additionally, we extend all of our results to several well-studied variations of \CR.
This work, for the first time, introduces two constant factor approximation algorithms with linear query complexity for non-monotone submodular maximization over a ground set of size $n$ subject to a knapsack constraint, $\mathsf{DLA}$ and $\mathsf{RLA}$. $\mathsf{DLA}$ is a deterministic algorithm that provides an approximation factor of $6+\epsilon$ while $\mathsf{RLA}$ is a randomized algorithm with an approximation factor of $4+\epsilon$. Both run in $O(n \log(1/\epsilon)/\epsilon)$ query complexity. The key idea to obtain a constant approximation ratio with linear query lies in: (1) dividing the ground set into two appropriate subsets to find the near-optimal solution over these subsets with linear queries, and (2) combining a threshold greedy with properties of two disjoint sets or a random selection process to improve solution quality. In addition to the theoretical analysis, we have evaluated our proposed solutions with three applications: Revenue Maximization, Image Summarization, and Maximum Weighted Cut, showing that our algorithms not only return comparative results to state-of-the-art algorithms but also require significantly fewer queries.
Maximum weight independent set (MWIS) admits a $\frac1k$-approximation in inductively $k$-independent graphs and a $\frac{1}{2k}$-approximation in $k$-perfectly orientable graphs. These are a a parameterized class of graphs that generalize $k$-degenerate graphs, chordal graphs, and intersection graphs of various geometric shapes such as intervals, pseudo-disks, and several others. We consider a generalization of MWIS to a submodular objective. Given a graph $G=(V,E)$ and a non-negative submodular function $f: 2^V \rightarrow \mathbb{R}_+$, the goal is to approximately solve $\max_{S \in \mathcal{I}_G} f(S)$ where $\mathcal{I}_G$ is the set of independent sets of $G$. We obtain an $\Omega(\frac1k)$-approximation for this problem in the two mentioned graph classes. The first approach is via the multilinear relaxation framework and a simple contention resolution scheme, and this results in a randomized algorithm with approximation ratio at least $\frac{1}{e(k+1)}$. This approach also yields parallel (or low-adaptivity) approximations. Motivated by the goal of designing efficient and deterministic algorithms, we describe two other algorithms for inductively $k$-independent graphs that are inspired by work on streaming algorithms: a preemptive greedy algorithm and a primal-dual algorithm. In addition to being simpler and faster, these algorithms, in the monotone submodular case, yield the first deterministic constant factor approximations for various special cases that have been previously considered such as intersection graphs of intervals, disks and pseudo-disks.
The development of new manufacturing techniques such as 3D printing have enabled the creation of previously infeasible chemical reactor designs. Systematically optimizing the highly parameterized geometries involved in these new classes of reactor is vital to ensure enhanced mixing characteristics and feasible manufacturability. Here we present a framework to rapidly solve this nonlinear, computationally expensive, and derivative-free problem, enabling the fast prototype of novel reactor parameterizations. We take advantage of Gaussian processes to adaptively learn a multi-fidelity model of reactor simulations across a number of different continuous mesh fidelities. The search space of reactor geometries is explored through an amalgam of different, potentially lower, fidelity simulations which are chosen for evaluation based on weighted acquisition function, trading off information gain with cost of simulation. Within our framework we derive a novel criteria for monitoring the progress and dictating the termination of multi-fidelity Bayesian optimization, ensuring a high fidelity solution is returned before experimental budget is exhausted. The class of reactor we investigate are helical-tube reactors under pulsed-flow conditions, which have demonstrated outstanding mixing characteristics, have the potential to be highly parameterized, and are easily manufactured using 3D printing. To validate our results, we 3D print and experimentally validate the optimal reactor geometry, confirming its mixing performance. In doing so we demonstrate our design framework to be extensible to a broad variety of expensive simulation-based optimization problems, supporting the design of the next generation of highly parameterized chemical reactors.
Algorithmic differentiation (AD) is a set of techniques that provide partial derivatives of computer-implemented functions. Such a function can be supplied to state-of-the-art AD tools via its source code, or via an intermediate representation produced while compiling its source code. We present the novel AD tool Derivgrind, which augments the machine code of compiled programs with forward-mode AD logic. Derivgrind leverages the Valgrind instrumentation framework for a structured access to the machine code, and a shadow memory tool to store dot values. Access to the source code is required at most for the files in which input and output variables are defined. Derivgrind's versatility comes at the price of scaling the run-time by a factor between 30 and 75, measured on a benchmark based on a numerical solver for a partial differential equation. Results of our extensive regression test suite indicate that Derivgrind produces correct results on GCC- and Clang-compiled programs, including a Python interpreter, with a small number of exceptions. While we provide a list of scenarios that Derivgrind does not handle correctly, nearly all of them are academic counterexamples or originate from highly optimized math libraries. As long as differentiating those is avoided, Derivgrind can be applied to an unprecedentedly wide range of cross-language or partially closed-source software with little integration efforts.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.