$d$-dimensional (for $d>1$) efficient range-summability ($d$D-ERS) of ideally i.i.d. random variables (RVs) is a fundamental algorithmic problem that has applications to two important families of database problems, namely, fast approximate wavelet tracking (FAWT) on data streams and approximately answering range-sum queries over a data cube. Whether there are efficient solutions to the $d$D-ERS problem, or to the latter database problem, have been two long-standing open problems. Both are solved in this work. Specifically, we propose a novel solution framework to $d$D-ERS on RVs that have Gaussian or Poisson distribution. Our $d$D-ERS solutions are the first ones that have polylogarithmic time complexities. Furthermore, we develop a novel $k$-wise independence theory that allows our $d$D-ERS solutions to have both high computational efficiencies and strong provable independence guarantees. Finally, we show that under a sufficient and likely necessary condition, certain existing solutions for 1D-ERS can be generalized to higher dimensions.
In this paper, we provide the first deterministic algorithm that achieves the tight $1-1/e$ approximation guarantee for submodular maximization under a cardinality (size) constraint while making a number of queries that scales only linearly with the size of the ground set $n$. To complement our result, we also show strong information-theoretic lower bounds. More specifically, we show that when the maximum cardinality allowed for a solution is constant, no algorithm making a sub-linear number of function evaluations can guarantee any constant approximation ratio. Furthermore, when the constraint allows the selection of a constant fraction of the ground set, we show that any algorithm making fewer than $\Omega(n/\log(n))$ function evaluations cannot perform better than an algorithm that simply outputs a uniformly random subset of the ground set of the right size. We then provide a variant of our deterministic algorithm for the more general knapsack constraint, which is the first linear-time algorithm that achieves $1/2$-approximation guarantee for this constraint. Finally, we extend our results to the general case of maximizing a monotone submodular function subject to the intersection of a $p$-set system and multiple knapsack constraints. We extensively evaluate the performance of our algorithms on multiple real-life machine learning applications, including movie recommendation, location summarization, twitter text summarization and video summarization.
This work deals with the a posteriori error estimates for the Darcy-Forchheimer problem. We introduce the corresponding variational formulation and discretize it by using the finite-element method. A posteriori error estimate with two types of computable error indicators is showed. The first one is linked to the linearization and the second one to the discretization. Finally, numerical computations are performed to show the effectiveness of the error indicators.
We present a method to fit exact Gaussian process models to large datasets by considering only a subset of the data. Our approach is novel in that the size of the subset is selected on the fly during exact inference with little computational overhead. From an empirical observation that the log-marginal likelihood often exhibits a linear trend once a sufficient subset of a dataset has been observed, we conclude that many large datasets contain redundant information that only slightly affects the posterior. Based on this, we provide probabilistic bounds on the full model evidence that can identify such subsets. Remarkably, these bounds are largely composed of terms that appear in intermediate steps of the standard Cholesky decomposition, allowing us to modify the algorithm to adaptively stop the decomposition once enough data have been observed. Empirically, we show that our method can be directly plugged into well-known inference schemes to fit exact Gaussian process models to large datasets.
We establish a novel convergent iteration framework for a weak approximation of general switching diffusion. The key theoretical basis of the proposed approach is a restriction of the maximum number of switching so as to untangle and compensate a challenging system of weakly coupled partial differential equations to a collection of independent partial differential equations, for which a variety of accurate and efficient numerical methods are available. Upper and lower bounding functions for the solutions are constructed using the iterative approximate solutions. We provide a rigorous convergence analysis for the iterative approximate solutions, as well as for the upper and lower bounding functions. Numerical results are provided to examine our theoretical findings and the effectiveness of the proposed framework.
Quantifying the data uncertainty in learning tasks is often done by learning a prediction interval or prediction set of the label given the input. Two commonly desired properties for learned prediction sets are \emph{valid coverage} and \emph{good efficiency} (such as low length or low cardinality). Conformal prediction is a powerful technique for learning prediction sets with valid coverage, yet by default its conformalization step only learns a single parameter, and does not optimize the efficiency over more expressive function classes. In this paper, we propose a generalization of conformal prediction to multiple learnable parameters, by considering the constrained empirical risk minimization (ERM) problem of finding the most efficient prediction set subject to valid empirical coverage. This meta-algorithm generalizes existing conformal prediction algorithms, and we show that it achieves approximate valid population coverage and near-optimal efficiency within class, whenever the function class in the conformalization step is low-capacity in a certain sense. Next, this ERM problem is challenging to optimize as it involves a non-differentiable coverage constraint. We develop a gradient-based algorithm for it by approximating the original constrained ERM using differentiable surrogate losses and Lagrangians. Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly over existing approaches in several applications such as prediction intervals with improved length, minimum-volume prediction sets for multi-output regression, and label prediction sets for image classification.
Sampling of sharp posteriors in high dimensions is a challenging problem, especially when gradients of the likelihood are unavailable. In low to moderate dimensions, affine-invariant methods, a class of ensemble-based gradient-free methods, have found success in sampling concentrated posteriors. However, the number of ensemble members must exceed the dimension of the unknown state in order for the correct distribution to be targeted. Conversely, the preconditioned Crank-Nicolson (pCN) algorithm succeeds at sampling in high dimensions, but samples become highly correlated when the posterior differs significantly from the prior. In this article we combine the above methods in two different ways as an attempt to find a compromise. The first method involves inflating the proposal covariance in pCN with that of the current ensemble, whilst the second performs approximately affine-invariant steps on a continually adapting low-dimensional subspace, while using pCN on its orthogonal complement.
We present a general framework for designing approximately revenue-optimal mechanisms for multi-item additive auctions, which applies to both truthful and non-truthful auctions. Given a (not necessarily truthful) single-item auction format $A$ satisfying certain technical conditions, we run simultaneous item auctions augmented with a personalized entry fee for each bidder that must be paid before the auction can be accessed. These entry fees depend only on the prior distribution of bidder types, and in particular are independent of realized bids. We bound the revenue of the resulting two-part tariff mechanism using a novel geometric technique that enables revenue guarantees for many common non-truthful auctions that previously had none. Our approach adapts and extends the duality framework of Cai et al [CDW16] beyond truthful auctions. Our framework can be used with many common auction formats, such as simultaneous first-price, simultaneous second-price, and simultaneous all-pay auctions. Our results for first price and all-pay are the first revenue guarantees of non-truthful mechanisms in multi-dimensional environments, addressing an open question in the literature [RST17]. If all-pay auctions are used, we prove that the resulting mechanism is also credible in the sense that the auctioneer cannot benefit by deviating from the stated mechanism after observing agent bids. This is the first static credible mechanism for multi-item additive auctions that achieves a constant factor of the optimal revenue. If second-price auctions are used, we obtain a truthful $O(1)$-approximate mechanism with fixed entry fees that are amenable to tuning via online learning techniques.
In this paper, we propose a direct parallel-in-time (PinT) algorithm for time-dependent problems with first- or second-order derivative. We use a second-order boundary value method as the time integrator that leads to a tridiagonal time discretization matrix. Instead of solving the corresponding all-at-once system iteratively, we diagonalize the time discretization matrix, which yields a direct parallel implementation across all time levels. A crucial issue on this methodology is how the condition number of the eigenvector matrix $V$ grows as $n$ is increased, where $n$ is the number of time levels. A large condition number leads to large roundoff error in the diagonalization procedure, which could seriously pollute the numerical accuracy. Based on a novel connection between the characteristic equation and the Chebyshev polynomials, we present explicit formulas for computing $V$ and $V^{-1}$, by which we prove that $\mathrm{Cond}_2(V)=\mathcal{O}(n^{2})$. This implies that the diagonalization process is well-conditioned and the roundoff error only increases moderately as $n$ grows and thus, compared to other direct PinT algorithms, a much larger $n$ can be used to yield satisfactory parallelism. Numerical results on parallel machine are given to support our findings, where over 60 times speedup is achieved with 256 cores.
Developing classification algorithms that are fair with respect to sensitive attributes of the data has become an important problem due to the growing deployment of classification algorithms in various social contexts. Several recent works have focused on fairness with respect to a specific metric, modeled the corresponding fair classification problem as a constrained optimization problem, and developed tailored algorithms to solve them. Despite this, there still remain important metrics for which we do not have fair classifiers and many of the aforementioned algorithms do not come with theoretical guarantees; perhaps because the resulting optimization problem is non-convex. The main contribution of this paper is a new meta-algorithm for classification that takes as input a large class of fairness constraints, with respect to multiple non-disjoint sensitive attributes, and which comes with provable guarantees. This is achieved by first developing a meta-algorithm for a large family of classification problems with convex constraints, and then showing that classification problems with general types of fairness constraints can be reduced to those in this family. We present empirical results that show that our algorithm can achieve near-perfect fairness with respect to various fairness metrics, and that the loss in accuracy due to the imposed fairness constraints is often small. Overall, this work unifies several prior works on fair classification, presents a practical algorithm with theoretical guarantees, and can handle fairness metrics that were previously not possible.
Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and real data that our algorithms have state-of-the-art performance and suddenly make high-dimensional robust estimation a realistic possibility.