The main focus of this paper is radius-based (supplier) clustering in the two-stage stochastic setting with recourse, where the inherent stochasticity of the model comes in the form of a budget constraint. We also explore a number of variants where additional constraints are imposed on the first-stage decisions, specifically matroid and multi-knapsack constraints. Our eventual goal is to handle supplier problems in the most general distributional setting, where there is only black-box access to the underlying distribution. To that end, we follow a two-step approach. First, we develop algorithms for a restricted version of each problem, where all scenarios are explicitly provided; second, we employ a novel scenario-discarding variant of the standard Sample Average Approximation (SAA) method, which crucially exploits properties of the restricted-case algorithms. We note that the scenario-discarding modification to the SAA method is necessary in order to optimize over the radius.
This paper investigates the best arm identification (BAI) problem in stochastic multi-armed bandits in the fixed confidence setting. The general class of the exponential family of bandits is considered. The state-of-the-art algorithms for the exponential family of bandits face computational challenges. To mitigate these challenges, a novel framework is proposed, which views the BAI problem as sequential hypothesis testing, and is amenable to tractable analysis for the exponential family of bandits. Based on this framework, a BAI algorithm is designed that leverages the canonical sequential probability ratio tests. This algorithm has three features for both settings: (1) its sample complexity is asymptotically optimal, (2) it is guaranteed to be $\delta-$PAC, and (3) it addresses the computational challenge of the state-of-the-art approaches. Specifically, these approaches, which are focused only on the Gaussian setting, require Thompson sampling from the arm that is deemed the best and a challenger arm. This paper analytically shows that identifying the challenger is computationally expensive and that the proposed algorithm circumvents it. Finally, numerical experiments are provided to support the analysis.
In this paper, we study a sequential decision making problem faced by e-commerce carriers related to when to send out a vehicle from the central depot to serve customer requests, and in which order to provide the service, under the assumption that the time at which parcels arrive at the depot is stochastic and dynamic. The objective is to maximize the number of parcels that can be delivered during the service hours. We propose two reinforcement learning approaches for solving this problem, one based on a policy function approximation (PFA) and the second on a value function approximation (VFA). Both methods are combined with a look-ahead strategy, in which future release dates are sampled in a Monte-Carlo fashion and a tailored batch approach is used to approximate the value of future states. Our PFA and VFA make a good use of branch-and-cut-based exact methods to improve the quality of decisions. We also establish sufficient conditions for partial characterization of optimal policy and integrate them into PFA/VFA. In an empirical study based on 720 benchmark instances, we conduct a competitive analysis using upper bounds with perfect information and we show that PFA and VFA greatly outperform two alternative myopic approaches. Overall, PFA provides best solutions, while VFA (which benefits from a two-stage stochastic optimization model) achieves a better tradeoff between solution quality and computing time.
Differentiable renderers provide a direct mathematical link between an object's 3D representation and images of that object. In this work, we develop an approximate differentiable renderer for a compact, interpretable representation, which we call Fuzzy Metaballs. Our approximate renderer focuses on rendering shapes via depth maps and silhouettes. It sacrifices fidelity for utility, producing fast runtimes and high-quality gradient information that can be used to solve vision tasks. Compared to mesh-based differentiable renderers, our method has forward passes that are 5x faster and backwards passes that are 30x faster. The depth maps and silhouette images generated by our method are smooth and defined everywhere. In our evaluation of differentiable renderers for pose estimation, we show that our method is the only one comparable to classic techniques. In shape from silhouette, our method performs well using only gradient descent and a per-pixel loss, without any surrogate losses or regularization. These reconstructions work well even on natural video sequences with segmentation artifacts. Project page: //leonidk.github.io/fuzzy-metaballs
We propose a data-driven way to reduce the noise of covariance matrices of nonstationary systems. In the case of stationary systems, asymptotic approaches were proved to converge to the optimal solutions. Such methods produce eigenvalues that are highly dependent on the inputs, as common sense would suggest. Our approach proposes instead to use a set of eigenvalues totally independent from the inputs and that encode the long-term averaging of the influence of the future on present eigenvalues. Such an influence can be the predominant factor in nonstationary systems. Using real and synthetic data, we show that our data-driven method outperforms optimal methods designed for stationary systems for the filtering of both covariance matrix and its inverse, as illustrated by financial portfolio variance minimization, which makes out method generically relevant to many problems of multivariate inference.
In this work we are interested in general linear inverse problems where the corresponding forward problem is solved iteratively using fixed point methods. Then one-shot methods, which iterate at the same time on the forward problem solution and on the inverse problem unknown, can be applied. We analyze two variants of the so-called multi-step one-shot methods and establish sufficient conditions on the descent step for their convergence, by studying the eigenvalues of the block matrix of the coupled iterations. Several numerical experiments are provided to illustrate the convergence of these methods in comparison with the classical usual and shifted gradient descent. In particular, we observe that very few inner iterations on the forward problem are enough to guarantee good convergence of the inversion algorithm.
Generative Adversarial Networks (GANs) have achieved a great success in unsupervised learning. Despite its remarkable empirical performance, there are limited theoretical studies on the statistical properties of GANs. This paper provides approximation and statistical guarantees of GANs for the estimation of data distributions that have densities in a H\"{o}lder space. Our main result shows that, if the generator and discriminator network architectures are properly chosen, GANs are consistent estimators of data distributions under strong discrepancy metrics, such as the Wasserstein-1 distance. Furthermore, when the data distribution exhibits low-dimensional structures, we show that GANs are capable of capturing the unknown low-dimensional structures in data and enjoy a fast statistical convergence, which is free of curse of the ambient dimensionality. Our analysis for low-dimensional data builds upon a universal approximation theory of neural networks with Lipschitz continuity guarantees, which may be of independent interest.
This paper proposes a two-stage estimation approach for a spatial misalignment scenario that is motivated by the epidemiological problem of linking pollutant exposures and health outcomes. We use the integrated nested Laplace approximation method to estimate the parameters of a two-stage spatio-temporal model; the first stage models the exposures while the second stage links the health outcomes to exposures. The first stage is based on the Bayesian melding model, which assumes a common latent field for the geostatistical monitors data and a high-resolution data such as satellite data. The second stage fits a GLMM using the spatial averages of the estimated latent field, and additional spatial and temporal random effects. Uncertainty from the first stage is accounted for by simulating repeatedly from the posterior predictive distribution of the latent field. A simulation study was carried out to assess the impact of the sparsity of the data on the monitors, number of time points, and the specification of the priors in terms of the biases, RMSEs, and coverage probabilities of the parameters and the block-level exposure estimates. The results show that the parameters are generally estimated correctly but there is difficulty in estimating the latent field parameters. The method works very well in estimating block-level exposures and the effect of exposures on the health outcomes, which is the primary parameter of interest for spatial epidemiologists and health policy makers, even with the use of non-informative priors.
The Laplace approximation (LA) has been proposed as a method for approximating the marginal likelihood of statistical models with latent variables. However, the approximate maximum likelihood estimators (MLEs) based on the LA are often biased for binary or spatial data, and the corresponding Hessian matrix underestimates the standard errors of these approximate MLEs. A higher-order approximation has been proposed; however, it cannot be applied to complicated models such as correlated random effects models and does not provide consistent variance estimators. In this paper, we propose an enhanced LA (ELA) that provides the true MLE and its consistent variance estimator. We study its relationship to the variational Bayes method. We also introduce a new restricted maximum likelihood estimator (REMLE) for estimating dispersion parameters. The results of numerical studies show that the ELA provides a satisfactory MLE and REMLE, as well as their variance estimators for fixed parameters. The MLE and REMLE can be viewed as posterior mode and marginal posterior mode under flat priors, respectively. Some comparisons are also made with Bayesian procedures under different priors.
In this paper, a new Discontinuity Capturing Shallow Neural Network (DCSNN) for approximating $d$-dimensional piecewise continuous functions and for solving elliptic interface problems is developed. There are three novel features in the present network; namely, (i) jump discontinuities are accurately captured, (ii) it is completely shallow, comprising only one hidden layer, (iii) it is completely mesh-free for solving partial differential equations. The crucial idea here is that a $d$-dimensional piecewise continuous function can be extended to a continuous function defined in $(d+1)$-dimensional space, where the augmented coordinate variable labels the pieces of each sub-domain. We then construct a shallow neural network to express this new function. Since only one hidden layer is employed, the number of training parameters (weights and biases) scales linearly with the dimension and the neurons used in the hidden layer. For solving elliptic interface problems, the network is trained by minimizing the mean square error loss that consists of the residual of the governing equation, boundary condition, and the interface jump conditions. We perform a series of numerical tests to demonstrate the accuracy of the present network. Our DCSNN model is efficient due to only a moderate number of parameters needed to be trained (a few hundred parameters used throughout all numerical examples), and the results indicate good accuracy. Compared with the results obtained by the traditional grid-based immersed interface method (IIM), which is designed particularly for elliptic interface problems, our network model shows a better accuracy than IIM. We conclude by solving a six-dimensional problem to demonstrate the capability of the present network for high-dimensional applications.
We consider the design of sublinear space and query complexity algorithms for estimating the cost of a minimum spanning tree (MST) and the cost of a minimum traveling salesman (TSP) tour in a metric on $n$ points. We first consider the $o(n)$-space regime and show that, when the input is a stream of all $\binom{n}{2}$ entries of the metric, for any $\alpha \ge 2$, both MST and TSP cost can be $\alpha$-approximated using $\tilde{O}(n/\alpha)$ space, and that $\Omega(n/\alpha^2)$ space is necessary for this task. Moreover, we show that even if the streaming algorithm is allowed $p$ passes over a metric stream, it still requires $\tilde{\Omega}(\sqrt{n/\alpha p^2})$ space. We next consider the semi-streaming regime, where computing even the exact MST cost is easy and the main challenge is to estimate TSP cost to within a factor that is strictly better than $2$. We show that, if the input is a stream of all edges of the weighted graph that induces the underlying metric, for any $\varepsilon > 0$, any one-pass $(2-\varepsilon)$-approximation of TSP cost requires $\Omega(\varepsilon^2 n^2)$ space; on the other hand, there is an $\tilde{O}(n)$ space two-pass algorithm that approximates the TSP cost to within a factor of 1.96. Finally, we consider the query complexity of estimating metric TSP cost to within a factor that is strictly better than $2$, when the algorithm is given access to a matrix that specifies pairwise distances between all points. For MST estimation in this model, it is known that a $(1+\varepsilon)$-approximation is achievable with $\tilde{O}(n/\varepsilon^{O(1)})$ queries. We design an algorithm that performs $\tilde{O}(n^{1.5})$ distance queries and achieves a strictly better than $2$-approximation when either the metric is known to contain a spanning tree supported on weight-$1$ edges or the algorithm is given access to a minimum spanning tree of the graph.