This work presents a novel matrix-based method for constructing an approximation Hessian using only function evaluations. The method requires less computational power than interpolation-based methods and is easy to implement in matrix-based programming languages such as MATLAB. As only function evaluations are required, the method is suitable for use in derivative-free algorithms. For reasonably structured sample sets, the method is proven to create an order-$1$ accurate approximation of the full Hessian. Under more specialized structures, the method is proved to yield order-$2$ accuracy. The undetermined case, where the number of sample points is less than required for full interpolation, is studied and error bounds are developed for the resulting partial Hessians.
This paper examines robust functional data analysis for discretely observed data, where the underlying process encompasses various distributions, such as heavy tail, skewness, or contaminations. We propose a unified robust concept of functional mean, covariance, and principal component analysis, while existing methods and definitions often differ from one another or only address fully observed functions (the ``ideal'' case). Specifically, the robust functional mean can deviate from its non-robust counterpart and is estimated using robust local linear regression. Moreover, we define a new robust functional covariance that shares useful properties with the classic version. Importantly, this covariance yields the robust version of Karhunen--Lo\`eve decomposition and corresponding principal components beneficial for dimension reduction. The theoretical results of the robust functional mean, covariance, and eigenfunction estimates, based on pooling discretely observed data (ranging from sparse to dense), are established and aligned with their non-robust counterparts. The newly-proposed perturbation bounds for estimated eigenfunctions, with indexes allowed to grow with sample size, lay the foundation for further modeling based on robust functional principal component analysis.
Sampling-based planning algorithm is a powerful tool for solving planning problems in high-dimensional state spaces. In this article, we present a novel approach to sampling in the most promising regions, which significantly reduces planning time-consumption. The RRT# algorithm defines the Relevant Region based on the cost-to-come provided by the optimal forward-searching tree. However, it uses the cumulative cost of a direct connection between the current state and the goal state as the cost-to-go. To improve the path planning efficiency, we propose a batch sampling method that samples in a refined Relevant Region with a direct sampling strategy, which is defined according to the optimal cost-to-come and the adaptive cost-to-go, taking advantage of various sources of heuristic information. The proposed sampling approach allows the algorithm to build the search tree in the direction of the most promising area, resulting in a superior initial solution quality and reducing the overall computation time compared to related work. To validate the effectiveness of our method, we conducted several simulations in both $SE(2)$ and $SE(3)$ state spaces. And the simulation results demonstrate the superiorities of proposed algorithm.
Matrix completion aims to estimate missing entries in a data matrix, using the assumption of a low-complexity structure (e.g., low rank) so that imputation is possible. While many effective estimation algorithms exist in the literature, uncertainty quantification for this problem has proved to be challenging, and existing methods are extremely sensitive to model misspecification. In this work, we propose a distribution-free method for predictive inference in the matrix completion problem. Our method adapts the framework of conformal prediction, which provides confidence intervals with guaranteed distribution-free validity in the setting of regression, to the problem of matrix completion. Our resulting method, conformalized matrix completion (cmc), offers provable predictive coverage regardless of the accuracy of the low-rank model. Empirical results on simulated and real data demonstrate that cmc is robust to model misspecification while matching the performance of existing model-based methods when the model is correct.
This paper presents theoretical and practical results for the bin packing problem with scenarios, a generalization of the classical bin packing problem which considers the presence of uncertain scenarios, of which only one is realized. For this problem, we propose an absolute approximation algorithm whose ratio is bounded by the square root of the number of scenarios times the approximation ratio for an algorithm for the vector bin packing problem. We also show how an asymptotic polynomial-time approximation scheme is derived when the number of scenarios is constant. As a practical study of the problem, we present a branch-and-price algorithm to solve an exponential model and a variable neighborhood search heuristic. To speed up the convergence of the exact algorithm, we also consider lower bounds based on dual feasible functions. Results of these algorithms show the competence of the branch-and-price in obtaining optimal solutions for about 59% of the instances considered, while the combined heuristic and branch-and-price optimally solved 62% of the instances considered.
Understanding the gradient variance of black-box variational inference (BBVI) is a crucial step for establishing its convergence and developing algorithmic improvements. However, existing studies have yet to show that the gradient variance of BBVI satisfies the conditions used to study the convergence of stochastic gradient descent (SGD), the workhorse of BBVI. In this work, we show that BBVI satisfies a matching bound corresponding to the $ABC$ condition used in the SGD literature when applied to smooth and quadratically-growing log-likelihoods. Our results generalize to nonlinear covariance parameterizations widely used in the practice of BBVI. Furthermore, we show that the variance of the mean-field parameterization has provably superior dimensional dependence.
We present Design-by-Morphing (DbM), a novel design methodology applicable to creating a search space for topology optimization of 2D airfoils. Most design techniques impose geometric constraints and sometimes designers' bias on the design space itself, thus restricting the novelty of the designs created, and only allowing for small local changes. We show that DbM methodology does not impose any such restrictions on the design space and allows for extrapolation from the search space, thus granting truly radical and large search space with a few design parameters. In comparison to other shape design methodologies, we apply DbM to create a search space for 2D airfoils. We optimize this airfoil shape design space for maximizing the lift-over-drag ratio, $CLD_{max}$, and stall angle tolerance, $\Delta \alpha$. Using a bi-objective genetic algorithm to optimize the DbM space, it is found that we create a Pareto-front of radical airfoils exhibiting remarkable properties for both objectives.
The problem of constrained reinforcement learning (CRL) holds significant importance as it provides a framework for addressing critical safety satisfaction concerns in the field of reinforcement learning (RL). However, with the introduction of constraint satisfaction, the current CRL methods necessitate the utilization of second-order optimization or primal-dual frameworks with additional Lagrangian multipliers, resulting in increased complexity and inefficiency during implementation. To address these issues, we propose a novel first-order feasible method named Constrained Proximal Policy Optimization (CPPO). By treating the CRL problem as a probabilistic inference problem, our approach integrates the Expectation-Maximization framework to solve it through two steps: 1) calculating the optimal policy distribution within the feasible region (E-step), and 2) conducting a first-order update to adjust the current policy towards the optimal policy obtained in the E-step (M-step). We establish the relationship between the probability ratios and KL divergence to convert the E-step into a convex optimization problem. Furthermore, we develop an iterative heuristic algorithm from a geometric perspective to solve this problem. Additionally, we introduce a conservative update mechanism to overcome the constraint violation issue that occurs in the existing feasible region method. Empirical evaluations conducted in complex and uncertain environments validate the effectiveness of our proposed method, as it performs at least as well as other baselines.
Score-based generative models (SGMs) are powerful tools to sample from complex data distributions. Their underlying idea is to (i) run a forward process for time $T_1$ by adding noise to the data, (ii) estimate its score function, and (iii) use such estimate to run a reverse process. As the reverse process is initialized with the stationary distribution of the forward one, the existing analysis paradigm requires $T_1\to\infty$. This is however problematic: from a theoretical viewpoint, for a given precision of the score approximation, the convergence guarantee fails as $T_1$ diverges; from a practical viewpoint, a large $T_1$ increases computational costs and leads to error propagation. This paper addresses the issue by considering a version of the popular predictor-corrector scheme: after running the forward process, we first estimate the final distribution via an inexact Langevin dynamics and then revert the process. Our key technical contribution is to provide convergence guarantees in Wasserstein distance which require to run the forward process only for a finite time $T_1$. Our bounds exhibit a mild logarithmic dependence on the input dimension and the subgaussian norm of the target distribution, have minimal assumptions on the data, and require only to control the $L^2$ loss on the score approximation, which is the quantity minimized in practice.
We consider a social choice setting in which agents and alternatives are represented by points in a metric space, and the cost of an agent for an alternative is the distance between the corresponding points in the space. The goal is to choose a single alternative to (approximately) minimize the social cost (cost of all agents) or the maximum cost of any agent, when only limited information about the preferences of the agents is given. Previous work has shown that the best possible distortion one can hope to achieve is $3$ when access to the ordinal preferences of the agents is given, even when the distances between alternatives in the metric space are known. We improve upon this bound of $3$ by designing deterministic mechanisms that exploit a bit of cardinal information. We show that it is possible to achieve distortion $1+\sqrt{2}$ by using the ordinal preferences of the agents, the distances between alternatives, and a threshold approval set per agent that contains all alternatives for whom her cost is within an appropriately chosen factor of her cost for her most-preferred alternative. We show that this bound is the best possible for any deterministic mechanism in general metric spaces, and also provide improved bounds for the fundamental case of a line metric.
Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.