Finding the optimal design of a hydrodynamic or aerodynamic surface is often impossible due to the expense of evaluating the cost functions (say, with computational fluid dynamics) needed to determine the performances of the flows that the surface controls. In addition, inherent limitations of the design space itself due to imposed geometric constraints, conventional parameterization methods, and user bias can restrict {\it all} of the designs within a chosen design space regardless of whether traditional optimization methods or newer, data-driven design algorithms with machine learning are used to search the design space. We present a 2-pronged attack to address these difficulties: we propose (1) a methodology to create the design space using morphing that we call {\it Design-by-Morphing} (DbM); and (2) an optimization algorithm to search that space that uses a novel Bayesian Optimization (BO) strategy that we call {\it Mixed variable, Multi-Objective Bayesian Optimization} (MixMOBO). We apply this shape optimization strategy to maximize the power output of a hydrokinetic turbine. Applying these two strategies in tandem, we demonstrate that we can create a novel, geometrically-unconstrained, design space of a draft tube and hub shape and then optimize them simultaneously with a {\it minimum} number of cost function calls. Our framework is versatile and can be applied to the shape optimization of a variety of fluid problems.
We propose coordinating guiding vector fields to achieve two tasks simultaneously with a team of robots: first, the guidance and navigation of multiple robots to possibly different paths or surfaces typically embedded in 2D or 3D; second, their motion coordination while tracking their prescribed paths or surfaces. The motion coordination is defined by desired parametric displacements between robots on the path or surface. Such a desired displacement is achieved by controlling the virtual coordinates, which correspond to the path or surface's parameters, between guiding vector fields. Rigorous mathematical guarantees underpinned by dynamical systems theory and Lyapunov theory are provided for the effective distributed motion coordination and navigation of robots on paths or surfaces from all initial positions. As an example for practical robotic applications, we derive a control algorithm from the proposed coordinating guiding vector fields for a Dubins-car-like model with actuation saturation. Our proposed algorithm is distributed and scalable to an arbitrary number of robots. Furthermore, extensive illustrative simulations and fixed-wing aircraft outdoor experiments validate the effectiveness and robustness of our algorithm.
The goal of Bayesian deep learning is to provide uncertainty quantification via the posterior distribution. However, exact inference over the weight space is computationally intractable due to the ultra-high dimensions of the neural network. Variational inference (VI) is a promising approach, but naive application on weight space does not scale well and often underperform on predictive accuracy. In this paper, we propose a new adaptive variational Bayesian algorithm to train neural networks on weight space that achieves high predictive accuracy. By showing that there is an equivalence to Stochastic Gradient Hamiltonian Monte Carlo(SGHMC) with preconditioning matrix, we then propose an MCMC within EM algorithm, which incorporates the spike-and-slab prior to capture the sparsity of the neural network. The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot. We evaluate our methods on CIFAR-10, CIFAR-100 and ImageNet datasets, and demonstrate that our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
For users to trust model predictions, they need to understand model outputs, particularly their confidence - calibration aims to adjust (calibrate) models' confidence to match expected accuracy. We argue that the traditional calibration evaluation does not promote effective calibrations: for example, it can encourage always assigning a mediocre confidence score to all predictions, which does not help users distinguish correct predictions from wrong ones. Building on those observations, we propose a new calibration metric, MacroCE, that better captures whether the model assigns low confidence to wrong predictions and high confidence to correct predictions. Focusing on the practical application of open-domain question answering, we examine conventional calibration methods applied on the widely-used retriever-reader pipeline, all of which do not bring significant gains under our new MacroCE metric. Toward better calibration, we propose a new calibration method (ConsCal) that uses not just final model predictions but whether multiple model checkpoints make consistent predictions. Altogether, we provide an alternative view of calibration along with a new metric, re-evaluation of existing calibration methods on our metric, and proposal of a more effective calibration method.
Traditional monocular Visual Simultaneous Localization and Mapping (vSLAM) systems can be divided into three categories: those that use features, those that rely on the image itself, and hybrid models. In the case of feature-based methods, new research has evolved to incorporate more information from their environment using geometric primitives beyond points, such as lines and planes. This is because in many environments, which are man-made environments, characterized as Manhattan world, geometric primitives such as lines and planes occupy most of the space in the environment. The exploitation of these schemes can lead to the introduction of algorithms capable of optimizing the trajectory of a Visual SLAM system and also helping to construct an exuberant map. Thus, we present a real-time monocular Visual SLAM system that incorporates real-time methods for line and VP extraction, as well as two strategies that exploit vanishing points to estimate the robot's translation and improve its rotation.Particularly, we build on ORB-SLAM2, which is considered the current state-of-the-art solution in terms of both accuracy and efficiency, and extend its formulation to handle lines and VPs to create two strategies the first optimize the rotation and the second refine the translation part from the known rotation. First, we extract VPs using a real-time method and use them for a global rotation optimization strategy. Second, we present a translation estimation method that takes advantage of last-stage rotation optimization to model a linear system. Finally, we evaluate our system on the TUM RGB-D benchmark and demonstrate that the proposed system achieves state-of-the-art results and runs in real time, and its performance remains close to the original ORB-SLAM2 system
In randomized experiments and observational studies, weighting methods are often used to generalize and transport treatment effect estimates to a target population. Traditional methods construct the weights by separately modeling the treatment assignment and study selection probabilities and then multiplying functions (e.g., inverses) of their estimates. However, these estimated multiplicative weights may not produce adequate covariate balance and can be highly variable, resulting in biased and unstable estimators, especially when there is limited covariate overlap across populations or treatment groups. To address these limitations, we propose a general weighting approach that weights each treatment group towards the target population in a single step. We present a framework and provide a justification for this one-step approach in terms of generic probability distributions. We show a formal connection between our method and inverse probability and inverse odds weighting. By construction, the proposed approach balances covariates and produces stable estimators. We show that our estimator for the target average treatment effect is consistent, asymptotically Normal, multiply robust, and semiparametrically efficient. We demonstrate the performance of this approach using a simulation study and a randomized case study on the effects of physician racial diversity on preventive healthcare utilization among Black men in California.
In group testing, the goal is to identify a subset of defective items within a larger set of items based on tests whose outcomes indicate whether at least one defective item is present. This problem is relevant in areas such as medical testing, DNA sequencing, communication protocols, and many more. In this paper, we study (i) a sparsity-constrained version of the problem, in which the testing procedure is subjected to one of the following two constraints: items are finitely divisible and thus may participate in at most $\gamma$ tests; or tests are size-constrained to pool no more than $\rho$ items per test; and (ii) a noisy version of the problem, where each test outcome is independently flipped with some constant probability. Under each of these settings, considering the for-each recovery guarantee with asymptotically vanishing error probability, we introduce a fast splitting algorithm and establish its near-optimality not only in terms of the number of tests, but also in terms of the decoding time. While the most basic formulations of our algorithms require $\Omega(n)$ storage for each algorithm, we also provide low-storage variants based on hashing, with similar recovery guarantees.
We apply the PAC-Bayes theory to the setting of learning-to-optimize. To the best of our knowledge, we present the first framework to learn optimization algorithms with provable generalization guarantees (PAC-bounds) and explicit trade-off between a high probability of convergence and a high convergence speed. Even in the limit case, where convergence is guaranteed, our learned optimization algorithms provably outperform related algorithms based on a (deterministic) worst-case analysis. Our results rely on PAC-Bayes bounds for general, unbounded loss-functions based on exponential families. By generalizing existing ideas, we reformulate the learning procedure into a one-dimensional minimization problem and study the possibility to find a global minimum, which enables the algorithmic realization of the learning procedure. As a proof-of-concept, we learn hyperparameters of standard optimization algorithms to empirically underline our theory.
Submodular functions and variants, through their ability to characterize diversity and coverage, have emerged as a key tool for data selection and summarization. Many recent approaches to learn submodular functions suffer from limited expressiveness. In this work, we propose FLEXSUBNET, a family of flexible neural models for both monotone and non-monotone submodular functions. To fit a latent submodular function from (set, value) observations, FLEXSUBNET applies a concave function on modular functions in a recursive manner. We do not draw the concave function from a restricted family, but rather learn from data using a highly expressive neural network that implements a differentiable quadrature procedure. Such an expressive neural model for concave functions may be of independent interest. Next, we extend this setup to provide a novel characterization of monotone \alpha-submodular functions, a recently introduced notion of approximate submodular functions. We then use this characterization to design a novel neural model for such functions. Finally, we consider learning submodular set functions under distant supervision in the form of (perimeter-set, high-value-subset) pairs. This yields a novel subset selection method based on an order-invariant, yet greedy sampler built around the above neural set functions. Our experiments on synthetic and real data show that FLEXSUBNET outperforms several baselines.
We propose an analysis in fair learning that preserves the utility of the data while reducing prediction disparities under the criteria of group sufficiency. We focus on the scenario where the data contains multiple or even many subgroups, each with limited number of samples. As a result, we present a principled method for learning a fair predictor for all subgroups via formulating it as a bilevel objective. Specifically, the subgroup specific predictors are learned in the lower-level through a small amount of data and the fair predictor. In the upper-level, the fair predictor is updated to be close to all subgroup specific predictors. We further prove that such a bilevel objective can effectively control the group sufficiency and generalization error. We evaluate the proposed framework on real-world datasets. Empirical evidence suggests the consistently improved fair predictions, as well as the comparable accuracy to the baselines.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.