In warehouses, order picking is known to be the most labor-intensive and costly task in which the employees account for a large part of the warehouse performance. Hence, many approaches exist, that optimize the order picking process based on diverse economic criteria. However, most of these approaches focus on a single economic objective at once and disregard ergonomic criteria in their optimization. Further, the influence of the placement of the items to be picked is underestimated and accordingly, too little attention is paid to the interdependence of these two problems. In this work, we aim at optimizing the storage assignment and the order picking problem within mezzanine warehouse with regards to their reciprocal influence. We propose a customized version of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) for optimizing the storage assignment problem as well as an Ant Colony Optimization (ACO) algorithm for optimizing the order picking problem. Both algorithms incorporate multiple economic and ergonomic constraints simultaneously. Furthermore, the algorithms incorporate knowledge about the interdependence between both problems, aiming to improve the overall warehouse performance. Our evaluation results show that our proposed algorithms return better storage assignments and order pick routes compared to commonly used techniques for the following quality indicators for comparing Pareto fronts: Coverage, Generational Distance, Euclidian Distance, Pareto Front Size, and Inverted Generational Distance. Additionally, the evaluation regarding the interaction of both algorithms shows a better performance when combining both proposed algorithms.
The stochastic nature of iterative optimization heuristics leads to inherently noisy performance measurements. Since these measurements are often gathered once and then used repeatedly, the number of collected samples will have a significant impact on the reliability of algorithm comparisons. We show that care should be taken when making decisions based on limited data. Particularly, we show that the number of runs used in many benchmarking studies, e.g., the default value of 15 suggested by the COCO environment, can be insufficient to reliably rank algorithms on well-known numerical optimization benchmarks. Additionally, methods for automated algorithm configuration are sensitive to insufficient sample sizes. This may result in the configurator choosing a `lucky' but poor-performing configuration despite exploring better ones. We show that relying on mean performance values, as many configurators do, can require a large number of runs to provide accurate comparisons between the considered configurations. Common statistical tests can greatly improve the situation in most cases but not always. We show examples of performance losses of more than 20%, even when using statistical races to dynamically adjust the number of runs, as done by irace. Our results underline the importance of appropriately considering the statistical distribution of performance values.
Existing inferential methods for small area data involve a trade-off between maintaining area-level frequentist coverage rates and improving inferential precision via the incorporation of indirect information. In this article, we propose a method to obtain an area-level prediction region for a future observation which mitigates this trade-off. The proposed method takes a conformal prediction approach in which the conformity measure is the posterior predictive density of a working model that incorporates indirect information. The resulting prediction region has guaranteed frequentist coverage regardless of the working model, and, if the working model assumptions are accurate, the region has minimum expected volume compared to other regions with the same coverage rate. When constructed under a normal working model, we prove such a prediction region is an interval and construct an efficient algorithm to obtain the exact interval. We illustrate the performance of our method through simulation studies and an application to EPA radon survey data.
This paper addresses the numerical solution of nonlinear eigenvector problems such as the Gross-Pitaevskii and Kohn-Sham equation arising in computational physics and chemistry. These problems characterize critical points of energy minimization problems on the infinite-dimensional Stiefel manifold. To efficiently compute minimizers, we propose a novel Riemannian gradient descent method induced by an energy-adaptive metric. Quantified convergence of the methods is established under suitable assumptions on the underlying problem. A non-monotone line search and the inexact evaluation of Riemannian gradients substantially improve the overall efficiency of the method. Numerical experiments illustrate the performance of the method and demonstrates its competitiveness with well-established schemes.
The dynamic response of the legged robot locomotion is non-Lipschitz and can be stochastic due to environmental uncertainties. To test, validate, and characterize the safety performance of legged robots, existing solutions on observed and inferred risk can be incomplete and sampling inefficient. Some formal verification methods suffer from the model precision and other surrogate assumptions. In this paper, we propose a scenario sampling based testing framework that characterizes the overall safety performance of a legged robot by specifying (i) where (in terms of a set of states) the robot is potentially safe, and (ii) how safe the robot is within the specified set. The framework can also help certify the commercial deployment of the legged robot in real-world environment along with human and compare safety performance among legged robots with different mechanical structures and dynamic properties. The proposed framework is further deployed to evaluate a group of state-of-the-art legged robot locomotion controllers from various model-based, deep neural network involved, and reinforcement learning based methods in the literature. Among a series of intended work domains of the studied legged robots (e.g. tracking speed on sloped surface, with abrupt changes on demanded velocity, and against adversarial push-over disturbances), we show that the method can adequately capture the overall safety characterization and the subtle performance insights. Many of the observed safety outcomes, to the best of our knowledge, have never been reported by the existing work in the legged robot literature.
Dynamic Linear Models (DLMs) are commonly employed for time series analysis due to their versatile structure, simple recursive updating, ability to handle missing data, and probabilistic forecasting. However, the options for count time series are limited: Gaussian DLMs require continuous data, while Poisson-based alternatives often lack sufficient modeling flexibility. We introduce a novel semiparametric methodology for count time series by warping a Gaussian DLM. The warping function has two components: a (nonparametric) transformation operator that provides distributional flexibility and a rounding operator that ensures the correct support for the discrete data-generating process. We develop conjugate inference for the warped DLM, which enables analytic and recursive updates for the state space filtering and smoothing distributions. We leverage these results to produce customized and efficient algorithms for inference and forecasting, including Monte Carlo simulation for offline analysis and an optimal particle filter for online inference. This framework unifies and extends a variety of discrete time series models and is valid for natural counts, rounded values, and multivariate observations. Simulation studies illustrate the excellent forecasting capabilities of the warped DLM. The proposed approach is applied to a multivariate time series of daily overdose counts and demonstrates both modeling and computational successes.
Holonomic functions play an essential role in Computer Algebra since they allow the application of many symbolic algorithms. Among all algorithmic attempts to find formulas for power series, the holonomic property remains the most important requirement to be satisfied by the function under consideration. The targeted functions mainly summarize that of meromorphic functions. However, expressions like $\tan(z)$, $z/(\exp(z)-1)$, $\sec(z)$, etc., particularly, reciprocals, quotients and compositions of holonomic functions, are generally not holonomic. Therefore their power series are inaccessible by the holonomic framework. From the mathematical dictionaries, one can observe that most of the known closed-form formulas of non-holonomic power series involve another sequence whose evaluation depends on some finite summations. In the case of $\tan(z)$ and $\sec(z)$ the corresponding sequences are the Bernoulli and Euler numbers, respectively. Thus providing a symbolic approach that yields complete representations when linear summations for power series coefficients of non-holonomic functions appear, might be seen as a step forward towards the representation of non-holonomic power series. By adapting the method of ansatz with undetermined coefficients, we build an algorithm that computes least-order quadratic differential equations with polynomial coefficients for a large class of non-holonomic functions. A differential equation resulting from this procedure is converted into a recurrence equation by applying the Cauchy product formula and rewriting powers into polynomials and derivatives into shifts. Finally, using enough initial values we are able to give normal form representations to characterize several non-holonomic power series and prove non-trivial identities. We discuss this algorithm and its implementation for Maple 2022.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.
Spectral clustering (SC) is a popular clustering technique to find strongly connected communities on a graph. SC can be used in Graph Neural Networks (GNNs) to implement pooling operations that aggregate nodes belonging to the same cluster. However, the eigendecomposition of the Laplacian is expensive and, since clustering results are graph-specific, pooling methods based on SC must perform a new optimization for each new sample. In this paper, we propose a graph clustering approach that addresses these limitations of SC. We formulate a continuous relaxation of the normalized minCUT problem and train a GNN to compute cluster assignments that minimize this objective. Our GNN-based implementation is differentiable, does not require to compute the spectral decomposition, and learns a clustering function that can be quickly evaluated on out-of-sample graphs. From the proposed clustering method, we design a graph pooling operator that overcomes some important limitations of state-of-the-art graph pooling techniques and achieves the best performance in several supervised and unsupervised tasks.
Deep learning has penetrated all aspects of our lives and brought us great convenience. However, the process of building a high-quality deep learning system for a specific task is not only time-consuming but also requires lots of resources and relies on human expertise, which hinders the development of deep learning in both industry and academia. To alleviate this problem, a growing number of research projects focus on automated machine learning (AutoML). In this paper, we provide a comprehensive and up-to-date study on the state-of-the-art AutoML. First, we introduce the AutoML techniques in details according to the machine learning pipeline. Then we summarize existing Neural Architecture Search (NAS) research, which is one of the most popular topics in AutoML. We also compare the models generated by NAS algorithms with those human-designed models. Finally, we present several open problems for future research.
Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.