Airplane refueling problem is a nonlinear combinatorial optimization problem with $n!$ feasible feasible solutions. Given a fleet of $n$ airplanes with mid-air refueling technique, each airplane has a specific fuel capacity and fuel consumption rate. The fleet starts to fly together to a same target and during the trip each airplane could instantaneously refuel to other airplanes and then be dropped out. The question is how to find the best refueling policy to make the last remaining airplane travels the farthest. To solve the large scale of the airplane refueling problem in polynomial-time, we propose the definition of the sequential feasible solution by employing the data structural properties of the airplane refueling problem. We prove that if an airplane refueling problem has feasible solutions, it must have sequential feasible solutions, and its optimal feasible solution must be the optimal sequential feasible solution. Then we present the sequential search algorithm which has a computational complexity that depends on the number of sequential feasible solutions referred to $Q_n$, which is proved to be upper bounded by $2^{n-2}$ as an exponential bound that lacks of applicability on larger input for worst case. Therefore we investigate the complexity behavior of the sequential search algorithm from dynamic perspective, and find out that $Q_n$ is bounded by $\frac{m^2}{n}C_n^m$ when the input $n$ is greater than $2m$. Here $m$ is a constant and $2m$ is regarded as the "inflection point" of the complexity of the sequential search algorithm from exponential-time to polynomial-time. Moreover, we build an efficient computability scheme according to which we shall predict the specific complexity of the sequential search algorithm to choose a proper algorithm considering the available running time for decision makers or users.
Real-world software applications must constantly evolve to remain relevant. This evolution occurs when developing new applications or adapting existing ones to meet new requirements, make corrections, or incorporate future functionality. Traditional methods of software quality control involve software quality models and continuous code inspection tools. These measures focus on directly assessing the quality of the software. However, there is a strong correlation and causation between the quality of the development process and the resulting software product. Therefore, improving the development process indirectly improves the software product, too. To achieve this, effective learning from past processes is necessary, often embraced through post mortem organizational learning. While qualitative evaluation of large artifacts is common, smaller quantitative changes captured by application lifecycle management are often overlooked. In addition to software metrics, these smaller changes can reveal complex phenomena related to project culture and management. Leveraging these changes can help detect and address such complex issues. Software evolution was previously measured by the size of changes, but the lack of consensus on a reliable and versatile quantification method prevents its use as a dependable metric. Different size classifications fail to reliably describe the nature of evolution. While application lifecycle management data is rich, identifying which artifacts can model detrimental managerial practices remains uncertain. Approaches such as simulation modeling, discrete events simulation, or Bayesian networks have only limited ability to exploit continuous-time process models of such phenomena. Even worse, the accessibility and mechanistic insight into such gray- or black-box models are typically very low. To address these challenges, we suggest leveraging objectively [...]
We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set $X \subseteq \Sigma^d$ of $n$ strings, find the string $x^*$ minimizing the radius of the smallest Hamming ball around $x^*$ that encloses all the strings in $X$. In this paper, we investigate whether the Closest String problem admits algorithms that are faster than the trivial exhaustive search algorithm. We obtain the following results for the two natural versions of the problem: $\bullet$ In the continuous Closest String problem, the goal is to find the solution string $x^*$ anywhere in $\Sigma^d$. For binary strings, the exhaustive search algorithm runs in time $O(2^d poly(nd))$ and we prove that it cannot be improved to time $O(2^{(1-\epsilon) d} poly(nd))$, for any $\epsilon > 0$, unless the Strong Exponential Time Hypothesis fails. $\bullet$ In the discrete Closest String problem, $x^*$ is required to be in the input set $X$. While this problem is clearly in polynomial time, its fine-grained complexity has been pinpointed to be quadratic time $n^{2 \pm o(1)}$ whenever the dimension is $\omega(\log n) < d < n^{o(1)}$. We complement this known hardness result with new algorithms, proving essentially that whenever $d$ falls out of this hard range, the discrete Closest String problem can be solved faster than exhaustive search. In the small-$d$ regime, our algorithm is based on a novel application of the inclusion-exclusion principle. Interestingly, all of our results apply (and some are even stronger) to the natural dual of the Closest String problem, called the Remotest String problem, where the task is to find a string maximizing the Hamming distance to all the strings in $X$.
We consider a nonprehensile manipulation task in which a mobile manipulator must balance objects on its end effector without grasping them -- known as the waiter's problem -- and move to a desired location while avoiding static and dynamic obstacles. In constrast to existing approaches, our focus is on fast online planning in response to new and changing environments. Our main contribution is a whole-body constrained model predictive controller (MPC) for a mobile manipulator that balances objects and avoids collisions. Furthermore, we propose planning using the minimum statically-feasible friction coefficients, which provides robustness to frictional uncertainty and other force disturbances while also substantially reducing the compute time required to update the MPC policy. Simulations and hardware experiments on a velocity-controlled mobile manipulator with up to seven balanced objects, stacked objects, and various obstacles show that our approach can handle a variety of conditions that have not been previously demonstrated, with end effector speeds and accelerations up to 2.0 m/s and 7.9 m/s$^2$, respectively. Notably, we demonstrate a projectile avoidance task in which the robot avoids a thrown ball while balancing a tall bottle.
We prove new upper and lower bounds on the number of iterations the $k$-dimensional Weisfeiler-Leman algorithm ($k$-WL) requires until stabilization. For $k \geq 3$, we show that $k$-WL stabilizes after at most $O(kn^{k-1}\log n)$ iterations (where $n$ denotes the number of vertices of the input structures), obtaining the first improvement over the trivial upper bound of $n^{k}-1$ and extending a previous upper bound of $O(n \log n)$ for $k=2$ [Lichter et al., LICS 2019]. We complement our upper bounds by constructing $k$-ary relational structures on which $k$-WL requires at least $n^{\Omega(k)}$ iterations to stabilize. This improves over a previous lower bound of $n^{\Omega(k / \log k)}$ [Berkholz, Nordstr\"{o}m, LICS 2016]. We also investigate tradeoffs between the dimension and the iteration number of WL, and show that $d$-WL, where $d = \lceil\frac{3(k+1)}{2}\rceil$, can simulate the $k$-WL algorithm using only $O(k^2 \cdot n^{\lfloor k/2\rfloor + 1} \log n)$ many iterations, but still requires at least $n^{\Omega(k)}$ iterations for any $d$ (that is sufficiently smaller than $n$). The number of iterations required by $k$-WL to distinguish two structures corresponds to the quantifier rank of a sentence distinguishing them in the $(k + 1)$-variable fragment $C_{k+1}$ of first-order logic with counting quantifiers. Hence, our results also imply new upper and lower bounds on the quantifier rank required in the logic $C_{k+1}$, as well as tradeoffs between variable number and quantifier rank.
The Fisher information matrix is a quantity of fundamental importance for information geometry and asymptotic statistics. In practice, it is widely used to quickly estimate the expected information available in a data set and guide experimental design choices. In many modern applications, it is intractable to analytically compute the Fisher information and Monte Carlo methods are used instead. The standard Monte Carlo method produces estimates of the Fisher information that can be biased when the Monte-Carlo noise is non-negligible. Most problematic is noise in the derivatives as this leads to an overestimation of the available constraining power, given by the inverse Fisher information. In this work we find another simple estimate that is oppositely biased and produces an underestimate of the constraining power. This estimator can either be used to give approximate bounds on the parameter constraints or can be combined with the standard estimator to give improved, approximately unbiased estimates. Both the alternative and the combined estimators are asymptotically unbiased so can be also used as a convergence check of the standard approach. We discuss potential limitations of these estimators and provide methods to assess their reliability. These methods accelerate the convergence of Fisher forecasts, as unbiased estimates can be achieved with fewer Monte Carlo samples, and so can be used to reduce the simulated data set size by several orders of magnitude.
Compositional data are contemporarily defined as positive vectors, the ratios among whose elements are of interest to the researcher. Financial statement analysis by means of accounting ratios fulfils this definition to the letter. Compositional data analysis solves the major problems in statistical analysis of standard financial ratios at industry level, such as skewness, non-normality, non-linearity and dependence of the results on the choice of which accounting figure goes to the numerator and to the denominator of the ratio. In spite of this, compositional applications to financial statement analysis are still rare. In this article, we present some transformations within compositional data analysis that are particularly useful for financial statement analysis. We show how to compute industry or sub-industry means of standard financial ratios from a compositional perspective. We show how to visualise firms in an industry with a compositional biplot, to classify them with compositional cluster analysis and to relate financial and non-financial indicators with compositional regression models. We show an application to the accounting statements of Spanish wineries using DuPont analysis, and a step-by-step tutorial to the compositional freeware CoDaPack.
The paper revisits the robust $s$-$t$ path problem, one of the most fundamental problems in robust optimization. In the problem, we are given a directed graph with $n$ vertices and $k$ distinct cost functions (scenarios) defined over edges, and aim to choose an $s$-$t$ path such that the total cost of the path is always provable no matter which scenario is realized. With the view of each cost function being associated with an agent, our goal is to find a common $s$-$t$ path minimizing the maximum objective among all agents, and thus create a fair solution for them. The problem is hard to approximate within $o(\log k)$ by any quasi-polynomial time algorithm unless $\mathrm{NP} \subseteq \mathrm{DTIME}(n^{\mathrm{poly}\log n})$, and the best approximation ratio known to date is $\widetilde{O}(\sqrt{n})$ which is based on the natural flow linear program. A longstanding open question is whether we can achieve a polylogarithmic approximation even when a quasi-polynomial running time is allowed. We give the first polylogarithmic approximation for robust $s$-$t$ path since the problem was proposed more than two decades ago. In particular, we introduce a $O(\log n \log k)$-approximate algorithm running in quasi-polynomial time. The algorithm is built on a novel linear program formulation for a decision-tree-type structure which enables us to get rid of the $\Omega(\max\{k,\sqrt{n}\})$ integrality gap of the natural flow LP. Further, we also consider some well-known graph classes, e.g., graphs with bounded treewidth, and show that the polylogarithmic approximation can be achieved polynomially on these graphs. We hope the new proposed techniques in the paper can offer new insights into the robust $s$-$t$ path problem and related problems in robust optimization.
A linear layout of a graph $ G $ consists of a linear order $\prec$ of the vertices and a partition of the edges. A part is called a queue (stack) if no two edges nest (cross), that is, two edges $ (v,w) $ and $ (x,y) $ with $ v \prec x \prec y \prec w $ ($ v \prec x \prec w \prec y $) may not be in the same queue (stack). The best known lower and upper bounds for the number of queues needed for planar graphs are 4 [Alam et al., Algorithmica 2020] and 42 [Bekos et al., Algorithmica 2022], respectively. While queue layouts of special classes of planar graphs have received increased attention following the breakthrough result of [Dujmovi\'c et al., J. ACM 2020], the meaningful class of bipartite planar graphs has remained elusive so far, explicitly asked for by Bekos et al. In this paper we investigate bipartite planar graphs and give an improved upper bound of 28 by refining existing techniques. In contrast, we show that two queues or one queue together with one stack do not suffice; the latter answers an open question by Pupyrev [GD 2018]. We further investigate subclasses of bipartite planar graphs and give improved upper bounds; in particular we construct 5-queue layouts for 2-degenerate quadrangulations.
We consider contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vector-valued costs are suffered. The learner aims to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. We assume that contexts come from a continuous set, that costs can be signed, and that the expected reward and cost functions, while unknown, may be uniformly estimated -- a typical assumption in the literature. In this setting, total cost constraints had so far to be at least of order $T^{3/4}$, where $T$ is the number of rounds, and were even typically assumed to depend linearly on $T$. We are however motivated to use CBwK to impose a fairness constraint of equalized average costs between groups: the budget associated with the corresponding cost constraints should be as close as possible to the natural deviations, of order $\sqrt{T}$. To that end, we introduce a dual strategy based on projected-gradient-descent updates, that is able to deal with total-cost constraints of the order of $\sqrt{T}$ up to poly-logarithmic terms. This strategy is more direct and simpler than existing strategies in the literature. It relies on a careful, adaptive, tuning of the step size.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.