Given a set of points $P = (P^+ \sqcup P^-) \subset \mathbb{R}^d$ for some constant $d$ and a supply function $\mu:P\to \mathbb{R}$ such that $\mu(p) > 0~\forall p \in P^+$, $\mu(p) < 0~\forall p \in P^-$, and $\sum_{p\in P}{\mu(p)} = 0$, the geometric transportation problem asks one to find a transportation map $\tau: P^+\times P^-\to \mathbb{R}_{\ge 0}$ such that $\sum_{q\in P^-}{\tau(p, q)} = \mu(p)~\forall p \in P^+$, $\sum_{p\in P^+}{\tau(p, q)} = -\mu(q)~ \forall q \in P^-$, and the weighted sum of Euclidean distances for the pairs $\sum_{(p,q)\in P^+\times P^-}\tau(p, q)\cdot ||q-p||_2$ is minimized. We present the first deterministic algorithm that computes, in near-linear time, a transportation map whose cost is within a $(1 + \varepsilon)$ factor of optimal. More precisely, our algorithm runs in $O(n\varepsilon^{-(d+2)}\log^5{n}\log{\log{n}})$ time for any constant $\varepsilon > 0$. Surprisingly, our result is not only a generalization of a bipartite matching one to arbitrary instances of geometric transportation, but it also reduces the running time for all previously known $(1 + \varepsilon)$-approximation algorithms, randomized or deterministic, even for geometric bipartite matching. In particular, we give the first $(1 + \varepsilon)$-approximate deterministic algorithm for geometric bipartite matching and the first $(1 + \varepsilon)$-approximate deterministic or randomized algorithm for geometric transportation with no dependence on $d$ in the exponent of the running time's polylog. As an additional application of our main ideas, we also give the first randomized near-linear $O(poly(1 / \varepsilon) m \log^{O(1)} n)$ time $(1 + \varepsilon)$-approximation algorithm for the uncapacitated minimum cost flow (transshipment) problem in undirected graphs with arbitrary real edge costs.
We consider the problem of fair allocation of indivisible goods or chores to $n$ agents with $\textit{weights}$ that describe their entitlements to a set of indivisible resources. Stemming from the well studied fairness notions envy-freeness up to one good (EF1) and envy-freeness up to any good (EFX) for agents with $\textit{equal}$ entitlements, we here present the first impossibility results in addition to algorithmic guarantees on fairness for agents with $\textit{unequal}$ entitlements. In this paper, we extend the notion of envy-freeness up to any good or chore to the weighted context (WEFX and XWEF respectively), proving that these allocations are not guaranteed to exist for two or three agents. In spite of these negative results, we provide an approximate WEFX procedure for two agents -- a first result of its kind. We further present a polynomial time algorithm that guarantees a weighted envy-free up to one chore (1WEF) allocation for any number of agents with additive cost functions. Our work highlights the increased complexity of the weighted fair division problem as compared to its unweighted counterpart.
Mobility systems often suffer from a high price of anarchy due to the uncontrolled behavior of selfish users. This may result in societal costs that are significantly higher compared to what could be achieved by a centralized system-optimal controller. Monetary tolling schemes can effectively align the behavior of selfish users with the system-optimum. Yet, they inevitably discriminate the population in terms of income. Artificial currencies were recently presented as an effective alternative that can achieve the same performance, whilst guaranteeing fairness among the population. However, those studies were based on behavioral models that may differ from practical implementations. This paper presents a data-driven approach to automatically adapt artificial-currency tolls within repetitive-game settings. We first consider a parallel-arc setting whereby users commute on a daily basis from an individual origin to an individual destination, choosing a route in exchange of an artificial-currency price or reward, while accounting for the impact of the choices of the other users on travel discomfort. Second, we devise a model-based reinforcement learning controller that autonomously learns the optimal pricing policy by interacting with the proposed framework considering the closeness of the observed aggregate flows to a desired system-optimal distribution as a reward function. Our numerical results show that the proposed data-driven pricing scheme can effectively align the users' flows with the system optimum, significantly reducing the societal costs with respect to the uncontrolled flows (by about 15% and 25% depending on the scenario), and respond to environmental changes in a robust and efficient manner.
Rotary Indexing Machines (RIMs) are widely used in manufacturing due to their ability to perform multiple production steps on a single product without manual repositioning, reducing production time and improving accuracy and consistency. Despite their advantages, little research has been done on diagnosing faults in RIMs, especially from the perspective of the actual production steps carried out on these machines. Long downtimes due to failures are problematic, especially for smaller companies employing these machines. To address this gap, we propose a diagnosis algorithm based on the product perspective, which focuses on the product being processed by RIMs. The algorithm traces the steps that a product takes through the machine and is able to diagnose possible causes in case of failure. We also analyze the properties of RIMs and how these influence the diagnosis of faults in these machines. Our contributions are three-fold. Firstly, we provide an analysis of the properties of RIMs and how they influence the diagnosis of faults in these machines. Secondly, we suggest a diagnosis algorithm based on the product perspective capable of diagnosing faults in such a machine. Finally, we test this algorithm on a model of a rotary indexing machine, demonstrating its effectiveness in identifying faults and their root causes.
While there has been progress in establishing the unprovability of complexity statements in lower fragments of bounded arithmetic, understanding the limits of Je\v{r}\'abek's theory $APC_1$ (2007) and of higher levels of Buss's hierarchy $S^i_2$ (1986) has been a more elusive task. Even in the more restricted setting of Cook's theory PV (1975), known results often rely on a less natural formalization that encodes a complexity statement using a collection of sentences instead of a single sentence. This is done to reduce the quantifier complexity of the resulting sentences so that standard witnessing results can be invoked. In this work, we establish unprovability results for stronger theories and for sentences of higher quantifier complexity. In particular, we unconditionally show that $APC_1$ cannot prove strong complexity lower bounds separating the third level of the polynomial hierarchy. In more detail, we consider non-uniform average-case separations, and establish that $APC_1$ cannot prove a sentence stating that $\forall n \ge n_0\;\exists\,f_n \in \Pi_{3}$-$SIZE[n^d]$ that is $(1/n)$-far from every $\Sigma_{3}$-$SIZE[2^{n^{\delta}}]$ circuit. This is a consequence of a much more general result showing that, for every $i \geq 1$, strong separations for $\Pi_{i}$-$SIZE[poly(n)]$ versus $\Sigma_{i}$-$SIZE[2^{n^{\Omega(1)}}]$ cannot be proved in the theory $T_{PV}^i$ consisting of all true $\forall \Sigma^b_{i-1}$-sentences in the language of Cook's theory PV. Our argument employs a convenient game-theoretic witnessing result that can be applied to sentences of arbitrary quantifier complexity. We combine it with extensions of a technique introduced by Kraj\'i\v{c}ek (2011) that was recently employed by Pich and Santhanam (2021) to establish the unprovability of lower bounds in PV (i.e., the case $i=1$ above, but under a weaker formalization) and in a fragment of $APC_1$.
We study implementations of basic fault-tolerant primitives, such as consensus and registers, in message-passing systems subject to process crashes and a broad range of communication failures. Our results characterize the necessary and sufficient conditions for implementing these primitives as a function of the connectivity constraints and synchrony assumptions. Our main contribution is a new algorithm for partially synchronous consensus that is resilient to process crashes and channel failures and is optimal in its connectivity requirements. In contrast to prior work, our algorithm assumes the most general model of message loss where faulty channels are flaky, i.e., can lose messages without any guarantee of fairness. This failure model is particularly challenging for consensus algorithms, as it rules out standard solutions based on leader oracles and failure detectors. To circumvent this limitation, we construct our solution using a new variant of the recently proposed view synchronizer abstraction, which we adapt to the crash-prone setting with flaky channels.
We study learning optimal policies from a logged dataset, i.e., offline RL, with function approximation. Despite the efforts devoted, existing algorithms with theoretic finite-sample guarantees typically assume exploratory data coverage or strong realizable function classes, which is hard to be satisfied in reality. While there are recent works that successfully tackle these strong assumptions, they either require the gap assumptions that only could be satisfied by part of MDPs or use the behavior regularization that makes the optimality of learned policy even intractable. To solve this challenge, we provide finite-sample guarantees for a simple algorithm based on marginalized importance sampling (MIS), showing that sample-efficient offline RL for general MDPs is possible with only a partial coverage dataset and weak realizable function classes given additional side information of a covering distribution. Furthermore, we demonstrate that the covering distribution trades off prior knowledge of the optimal trajectories against the coverage requirement of the dataset, revealing the effect of this inductive bias in the learning processes.
We study the problem of allocating a set of indivisible goods among a set of agents with 2-value additive valuations. In this setting, each good is valued either $1$ or $\frac{p}{q}$, for some fixed co-prime numbers $p,q\in N$ such that $1\leq q < p$, and the value of a bundle is the sum of the values of the contained goods. Our goal is to find an allocation which maximizes the Nash social welfare (NSW), i.e., the geometric mean of the valuations of the agents. In this work, we give a complete characterization of polynomial-time tractability of NSW maximization that solely depends on the values of $q$. We start by providing a rather simple polynomial-time algorithm to find a maximum NSW allocation when the valuation functions are integral, that is, $q=1$. We then exploit more involved techniques to get an algorithm producing a maximum NSW allocation for the half-integral case, that is, $q=2$. Finally, we show that such an improvement cannot be further extended to the case $q=3$; indeed, we prove that it is NP-hard to compute an allocation with maximum NSW whenever $q\geq 3$.
This paper deals with the problem of efficient sampling from a stochastic differential equation, given the drift function and the diffusion matrix. The proposed approach leverages a recent model for probabilities \cite{rudi2021psd} (the positive semi-definite -- PSD model) from which it is possible to obtain independent and identically distributed (i.i.d.) samples at precision $\varepsilon$ with a cost that is $m^2 d \log(1/\varepsilon)$ where $m$ is the dimension of the model, $d$ the dimension of the space. The proposed approach consists in: first, computing the PSD model that satisfies the Fokker-Planck equation (or its fractional variant) associated with the SDE, up to error $\varepsilon$, and then sampling from the resulting PSD model. Assuming some regularity of the Fokker-Planck solution (i.e. $\beta$-times differentiability plus some geometric condition on its zeros) We obtain an algorithm that: (a) in the preparatory phase obtains a PSD model with L2 distance $\varepsilon$ from the solution of the equation, with a model of dimension $m = \varepsilon^{-(d+1)/(\beta-2s)} (\log(1/\varepsilon))^{d+1}$ where $1/2\leq s\leq1$ is the fractional power to the Laplacian, and total computational complexity of $O(m^{3.5} \log(1/\varepsilon))$ and then (b) for Fokker-Planck equation, it is able to produce i.i.d.\ samples with error $\varepsilon$ in Wasserstein-1 distance, with a cost that is $O(d \varepsilon^{-2(d+1)/\beta-2} \log(1/\varepsilon)^{2d+3})$ per sample. This means that, if the probability associated with the SDE is somewhat regular, i.e. $\beta \geq 4d+2$, then the algorithm requires $O(\varepsilon^{-0.88} \log(1/\varepsilon)^{4.5d})$ in the preparatory phase, and $O(\varepsilon^{-1/2}\log(1/\varepsilon)^{2d+2})$ for each sample. Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to them (the so-called ''black-box'' scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely {\epsilon}-LDP and R\'enyi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees; this result also holds for the central versions of the two notions of DP considered. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. We validate our results experimentally and we note that, on a particularly well-behaved distribution (namely, the Laplace noise), our method gives even better results than expected, in the sense that in practice the number of samples needed to achieve the desired confidence is smaller than the theoretical bound, and the estimation of {\epsilon} is more precise than predicted.
Overlapping instruction subsets derived from human originated code have previously been shown to dramatically shrink the inductive programming search space, often by many orders of magnitude. Here we extend the instruction subset approach to consider direct instruction-instruction applications (or instruction digrams) as an additional search heuristic for inductive programming. In this study we analyse the frequency distribution of instruction digrams in a large sample of open source code. This indicates that the instruction digram distribution is highly skewed with over 93% of possible instruction digrams not represnted in the code sample. We demonstrate that instruction digrams can be used to constrain instruction selection during search, further reducing size of the the search space, in some cases by several orders of magnitude. This significantly increases the size of programs that can be generated using search based inductive programming techniques. We discuss the results and provide some suggestions for further work.