We study a generalization of the well-known disjoint paths problem which we call the metric Menger problem, denoted MM(r,k), where one is given two subsets of a graph and must decide whether they can be connected by $k$ paths of pairwise distance at least $r$. We prove that this problem is NP-complete for every $r\geq 3$ and $k\geq 2$ by giving a reduction from 3SAT. This resolves a conjecture recently stated by Georgakopoulos and Papasoglu. On the other hand, we show that the problem is in XP when parameterised by treewidth and maximum degree by observing that it is `locally checkable'. In the case $r\leq 3$, we prove that it suffices to parameterise by treewidth. We also state some open questions relating to this work.
We study the Renting Servers in the Cloud problem (RSiC) in multiple dimensions. In this problem, a sequence of multi-parameter jobs must be scheduled on servers that can be rented on-demand. Each job has an arrival time, a finishing time, and a multi-dimensional size vector that specifies its resource demands. Each server has a multi-dimensional capacity and jobs can be scheduled on a server as long as in each dimension the sum of sizes of jobs does not exceed the capacity of the server in that dimension. The goal is to minimize the total rental time of servers needed to process the job sequence. AF algorithms do not rent new servers to accommodate a job unless they have to. We introduce a sub-family of AF algorithms called monotone AF algorithms. We show this family have a tight competitive ratio of $Theta(d mu)$, where $d$ is the dimension of the problem and $mu$ is the ratio between the maximum and minimum duration of jobs in the input sequence. We also show that upper bounds for the RSiC problem obey the direct-sum property with respect to dimension $d$, that is we show how to transform $1$-dimensional algorithms for RSiC to work in the $d$-dimensional setting with competitive ratio scaling by a factor of $d$. As a corollary, we obtain an $O(d\sqrt{log mu})$ upper bound for $d$-dimensional clairvoyant RSiC. We also establish a lower bound of $\widetilde{Omega}(d mu)$ for both deterministic and randomized algorithms for $d$-dimensional non-clairvoyant RSiC, under the assumption that $mu \le log d - 2$. Lastly, we propose a natural greedy algorithm called Greedy. Greedy, is a clairvoyant algorithm belongs to the monotone AF family, achieves a competitive ratio of $Theta(d mu)$. Our experimental results indicate that Greedy performs better or matches all other existing algorithms, for almost all the settings of arrival rates and values of mu and $d$ that we implemented.
To analyze the worst-case running time of branching algorithms, the majority of work in exponential time algorithms focuses on designing complicated branching rules over developing better analysis methods for simple algorithms. In the mid-$2000$s, Fomin et al. [2005] introduced measure & conquer, an advanced general analysis method, sparking widespread adoption for obtaining tighter worst-case running time upper bounds for many fundamental NP-complete problems. Yet, much potential in this direction remains untapped, as most subsequent work applied it without further advancement. Motivated by this, we present piecewise analysis, a new general method that analyzes the running time of branching algorithms. Our approach is to define a similarity ratio that divides instances into groups and then analyze the running time within each group separately. The similarity ratio is a scale between two parameters of an instance I. Instead of relying on a single measure and a single analysis for the whole instance space, our method allows to take advantage of different intrinsic properties of instances with different similarity ratios. To showcase its potential, we reanalyze two $17$-year-old algorithms from Fomin et al. [2007] that solve $4$-Coloring and #$3$-Coloring respectively. The original analysis in their paper gave running times of $O(1.7272^n)$ and $O(1.6262^n)$ respectively for these algorithms, our analysis improves these running times to $O(1.7215^n)$ and $O(1.6232^n)$. Among the two improvements, our new running time $O(1.7215^n)$ is the first improvement in the best known running time for the 4-Coloring problem since 2007.
Existing efforts to improve logical reasoning ability of language models have predominantly relied on supervised fine-tuning, hindering generalization to new domains and/or tasks. The development of Large Langauge Models (LLMs) has demonstrated the capacity of compressing abundant knowledge into a single proxy, enabling them to tackle multiple tasks effectively. Our preliminary experiments, nevertheless, show that LLMs do not show capability on logical reasoning. The performance of LLMs on logical reasoning benchmarks is far behind the existing state-of-the-art baselines. In this paper, we make the first attempt to investigate the feasibility of incorporating logical knowledge through self-supervised post-training, and activating it via in-context learning, which we termed as LogicLLM. Specifically, we devise an auto-regressive objective variant of MERIt and integrate it with two LLM series, i.e., FLAN-T5 and LLaMA, with parameter size ranging from 3 billion to 13 billion. The results on two challenging logical reasoning benchmarks demonstrate the effectiveness of LogicLLM. Besides, we conduct extensive ablation studies to analyze the key factors in designing logic-oriented proxy tasks.
We propose a technique called Rotate-and-Kill for solving the polygon inclusion and circumscribing problems. By applying this technique, we obtain $O(n)$ time algorithms for computing (1) the maximum area triangle in a given $n$-sided convex polygon $P$, (2) the minimum area triangle enclosing $P$, (3) the minimum area triangle enclosing $P$ touching edge-to-edge, i.e. the minimum area triangle that is the intersection of three half-planes out of the $n$ half-planes defining $P$, and (4) the minimum perimeter triangle enclosing $P$ touching edge-to-edge. Our algorithm for computing the maximum area triangle is simpler than the alternatives given in [Chandran and Mount, IJCGA'92] and [Kallus, arXiv'17]. Our algorithms for computing the minimum area or perimeter triangle enclosing $P$ touching edge-to-edge improve the $O(n\log n)$ or $O(n\log^2n)$ time algorithms given in [Boyce \emph{et al.}, STOC'82], [Aggarwal \emph{et al.}, Algorithmica'87], [Aggarwal and J. Park., FOCS'88], [Aggarwal \emph{et al.}, DCG'94], and [Schieber, SODA'95].
We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution. We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution and design an efficient sequential sampling algorithm to learn the MPB when the parameter has a finite support. We derive the large deviations rate of the probability of falsely selecting the MPB and formulate an optimal computing budget allocation problem to find the rate-maximizing static sampling ratios. The problem is then relaxed to obtain a set of optimality conditions that are interpretable and computationally efficient to verify. We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases. Furthermore, we show that the empirical performances of the algorithms can be significantly improved by adopting the kernel ridge regression for mean estimation while achieving the same asymptotic convergence results. The algorithms are benchmarked against a state-of-the-art contextual R&S algorithm and demonstrated to have superior empirical performances.
We study comparison sorting in the evolving data model [AKMU11], where the true total order changes while the sorting algorithm is processing the input. More precisely, each comparison operation of the algorithm is followed by a sequence of evolution steps, where an evolution step perturbs the rank of a random item by a "small" random value. The goal is to maintain an ordering that remains close to the true order over time. Previous works have analyzed adaptations of classic sorting algorithms, assuming that an evolution step changes the rank of an item by just one, and that a fixed constant number $b$ of evolution steps take place between two comparisons. In fact, the only previous result achieving optimal $O(n)$ total deviation from the true order, where $n$ is the number of items, applies just for $b=1$ [BDEGJ18]. We analyze a very simple sorting algorithm suggested in [M14], which samples a random pair of adjacent items in each step and swaps them if they are out of order. We show that the algorithm achieves and maintains, w.h.p., optimal total deviation, $O(n)$, and optimal maximum deviation, $O(\log n)$, under very general model settings. Namely, the perturbation introduced by each evolution step follows a distribution of bounded moment generating function, and over a linear number of steps, on average the number of evolution steps between two sorting steps is bounded by an arbitrary constant. Our proof consists of a novel potential function argument that inserts "gaps" in the list of items, and a general framework which separates the analysis of sorting from that of the evolution steps, and is applicable to a variety of settings for which previous approaches do not apply. Our results settle conjectures by [AKMU11] and [M14], and provide theoretical support for the empirical evidence that simple quadratic algorithms are optimal and robust for sorting evolving data [BDEGJ18].
We develop a thermodynamic theory for machine learning (ML) systems. Similar to physical thermodynamic systems which are characterized by energy and entropy, ML systems possess these characteristics as well. This comparison inspire us to integrate the concept of temperature into ML systems grounded in the fundamental principles of thermodynamics, and establish a basic thermodynamic framework for machine learning systems with non-Boltzmann distributions. We introduce the concept of states within a ML system, identify two typical types of state, and interpret model training and refresh as a process of state phase transition. We consider that the initial potential energy of a ML system is described by the model's loss functions, and the energy adheres to the principle of minimum potential energy. For a variety of energy forms and parameter initialization methods, we derive the temperature of systems during the phase transition both analytically and asymptotically, highlighting temperature as a vital indicator of system data distribution and ML training complexity. Moreover, we perceive deep neural networks as complex heat engines with both global temperature and local temperatures in each layer. The concept of work efficiency is introduced within neural networks, which mainly depends on the neural activation functions. We then classify neural networks based on their work efficiency, and describe neural networks as two types of heat engines.
Many neurological conditions, e.g., a stroke, can cause patients to experience upper limb (UL) motor impairments that hinder their daily activities. For such patients, while rehabilitation therapy is key for regaining autonomy and restoring mobility, its long-term nature entails ongoing time commitment and it is often not sufficiently engaging. Virtual reality (VR) can transform rehabilitation therapy into engaging game-like tasks that can be tailored to patient-specific activities, set goals, and provide rehabilitation assessment. Yet, most VR systems lack built-in methods to track progress over time and alter rehabilitation programs accordingly. We propose using arm kinematic modeling and capability maps to allow a VR system to understand a user's physical capability and limitation. Next, we suggest two use cases for the VR system to utilize the user's capability map for tailoring rehabilitation programs. Finally, for one use case, it is shown that the VR system can emphasize and assess the use of specific UL joints.
We give a simpler analysis of the ascending auction of Bikhchandani, de Vries, Schummer, and Vohra to sell a welfare-maximizing base of a matroid at Vickrey prices. The new proofs for economic efficiency and the charge of Vickrey prices only require a few matroid folklore theorems, therefore shortening the analysis of the design goals of the auction significantly.
Multi-view learning methods often focus on improving decision accuracy while neglecting the decision uncertainty, which significantly restricts their applications in safety-critical applications. To address this issue, researchers propose trusted multi-view methods that learn the class distribution for each instance, enabling the estimation of classification probabilities and uncertainty. However, these methods heavily rely on high-quality ground-truth labels. This motivates us to delve into a new generalized trusted multi-view learning problem: how to develop a reliable multi-view learning model under the guidance of noisy labels? We propose a trusted multi-view noise refining method to solve this problem. We first construct view-opinions using evidential deep neural networks, which consist of belief mass vectors and uncertainty estimates. Subsequently, we design view-specific noise correlation matrices that transform the original opinions into noisy opinions aligned with the noisy labels. Considering label noises originating from low-quality data features and easily-confused classes, we ensure that the diagonal elements of these matrices are inversely proportional to the uncertainty, while incorporating class relations into the off-diagonal elements. Finally, we aggregate the noisy opinions and employ a generalized maximum likelihood loss on the aggregated opinion for model training, guided by the noisy labels. We empirically compare TMNR with state-of-the-art trusted multi-view learning and label noise learning baselines on 5 publicly available datasets. Experiment results show that TMNR outperforms baseline methods on accuracy, reliability and robustness. We promise to release the code and all datasets on Github and show the link here.