We consider the non-preemptive scheduling problem on identical machines where there is a parameter B and each machine in every unit length time interval can process up to B different jobs. The goal function we consider is the makespan minimization and we develop an EPTAS for this problem. Prior to our work a PTAS was known only for the case of one machine and constant values of B, and even the case of non-constant values of B on one machine was not known to admit a PTAS.
Niching methods have been developed to maintain the population diversity, to investigate many peaks in parallel and to reduce the effect of genetic drift. We present the first rigorous runtime analyses of restricted tournament selection (RTS), embedded in a ($\mu$+1) EA, and analyse its effectiveness at finding both optima of the bimodal function ${\rm T{\small WO}M{\small AX}}$. In RTS, an offspring competes against the closest individual, with respect to some distance measure, amongst $w$ (window size) population members (chosen uniformly at random with replacement), to encourage competition within the same niche. We prove that RTS finds both optima on ${\rm T{\small WO}M{\small AX}}$ efficiently if the window size $w$ is large enough. However, if $w$ is too small, RTS fails to find both optima even in exponential time, with high probability. We further consider a variant of RTS selecting individuals for the tournament \emph{without} replacement. It yields a more diverse tournament and is more effective at preventing one niche from taking over the other. However, this comes at the expense of a slower progress towards optima when a niche collapses to a single individual. Our theoretical results are accompanied by experimental studies that shed light on parameters not covered by the theoretical results and support a conjectured lower runtime bound.
A prediction interval covers a future observation from a random process in repeated sampling, and is typically constructed by identifying a pivotal quantity that is also an ancillary statistic. Analogously, a tolerance interval covers a population percentile in repeated sampling and is often based on a pivotal quantity. One approach we consider in non-normal models leverages a link function resulting in a pivotal quantity that is approximately normally distributed. In settings where this normal approximation does not hold we consider a second approach for tolerance and prediction based on a confidence interval for the mean. These methods are intuitive, simple to implement, have proper operating characteristics, and are computationally efficient compared to Bayesian, re-sampling, and machine learning methods. This is demonstrated in the context of multi-site clinical trial recruitment with staggered site initiation, real-world time on treatment, and end-of-study success for a clinical endpoint.
eBPF is a new technology which allows dynamically loading pieces of code into the Linux kernel. It can greatly speed up networking since it enables the kernel to process certain packets without the involvement of a userspace program. So far eBPF has been used for simple packet filtering applications such as firewalls or Denial of Service protection. We show that it is possible to develop a flow based network intrusion detection system based on machine learning entirely in eBPF. Our solution uses a decision tree and decides for each packet whether it is malicious or not, considering the entire previous context of the network flow. We achieve a performance increase of over 20% compared to the same solution implemented as a userspace program.
A strict bramble of a graph $G$ is a collection of pairwise-intersecting connected subgraphs of $G.$ The order of a strict bramble ${\cal B}$ is the minimum size of a set of vertices intersecting all sets of ${\cal B}.$ The strict bramble number of $G,$ denoted by ${\sf sbn}(G),$ is the maximum order of a strict bramble in $G.$ The strict bramble number of $G$ can be seen as a way to extend the notion of acyclicity, departing from the fact that (non-empty) acyclic graphs are exactly the graphs where every strict bramble has order one. We initiate the study of this graph parameter by providing three alternative definitions, each revealing different structural characteristics. The first is a min-max theorem asserting that ${\sf sbn}(G)$ is equal to the minimum $k$ for which $G$ is a minor of the lexicographic product of a tree and a clique on $k$ vertices (also known as the lexicographic tree product number). The second characterization is in terms of a new variant of a tree decomposition called lenient tree decomposition. We prove that ${\sf sbn}(G)$ is equal to the minimum $k$ for which there exists a lenient tree decomposition of $G$ of width at most $k.$ The third characterization is in terms of extremal graphs. For this, we define, for each $k,$ the concept of a $k$-domino-tree and we prove that every edge-maximal graph of strict bramble number at most $k$ is a $k$-domino-tree. We also identify three graphs that constitute the minor-obstruction set of the class of graphs with strict bramble number at most two. We complete our results by proving that, given some $G$ and $k,$ deciding whether ${\sf sbn}(G) \leq k$ is an ${\sf NP}$-complete problem.
We consider the deformation of a geological structure with non-intersecting faults that can be represented by a layered system of viscoelastic bodies satisfying rate- and state-depending friction conditions along the common interfaces. We derive a mathematical model that contains classical Dieterich- and Ruina-type friction as special cases and accounts for possibly large tangential displacements. Semi-discretization in time by a Newmark scheme leads to a coupled system of non-smooth, convex minimization problems for rate and state to be solved in each time step. Additional spatial discretization by a mortar method and piecewise constant finite elements allows for the decoupling of rate and state by a fixed point iteration and efficient algebraic solution of the rate problem by truncated non-smooth Newton methods. Numerical experiments with a spring slider and a layered multiscale system illustrate the behavior of our model as well as the efficiency and reliability of the numerical solver.
Muchnik's paradox says that enumerable betting strategies are not always reducible to enumerable strategies whose bets are restricted to either even rounds or odd rounds. In other words, there are outcome sequences x where an effectively enumerable strategy succeeds, but no such parity-restricted effectively enumerable strategy does. We characterize the effective Hausdorff dimension of such $x$, showing that it can be as low as 1/2 but not less. We also show that such reals that are random with respect to parity-restricted effectively enumerable strategies with packing dimension as low as $\log\sqrt3$. Finally we exhibit Muchnik's paradox in the case of computable integer-valued strategies.
In this paper we present a novel class of asymptotic consistent exponential-type integrators for Klein-Gordon-Schr\"odinger systems that capture all regimes from the slowly varying classical regime up to the highly oscillatory non-relativistic limit regime. We achieve convergence of order one and two that is uniform in $c$ without any time step size restrictions. In particular, we establish an explicit relation between gain in negative powers of the potentially large parameter $c$ in the error constant and loss in derivative.
A core capability of intelligent systems is the ability to quickly learn new tasks by drawing on prior experience. Gradient (or optimization) based meta-learning has recently emerged as an effective approach for few-shot learning. In this formulation, meta-parameters are learned in the outer loop, while task-specific models are learned in the inner-loop, by using only a small amount of data from the current task. A key challenge in scaling these approaches is the need to differentiate through the inner loop learning process, which can impose considerable computational and memory burdens. By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer. This effectively decouples the meta-gradient computation from the choice of inner loop optimizer. As a result, our approach is agnostic to the choice of inner loop optimizer and can gracefully handle many gradient steps without vanishing gradients or memory constraints. Theoretically, we prove that implicit MAML can compute accurate meta-gradients with a memory footprint that is, up to small constant factors, no more than that which is required to compute a single inner loop gradient and at no overall increase in the total computational cost. Experimentally, we show that these benefits of implicit MAML translate into empirical gains on few-shot image recognition benchmarks.
Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.
Given the rise of a new approach to MT, Neural MT (NMT), and its promising performance on different text types, we assess the translation quality it can attain on what is perceived to be the greatest challenge for MT: literary text. Specifically, we target novels, arguably the most popular type of literary text. We build a literary-adapted NMT system for the English-to-Catalan translation direction and evaluate it against a system pertaining to the previous dominant paradigm in MT: statistical phrase-based MT (PBSMT). To this end, for the first time we train MT systems, both NMT and PBSMT, on large amounts of literary text (over 100 million words) and evaluate them on a set of twelve widely known novels spanning from the the 1920s to the present day. According to the BLEU automatic evaluation metric, NMT is significantly better than PBSMT (p < 0.01) on all the novels considered. Overall, NMT results in a 11% relative improvement (3 points absolute) over PBSMT. A complementary human evaluation on three of the books shows that between 17% and 34% of the translations, depending on the book, produced by NMT (versus 8% and 20% with PBSMT) are perceived by native speakers of the target language to be of equivalent quality to translations produced by a professional human translator.