亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A longstanding open problem asks for an aperiodic monotile, also known as an "einstein": a shape that admits tilings of the plane, but never periodic tilings. We answer this problem for topological disk tiles by exhibiting a continuum of combinatorially equivalent aperiodic polygons. We first show that a representative example, the "hat" polykite, can form clusters called "metatiles", for which substitution rules can be defined. Because the metatiles admit tilings of the plane, so too does the hat. We then prove that generic members of our continuum of polygons are aperiodic, through a new kind of geometric incommensurability argument. Separately, we give a combinatorial, computer-assisted proof that the hat must form hierarchical -- and hence aperiodic -- tilings.

相關內容

A package query returns a package -- a multiset of tuples -- that maximizes or minimizes a linear objective function subject to linear constraints, thereby enabling in-database decision support. Prior work has established the equivalence of package queries to Integer Linear Programs (ILPs) and developed the SketchRefine algorithm for package query processing. While this algorithm was an important first step toward supporting prescriptive analytics scalably inside a relational database, it struggles when the data size grows beyond a few hundred million tuples or when the constraints become very tight. In this paper, we present Progressive Shading, a novel algorithm for processing package queries that can scale efficiently to billions of tuples and gracefully handle tight constraints. Progressive Shading solves a sequence of optimization problems over a hierarchy of relations, each resulting from an ever-finer partitioning of the original tuples into homogeneous groups until the original relation is obtained. This strategy avoids the premature discarding of high-quality tuples that can occur with SketchRefine. Our novel partitioning scheme, Dynamic Low Variance, can handle very large relations with multiple attributes and can dynamically adapt to both concentrated and spread-out sets of attribute values, provably outperforming traditional partitioning schemes such as KD-Tree. We further optimize our system by replacing our off-the-shelf optimization software with customized ILP and LP solvers, called Dual Reducer and Parallel Dual Simplex respectively, that are highly accurate and orders of magnitude faster.

In this paper we study the relation of two fundamental problems in scheduling and fair allocation: makespan minimization on unrelated parallel machines and max-min fair allocation, also known as the Santa Claus problem. For both of these problems the best approximation factor is a notorious open question; more precisely, whether there is a better-than-2 approximation for the former problem and whether there is a constant approximation for the latter. While the two problems are intuitively related and history has shown that techniques can often be transferred between them, no formal reductions are known. We first show that an affirmative answer to the open question for makespan minimization implies the same for the Santa Claus problem by reducing the latter problem to the former. We also prove that for problem instances with only two input values both questions are equivalent. We then move to a special case called ``restricted assignment'', which is well studied in both problems. Although our reductions do not maintain the characteristics of this special case, we give a reduction in a slight generalization, where the jobs or resources are assigned to multiple machines or players subject to a matroid constraint and in addition we have only two values. This draws a similar picture as before: equivalence for two values and the general case of Santa Claus can only be easier than makespan minimization. To complete the picture, we give an algorithm for our new matroid variant of the Santa Claus problem using a non-trivial extension of the local search method from restricted assignment. Thereby we unify, generalize, and improve several previous results. We believe that this matroid generalization may be of independent interest and provide several sample applications.

We investigate trade-offs in static and dynamic evaluation of hierarchical queries with arbitrary free variables. In the static setting, the trade-off is between the time to partially compute the query result and the delay needed to enumerate its tuples. In the dynamic setting, we additionally consider the time needed to update the query result under single-tuple inserts or deletes to the database. Our approach observes the degree of values in the database and uses different computation and maintenance strategies for high-degree (heavy) and low-degree (light) values. For the latter it partially computes the result, while for the former it computes enough information to allow for on-the-fly enumeration. We define the preprocessing time, the update time, and the enumeration delay as functions of the light/heavy threshold. By appropriately choosing this threshold, our approach recovers a number of prior results when restricted to hierarchical queries. We show that for a restricted class of hierarchical queries, our approach achieves worst-case optimal update time and enumeration delay conditioned on the Online Matrix-Vector Multiplication Conjecture.

Background: Machine Learning (ML) methods are being increasingly used for automating different activities, e.g., Test Case Prioritization (TCP), of Continuous Integration (CI). However, ML models need frequent retraining as a result of changes in the CI environment, more commonly known as data drift. Also, continuously retraining ML models consume a lot of time and effort. Hence, there is an urgent need of identifying and evaluating suitable approaches that can help in reducing the retraining efforts and time for ML models used for TCP in CI environments. Aims: This study aims to investigate the performance of using data drift detection techniques for automatically detecting the retraining points for ML models for TCP in CI environments without requiring detailed knowledge of the software projects. Method: We employed the Hellinger distance to identify changes in both the values and distribution of input data and leveraged these changes as retraining points for the ML model. We evaluated the efficacy of this method on multiple datasets and compared the APFDc and NAPFD evaluation metrics against models that were regularly retrained, with careful consideration of the statistical methods. Results: Our experimental evaluation of the Hellinger distance-based method demonstrated its efficacy and efficiency in detecting retraining points and reducing the associated costs. However, the performance of this method may vary depending on the dataset. Conclusions: Our findings suggest that data drift detection methods can assist in identifying retraining points for ML models in CI environments, while significantly reducing the required retraining time. These methods can be helpful for practitioners who lack specialized knowledge of software projects, enabling them to maintain ML model accuracy.

Randomized trials balance all covariates on average and provide the gold standard for estimating treatment effects. Chance imbalances nevertheless exist more or less in realized treatment allocations and intrigue an important question: what should we do in case the treatment groups differ with respect to some important baseline characteristics? A common strategy is to conduct a {\it preliminary test} of the balance of baseline covariates after randomization, and invoke covariate adjustment for subsequent inference if and only if the realized allocation fails some prespecified criterion. Although such practice is intuitive and popular among practitioners, the existing literature has so far only evaluated its properties under strong parametric model assumptions in theory and simulation, yielding results of limited generality. To fill this gap, we examine two strategies for conducting preliminary test-based covariate adjustment by regression, and evaluate the validity and efficiency of the resulting inferences from the randomization-based perspective. As it turns out, the preliminary-test estimator based on the analysis of covariance can be even less efficient than the unadjusted difference in means, and risks anticonservative confidence intervals based on normal approximation even with the robust standard error. The preliminary-test estimator based on the fully interacted specification is on the other hand less efficient than its counterpart under the {\it always-adjust} strategy, and yields overconservative confidence intervals based on normal approximation. Based on theory and simulation, we echo the existing literature and do not recommend the preliminary-test procedure for covariate adjustment in randomized trials.

This paper focuses on spatial time-optimal motion planning, a generalization of the exact time-optimal path following problem that allows the system to plan within a predefined space. In contrast to state-of-the-art methods, we drop the assumption that a collision-free geometric reference is given. Instead, we present a two-stage motion planning method that solely relies on a goal location and a geometric representation of the environment to compute a time-optimal trajectory that is compliant with system dynamics and constraints. To do so, the proposed scheme first computes an obstacle-free Pythagorean Hodograph parametric spline, and second solves a spatially reformulated minimum-time optimization problem. The spline obtained in the first stage is not a geometric reference, but an extension of the environment representation, and thus, time-optimality of the solution is guaranteed. The efficacy of the proposed approach is benchmarked by a known planar example and validated in a more complex spatial system, illustrating its versatility and applicability.

We study coresets for clustering with capacity and fairness constraints. Our main result is a near-linear time algorithm to construct $\tilde{O}(k^2\varepsilon^{-2z-2})$-sized $\varepsilon$-coresets for capacitated $(k,z)$-clustering which improves a recent $\tilde{O}(k^3\varepsilon^{-3z-2})$ bound by [BCAJ+22, HJLW23]. As a corollary, we also save a factor of $k \varepsilon^{-z}$ on the coreset size for fair $(k,z)$-clustering compared to them. We fundamentally improve the hierarchical uniform sampling framework of [BCAJ+22] by adaptively selecting sample size on each ring instance, proportional to its clustering cost to an optimal solution. Our analysis relies on a key geometric observation that reduces the number of total ``effective centers" from [BCAJ+22]'s $\tilde{O}(k^2\varepsilon^{-z})$ to merely $O(k\log \varepsilon^{-1})$ by being able to ``ignore'' all center points that are too far or too close to the ring center.

One of the most studied extensions of the famous Traveling Salesperson Problem (TSP) is the {\sc Multiple TSP}: a set of $m\geq 1$ salespersons collectively traverses a set of $n$ cities by $m$ non-trivial tours, to minimize the total length of their tours. This problem can also be considered to be a variant of {\sc Uncapacitated Vehicle Routing} where the objective function is the sum of all tour lengths. When all $m$ tours start from a single common \emph{depot} $v_0$, then the metric {\sc Multiple TSP} can be approximated equally well as the standard metric TSP, as shown by Frieze (1983). The {\sc Multiple TSP} becomes significantly harder to approximate when there is a \emph{set} $D$ of $d \geq 1$ depots that form the starting and end points of the $m$ tours. For this case only a $(2-1/d)$-approximation in polynomial time is known, as well as a $3/2$-approximation for \emph{constant} $d$ which requires a prohibitive run time of $n^{\Theta(d)}$ (Xu and Rodrigues, \emph{INFORMS J. Comput.}, 2015). A recent work of Traub, Vygen and Zenklusen (STOC 2020) gives another approximation algorithm for {\sc Multiple TSP} running in time $n^{\Theta(d)}$ and reducing the problem to approximating TSP. In this paper we overcome the $n^{\Theta(d)}$ time barrier: we give the first efficient approximation algorithm for {\sc Multiple TSP} with a \emph{variable} number $d$ of depots that yields a better-than-2 approximation. Our algorithm runs in time $(1/\varepsilon)^{\mathcal O(d\log d)}\cdot n^{\mathcal O(1)}$, and produces a $(3/2+\varepsilon)$-approximation with constant probability. For the graphic case, we obtain a deterministic $3/2$-approximation in time $2^d\cdot n^{\mathcal O(1)}$.ithm for metric {\sc Multiple TSP} with run time $n^{\Theta(d)}$, which reduces the problem to approximating metric TSP.

Causal disentanglement aims to uncover a representation of data using latent variables that are interrelated through a causal model. Such a representation is identifiable if the latent model that explains the data is unique. In this paper, we focus on the scenario where unpaired observational and interventional data are available, with each intervention changing the mechanism of a latent variable. When the causal variables are fully observed, statistically consistent algorithms have been developed to identify the causal model under faithfulness assumptions. We here show that identifiability can still be achieved with unobserved causal variables, given a generalized notion of faithfulness. Our results guarantee that we can recover the latent causal model up to an equivalence class and predict the effect of unseen combinations of interventions, in the limit of infinite data. We implement our causal disentanglement framework by developing an autoencoding variational Bayes algorithm and apply it to the problem of predicting combinatorial perturbation effects in genomics.

Since real-world objects and their interactions are often multi-modal and multi-typed, heterogeneous networks have been widely used as a more powerful, realistic, and generic superclass of traditional homogeneous networks (graphs). Meanwhile, representation learning (\aka~embedding) has recently been intensively studied and shown effective for various network mining and analytical tasks. In this work, we aim to provide a unified framework to deeply summarize and evaluate existing research on heterogeneous network embedding (HNE), which includes but goes beyond a normal survey. Since there has already been a broad body of HNE algorithms, as the first contribution of this work, we provide a generic paradigm for the systematic categorization and analysis over the merits of various existing HNE algorithms. Moreover, existing HNE algorithms, though mostly claimed generic, are often evaluated on different datasets. Understandable due to the application favor of HNE, such indirect comparisons largely hinder the proper attribution of improved task performance towards effective data preprocessing and novel technical design, especially considering the various ways possible to construct a heterogeneous network from real-world application data. Therefore, as the second contribution, we create four benchmark datasets with various properties regarding scale, structure, attribute/label availability, and \etc.~from different sources, towards handy and fair evaluations of HNE algorithms. As the third contribution, we carefully refactor and amend the implementations and create friendly interfaces for 13 popular HNE algorithms, and provide all-around comparisons among them over multiple tasks and experimental settings.

北京阿比特科技有限公司