亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The graph parameter of pathwidth can be seen as a measure of the topological resemblance of a graph to a path. A popular definition of pathwidth is given in terms of node search where we are given a system of tunnels that is contaminated by some infectious substance and we are looking for a search strategy that, at each step, either places a searcher on a vertex or removes a searcher from a vertex and where an edge is cleaned when both endpoints are simultaneously occupied by searchers. It was proved that the minimum number of searchers required for a successful cleaning strategy is equal to the pathwidth of the graph plus one. Two desired characteristics for a cleaning strategy is to be monotone (no recontamination occurs) and connected (clean territories always remain connected). Under these two demands, the number of searchers is equivalent to a variant of pathwidth called {\em connected pathwidth}. We prove that connected pathwidth is fixed parameter tractable, in particular we design a $2^{O(k^2)}\cdot n$ time algorithm that checks whether the connected pathwidth of $G$ is at most $k.$ This resolves an open question by [Dereniowski, Osula, and Rz{\k{a}}{\.{z}}ewski, Finding small-width connected path-decompositions in polynomial time. Theor. Comput. Sci., 794:85-100, 2019]. For our algorithm, we enrich the typical sequence technique that is able to deal with the connectivity demand. Typical sequences have been introduced in [Bodlaender and Kloks. Efficient and constructive algorithms for the pathwidth and treewidth of graphs. J. Algorithms, 21(2):358-402, 1996] for the design of linear parameterized algorithms for treewidth and pathwidth. The proposed extension is based on an encoding of the connectivity property that is quite versatile and may be adapted so to deliver linear parameterized algorithms for the connected variants of other width parameters as well.

相關內容

We prove a new generalization bound that shows for any class of linear predictors in Gaussian space, the Rademacher complexity of the class and the training error under any continuous loss $\ell$ can control the test error under all Moreau envelopes of the loss $\ell$. We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021) for linear regression with the square loss, which is known to be tight for minimal $\ell_2$-norm interpolation, but we also handle more general settings where the label is generated by a potentially misspecified multi-index model. The same argument can analyze noisy interpolation of max-margin classifiers through the squared hinge loss, and establishes consistency results in spiked-covariance settings. More generally, when the loss is only assumed to be Lipschitz, our bound effectively improves Talagrand's well-known contraction lemma by a factor of two, and we prove uniform convergence of interpolators (Koehler et al. 2021) for all smooth, non-negative losses. Finally, we show that application of our generalization bound using localized Gaussian width will generally be sharp for empirical risk minimizers, establishing a non-asymptotic Moreau envelope theory for generalization that applies outside of proportional scaling regimes, handles model misspecification, and complements existing asymptotic Moreau envelope theories for M-estimation.

We study the training of deep neural networks by gradient descent where floating-point arithmetic is used to compute the gradients. In this framework and under realistic assumptions, we demonstrate that it is highly unlikely to find ReLU neural networks that maintain, in the course of training with gradient descent, superlinearly many affine pieces with respect to their number of layers. In virtually all approximation theoretical arguments which yield high order polynomial rates of approximation, sequences of ReLU neural networks with exponentially many affine pieces compared to their numbers of layers are used. As a consequence, we conclude that approximating sequences of ReLU neural networks resulting from gradient descent in practice differ substantially from theoretically constructed sequences. The assumptions and the theoretical results are compared to a numerical study, which yields concurring results.

Modified Patankar-Runge-Kutta (MPRK) methods preserve the positivity as well as conservativity of a production-destruction system (PDS) of ordinary differential equations for all time step sizes. As a result, higher order MPRK schemes do not belong to the class of general linear methods, i.e. the iterates are generated by a nonlinear map $\mathbf g$ even when the PDS is linear. Moreover, due to the conservativity of the method, the map $\mathbf g$ possesses non-hyperbolic fixed points. Recently, a new theorem for the investigation of stability properties of non-hyperbolic fixed points of a nonlinear iteration map was developed. We apply this theorem to understand the stability properties of a family of second order MPRK methods when applied to a nonlinear PDS of ordinary differential equations. It is shown that the fixed points are stable for all time step sizes and members of the MPRK family. Finally, experiments are presented to numerically support the theoretical claims.

In this paper, we address the dichotomy between heterogeneous models and simultaneous training in Federated Learning (FL) via a clustering framework. We define a new clustering model for FL based on the (optimal) local models of the users: two users belong to the same cluster if their local models are close; otherwise they belong to different clusters. A standard algorithm for clustered FL is proposed in \cite{ghosh_efficient_2021}, called \texttt{IFCA}, which requires \emph{suitable} initialization and the knowledge of hyper-parameters like the number of clusters (which is often quite difficult to obtain in practical applications) to converge. We propose an improved algorithm, \emph{Successive Refine Federated Clustering Algorithm} (\texttt{SR-FCA}), which removes such restrictive assumptions. \texttt{SR-FCA} treats each user as a singleton cluster as an initialization, and then successively refine the cluster estimation via exploiting similar users belonging to the same cluster. In any intermediate step, \texttt{SR-FCA} uses a robust federated learning algorithm within each cluster to exploit simultaneous training and to correct clustering errors. Furthermore, \texttt{SR-FCA} does not require any \emph{good} initialization (warm start), both in theory and practice. We show that with proper choice of learning rate, \texttt{SR-FCA} incurs arbitrarily small clustering error. Additionally, we validate the performance of our algorithm on standard FL datasets in non-convex problems like neural nets, and we show the benefits of \texttt{SR-FCA} over baselines.

The goal of cryptocurrencies is decentralization. In principle, all currencies have equal status. Unlike traditional stock markets, there is no default currency of denomination (fiat), thus the trading pairs can be set freely. However, it is impractical to set up a trading market between every two currencies. In order to control management costs and ensure sufficient liquidity, we must give priority to covering those large-volume trading pairs and ensure that all coins are reachable. We note that this is an optimization problem. Its particularity lies in: 1) the trading volume between most (>99.5%) possible trading pairs cannot be directly observed. 2) It satisfies the connectivity constraint, that is, all currencies are guaranteed to be tradable. To solve this problem, we use a two-stage process: 1) Fill in missing values based on a regularized, truncated eigenvalue decomposition, where the regularization term is used to control what extent missing values should be limited to zero. 2) Search for the optimal trading pairs, based on a branch and bound process, with heuristic search and pruning strategies. The experimental results show that: 1) If the number of denominated coins is not limited, we will get a more decentralized trading pair settings, which advocates the establishment of trading pairs directly between large currency pairs. 2) There is a certain room for optimization in all exchanges. The setting of inappropriate trading pairs is mainly caused by subjectively setting small coins to quote, or failing to track emerging big coins in time. 3) Too few trading pairs will lead to low coverage; too many trading pairs will need to be adjusted with markets frequently. Exchanges should consider striking an appropriate balance between them.

While Mixed-integer linear programming (MILP) is NP-hard in general, practical MILP has received roughly 100--fold speedup in the past twenty years. Still, many classes of MILPs quickly become unsolvable as their sizes increase, motivating researchers to seek new acceleration techniques for MILPs. With deep learning, they have obtained strong empirical results, and many results were obtained by applying graph neural networks (GNNs) to making decisions in various stages of MILP solution processes. This work discovers a fundamental limitation: there exist feasible and infeasible MILPs that all GNNs will, however, treat equally, indicating GNN's lacking power to express general MILPs. Then, we show that, by restricting the MILPs to unfoldable ones or by adding random features, there exist GNNs that can reliably predict MILP feasibility, optimal objective values, and optimal solutions up to prescribed precision. We conducted small-scale numerical experiments to validate our theoretical findings.

Spike-and-slab and horseshoe regression are arguably the most popular Bayesian variable selection approaches for linear regression models. However, their performance can deteriorate if outliers and heteroskedasticity are present in the data, which are common features in many real-world statistics and machine learning applications. In this work, we propose a Bayesian nonparametric approach to linear regression that performs variable selection while accounting for outliers and heteroskedasticity. Our proposed model is an instance of a Dirichlet process scale mixture model with the advantage that we can derive the full conditional distributions of all parameters in closed form, hence producing an efficient Gibbs sampler for posterior inference. Moreover, we present how to extend the model to account for heavy-tailed response variables. The performance of the model is tested against competing algorithms on synthetic and real-world datasets.

We consider robust clustering problems in $\mathbb{R}^d$, specifically $k$-clustering problems (e.g., $k$-Median and $k$-Means with $m$ outliers, where the cost for a given center set $C \subset \mathbb{R}^d$ aggregates the distances from $C$ to all but the furthest $m$ data points, instead of all points as in classical clustering. We focus on the $\epsilon$-coreset for robust clustering, a small proxy of the dataset that preserves the clustering cost within $\epsilon$-relative error for all center sets. Our main result is an $\epsilon$-coreset of size $O(m + \mathrm{poly}(k \epsilon^{-1}))$ that can be constructed in near-linear time. This significantly improves previous results, which either suffers an exponential dependence on $(m + k)$ [Feldman and Schulman, SODA'12], or has a weaker bi-criteria guarantee [Huang et al., FOCS'18]. Furthermore, we show this dependence in $m$ is nearly-optimal, and the fact that it is isolated from other factors may be crucial for dealing with large number of outliers. We construct our coresets by adapting to the outlier setting a recent framework [Braverman et al., FOCS'22] which was designed for capacity-constrained clustering, overcoming a new challenge that the participating terms in the cost, particularly the excluded $m$ outlier points, are dependent on the center set $C$. We validate our coresets on various datasets, and we observe a superior size-accuracy tradeoff compared with popular baselines including uniform sampling and sensitivity sampling. We also achieve a significant speedup of existing approximation algorithms for robust clustering using our coresets.

We consider and discretize a mixed formulation for linear elasticity with weakly imposed symmetry in two and three dimensions. Whereas existing methods mainly deal with simplicial or polygonal meshes, we take advantage of isogeometric analysis (IGA) and consequently allow for shapes with curved boundaries. To introduce the discrete spaces we use isogeometric discrete differential forms defined by proper B-spline spaces. For the proposed schemes a proof of well-posedness and an error estimate are given. Further we discuss our ansatz by means of different numerical examples.

Persistent Memory (PM) technologies enable program recovery to a consistent state in case of failure. To ensure this crash-consistent behavior, programs need to enforce persist ordering by employing mechanisms, such as logging and checkpointing, which introduce additional data movement.The emerging near-data processing (NDP) architectures can effectively reduce this data movement overhead by partitioning the persistent programs and executing the crash consistency mechanisms in the NDP-enabled PM. However, a significant challenge lies in maintaining the persist ordering when execution has been partitioned between the host CPU and NDP-enabled PM. In this work, we first propose Partitioned Persist Ordering (PPO) that ensures a correct persist ordering between CPU and NDP devices, as well as among multiple NDP devices. PPO guarantees high efficiency by reducing unnecessary synchronization among CPU and NDP devices. Based on PPO, we prototype an NDP system, NearPM, on an FPGA platform. NearPM executes data-intensive operations in crash consistency mechanisms with correct ordering guarantees while the rest of the program runs on the CPU. We evaluate nine PM workloads, where each workload supports three crash consistency mechanisms - logging, checkpointing, and shadow paging. Overall, NearPM achieves 4.3-9.8X speedup in the NDP-offloaded operations and 1.22-1.35X speedup in end-to-end execution.

北京阿比特科技有限公司