亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we study the problem of computing an edge-coloring in the (one-pass) W-streaming model. In this setting, the edges of an $n$-node graph arrive in an arbitrary order to a machine with a relatively small space, and the goal is to design an algorithm that outputs, as a stream, a proper coloring of the edges using the fewest possible number of colors. Behnezhad et al. [Behnezhad et al., 2019] devised the first non-trivial algorithm for this problem, which computes in $\tilde{O}(n)$ space a proper $O(\Delta^2)$-coloring w.h.p. (here $\Delta$ is the maximum degree of the graph). Subsequent papers improved upon this result, where latest of them [Ansari et al., 2022] shows that it is possible to deterministically compute an $O(\Delta^2/s)$-coloring in $O(ns)$ space. However, none of the improvements, succeeded in reducing the number of colors to $O(\Delta^{2-\epsilon})$ while keeping the same space bound of $\tilde{O}(n)$. In particular, no progress was made on the question of whether computing an $O(\Delta)$-coloring is possible with roughly $O(n)$ space, which was stated in [Behnezhad et al., 2019] to be a major open problem. In this paper we bypass the quadratic bound by presenting a new randomized $\tilde{O}(n)$-space algorithm that uses $\tilde{O}(\Delta^{1.5})$ colors.

相關內容

In distributed computing, slower nodes (stragglers) usually become a bottleneck. Gradient Coding (GC), introduced by Tandon et al., is an efficient technique that uses principles of error-correcting codes to distribute gradient computation in the presence of stragglers. In this paper, we consider the distributed computation of a sequence of gradients $\{g(1),g(2),\ldots,g(J)\}$, where processing of each gradient $g(t)$ starts in round-$t$ and finishes by round-$(t+T)$. Here $T\geq 0$ denotes a delay parameter. For the GC scheme, coding is only across computing nodes and this results in a solution where $T=0$. On the other hand, having $T>0$ allows for designing schemes which exploit the temporal dimension as well. In this work, we propose two schemes that demonstrate improved performance compared to GC. Our first scheme combines GC with selective repetition of previously unfinished tasks and achieves improved straggler mitigation. In our second scheme, which constitutes our main contribution, we apply GC to a subset of the tasks and repetition for the remainder of the tasks. We then multiplex these two classes of tasks across workers and rounds in an adaptive manner, based on past straggler patterns. Using theoretical analysis, we demonstrate that our second scheme achieves significant reduction in the computational load. In our experiments, we study a practical setting of concurrently training multiple neural networks over an AWS Lambda cluster involving 256 worker nodes, where our framework naturally applies. We demonstrate that the latter scheme can yield a 16\% improvement in runtime over the baseline GC scheme, in the presence of naturally occurring, non-simulated stragglers.

We study synthetic data release for answering multiple linear queries over a set of database tables in a differentially private way. Two special cases have been considered in the literature: how to release a synthetic dataset for answering multiple linear queries over a single table, and how to release the answer for a single counting (join size) query over a set of database tables. Compared to the single-table case, the join operator makes query answering challenging, since the sensitivity (i.e., by how much an individual data record can affect the answer) could be heavily amplified by complex join relationships. We present an algorithm for the general problem, and prove a lower bound illustrating that our general algorithm achieves parameterized optimality (up to logarithmic factors) on some simple queries (e.g., two-table join queries) in the most commonly-used privacy parameter regimes. For the case of hierarchical joins, we present a data partition procedure that exploits the concept of {\em uniformized sensitivities} to further improve the utility.

Differential Privacy (DP) ensures that training a machine learning model does not leak private data. However, the cost of DP is lower model accuracy or higher sample complexity. In practice, we may have access to auxiliary public data that is free of privacy concerns. This has motivated the recent study of what role public data might play in improving the accuracy of DP models. In this work, we assume access to a given amount of public data and settle the following fundamental open questions: 1. What is the optimal (worst-case) error of a DP model trained over a private data set while having access to side public data? What algorithms are optimal? 2. How can we harness public data to improve DP model training in practice? We consider these questions in both the local and central models of DP. To answer the first question, we prove tight (up to constant factors) lower and upper bounds that characterize the optimal error rates of three fundamental problems: mean estimation, empirical risk minimization, and stochastic convex optimization. We prove that public data reduces the sample complexity of DP model training. Perhaps surprisingly, we show that the optimal error rates can be attained (up to constants) by either discarding private data and training a public model, or treating public data like it's private data and using an optimal DP algorithm. To address the second question, we develop novel algorithms which are "even more optimal" (i.e. better constants) than the asymptotically optimal approaches described above. For local DP mean estimation with public data, our algorithm is optimal including constants. Empirically, our algorithms show benefits over existing approaches for DP model training with side access to public data.

For many applications involving a sequence of linear systems with slowly changing system matrices, subspace recycling, which exploits relationships among systems and reuses search space information, can achieve huge gains in iterations across the total number of linear system solves in the sequence. However, for general (i.e., non-identity) shifted systems with the shift value varying over a wide range, the properties of the linear systems vary widely as well, which makes recycling less effective. If such a sequence of systems is embedded in a nonlinear iteration, the problem is compounded, and special approaches are needed to use recycling effectively. In this paper, we develop new, more efficient, Krylov subspace recycling approaches for large-scale image reconstruction and restoration techniques that employ a nonlinear iteration to compute a suitable regularization matrix. For each new regularization matrix, we need to solve regularized linear systems, ${\bf A} + \gamma_\ell {\bf E}_k$, for a sequence of regularization parameters, $\gamma_\ell$, to find the optimally regularized solution that, in turn, will be used to update the regularization matrix. In this paper, we analyze system and solution characteristics to choose appropriate techniques to solve each system rapidly. Specifically, we use an inner-outer recycling approach with a larger, principal recycle space for each nonlinear step and smaller recycle spaces for each shift. We propose an efficient way to obtain good initial guesses from the principle recycle space and smaller shift-specific recycle spaces that lead to fast convergence. Our method is substantially reduces the total number of matrix-vector products that would arise in a naive approach. Our approach is more generally applicable to sequences of shifted systems where the matrices in the sum are positive semi-definite.

We present two approximate counting algorithms with $\widetilde{O}(n^{2-c}/\varepsilon^2)$ running time for some constant $c > 0$ and accuracy $\varepsilon$: (1) for the hard-core model when strong spatial mixing (SSM) is sufficiently fast; (2) for spin systems with SSM on planar graphs with quadratic growth, such as $\mathbb{Z}^2$. The latter algorithm also extends to (not necessarily planar) graphs with polynomial growth, such as $\mathbb{Z}^d$, albeit with a running time of the form $\widetilde{O}(n^2\varepsilon^{-2}/2^{c(\log n)^{1/d}})$ for some constant $c > 0$ and $d$ being the exponent of the polynomial growth. Our technique utilizes Weitz's self-avoiding walk tree (STOC, 2006) and the recent marginal sampler of Anand and Jerrum (SIAM J. Comput., 2022).

We consider the Sobolev embedding operator $E_s : H^s(\Omega) \to L_2(\Omega)$ and its role in the solution of inverse problems. In particular, we collect various properties and investigate different characterizations of its adjoint operator $E_s^*$, which is a common component in both iterative and variational regularization methods. These include variational representations and connections to boundary value problems, Fourier and wavelet representations, as well as connections to spatial filters. Moreover, we consider characterizations in terms of Fourier series, singular value decompositions and frame decompositions, as well as representations in finite dimensional settings. While many of these results are already known to researchers from different fields, a detailed and general overview or reference work containing rigorous mathematical proofs is still missing. Hence, in this paper we aim to fill this gap by collecting, introducing and generalizing a large number of characterizations of $E_s^*$ and discuss their use in regularization methods for solving inverse problems. The resulting compilation can serve both as a reference as well as a useful guide for its efficient numerical implementation in practice.

We focus on analyzing the classical stochastic projected gradient methods under a general dependent data sampling scheme for constrained smooth nonconvex optimization. We show the worst-case rate of convergence $\tilde{O}(t^{-1/4})$ and complexity $\tilde{O}(\varepsilon^{-4})$ for achieving an $\varepsilon$-near stationary point in terms of the norm of the gradient of Moreau envelope and gradient mapping. While classical convergence guarantee requires i.i.d. data sampling from the target distribution, we only require a mild mixing condition of the conditional distribution, which holds for a wide class of Markov chain sampling algorithms. This improves the existing complexity for the constrained smooth nonconvex optimization with dependent data from $\tilde{O}(\varepsilon^{-8})$ to $\tilde{O}(\varepsilon^{-4})$ with a significantly simpler analysis. We illustrate the generality of our approach by deriving convergence results with dependent data for stochastic proximal gradient methods, adaptive stochastic gradient algorithm AdaGrad and stochastic gradient algorithm with heavy ball momentum. As an application, we obtain first online nonnegative matrix factorization algorithms for dependent data based on stochastic projected gradient methods with adaptive step sizes and optimal rate of convergence.

Given a graph, the $k$-plex is a vertex set in which each vertex is not adjacent to at most $k-1$ other vertices in the set. The maximum $k$-plex problem, which asks for the largest $k$-plex from a given graph, is an important but computationally challenging problem in applications like graph search and community detection. So far, there is a number of empirical algorithms without sufficient theoretical explanations on the efficiency. We try to bridge this gap by defining a novel parameter of the input instance, $g_k(G)$, the gap between the degeneracy bound and the size of maximum $k$-plex in the given graph, and presenting an exact algorithm parameterized by $g_k(G)$. In other words, we design an algorithm with running time polynomial in the size of input graph and exponential in $g_k(G)$ where $k$ is a constant. Usually, $g_k(G)$ is small and bounded by $O(\log{(|V|)})$ in real-world graphs, indicating that the algorithm runs in polynomial time. We also carry out massive experiments and show that the algorithm is competitive with the state-of-the-art solvers. Additionally, for large $k$ values such as $15$ and $20$, our algorithm has superior performance over existing algorithms.

We study the asymptotic generalization of an overparameterized linear model for multiclass classification under the Gaussian covariates bi-level model introduced in Subramanian et al.~'22, where the number of data points, features, and classes all grow together. We fully resolve the conjecture posed in Subramanian et al.~'22, matching the predicted regimes for generalization. Furthermore, our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1 asymptotically. One surprising consequence of our tight results is that the min-norm interpolating classifier can be asymptotically suboptimal relative to noninterpolating classifiers in the regime where the min-norm interpolating regressor is known to be optimal. The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels. As an application, we show that the same type of analysis can be used to analyze the related multilabel classification problem under the same bi-level ensemble.

We explore the switching-algebraic computation of the Banzhaf indices for general and monotone or unrestricted systems. This computation is achieved via (a) two Boolean-quotient formulas that are valid when the voting system is not necessarily monotone (e.g., when coalition formation is restricted), (b) four Boolean differencing formulas and six Boolean-quotient formulas that are applicable when the decision switching function is a positively polarized unate one. We also provide switching-algebraic formulas for certain Banzhaf-related indices, including the power-to-initiate index (PII), and the power-to-prevent index (PPI), as well as satisfaction indices. Moreover, we briefly address other Banzhaf-related indices, including the Strict Power Index (SPI) and the Public Good Index (PGI). We illustrate the various indices formulas by way of four examples of voting systems, each considered first as an unrestricted monotone system and then subjected to a restriction on the formation of a coalition between two particular voters.

北京阿比特科技有限公司