亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Given an $n$-vertex undirected graph $G=(V,E,w)$, and a parameter $k\geq1$, a path-reporting distance oracle (or PRDO) is a data structure of size $S(n,k)$, that given a query $(u,v)\in V^2$, returns an $f(k)$-approximate shortest $u-v$ path $P$ in $G$ within time $q(k)+O(|P|)$. Here $S(n,k)$, $f(k)$ and $q(k)$ are arbitrary functions. A landmark PRDO due to Thorup and Zwick, with an improvement of Wulff-Nilsen, has $S(n,k)=O(k\cdot n^{1+\frac{1}{k}})$, $f(k)=2k-1$ and $q(k)=O(\log k)$. The size of this oracle is $\Omega(n\log n)$ for all $k$. Elkin and Pettie and Neiman and Shabat devised much sparser PRDOs, but their stretch was polynomially larger than the optimal $2k-1$. On the other hand, for non-path-reporting distance oracles, Chechik devised a result with $S(n,k)=O(n^{1+\frac{1}{k}})$, $f(k)=2k-1$ and $q(k)=O(1)$. In this paper we make a dramatic progress in bridging the gap between path-reporting and non-path-reporting distance oracles. We devise a PRDO with size $S(n,k)=O(\lceil\frac{k\log\log n}{\log n}\rceil\cdot n^{1+\frac{1}{k}})$, stretch $f(k)=O(k)$ and query time $q(k)=O(\log\lceil\frac{k\log\log n}{\log n}\rceil)$. We can also have size $O(n^{1+\frac{1}{k}})$, stretch $O(k\cdot\lceil\frac{k\log\log n}{\log n}\rceil)$ and query time $q(k)=O(\log\lceil\frac{k\log\log n}{\log n}\rceil)$. Our results on PRDOs are based on novel constructions of approximate distance preservers, that we devise in this paper. Specifically, we show that for any $\epsilon>0$, any $k=1,2,...$, and any graph $G$ and a collection $\mathcal{P}$ of $p$ vertex pairs, there exists a $(1+\epsilon)$-approximate preserver with $O(\gamma(\epsilon,k)\cdot p+n\log k+n^{1+\frac{1}{k}})$ edges, where $\gamma(\epsilon,k)=(\frac{\log k}{\epsilon})^{O(\log k)}$. These new preservers are significantly sparser than the previous state-of-the-art approximate preservers due to Kogan and Parter.

相關內容

甲骨文公司,全稱甲骨文股份有限公司(甲骨文軟件系統有限公司),是全球最大的企業級軟件公司,總部位于美國加利福尼亞州的紅木灘。1989年正式進入中國市場。2013年,甲骨文已超越 IBM ,成為繼 Microsoft 后全球第二大軟件公司。

We study the convergence of message passing graph neural networks on random graph models to their continuous counterpart as the number of nodes tends to infinity. Until now, this convergence was only known for architectures with aggregation functions in the form of normalized means, or, equivalently, of an application of classical operators like the adjacency matrix or the graph Laplacian. We extend such results to a large class of aggregation functions, that encompasses all classically used message passing graph neural networks, such as attention-based message passing, max convolutional message passing or (degree-normalized) convolutional message passing. Under mild assumptions, we give non-asymptotic bounds with high probability to quantify this convergence. Our main result is based on the McDiarmid inequality. Interestingly, this result does not apply to the case where the aggregation is a coordinate-wise maximum. We treat this case separately and obtain a different convergence rate.

We consider the problem of query-efficient global max-cut on a weighted undirected graph in the value oracle model examined by [RSW18]. This model arises as a natural special case of submodular function maximization: on query $S \subseteq V$, the oracle returns the total weight of the cut between $S$ and $V \backslash S$. For most constants $c \in (0,1]$, we nail down the query complexity of achieving a $c$-approximation, for both deterministic and randomized algorithms (up to logarithmic factors). Analogously to general submodular function maximization in the same model, we observe a phase transition at $c = 1/2$: we design a deterministic algorithm for global $c$-approximate max-cut in $O(\log n)$ queries for any $c < 1/2$, and show that any randomized algorithm requires $\tilde{\Omega}(n)$ queries to find a $c$-approximate max-cut for any $c > 1/2$. Additionally, we show that any deterministic algorithm requires $\Omega(n^2)$ queries to find an exact max-cut (enough to learn the entire graph), and develop a $\tilde{O}(n)$-query randomized $c$-approximation for any $c < 1$. Our approach provides two technical contributions that may be of independent interest. One is a query-efficient sparsifier for undirected weighted graphs (prior work of [RSW18] holds only for unweighted graphs). Another is an extension of the cut dimension to rule out approximation (prior work of [GPRW20] introducing the cut dimension only rules out exact solutions).

We study a fundamental problem in optimization under uncertainty. There are $n$ boxes; each box $i$ contains a hidden reward $x_i$. Rewards are drawn i.i.d. from an unknown distribution $\mathcal{D}$. For each box $i$, we see $y_i$, an unbiased estimate of its reward, which is drawn from a Normal distribution with known standard deviation $\sigma_i$ (and an unknown mean $x_i$). Our task is to select a single box, with the goal of maximizing our reward. This problem captures a wide range of applications, e.g. ad auctions, where the hidden reward is the click-through rate of an ad. Previous work in this model [BKMR12] proves that the naive policy, which selects the box with the largest estimate $y_i$, is suboptimal, and suggests a linear policy, which selects the box $i$ with the largest $y_i - c \cdot \sigma_i$, for some $c > 0$. However, no formal guarantees are given about the performance of either policy (e.g., whether their expected reward is within some factor of the optimal policy's reward). In this work, we prove that both the naive policy and the linear policy are arbitrarily bad compared to the optimal policy, even when $\mathcal{D}$ is well-behaved, e.g. has monotone hazard rate (MHR), and even under a "small tail" condition, which requires that not too many boxes have arbitrarily large noise. On the flip side, we propose a simple threshold policy that gives a constant approximation to the reward of a prophet (who knows the realized values $x_1, \dots, x_n$) under the same "small tail" condition. We prove that when this condition is not satisfied, even an optimal clairvoyant policy (that knows $\mathcal{D}$) cannot get a constant approximation to the prophet, even for MHR distributions, implying that our threshold policy is optimal against the prophet benchmark, up to constants.

This paper studies the message complexity of authenticated Byzantine agreement (BA) in synchronous, fully-connected distributed networks under an honest majority. We focus on the so-called {\em implicit} Byzantine agreement problem where each node starts with an input value and at the end a non-empty subset of the honest nodes should agree on a common input value by satisfying the BA properties (i.e., there can be undecided nodes). We show that a sublinear (in $n$, number of nodes) message complexity BA protocol under honest majority is possible in the standard PKI model when the nodes have access to an unbiased global coin and hash function. In particular, we present a randomized Byzantine agreement algorithm which, with high probability achieves implicit agreement, uses $\tilde{O}(\sqrt{n})$ messages, and runs in $\tilde{O}(1)$ rounds while tolerating $(1/2 - \epsilon)n$ Byzantine nodes for any fixed $\epsilon > 0$, the notation $\Tilde{O}$ hides a $O(\polylog{n})$ factor. The algorithm requires standard cryptographic setup PKI and hash function with a static Byzantine adversary. The algorithm works in the CONGEST model and each node does not need to know the identity of its neighbors, i.e., works in the $KT_0$ model. The message complexity (and also the time complexity) of our algorithm is optimal up to a $\polylog n$ factor, as we show a $\Omega(\sqrt{n})$ lower bound on the message complexity.

The standard goal for an effective algebraic multigrid (AMG) algorithm is to develop relaxation and coarse-grid correction schemes that attenuate complementary error modes. In the nonsymmetric setting, coarse-grid correction $\Pi$ will almost certainly be nonorthogonal (and divergent) in any known inner product, meaning $\|\Pi\| > 1$. This introduces a new consideration, that one wants coarse-grid correction to be as close to orthogonal as possible, in an appropriate norm. In addition, due to non-orthogonality, $\Pi$ may actually amplify certain error modes that are in the range of interpolation. Relaxation must then not only be complementary to interpolation, but also rapidly eliminate any error amplified by the non-orthogonal correction, or the algorithm may diverge. This note develops analytic formulae on how to construct ``compatible'' transfer operators in nonsymmetric AMG such that $\|\Pi\| = 1$ in any standard matrix-induced norm. Discussion is provided on different options for norm in the nonsymmetric setting, the relation between ``ideal'' transfer operators in different norms, and insight into the convergence of nonsymmetric reduction-based AMG.

In this paper, we provide a novel framework for the analysis of generalization error of first-order optimization algorithms for statistical learning when the gradient can only be accessed through partial observations given by an oracle. Our analysis relies on the regularity of the gradient w.r.t. the data samples, and allows to derive near matching upper and lower bounds for the generalization error of multiple learning problems, including supervised learning, transfer learning, robust learning, distributed learning and communication efficient learning using gradient quantization. These results hold for smooth and strongly-convex optimization problems, as well as smooth non-convex optimization problems verifying a Polyak-Lojasiewicz assumption. In particular, our upper and lower bounds depend on a novel quantity that extends the notion of conditional standard deviation, and is a measure of the extent to which the gradient can be approximated by having access to the oracle. As a consequence, our analysis provides a precise meaning to the intuition that optimization of the statistical learning objective is as hard as the estimation of its gradient. Finally, we show that, in the case of standard supervised learning, mini-batch gradient descent with increasing batch sizes and a warm start can reach a generalization error that is optimal up to a multiplicative factor, thus motivating the use of this optimization scheme in practical applications.

We consider the randomized communication complexity of the distributed $\ell_p$-regression problem in the coordinator model, for $p\in (0,2]$. In this problem, there is a coordinator and $s$ servers. The $i$-th server receives $A^i\in\{-M, -M+1, \ldots, M\}^{n\times d}$ and $b^i\in\{-M, -M+1, \ldots, M\}^n$ and the coordinator would like to find a $(1+\epsilon)$-approximate solution to $\min_{x\in\mathbb{R}^n} \|(\sum_i A^i)x - (\sum_i b^i)\|_p$. Here $M \leq \mathrm{poly}(nd)$ for convenience. This model, where the data is additively shared across servers, is commonly referred to as the arbitrary partition model. We obtain significantly improved bounds for this problem. For $p = 2$, i.e., least squares regression, we give the first optimal bound of $\tilde{\Theta}(sd^2 + sd/\epsilon)$ bits. For $p \in (1,2)$,we obtain an $\tilde{O}(sd^2/\epsilon + sd/\mathrm{poly}(\epsilon))$ upper bound. Notably, for $d$ sufficiently large, our leading order term only depends linearly on $1/\epsilon$ rather than quadratically. We also show communication lower bounds of $\Omega(sd^2 + sd/\epsilon^2)$ for $p\in (0,1]$ and $\Omega(sd^2 + sd/\epsilon)$ for $p\in (1,2]$. Our bounds considerably improve previous bounds due to (Woodruff et al. COLT, 2013) and (Vempala et al., SODA, 2020).

We consider the problem of latent bandits with cluster structure where there are multiple users, each with an associated multi-armed bandit problem. These users are grouped into \emph{latent} clusters such that the mean reward vectors of users within the same cluster are identical. At each round, a user, selected uniformly at random, pulls an arm and observes a corresponding noisy reward. The goal of the users is to maximize their cumulative rewards. This problem is central to practical recommendation systems and has received wide attention of late \cite{gentile2014online, maillard2014latent}. Now, if each user acts independently, then they would have to explore each arm independently and a regret of $\Omega(\sqrt{\mathsf{MNT}})$ is unavoidable, where $\mathsf{M}, \mathsf{N}$ are the number of arms and users, respectively. Instead, we propose LATTICE (Latent bAndiTs via maTrIx ComplEtion) which allows exploitation of the latent cluster structure to provide the minimax optimal regret of $\widetilde{O}(\sqrt{(\mathsf{M}+\mathsf{N})\mathsf{T}})$, when the number of clusters is $\widetilde{O}(1)$. This is the first algorithm to guarantee such strong regret bound. LATTICE is based on a careful exploitation of arm information within a cluster while simultaneously clustering users. Furthermore, it is computationally efficient and requires only $O(\log{\mathsf{T}})$ calls to an offline matrix completion oracle across all $\mathsf{T}$ rounds.

We consider error-correction coding schemes for adversarial wiretap channels (AWTCs) in which the channel can a) read a fraction of the codeword bits up to a bound $r$ and b) flip a fraction of the bits up to a bound $p$. The channel can freely choose the locations of the bit reads and bit flips via a process with unbounded computational power. Codes for the AWTC are of broad interest in the area of information security, as they can provide data resiliency in settings where an attacker has limited access to a storage or transmission medium. We investigate a family of non-linear codes known as pseudolinear codes, which were first proposed by Guruswami and Indyk (FOCS 2001) for constructing list-decodable codes independent of the AWTC setting. Unlike general non-linear codes, pseudolinear codes admit efficient encoders and have succinct representations. We focus on unique decoding and show that random pseudolinear codes can achieve rates up to the binary symmetric channel (BSC) capacity $1-H_2(p)$ for any $p,r$ in the less noisy region: $p<1/2$ and $r<1-H_2(p)$ where $H_2(\cdot)$ is the binary entropy function. Thus, pseudolinear codes are the first known optimal-rate binary code family for the less noisy AWTC that admit efficient encoders. The above result can be viewed as a derandomization result of random general codes in the AWTC setting, which in turn opens new avenues for applying derandomization techniques to randomized constructions of AWTC codes. Our proof applies a novel concentration inequality for sums of random variables with limited independence which may be of interest as an analysis tool more generally.

Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.

北京阿比特科技有限公司