我的美女教师在线观看免费,色又黄又刺激的国语对白视频

In this paper, we consider the problem of maintaining a $(1-\varepsilon)$-approximate maximum weight matching in a dynamic graph $G$, while the adversary makes changes to the edges of the graph. In the fully dynamic setting, where both edge insertions and deletions are allowed, Gupta and Peng gave an algorithm for this problem with an update time of $\tilde{O}_{\varepsilon}(\sqrt{m})$. We study a natural relaxation of this problem, namely the decremental model, where the adversary is only allowed to delete edges. For the cardinality version of this problem in general (possibly, non-bipartite) graphs, Assadi, Bernstein, and Dudeja gave a decremental algorithm with update time $O_{\varepsilon}(\text{poly}(\log n))$. However, beating $\tilde{O}_{\varepsilon}(\sqrt{m})$ update time remained an open problem for the \emph{weighted} version in \emph{general graphs}. In this paper, we bridge the gap between unweighted and weighted general graphs for the decremental setting. We give a $O_{\varepsilon}(\text{poly}(\log n))$ update time algorithm that maintains a $(1-\varepsilon)$-approximate maximum weight matching under adversarial deletions. Like the decremental algorithm of Assadi, Bernstein, and Dudeja, our algorithm is randomized, but works against an adaptive adversary. It also matches the time bound for the cardinality version upto dependencies on $\varepsilon$ and a $\log R$ factor, where $R$ is the ratio between the maximum and minimum edge weight in $G$.

相關內容

Weight

關注 0

流 · Attention · 近似 · 詞元分析器 · MoDELS ·

2024 年 2 月 5 日

One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

Raghav Addanki,Chenyang Li,Zhao Song,Chiwun Yang

Attention computation takes both the time complexity of $O(n^2)$ and the space complexity of $O(n^2)$ simultaneously, which makes deploying Large Language Models (LLMs) in streaming applications that involve long contexts requiring substantial computational resources. In recent OpenAI DevDay (Nov 6, 2023), OpenAI released a new model that is able to support a 128K-long document, in our paper, we focus on the memory-efficient issue when context length $n$ is much greater than 128K ($n \gg 2^d$). Considering a single-layer self-attention with Query, Key, and Value matrices $Q, K, V \in \mathbb{R}^{n \times d}$, the polynomial method approximates the attention output $T \in \mathbb{R}^{n \times d}$. It accomplishes this by constructing $U_1, U_2 \in \mathbb{R}^{n \times t}$ to expedite attention ${\sf Attn}(Q, K, V)$ computation within $n^{1+o(1)}$ time executions. Despite this, computing the approximated attention matrix $U_1U_2^\top \in \mathbb{R}^{n \times n}$ still necessitates $O(n^2)$ space, leading to significant memory usage. In response to these challenges, we introduce a new algorithm that only reads one pass of the data in a streaming fashion. This method employs sublinear space $o(n)$ to store three sketch matrices, alleviating the need for exact $K, V$ storage. Notably, our algorithm exhibits exceptional memory-efficient performance with super-long tokens. As the token length $n$ increases, our error guarantee diminishes while the memory usage remains nearly constant. This unique attribute underscores the potential of our technique in efficiently handling LLMs in streaming applications.

2024 年 2 月 5 日

Fixed Point Theorems in Computability Theory

Sebastiaan A. Terwijn

We give a quick survey of the various fixed point theorems in computability theory, partial combinatory algebra, and the theory of numberings, as well as generalizations based on those. We also point out several open problems connected to these.

有偏 · 分解的 · 潛在 · INFORMS · Performer ·

2024 年 2 月 5 日

Exploiting Observation Bias to Improve Matrix Completion

Yassir Jedra,Sean Mann,Charlotte Park,Devavrat Shah

We consider a variant of matrix completion where entries are revealed in a biased manner, adopting a model akin to that introduced by Ma and Chen. Instead of treating this observation bias as a disadvantage, as is typically the case, the goal is to exploit the shared information between the bias and the outcome of interest to improve predictions. Towards this, we consider a natural model where the observation pattern and outcome of interest are driven by the same set of underlying latent or unobserved factors. This leads to a two stage matrix completion algorithm: first, recover (distances between) the latent factors by utilizing matrix completion for the fully observed noisy binary matrix corresponding to the observation pattern; second, utilize the recovered latent factors as features and sparsely observed noisy outcomes as labels to perform non-parametric supervised learning. The finite-sample error rates analysis suggests that, ignoring logarithmic factors, this approach is competitive with the corresponding supervised learning parametric rates. This implies the two-stage method has performance that is comparable to having access to the unobserved latent factors through exploiting the shared information between the bias and outcomes. Through empirical evaluation using a real-world dataset, we find that with this two-stage algorithm, the estimates have 30x smaller mean squared error compared to traditional matrix completion methods, suggesting the utility of the model and the method proposed in this work.

分離的 · Continuity · 前向 · CASE · 不變 ·

2024 年 2 月 2 日

Separators in Continuous Petri Nets

Michael Blondin,Javier Esparza

from arxiv, Submitted to LMCS as an extension of the FoSSaCS'22 conference version

Leroux has proved that unreachability in Petri nets can be witnessed by a Presburger separator, i.e. if a marking $\vec{m}_\text{src}$ cannot reach a marking $\vec{m}_\text{tgt}$, then there is a formula $\varphi$ of Presburger arithmetic such that: $\varphi(\vec{m}_\text{src})$ holds; $\varphi$ is forward invariant, i.e., $\varphi(\vec{m})$ and $\vec{m} \rightarrow \vec{m}'$ imply $\varphi(\vec{m}'$); and $\neg \varphi(\vec{m}_\text{tgt})$ holds. While these separators could be used as explanations and as formal certificates of unreachability, this has not yet been the case due to their worst-case size, which is at least Ackermannian, and the complexity of checking that a formula is a separator, which is at least exponential (in the formula size). We show that, in continuous Petri nets, these two problems can be overcome. We introduce locally closed separators, and prove that: (a) unreachability can be witnessed by a locally closed separator computable in polynomial time; (b) checking whether a formula is a locally closed separator is in NC (so, simpler than unreachability, which is P-complete). We further consider the more general problem of (existential) set-to-set reachability, where two sets of markings are given as convex polytopes. We show that, while our approach does not extend directly, we can efficiently certify unreachability via an altered Petri net.

3D · Microsoft Surface · 優化器 · 訓練數據 · 蒸餾 ·

2024 年 2 月 2 日

Neural Semantic Surface Maps

Luca Morreale,Noam Aigerman,Vladimir G. Kim,Niloy J. Mitra

We present an automated technique for computing a map between two genus-zero shapes, which matches semantically corresponding regions to one another. Lack of annotated data prohibits direct inference of 3D semantic priors; instead, current State-of-the-art methods predominantly optimize geometric properties or require varying amounts of manual annotation. To overcome the lack of annotated training data, we distill semantic matches from pre-trained vision models: our method renders the pair of 3D shapes from multiple viewpoints; the resulting renders are then fed into an off-the-shelf image-matching method which leverages a pretrained visual model to produce feature points. This yields semantic correspondences, which can be projected back to the 3D shapes, producing a raw matching that is inaccurate and inconsistent between different viewpoints. These correspondences are refined and distilled into an inter-surface map by a dedicated optimization scheme, which promotes bijectivity and continuity of the output map. We illustrate that our approach can generate semantic surface-to-surface maps, eliminating manual annotations or any 3D training data requirement. Furthermore, it proves effective in scenarios with high semantic complexity, where objects are non-isometrically related, as well as in situations where they are nearly isometric.

穩健性 · 可理解性 · 泛化理論 · 相關系數 · Networking ·

2024 年 2 月 2 日

Understanding Grokking Through A Robustness Viewpoint

Zhiquan Tan,Weiran Huang

Recently, an interesting phenomenon called grokking has gained much attention, where generalization occurs long after the models have initially overfitted the training data. We try to understand this seemingly strange phenomenon through the robustness of the neural network. From a robustness perspective, we show that the popular $l_2$ weight norm (metric) of the neural network is actually a sufficient condition for grokking. Based on the previous observations, we propose perturbation-based methods to speed up the generalization process. In addition, we examine the standard training process on the modulo addition dataset and find that it hardly learns other basic group operations before grokking, for example, the commutative law. Interestingly, the speed-up of generalization when using our proposed method can be explained by learning the commutative law, a necessary condition when the model groks on the test dataset. We also empirically find that $l_2$ norm correlates with grokking on the test data not in a timely way, we propose new metrics based on robustness and information theory and find that our new metrics correlate well with the grokking phenomenon and may be used to predict grokking.

相互獨立的 · 條件獨立的 · 隨機變量 · Lipschitz · 聯合分布 ·

2024 年 2 月 2 日

Wasserstein Conditional Independence Testing

Andrew Warren

from arxiv, 31 pages. v2 contains major revision to Section 3, plus assorted expository improvements

We introduce a test for the conditional independence of random variables $X$ and $Y$ given a random variable $Z$, specifically by sampling from the joint distribution $(X,Y,Z)$, binning the support of the distribution of $Z$, and conducting multiple $p$-Wasserstein two-sample tests. Under a $p$-Wasserstein Lipschitz assumption on the conditional distributions $\mathcal{L}_{X|Z}$, $\mathcal{L}_{Y|Z}$, and $\mathcal{L}_{(X,Y)|Z}$, we show that it is possible to control the Type I and Type II error of this test, and give examples of explicit finite-sample error bounds in the case where the distribution of $Z$ has compact support.

有向 · 泛函 · 相互獨立的 · 相關系數 · GROUP ·

2024 年 2 月 1 日

Constant Degree Direct Product Testers with Small Soundness

Mitali Bafna,Noam Lifshitz,Dor Minzer

Let $X$ be a $d$-dimensional simplicial complex. A function $F\colon X(k)\to \{0,1\}^k$ is said to be a direct product function if there exists a function $f\colon X(1)\to \{0,1\}$ such that $F(\sigma) = (f(\sigma_1), \ldots, f(\sigma_k))$ for each $k$-face $\sigma$. In an effort to simplify components of the PCP theorem, Goldreich and Safra introduced the problem of direct product testing, which asks whether one can test if $F\colon X(k)\to \{0,1\}^k$ is correlated with a direct product function by querying $F$ on only $2$ inputs. Dinur and Kaufman conjectured that there exist bounded degree complexes with a direct product test in the small soundness regime. We resolve their conjecture by showing that for all $\delta>0$, there exists a family of high-dimensional expanders with degree $O_{\delta}(1)$ and a $2$-query direct product tester with soundness $\delta$. We use the characterization given by a subset of the authors and independently by Dikstein and Dinur, who showed that some form of non-Abelian coboundary expansion (which they called "Unique-Games coboundary expansion") is a necessary and sufficient condition for a complex to admit such direct product testers. Our main technical contribution is a general technique for showing coboundary expansion of complexes with coefficients in a non-Abelian group. This allows us to prove that the high dimensional expanders constructed by Chapman and Lubotzky satisfies the necessary conditions, thus admitting a 2-query direct product tester with small soundness.

情景 · 分解 · 泛函 · Performer · 優化器 ·

2024 年 1 月 31 日

Decomposable Submodular Maximization in Federated Setting

Akbar Rafiey

Submodular functions, as well as the sub-class of decomposable submodular functions, and their optimization appear in a wide range of applications in machine learning, recommendation systems, and welfare maximization. However, optimization of decomposable submodular functions with millions of component functions is computationally prohibitive. Furthermore, the component functions may be private (they might represent user preference function, for example) and cannot be widely shared. To address these issues, we propose a {\em federated optimization} setting for decomposable submodular optimization. In this setting, clients have their own preference functions, and a weighted sum of these preferences needs to be maximized. We implement the popular {\em continuous greedy} algorithm in this setting where clients take parallel small local steps towards the local solution and then the local changes are aggregated at a central server. To address the large number of clients, the aggregation is performed only on a subsampled set. Further, the aggregation is performed only intermittently between stretches of parallel local steps, which reduces communication cost significantly. We show that our federated algorithm is guaranteed to provide a good approximate solution, even in the presence of above cost-cutting measures. Finally, we show how the federated setting can be incorporated in solving fundamental discrete submodular optimization problems such as Maximum Coverage and Facility Location.

多峰值 · 模態 · INFORMS · MoDELS · 可約的 ·

2021 年 6 月 30 日

Attention Bottlenecks for Multimodal Fusion

Arsha Nagrani,Shan Yang,Anurag Arnab,Aren Jansen,Cordelia Schmid,Chen Sun

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.