销魂美女一区二区三区AV,欧美日韩性爱视频免费观看,又爽又黄又无遮掩的视频1,高清少妇熟女一区二区,日韩免费视频在线观看

We provide the first online algorithm for spectral hypergraph sparsification. In the online setting, hyperedges with positive weights are arriving in a stream, and upon the arrival of each hyperedge, we must irrevocably decide whether or not to include it in the sparsifier. Our algorithm produces an $(\epsilon, \delta)$-spectral sparsifier with multiplicative error $\epsilon$ and additive error $\delta$ that has $O(\epsilon^{-2} n (\log n)^2 \log(1 + \epsilon W/\delta n))$ hyperedges with high probability, where $\epsilon, \delta \in (0,1)$, $n$ is the number of nodes, and $W$ is the sum of edge weights. The space complexity of our algorithm is $O(n^2)$, while previous algorithms require the space complexity of $\Omega(m)$, where $m$ is the number of hyperedges. This provides an exponential improvement in the space complexity since $m$ can be exponential in $n$.

相關內容

稀疏化

關注 0

準則 · 平穩的 · SC · Neural Networks · 總回報 ·

2023 年 11 月 20 日

High Probability Guarantees for Random Reshuffling

Hengxu Yu,Xiao Li

from arxiv, 21 pages, 3 figures

We consider the stochastic gradient method with random reshuffling ($\mathsf{RR}$) for tackling smooth nonconvex optimization problems. $\mathsf{RR}$ finds broad applications in practice, notably in training neural networks. In this work, we first investigate the concentration property of $\mathsf{RR}$'s sampling procedure and establish a new high probability sample complexity guarantee for driving the gradient (without expectation) below $\varepsilon$, which effectively characterizes the efficiency of a single $\mathsf{RR}$ execution. Our derived complexity matches the best existing in-expectation one up to a logarithmic term while imposing no additional assumptions nor changing $\mathsf{RR}$'s updating rule. Furthermore, by leveraging our derived high probability descent property and bound on the stochastic error, we propose a simple and computable stopping criterion for $\mathsf{RR}$ (denoted as $\mathsf{RR}$-$\mathsf{sc}$). This criterion is guaranteed to be triggered after a finite number of iterations, and then $\mathsf{RR}$-$\mathsf{sc}$ returns an iterate with its gradient below $\varepsilon$ with high probability. Moreover, building on the proposed stopping criterion, we design a perturbed random reshuffling method ($\mathsf{p}$-$\mathsf{RR}$) that involves an additional randomized perturbation procedure near stationary points. We derive that $\mathsf{p}$-$\mathsf{RR}$ provably escapes strict saddle points and efficiently returns a second-order stationary point with high probability, without making any sub-Gaussian tail-type assumptions on the stochastic gradient errors. Finally, we conduct numerical experiments on neural network training to support our theoretical findings.

state-of-the-art · 樣本 · Integration · Analysis · prototype ·

2023 年 11 月 19 日

Symbolic Execution for Quantum Error Correction Programs

Wang Fang,Mingsheng Ying

We define a symbolic execution framework QSE for quantum programs by integrating symbolic variables into quantum states and the outcomes of quantum measurements. The soundness theorem of QSE is proved. We further introduce symbolic stabilizer states, which facilitate the efficient analysis of quantum error correction programs. Within the QSE framework, we can use symbolic expressions to characterize the possible adversarial errors in quantum error correction, providing a significant improvement over existing methods that rely on sampling with simulators. We implement QSE with the support of symbolic stabilizer states in a prototype tool named QuantumSE.jl. With experiments on representative quantum error correction codes, including quantum repetition codes, Kitaev's toric codes, and quantum Tanner codes, we demonstrate the efficiency of QuantumSE.jl for debugging quantum error correction programs with over 1000 qubits. In addition, as a by-product of QSE, QuantumSE.jl's sampling functionality for stabilizer circuits also outperforms the state-of-the-art stabilizer simulator, Google's Stim, in the experiments.

數據選擇 · 語言模型化 · N元 · 哈希學習 · MoDELS ·

2023 年 11 月 18 日

Data Selection for Language Models via Importance Resampling

Sang Michael Xie,Shibani Santurkar,Tengyu Ma,Percy Liang

from arxiv, NeurIPS 2023

Selecting a suitable pretraining dataset is crucial for both general-domain (e.g., GPT-3) and domain-specific (e.g., Codex) language models (LMs). We formalize this problem as selecting a subset of a large raw unlabeled dataset to match a desired target distribution given unlabeled target samples. Due to the scale and dimensionality of the raw text data, existing methods use simple heuristics or require human experts to manually curate data. Instead, we extend the classic importance resampling approach used in low-dimensions for LM data selection. We propose Data Selection with Importance Resampling (DSIR), an efficient and scalable framework that estimates importance weights in a reduced feature space for tractability and selects data with importance resampling according to these weights. We instantiate the DSIR framework with hashed n-gram features for efficiency, enabling the selection of 100M documents from the full Pile dataset in 4.5 hours. To measure whether hashed n-gram features preserve the aspects of the data that are relevant to the target, we define KL reduction, a data metric that measures the proximity between the selected pretraining data and the target on some feature space. Across 8 data selection methods (including expert selection), KL reduction on hashed n-gram features highly correlates with average downstream accuracy (r=0.82). When selecting data for continued pretraining on a specific domain, DSIR performs comparably to expert curation across 8 target distributions. When pretraining general-domain models (target is Wikipedia and books), DSIR improves over random selection and heuristic filtering baselines by 2-2.5% on the GLUE benchmark. Code is available at //github.com/p-lambda/dsir.

GANs · 稀疏 · Performer · DST (Digital Sky Technologies) · 可約的 ·

2023 年 11 月 18 日

Balanced Training for Sparse GANs

Yite Wang,Jing Wu,Naira Hovakimyan,Ruoyu Sun

from arxiv, Accepted at NeurIPS 2023 (//neurips.cc/virtual/2023/poster/70078). Our code will be released at //github.com/YiteWang/ADAPT

Over the past few years, there has been growing interest in developing larger and deeper neural networks, including deep generative models like generative adversarial networks (GANs). However, GANs typically come with high computational complexity, leading researchers to explore methods for reducing the training and inference costs. One such approach gaining popularity in supervised learning is dynamic sparse training (DST), which maintains good performance while enjoying excellent training efficiency. Despite its potential benefits, applying DST to GANs presents challenges due to the adversarial nature of the training process. In this paper, we propose a novel metric called the balance ratio (BR) to study the balance between the sparse generator and discriminator. We also introduce a new method called balanced dynamic sparse training (ADAPT), which seeks to control the BR during GAN training to achieve a good trade-off between performance and computational cost. Our proposed method shows promising results on multiple datasets, demonstrating its effectiveness.

稀疏 · Learning · 機器人 · 查準率/準確率 · 穩健性 ·

2023 年 11 月 17 日

Learning Agile Locomotion on Risky Terrains

Chong Zhang,Nikita Rudin,David Hoeller,Marco Hutter

from arxiv, 8 pages, 11 figures

Quadruped robots have shown remarkable mobility on various terrains through reinforcement learning. Yet, in the presence of sparse footholds and risky terrains such as stepping stones and balance beams, which require precise foot placement to avoid falls, model-based approaches are often used. In this paper, we show that end-to-end reinforcement learning can also enable the robot to traverse risky terrains with dynamic motions. To this end, our approach involves training a generalist policy for agile locomotion on disorderly and sparse stepping stones before transferring its reusable knowledge to various more challenging terrains by finetuning specialist policies from it. Given that the robot needs to rapidly adapt its velocity on these terrains, we formulate the task as a navigation task instead of the commonly used velocity tracking which constrains the robot's behavior and propose an exploration strategy to overcome sparse rewards and achieve high robustness. We validate our proposed method through simulation and real-world experiments on an ANYmal-D robot achieving peak forward velocity of >= 2.5 m/s on sparse stepping stones and narrow balance beams. Video: youtu.be/Z5X0J8OH6z4

Analysis · Performer · 單純形 · 線性的 · 優化器 ·

2023 年 11 月 16 日

Realistic Runtime Analysis for Quantum Simplex Computation

Sabrina Ammann,Maximilian Hess,Debora Ramacciotti,Sándor P. Fekete,Paulina L. A. Goedicke,David Gross,Andreea Lefterovici,Tobias J. Osborne,Michael Perk,Antonio Rotundo,S. E. Skelton,Sebastian Stiller,Timo de Wolff

from arxiv, 39 pages, 8 figures

In recent years, strong expectations have been raised for the possible power of quantum computing for solving difficult optimization problems, based on theoretical, asymptotic worst-case bounds. Can we expect this to have consequences for Linear and Integer Programming when solving instances of practically relevant size, a fundamental goal of Mathematical Programming, Operations Research and Algorithm Engineering? Answering this question faces a crucial impediment: The lack of sufficiently large quantum platforms prevents performing real-world tests for comparison with classical methods. In this paper, we present a quantum analog for classical runtime analysis when solving real-world instances of important optimization problems. To this end, we measure the expected practical performance of quantum computers by analyzing the expected gate complexity of a quantum algorithm. The lack of practical quantum platforms for experimental comparison is addressed by hybrid benchmarking, in which the algorithm is performed on a classical system, logging the expected cost of the various subroutines that are employed by the quantum versions. In particular, we provide an analysis of quantum methods for Linear Programming, for which recent work has provided asymptotic speedup through quantum subroutines for the Simplex method. We show that a practical quantum advantage for realistic problem sizes would require quantum gate operation times that are considerably below current physical limitations.

語言模型化 · MoDELS · Performer · 張成子空間 · GPT-4 ·

2023 年 11 月 16 日

Large Language Models for Propaganda Span Annotation

Maram Hasanain,Fatema Ahmed,Firoj Alam

from arxiv, propaganda, span detection, disinformation, misinformation, fake news, LLMs, GPT-4

The use of propagandistic techniques in online communication has increased in recent years, aiming to manipulate online audiences. Efforts to automatically detect and debunk such content have been made, addressing various modeling scenarios. These include determining whether the content (text, image, or multimodal) (i) is propagandistic, (ii) employs one or more techniques, and (iii) includes techniques with identifiable spans. Significant research efforts have been devoted to the first two scenarios compared to the latter. Therefore, in this study, we focus on the task of detecting propagandistic textual spans. We investigate whether large language models such as GPT-4 can be utilized to perform the task of an annotator. For the experiments, we used an in-house developed dataset consisting of annotations from multiple annotators. Our results suggest that providing more information to the model as prompts improves the annotation agreement and performance compared to human annotations. We plan to make the annotated labels from multiple annotators, including GPT-4, available for the community.

學成 · 分子建模 · 深度學習 · Neural Networks · Processing（編程語言） ·

2021 年 7 月 26 日

Geometric Deep Learning on Molecular Representations

Kenneth Atz,Francesca Grisoni,Gisbert Schneider

Geometric deep learning (GDL), which is based on neural network architectures that incorporate and process symmetry information, has emerged as a recent paradigm in artificial intelligence. GDL bears particular promise in molecular modeling applications, in which various molecular representations with different symmetry properties and levels of abstraction exist. This review provides a structured and harmonized overview of molecular GDL, highlighting its applications in drug discovery, chemical synthesis prediction, and quantum chemistry. Emphasis is placed on the relevance of the learned molecular features and their complementarity to well-established molecular descriptors. This review provides an overview of current challenges and opportunities, and presents a forecast of the future of GDL for molecular sciences.

多峰值 · 模態 · INFORMS · MoDELS · 可約的 ·

2021 年 6 月 30 日

Attention Bottlenecks for Multimodal Fusion

Arsha Nagrani,Shan Yang,Anurag Arnab,Aren Jansen,Cordelia Schmid,Chen Sun

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.

entity · 圖 · 知識圖譜 · MoDELS · 相似度 ·

2019 年 9 月 11 日

Domain Representation for Knowledge Graph Embedding

Cunxiang Wang,Feiliang Ren,Zhichao Lin,Chenxv Zhao,Tian Xie,Yue Zhang

from arxiv, Acceptted by NLPCC2019

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.