一区二区三区四区五区无码_伊人久久大香线蕉精品69_久久久久久国产免费A片_国产又色又爽又黄刺激视_亚洲AV永久无码野狼在线观看_亚洲欧美日韩在线超精品_中文欧美日韩久久

Recent improvements in adder optimization could be achieved by optimizing the AND-trees occurring within the constructed circuits. The overlap of such trees and its potential for pure size optimization has not been taken into account though. Motivated by this, we examine the fundamental problem of minimizing the size of a circuit for multiple AND-functions on intersecting variable sets. Our formulation generalizes the overlapping \AND-trees within adder optimization but is in NP, in contrast to general Boolean circuit optimization which is in $\Sigma_2^p$ (and thus suspected not to be in NP). While restructuring the AND- or XOR-trees simultaneously, we optimize the total number of gates needed for all functions to be computed. We show that this problem is APX-hard already for functions of few variables and present efficient approximation algorithms for the case in which the Boolean functions depend on at most 3 or 4 variables each, achieving guarantees of $\frac 43$ and $1.9$, respectively. To conclude, we give a polynomial approximation algorithm with guarantee $\frac 23k$ for AND-functions of up to $k$ variables. To achieve these results, the key technique is to determine how much overlap among the variable sets makes tree construction cheap and how little makes the optimum solution large.

相關內容

優化器

關注 4

方差 · 幾乎處處 · 決策樹 · 模式識別 · 大學 ·

2024 年 2 月 21 日

Aaronson-Ambainis Conjecture Is True For Random Restrictions

Sreejata Kishor Bhattacharya

In an attempt to show that the acceptance probability of a quantum query algorithm making $q$ queries can be well-approximated almost everywhere by a classical decision tree of depth $\leq \text{poly}(q)$, Aaronson and Ambainis proposed the following conjecture: let $f: \{ \pm 1\}^n \rightarrow [0,1]$ be a degree $d$ polynomial with variance $\geq \epsilon$. Then, there exists a coordinate of $f$ with influence $\geq \text{poly} (\epsilon, 1/d)$. We show that for any polynomial $f: \{ \pm 1\}^n \rightarrow [0,1]$ of degree $d$ $(d \geq 2)$ and variance $\text{Var}[f] \geq 1/d$, if $\rho$ denotes a random restriction with survival probability $\dfrac{\log(d)}{C_1 d}$, $$ \text{Pr} \left[f_{\rho} \text{ has a coordinate with influence} \geq \dfrac{\text{Var}[f]^2 }{d^{C_2}} \right] \geq \dfrac{\text{Var}[f] \log(d)}{50C_1 d}$$ where $C_1, C_2>0$ are universal constants. Thus, Aaronson-Ambainis conjecture is true for a non-negligible fraction of random restrictions of the given polynomial assuming its variance is not too low.

MIMO · Performer · 可約的 · GNN · massive MIMO ·

2024 年 2 月 21 日

Near-Field Multiuser Beam-Training for Extremely Large-Scale MIMO Systems

Wang Liu,Cunhua Pan,Hong Ren,Jiangzhou Wang,Robert Schober,Lajos Hanzo

from arxiv, submitted to IEEE

Extremely large-scale multiple-input multiple-output (XL-MIMO) systems are capable of improving spectral efficiency by employing far more antennas than conventional massive MIMO at the base station (BS). However, beam training in multiuser XL-MIMO systems is challenging. To tackle these issues, we conceive a three-phase graph neural network (GNN)-based beam training scheme for multiuser XL-MIMO systems. In the first phase, only far-field wide beams have to be tested for each user and the GNN is utilized to map the beamforming gain information of the far-field wide beams to the optimal near-field beam for each user. In addition, the proposed GNN-based scheme can exploit the position-correlation between adjacent users for further improvement of the accuracy of beam training. In the second phase, a beam allocation scheme based on the probability vectors produced at the outputs of GNNs is proposed to address the above beam-direction conflicts between users. In the third phase, the hybrid TBF is designed for further reducing the inter-user interference. Our simulation results show that the proposed scheme improves the beam training performance of the benchmarks. Moreover, the performance of the proposed beam training scheme approaches that of an exhaustive search, despite requiring only about 7% of the pilot overhead.

Performer · MoDELS · ForCES · 標記空間 · 相關系數 ·

2024 年 2 月 21 日

Robustness-Guided Image Synthesis for Data-Free Quantization

Jianhong Bai,Yuchen Yang,Huanpeng Chu,Hualiang Wang,Zuozhu Liu,Ruizhe Chen,Xiaoxuan He,Lianrui Mu,Chengfei Cai,Haoji Hu

from arxiv, Accepted at AAAI 2024

Quantization has emerged as a promising direction for model compression. Recently, data-free quantization has been widely studied as a promising method to avoid privacy concerns, which synthesizes images as an alternative to real training data. Existing methods use classification loss to ensure the reliability of the synthesized images. Unfortunately, even if these images are well-classified by the pre-trained model, they still suffer from low semantics and homogenization issues. Intuitively, these low-semantic images are sensitive to perturbations, and the pre-trained model tends to have inconsistent output when the generator synthesizes an image with poor semantics. To this end, we propose Robustness-Guided Image Synthesis (RIS), a simple but effective method to enrich the semantics of synthetic images and improve image diversity, further boosting the performance of downstream data-free compression tasks. Concretely, we first introduce perturbations on input and model weight, then define the inconsistency metrics at feature and prediction levels before and after perturbations. On the basis of inconsistency on two levels, we design a robustness optimization objective to enhance the semantics of synthetic images. Moreover, we also make our approach diversity-aware by forcing the generator to synthesize images with small correlations in the label space. With RIS, we achieve state-of-the-art performance for various settings on data-free quantization and can be extended to other data-free compression tasks.

層 · MoDELS · Learning · 統計量 · 相關系數 ·

2024 年 2 月 20 日

Semantic Image Synthesis via Class-Adaptive Cross-Attention

Tomaso Fontanini,Claudio Ferrari,Giuseppe Lisanti,Massimo Bertozzi,Andrea Prati

from arxiv, Code and models available at //github.com/TFonta/CA2SIS

In semantic image synthesis the state of the art is dominated by methods that use customized variants of the SPatially-Adaptive DE-normalization (SPADE) layers, which allow for good visual generation quality and editing versatility. By design, such layers learn pixel-wise modulation parameters to de-normalize the generator activations based on the semantic class each pixel belongs to. Thus, they tend to overlook global image statistics, ultimately leading to unconvincing local style editing and causing global inconsistencies such as color or illumination distribution shifts. Also, SPADE layers require the semantic segmentation mask for mapping styles in the generator, preventing shape manipulations without manual intervention. In response, we designed a novel architecture where cross-attention layers are used in place of SPADE for learning shape-style correlations and so conditioning the image generation process. Our model inherits the versatility of SPADE, at the same time obtaining state-of-the-art generation quality, as well as improved global and local style transfer. Code and models available at //github.com/TFonta/CA2SIS.

Branch · contrastive · 偽標記 · Learning · INTERACT ·

2024 年 2 月 20 日

Towards Distribution-Agnostic Generalized Category Discovery

Jianhong Bai,Zuozhu Liu,Hualiang Wang,Ruizhe Chen,Lianrui Mu,Xiaomeng Li,Joey Tianyi Zhou,Yang Feng,Jian Wu,Haoji Hu

from arxiv, Accepted at NeurIPS 2023

Data imbalance and open-ended distribution are two intrinsic characteristics of the real visual world. Though encouraging progress has been made in tackling each challenge separately, few works dedicated to combining them towards real-world scenarios. While several previous works have focused on classifying close-set samples and detecting open-set samples during testing, it's still essential to be able to classify unknown subjects as human beings. In this paper, we formally define a more realistic task as distribution-agnostic generalized category discovery (DA-GCD): generating fine-grained predictions for both close- and open-set classes in a long-tailed open-world setting. To tackle the challenging problem, we propose a Self-Balanced Co-Advice contrastive framework (BaCon), which consists of a contrastive-learning branch and a pseudo-labeling branch, working collaboratively to provide interactive supervision to resolve the DA-GCD task. In particular, the contrastive-learning branch provides reliable distribution estimation to regularize the predictions of the pseudo-labeling branch, which in turn guides contrastive learning through self-balanced knowledge transfer and a proposed novel contrastive loss. We compare BaCon with state-of-the-art methods from two closely related fields: imbalanced semi-supervised learning and generalized category discovery. The effectiveness of BaCon is demonstrated with superior performance over all baselines and comprehensive analysis across various datasets. Our code is publicly available.

分解的 · MoDELS · 語音合成 · 知識 (knowledge) · Performer ·

2024 年 2 月 19 日

Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting

Haolin Chen,Philip N. Garner

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Although motivated by the adaptation of text-to-speech synthesis models, we argue that more generic parameter-efficient fine-tuning (PEFT) is an appropriate framework to do such adaptation. However, catastrophic forgetting remains an issue with PEFT, damaging the pre-trained model's inherent capabilities. We demonstrate that existing Bayesian learning techniques can be applied to PEFT to prevent catastrophic forgetting as long as the parameter shift of the fine-tuned layers can be calculated differentiably. In a principled series of experiments on language modeling and speech synthesis tasks, we utilize established Laplace approximations, including diagonal and Kronecker factored approaches, to regularize PEFT with the low-rank adaptation (LoRA) and compare their performance in pre-training knowledge preservation. Our results demonstrate that catastrophic forgetting can be overcome by our methods without degrading the fine-tuning performance, and using the Kronecker factored approximations produces a better preservation of the pre-training knowledge than the diagonal ones.

PDE · Better · Learning · 優化器 · MoDELS ·

2024 年 2 月 19 日

Better Neural PDE Solvers Through Data-Free Mesh Movers

Peiyan Hu,Yue Wang,Zhi-Ming Ma

Recently, neural networks have been extensively employed to solve partial differential equations (PDEs) in physical system modeling. While major studies focus on learning system evolution on predefined static mesh discretizations, some methods utilize reinforcement learning or supervised learning techniques to create adaptive and dynamic meshes, due to the dynamic nature of these systems. However, these approaches face two primary challenges: (1) the need for expensive optimal mesh data, and (2) the change of the solution space's degree of freedom and topology during mesh refinement. To address these challenges, this paper proposes a neural PDE solver with a neural mesh adapter. To begin with, we introduce a novel data-free neural mesh adaptor, called Data-free Mesh Mover (DMM), with two main innovations. Firstly, it is an operator that maps the solution to adaptive meshes and is trained using the Monge-Amp\`ere equation without optimal mesh data. Secondly, it dynamically changes the mesh by moving existing nodes rather than adding or deleting nodes and edges. Theoretical analysis shows that meshes generated by DMM have the lowest interpolation error bound. Based on DMM, to efficiently and accurately model dynamic systems, we develop a moving mesh based neural PDE solver (MM-PDE) that embeds the moving mesh with a two-branch architecture and a learnable interpolation framework to preserve information within the data. Empirical experiments demonstrate that our method generates suitable meshes and considerably enhances accuracy when modeling widely considered PDE systems. The code can be found at: //github.com/Peiyannn/MM-PDE.git.

Performer · 變換 · Processing（編程語言） · 點云 · 潛變量/隱變量 ·

2024 年 2 月 19 日

Perceiving Longer Sequences With Bi-Directional Cross-Attention Transformers

Markus Hiller,Krista A. Ehinger,Tom Drummond

from arxiv, Preprint

We present a novel bi-directional Transformer architecture (BiXT) which scales linearly with input size in terms of computational cost and memory consumption, but does not suffer the drop in performance or limitation to only one input modality seen with other efficient Transformer-based approaches. BiXT is inspired by the Perceiver architectures but replaces iterative attention with an efficient bi-directional cross-attention module in which input tokens and latent variables attend to each other simultaneously, leveraging a naturally emerging attention-symmetry between the two. This approach unlocks a key bottleneck experienced by Perceiver-like architectures and enables the processing and interpretation of both semantics (`what') and location (`where') to develop alongside each other over multiple layers -- allowing its direct application to dense and instance-based tasks alike. By combining efficiency with the generality and performance of a full Transformer architecture, BiXT can process longer sequences like point clouds or images at higher feature resolutions and achieves competitive performance across a range of tasks like point cloud part segmentation, semantic image segmentation and image classification.

Cognition · MoDELS · 語言模型化 · Processing（編程語言） · 大語言模型 ·

2024 年 2 月 18 日

Metacognitive Retrieval-Augmented Large Language Models

Yujia Zhou,Zheng Liu,Jiajie Jin,Jian-Yun Nie,Zhicheng Dou

from arxiv, Accepted by WWW 2024

Retrieval-augmented generation have become central in natural language processing due to their efficacy in generating factual content. While traditional methods employ single-time retrieval, more recent approaches have shifted towards multi-time retrieval for multi-hop reasoning tasks. However, these strategies are bound by predefined reasoning steps, potentially leading to inaccuracies in response generation. This paper introduces MetaRAG, an approach that combines the retrieval-augmented generation process with metacognition. Drawing from cognitive psychology, metacognition allows an entity to self-reflect and critically evaluate its cognitive processes. By integrating this, MetaRAG enables the model to monitor, evaluate, and plan its response strategies, enhancing its introspective reasoning abilities. Through a three-step metacognitive regulation pipeline, the model can identify inadequacies in initial cognitive responses and fixes them. Empirical evaluations show that MetaRAG significantly outperforms existing methods.

自動問答 · MoDELS · Networking · Processing（編程語言） · state-of-the-art ·

2018 年 6 月 1 日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Mantong Zhou,Minlie Huang,Xiaoyan Zhu

from arxiv, COLING 2018, 13pages

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.