91婷婷国产精选国产色-亚洲AV久播在线观看

Artificial neural networks typically struggle in generalizing to out-of-context examples. One reason for this limitation is caused by having datasets that incorporate only partial information regarding the potential correlational structure of the world. In this work, we propose TIDA (Targeted Image-editing Data Augmentation), a targeted data augmentation method focused on improving models' human-like abilities (e.g., gender recognition) by filling the correlational structure gap using a text-to-image generative model. More specifically, TIDA identifies specific skills in captions describing images (e.g., the presence of a specific gender in the image), changes the caption (e.g., "woman" to "man"), and then uses a text-to-image model to edit the image in order to match the novel caption (e.g., uniquely changing a woman to a man while maintaining the context identical). Based on the Flickr30K benchmark, we show that, compared with the original data set, a TIDA-enhanced dataset related to gender, color, and counting abilities induces better performance in several image captioning metrics. Furthermore, on top of relying on the classical BLEU metric, we conduct a fine-grained analysis of the improvements of our models against the baseline in different ways. We compared text-to-image generative models and found different behaviors of the image captioning models in terms of encoding visual encoding and textual decoding.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 不變 · 代碼 · SimPLe · Microsoft Surface ·

2023 年 11 月 13 日

Quotient Space Quantum Codes

JingLei Xia

Quantum error-correcting codes are crucial for quantum computing and communication. Currently, these codes are mainly categorized into additive, non-additive, and surface codes. Additive and non-additive codes utilize one or more invariant subspaces of the stabilizer G to construct quantum codes. Therefore, the selection of these invariant subspaces is a key issue. In this paper, we propose a solution to this problem by introducing quotient space codes and a construction method for quotient space quantum codes. This new framework unifies additive and non-additive quantum codes. We demonstrate the codeword stabilizer codes as a special case within this framework and supplement its error-correction distance. Furthermore, we provide a simple proof of the Singleton bound for this quantum code by establishing the code bound of quotient space codes and discuss the code bounds for pure and impure codes. The quotient space approach offers a concise and clear mathematical form for the study of quantum codes.

Neural Networks · Networking · GROUP · 等變 · Weight ·

2023 年 11 月 10 日

Compact Matrix Quantum Group Equivariant Neural Networks

Edward Pearce-Crump

from arxiv, 15 pages

We derive the existence of a new type of neural network, called a compact matrix quantum group equivariant neural network, that learns from data that has an underlying quantum symmetry. We apply the Woronowicz formulation of Tannaka-Krein duality to characterise the weight matrices that appear in these neural networks for any easy compact matrix quantum group. We show that compact matrix quantum group equivariant neural networks contain, as a subclass, all compact matrix group equivariant neural networks. Moreover, we obtain characterisations of the weight matrices for many compact matrix group equivariant neural networks that have not previously appeared in the machine learning literature.

語言模型化 · INFORMS · MoDELS · 知識 (knowledge) · DATE ·

2023 年 11 月 10 日

Large Language Models are Zero Shot Hypothesis Proposers

Biqing Qi,Kaiyan Zhang,Haoxiang Li,Kai Tian,Sihang Zeng,Zhang-Ren Chen,Bowen Zhou

from arxiv, Instruction Workshop @ NeurIPS 2023

Significant scientific discoveries have driven the progress of human civilisation. The explosion of scientific literature and data has created information barriers across disciplines that have slowed the pace of scientific discovery. Large Language Models (LLMs) hold a wealth of global and interdisciplinary knowledge that promises to break down these information barriers and foster a new wave of scientific discovery. However, the potential of LLMs for scientific discovery has not been formally explored. In this paper, we start from investigating whether LLMs can propose scientific hypotheses. To this end, we construct a dataset consist of background knowledge and hypothesis pairs from biomedical literature. The dataset is divided into training, seen, and unseen test sets based on the publication date to control visibility. We subsequently evaluate the hypothesis generation capabilities of various top-tier instructed models in zero-shot, few-shot, and fine-tuning settings, including both closed and open-source LLMs. Additionally, we introduce an LLM-based multi-agent cooperative framework with different role designs and external tools to enhance the capabilities related to generating hypotheses. We also design four metrics through a comprehensive review to evaluate the generated hypotheses for both ChatGPT-based and human evaluations. Through experiments and analyses, we arrive at the following findings: 1) LLMs surprisingly generate untrained yet validated hypotheses from testing literature. 2) Increasing uncertainty facilitates candidate generation, potentially enhancing zero-shot hypothesis generation capabilities. These findings strongly support the potential of LLMs as catalysts for new scientific discoveries and guide further exploration.

INFORMS · 信息檢索 · 服務器 · 線性的 · 代碼 ·

2023 年 11 月 10 日

Single Server Private Information Retrieval Protocols With Codes Over Rings

?eyma Bodur,Edgar Martínez-Moro,Diego Ruano

A Private Information Retrieval (PIR) protocol based on coding theory for a single server is proposed. It provides computational security against linear algebra attacks, addressing the main drawback of previous PIR proposals based on coding theory. The approach involves two types of codes each one over a different ring, an inner non-free linear code that will be used as a distinguisher of some elements added to the query matrix, and an outer code that will be used for generating the query matrix. Moreover, it only uses modular arithmetic at the server level and the recovering stage if the base ring chosen for the inner code is $\mathbb Z_m$.

Learning · 自編碼器 · 學習器 · contrastive · 掩碼 ·

2023 年 11 月 9 日

Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners

Hao Quan,Xingyu Li,Weixing Chen,Qun Bai,Mingchen Zou,Ruijie Yang,Tingting Zheng,Ruiqun Qi,Xinghua Gao,Xiaoyu Cui

Based on digital pathology slice scanning technology, artificial intelligence algorithms represented by deep learning have achieved remarkable results in the field of computational pathology. Compared to other medical images, pathology images are more difficult to annotate, and thus, there is an extreme lack of available datasets for conducting supervised learning to train robust deep learning models. In this paper, we propose a self-supervised learning (SSL) model, the global contrast-masked autoencoder (GCMAE), which can train the encoder to have the ability to represent local-global features of pathological images, also significantly improve the performance of transfer learning across data sets. In this study, the ability of the GCMAE to learn migratable representations was demonstrated through extensive experiments using a total of three different disease-specific hematoxylin and eosin (HE)-stained pathology datasets: Camelyon16, NCTCRC and BreakHis. In addition, this study designed an effective automated pathology diagnosis process based on the GCMAE for clinical applications. The source code of this paper is publicly available at //github.com/StarUniversus/gcmae.

圖卷積神經網絡/圖卷積網絡 · 圖 · entity · 圖卷積 · 卷積 ·

2021 年 4 月 23 日

Knowledge Embedding Based Graph Convolutional Network

Donghan Yu,Yiming Yang,Ruohong Zhang,Yuexin Wu

from arxiv, WWW 2021

Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification.

可辨認的 · 目標檢測 · 可約的 · 示例 · 學成 ·

2021 年 3 月 3 日

Towards Open World Object Detection

K J Joseph,Salman Khan,Fahad Shahbaz Khan,Vineeth N Balasubramanian

from arxiv, To appear in CVPR 2021 as an ORAL paper. Code is available in //github.com/JosephKJ/OWOD

Humans have a natural instinct to identify unknown object instances in their environments. The intrinsic curiosity about these unknown instances aids in learning about them, when the corresponding knowledge is eventually available. This motivates us to propose a novel computer vision problem called: `Open World Object Detection', where a model is tasked to: 1) identify objects that have not been introduced to it as `unknown', without explicit supervision to do so, and 2) incrementally learn these identified unknown categories without forgetting previously learned classes, when the corresponding labels are progressively received. We formulate the problem, introduce a strong evaluation protocol and provide a novel solution, which we call ORE: Open World Object Detector, based on contrastive clustering and energy based unknown identification. Our experimental evaluation and ablation studies analyze the efficacy of ORE in achieving Open World objectives. As an interesting by-product, we find that identifying and characterizing unknown instances helps to reduce confusion in an incremental object detection setting, where we achieve state-of-the-art performance, with no extra methodological effort. We hope that our work will attract further research into this newly identified, yet crucial research direction.

圖形處理器 · 圖 · Neural Networks · Networking · Performer ·

2021 年 2 月 13 日

How Framelets Enhance Graph Neural Networks

Xuebin Zheng,Bingxin Zhou,Junbin Gao,Yu Guang Wang,Pietro Lio,Ming Li,Guido Montufar

from arxiv, 24 pages, 17 figures, 6 tables

This paper presents a new approach for assembling graph neural networks based on framelet transforms. The latter provides a multi-scale representation for graph-structured data. With the framelet system, we can decompose the graph feature into low-pass and high-pass frequencies as extracted features for network training, which then defines a framelet-based graph convolution. The framelet decomposition naturally induces a graph pooling strategy by aggregating the graph feature into low-pass and high-pass spectra, which considers both the feature values and geometry of the graph data and conserves the total information. The graph neural networks with the proposed framelet convolution and pooling achieve state-of-the-art performance in many types of node and graph prediction tasks. Moreover, we propose shrinkage as a new activation for the framelet convolution, which thresholds the high-frequency information at different scales. Compared to ReLU, shrinkage in framelet convolution improves the graph neural network model in terms of denoising and signal compression: noises in both node and structure can be significantly reduced by accurately cutting off the high-pass coefficients from framelet decomposition, and the signal can be compressed to less than half its original size with the prediction performance well preserved.

矩陣微積分 · 可理解性 · 學成 · Neural Networks · Networks ·

2018 年 7 月 2 日

The Matrix Calculus You Need For Deep Learning

Terence Parr,Jeremy Howard

from arxiv, PDF version of mobile/web friendly version //explained.ai/matrix-calculus/index.html

This paper is an attempt to explain all the matrix calculus you need in order to understand the training of deep neural networks. We assume no math knowledge beyond what you learned in calculus 1, and provide links to help you refresh the necessary math where needed. Note that you do not need to understand this material before you start learning to train and use deep learning in practice; rather, this material is for those who are already familiar with the basics of neural networks, and wish to deepen their understanding of the underlying math. Don't worry if you get stuck at some point along the way---just go back and reread the previous section, and try writing down and working through some examples. And if you're still stuck, we're happy to answer your questions in the Theory category at forums.fast.ai. Note: There is a reference section at the end of the paper summarizing all the key matrix calculus rules and terminology discussed here. See related articles at //explained.ai

BLEU · MoDELS · 注意力機制 · Transformer · Networking ·

2017 年 12 月 6 日

Attention Is All You Need

Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin

from arxiv, 15 pages, 5 figures

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.