亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

·

語言模型化 · MoDELS · SimPLe · 穩健性 · Prompt ·

2023 年 11 月 15 日

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

C. Daniel Freeman,Laura Culp,Aaron Parisi,Maxwell L Bileschi,Gamaleldin F Elsayed,Alex Rizkowsky,Isabelle Simpson,Alex Alemi,Azade Nova,Ben Adlam,Bernd Bohnet,Gaurav Mishra,Hanie Sedghi,Igor Mordatch,Izzeddin Gur,Jaehoon Lee,JD Co-Reyes,Jeffrey Pennington,Kelvin Xu,Kevin Swersky,Kshiteej Mahajan,Lechao Xiao,Rosanne Liu,Simon Kornblith,Noah Constant,Peter J. Liu,Roman Novak,Yundi Qian,Noah Fiedel,Jascha Sohl-Dickstein

We introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model alignment. This problem is comprised of arithmetic questions posed in natural language, with an arbitrary adversarial string inserted before the question is complete. Even in the simple setting of 1-digit addition problems, it is easy to find adversarial prompts that make all tested models (including PaLM2, GPT4, Claude2) misbehave, and even to steer models to a particular wrong answer. We additionally provide a simple algorithm for finding successful attacks by querying those same models, which we name "prompt inversion rejection sampling" (PIRS). We finally show that models can be partially hardened against these attacks via reinforcement learning and via agentic constitutional loops. However, we were not able to make a language model fully robust against adversarial arithmetic attacks.

相關內容

語言模型化

語言模型化

Better · 原點 · 論文 · AI · MoDELS ·

2024 年 1 月 7 日

Turing's Test, a Beautiful Thought Experiment

Bernardo Gon?alves

from arxiv, 8.4K words, 9 pages, 3 figures, 3 text boxes, few minor corrections

In the wake of large language models, there has been a resurgence of claims and questions about the Turing test and its value for AI, which are reminiscent of decades of practical "Turing" tests. If AI were quantum physics, by now several "Schr\"odinger's" cats could have been killed. Better late than never, it is time for a historical reconstruction of Turing's beautiful thought experiment. In this paper I present a wealth of evidence, including new archival sources, give original answers to several open questions about Turing's 1950 paper, and address the core question of the value of Turing's test.

統計量 · MoDELS · AI · Continuity · ONCE ·

2024 年 1 月 5 日

Can AI Be as Creative as Humans?

Haonan Wang,James Zou,Michael Mozer,Anirudh Goyal,Alex Lamb,Linjun Zhang,Weijie J Su,Zhun Deng,Michael Qizhe Xie,Hannah Brown,Kenji Kawaguchi

from arxiv, The paper introduce the notion of "Relative Creativity", presents measurable assessment, and provides AI training guidelines to foster AI's creative capabilities Project Page: //ai-relative-creativity.github.io/

Creativity serves as a cornerstone for societal progress and innovation, but its assessment remains a complex and often subjective endeavor. With the rise of advanced generative AI models capable of tasks once reserved for human creativity, the study of AI's creative potential becomes imperative for its responsible development and application. This paper addresses the complexities in defining and evaluating creativity by introducing a new concept called Relative Creativity. Instead of trying to define creativity universally, we shift the focus to whether AI can match the creative abilities of a hypothetical human. This perspective draws inspiration from the Turing Test, expanding upon it to address the challenges and subjectivities inherent in evaluating creativity. This methodological shift facilitates a statistically quantifiable evaluation of AI's creativity, which we term Statistical Creativity. This approach allows for direct comparisons of AI's creative abilities with those of specific human groups. Building on this foundation, we discuss the application of statistical creativity in contemporary prompt-conditioned autoregressive models. In addition to defining and analyzing a measure of creativity, we introduce an actionable training guideline, effectively bridging the gap between theoretical quantification of creativity and practical model training. Through these multifaceted contributions, the paper establishes a cohesive, continuously evolving, and transformative framework for assessing and fostering statistical creativity in AI models.

可約的 · MoDELS · Learning · 評論員 · Boosting（一種模型訓練加速方式） ·

2024 年 1 月 4 日

The Compute Divide in Machine Learning: A Threat to Academic Contribution and Scrutiny?

Tamay Besiroglu,Sage Andrus Bergerson,Amelia Michael,Xueyun Luo,Neil Thompson

There are pronounced differences in the extent to which industrial and academic AI labs use computing resources. We provide a data-driven survey of the role of the compute divide in shaping machine learning research. We show that a compute divide has coincided with a reduced representation of academic-only research teams in compute intensive research topics, especially foundation models. We argue that, academia will likely play a smaller role in advancing the associated techniques, providing critical evaluation and scrutiny, and in the diffusion of such models. Concurrent with this change in research focus, there is a noticeable shift in academic research towards embracing open source, pre-trained models developed within the industry. To address the challenges arising from this trend, especially reduced scrutiny of influential models, we recommend approaches aimed at thoughtfully expanding academic insights. Nationally-sponsored computing infrastructure coupled with open science initiatives could judiciously boost academic compute access, prioritizing research on interpretability, safety and security. Structured access programs and third-party auditing may also allow measured external evaluation of industry systems.

Integration · 優化器 · 操作 · 通道 · SCA ·

2024 年 1 月 3 日

Bidirectional Integrated Sensing and Communication: Full-Duplex or Half-Duplex?

Zhaolin Wang,Xidong Mu,Yuanwei Liu

from arxiv, 15 pages, 10 figures

A bidirectional integrated sensing and communication (ISAC) system is proposed, in which a pair of transceivers carry out two-way communication and mutual sensing. Both full-duplex and half-duplex operations in narrowband and wideband systems are conceived for the bidirectional ISAC. 1) For the narrowband system, the conventional full-duplex and half-duplex operations are redesigned to take into account sensing echo signals. Then, the transmit beamforming design of both transceivers is proposed for addressing the sensing and communication (S&C) tradeoff. A one-layer iterative algorithm relying on successive convex approximation (SCA) is proposed to obtain Karush-Kuhn-Tucker (KKT) optimal solutions. 2) For the wideband system, the new full-duplex and half-duplex operations are proposed for the bidirectional ISAC. In particular, the frequency-selective fading channel is tackled by delay pre-compensation and path-based beamforming. By redesigning the proposed SCA-based algorithm, the KKT optimal solutions for path-based beamforming for characterizing the S&C tradeoff are obtained. Finally, the numerical results show that: i) For both bandwidth scenarios, the existence of the interference introduced by sensing results in full-duplex may not always outperform half-duplex, especially in the sensing-prior regime or when the communication channel is line-of-sight-dominated; and ii) For both duplex operations, it is sufficient to reuse communication signals for sensing in the narrowband system, while an additional dedicated sensing signal is required in the wideband system.

長短期記憶網絡 · RNN · Networking · Weight · MoDELS ·

2020 年 6 月 10 日

Do RNN and LSTM have Long Memory?

Jingyu Zhao,Feiqing Huang,Jia Lv,Yanjie Duan,Zhen Qin,Guodong Li,Guangjian Tian

from arxiv, Accepted by ICML 2020. Added references, experiments and acknowledgements

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

entity · 鏈路預測 · Performer · 圖 · 知識圖譜 ·

2019 年 9 月 26 日

Representation Learning with Ordered Relation Paths for Knowledge Graph Completion

Yao Zhu,Hongzhi Liu,Zhonghai Wu,Yang Song,Tao Zhang

Incompleteness is a common problem for existing knowledge graphs (KGs), and the completion of KG which aims to predict links between entities is challenging. Most existing KG completion methods only consider the direct relation between nodes and ignore the relation paths which contain useful information for link prediction. Recently, a few methods take relation paths into consideration but pay less attention to the order of relations in paths which is important for reasoning. In addition, these path-based models always ignore nonlinear contributions of path features for link prediction. To solve these problems, we propose a novel KG completion method named OPTransE. Instead of embedding both entities of a relation into the same latent space as in previous methods, we project the head entity and the tail entity of each relation into different spaces to guarantee the order of relations in the path. Meanwhile, we adopt a pooling strategy to extract nonlinear and complex features of different paths to further improve the performance of link prediction. Experimental results on two benchmark datasets show that the proposed model OPTransE performs better than state-of-the-art methods.

文本分類 · 語言模型化 · BERT · state-of-the-art · MoDELS ·

2019 年 5 月 14 日

How to Fine-Tune BERT for Text Classification?

Chi Sun,Xipeng Qiu,Yige Xu,Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.

SimPLe · entity · Neural Networks · 自動問答 · Networking ·

2018 年 6 月 5 日

Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks

Salman Mohammed,Peng Shi,Jimmy Lin

from arxiv, Published in NAACL HLT 2018

We examine the problem of question answering over knowledge graphs, focusing on simple questions that can be answered by the lookup of a single fact. Adopting a straightforward decomposition of the problem into entity detection, entity linking, relation prediction, and evidence combination, we explore simple yet strong baselines. On the popular SimpleQuestions dataset, we find that basic LSTMs and GRUs plus a few heuristics yield accuracies that approach the state of the art, and techniques that do not use neural networks also perform reasonably well. These results show that gains from sophisticated deep learning techniques proposed in the literature are quite modest and that some previous models exhibit unnecessary complexity.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

語言模型化

穩健性(xing)

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<li id='dxwo1'></li>

_{^{<dd id='dxwo1'><tbody id='dxwo1'><td id='dxwo1'><optgroup id='dxwo1'><strong id='dxwo1'></strong></optgroup><address id='dxwo1'><ul id='dxwo1'></ul></address><big id='dxwo1'></big></td><table id='dxwo1'></table></tbody><pre id='dxwo1'></pre></dd><span id='dxwo1'><b id='dxwo1'></b></span>}}


<dfn id='dxwo1'><optgroup id='dxwo1'></optgroup></dfn><tfoot id='dxwo1'><bdo id='dxwo1'><div id='dxwo1'></div><i id='dxwo1'><dt id='dxwo1'></dt></i></bdo></tfoot>

_{<fieldset id='dxwo1'></fieldset>}