91婷婷国产精选国产色,精品国产91久久久久久久下载,日韩精品无码免费视频网站观看,日韩一区二区三区无码免费看

We propose a framework to evaluate the random-coding union bound with parameter $s$ on the achievable error probability in the finite-blocklength regime for a pilot-assisted transmission scheme operating over an imperfectly synchronized and memoryless block-fading waveform channel. Unlike previous results, which disregard the effects of imperfect synchronization, our framework utilizes pilots for both synchronization and channel estimation. Specifically, we provide an algorithm to perform joint synchronization and channel estimation and verify its accuracy by observing its tightness in comparison with the Cramer-Rao bound. Then, we develop an RCUs bound on the error probability, which applies for a receiver that treats the estimates provided by the algorithm as accurate. Additionally, we utilize the saddlepoint approximation to provide a numerically efficient method for evaluating the RCUs bound in this scenario. Our numerical experiments verify the accuracy of the proposed approximation. Moreover, when transmission blocks are received synchronously, numerical results indicate that the number of pilot symbols needed to estimate the fading channel gains to the level of accuracy required in ultra-reliable low-latency communication is also sufficient to acquire sufficiently good synchronization. However, when the blocks are received asynchronously, synchronization becomes the bottleneck for the system performance.

相關內容

估計/估計量

關注 0

MoDELS · 樣本 · Performer · 測試樣本 · 秩 ·

2024 年 2 月 28 日

How Much Annotation is Needed to Compare Summarization Models?

Chantal Shaib,Joe Barrow,Alexa F. Siu,Byron C. Wallace,Ani Nenkova

from arxiv, Preprint

Modern instruction-tuned models have become highly capable in text generation tasks such as summarization, and are expected to be released at a steady pace. In practice one may now wish to choose confidently, but with minimal effort, the best performing summarization model when applied to a new domain or purpose. In this work, we empirically investigate the test sample size necessary to select a preferred model in the context of news summarization. Empirical results reveal that comparative evaluation converges quickly for both automatic and human evaluation, with clear preferences for a system emerging from under 100 examples. The human preference data allows us to quantify how well automatic scores can reproduce preference rankings across a variety of downstream summarization tasks. We find that, while automatic metrics are stable at smaller sample sizes, only some automatic metrics are able to moderately predict model win rates according to human preference.

自動問答 · INFORMS · GPT-3 · MoDELS · 可辨認的 ·

2024 年 2 月 28 日

RealTime QA: What's the Answer Right Now?

Jungo Kasai,Keisuke Sakaguchi,Yoichi Takahashi,Ronan Le Bras,Akari Asai,Xinyan Yu,Dragomir Radev,Noah A. Smith,Yejin Choi,Kentaro Inui

from arxiv, RealTime QA Website: //realtimeqa.github.io/

We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applications. We build strong baseline models upon large pretrained language models, including GPT-3 and T5. Our benchmark is an ongoing effort, and this paper presents real-time evaluation results over the past year. Our experimental results show that GPT-3 can often properly update its generation results, based on newly-retrieved documents, highlighting the importance of up-to-date information retrieval. Nonetheless, we find that GPT-3 tends to return outdated answers when retrieved documents do not provide sufficient information to find an answer. This suggests an important avenue for future research: can an open-domain QA system identify such unanswerable cases and communicate with the user or even the retrieval module to modify the retrieval results? We hope that REALTIME QA will spur progress in instantaneous applications of question answering and beyond.

變換 · Performer · Learning · CASES · AIM ·

2024 年 2 月 27 日

Case-Based or Rule-Based: How Do Transformers Do the Math?

Yi Hu,Xiaojuan Tang,Haotong Yang,Muhan Zhang

Despite the impressive performance in a variety of complex tasks, modern large language models (LLMs) still have trouble dealing with some math problems that are simple and intuitive for humans, such as addition. While we can easily learn basic rules of addition and apply them to new problems of any length, LLMs struggle to do the same. Instead, they may rely on similar "cases" seen in the training corpus for help. We define these two different reasoning mechanisms as "rule-based reasoning" and "case-based reasoning". Since rule-based reasoning is essential for acquiring the systematic generalization ability, we aim to explore exactly whether transformers use rule-based or case-based reasoning for math problems. Through carefully designed intervention experiments on five math tasks, we confirm that transformers are performing case-based reasoning, no matter whether scratchpad is used, which aligns with the previous observations that transformers use subgraph matching/shortcut learning to reason. To mitigate such problems, we propose a Rule-Following Fine-Tuning (RFFT) technique to teach transformers to perform rule-based reasoning. Specifically, we provide explicit rules in the input and then instruct transformers to recite and follow the rules step by step. Through RFFT, we successfully enable LLMs fine-tuned on 1-5 digit addition to generalize to up to 12-digit addition with over 95% accuracy, which is over 40% higher than scratchpad. The significant improvement demonstrates that teaching LLMs to explicitly use rules helps them learn rule-based reasoning and generalize better in length.

視頻描述生成（Video Caption） · 多峰值 · Learning · 知識 (knowledge) · Performer ·

2024 年 2 月 27 日

MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning

Huiyu Xiong,Lanxiao Wang,Heqian Qiu,Taijin Zhao,Benliu Qiu,Hongliang Li

from arxiv, 13 pages

To address the problem of catastrophic forgetting due to the invisibility of old categories in sequential input, existing work based on relatively simple categorization tasks has made some progress. In contrast, video captioning is a more complex task in multimodal scenario, which has not been explored in the field of incremental learning. After identifying this stability-plasticity problem when analyzing video with sequential input, we originally propose a method to Mitigate Catastrophic Forgetting in class-incremental learning for multimodal Video Captioning (MCF-VC). As for effectively maintaining good performance on old tasks at the macro level, we design Fine-grained Sensitivity Selection (FgSS) based on the Mask of Linear's Parameters and Fisher Sensitivity to pick useful knowledge from old tasks. Further, in order to better constrain the knowledge characteristics of old and new tasks at the specific feature level, we have created the Two-stage Knowledge Distillation (TsKD), which is able to learn the new task well while weighing the old task. Specifically, we design two distillation losses, which constrain the cross modal semantic information of semantic attention feature map and the textual information of the final outputs respectively, so that the inter-model and intra-model stylized knowledge of the old class is retained while learning the new class. In order to illustrate the ability of our model to resist forgetting, we designed a metric CIDER_t to detect the stage forgetting rate. Our experiments on the public dataset MSR-VTT show that the proposed method significantly resists the forgetting of previous tasks without replaying old samples, and performs well on the new task.

MoDELS · 前向 · 變分自編碼 · 推斷 · 多峰值 ·

2024 年 2 月 27 日

Material Microstructure Design Using VAE-Regression with Multimodal Prior

Avadhut Sardeshmukh,Sreedhar Reddy,BP Gautham,Pushpak Bhattacharyya

from arxiv, 12 pages main paper, 9 pages appendix. 10 tables and 11 figures. Accepted for publication in PAKDD 2024

We propose a variational autoencoder (VAE)-based model for building forward and inverse structure-property linkages, a problem of paramount importance in computational materials science. Our model systematically combines VAE with regression, linking the two models through a two-level prior conditioned on the regression variables. The regression loss is optimized jointly with the reconstruction loss of the variational autoencoder, learning microstructure features relevant for property prediction and reconstruction. The resultant model can be used for both forward and inverse prediction i.e., for predicting the properties of a given microstructure as well as for predicting the microstructure required to obtain given properties. Since the inverse problem is ill-posed (one-to-many), we derive the objective function using a multi-modal Gaussian mixture prior enabling the model to infer multiple microstructures for a target set of properties. We show that for forward prediction, our model is as accurate as state-of-the-art forward-only models. Additionally, our method enables direct inverse inference. We show that the microstructures inferred using our model achieve desired properties reasonably accurately, avoiding the need for expensive optimization loops.

MoDELS · CASES · 泛函 · 不變 · 講稿 ·

2024 年 2 月 26 日

Structure-Preserving Numerical Methods for Two Nonlinear Systems of Dispersive Wave Equations

Joshua Lampert,Hendrik Ranocha

We use the general framework of summation by parts operators to construct conservative, entropy-stable and well-balanced semidiscretizations of two different nonlinear systems of dispersive shallow water equations with varying bathymetry: (i) a variant of the coupled Benjamin-Bona-Mahony (BBM) equations and (ii) a recently proposed model by Sv\"ard and Kalisch (2023) with enhanced dispersive behavior. Both models share the property of being conservative in terms of a nonlinear invariant, often interpreted as entropy function. This property is preserved exactly in our novel semidiscretizations. To obtain fully-discrete entropy-stable schemes, we employ the relaxation method. We present improved numerical properties of our schemes in some test cases.

真實值 · 可辨認的 · 數據集 · HTTPS · 計算學習理論 ·

2021 年 12 月 15 日

Do Feature Attribution Methods Correctly Attribute Features?

Yilun Zhou,Serena Booth,Marco Tulio Ribeiro,Julie Shah

from arxiv, AAAI 2022. Video summary at //www.youtube.com/watch?v=kAodFw6jvvo

Feature attribution methods are popular in interpretable machine learning. These methods compute the attribution of each input feature to represent its importance, but there is no consensus on the definition of "attribution", leading to many competing methods with little systematic evaluation, complicated in particular by the lack of ground truth attribution. To address this, we propose a dataset modification procedure to induce such ground truth. Using this procedure, we evaluate three common methods: saliency maps, rationales, and attentions. We identify several deficiencies and add new perspectives to the growing body of evidence questioning the correctness and reliability of these methods applied on datasets in the wild. We further discuss possible avenues for remedy and recommend new attribution methods to be tested against ground truth before deployment. The code is available at \url{//github.com/YilunZhou/feature-attribution-evaluation}.

全局極小值 · 優化器 · 極小值 · 非凸 · 近似 ·

2021 年 3 月 24 日

Why Do Local Methods Solve Nonconvex Problems?

Tengyu Ma

from arxiv, This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.

AdderNet · Neural Networks · Networking · 卷積 · 模型評估 ·

2019 年 12 月 31 日

AdderNet: Do We Really Need Multiplications in Deep Learning?

Hanting Chen,Yunhe Wang,Chunjing Xu,Boxin Shi,Chao Xu,Qi Tian,Chang Xu

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the $\ell_1$-norm distance between filters and input feature as the output response. The influence of this new similarity measure on the optimization of neural network have been thoroughly analyzed. To achieve a better performance, we develop a special back-propagation approach for AdderNets by investigating the full-precision gradient. We then propose an adaptive learning rate strategy to enhance the training procedure of AdderNets according to the magnitude of each neuron's gradient. As a result, the proposed AdderNets can achieve 74.9% Top-1 accuracy 91.7% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in convolution layer.

事件抽取 · 學成 · 逆強化學習 · GAN · 估計/估計量 ·

2018 年 4 月 21 日

Event Extraction with Generative Adversarial Imitation Learning

Tongtao Zhang,Heng Ji

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.