亚洲AV午夜成人片精品网站听书,日韩欧美国产AⅤ另类

Accurate precipitation nowcasting is essential for various purposes, including flood prediction, disaster management, optimizing agricultural activities, managing transportation routes and renewable energy. While several studies have addressed this challenging task from a sequence-to-sequence perspective, most of them have focused on a single area without considering the existing correlation between multiple disjoint regions. In this paper, we formulate precipitation nowcasting as a spatiotemporal graph sequence nowcasting problem. In particular, we introduce Graph Dual-stream Convolutional Attention Fusion (GD-CAF), a novel approach designed to learn from historical spatiotemporal graph of precipitation maps and nowcast future time step ahead precipitation at different spatial locations. GD-CAF consists of spatio-temporal convolutional attention as well as gated fusion modules which are equipped with depthwise-separable convolutional operations. This enhancement enables the model to directly process the high-dimensional spatiotemporal graph of precipitation maps and exploits higher-order correlations between the data dimensions. We evaluate our model on seven years of precipitation maps across Europe and its neighboring areas collected from the ERA5 dataset, provided by Copernicus. The model receives a fully connected graph in which each node represents historical observations from a specific region on the map. Consequently, each node contains a 3D tensor with time, height, and width dimensions. Experimental results demonstrate that the proposed GD-CAF model outperforms the other examined models. Furthermore, the averaged seasonal spatial and temporal attention scores over the test set are visualized to provide additional insights about the strongest connections between different regions or time steps. These visualizations shed light on the decision-making process of our model.

相關內容

圖

關注 6

掩碼語言模型化 · 語言模型化 · MoDELS · 掩碼 · 向量化 ·

2024 年 2 月 27 日

NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents

Tamara Czinczoll,Christoph H?nes,Maximilian Schall,Gerard de Melo

While (large) language models have significantly improved over the last years, they still struggle to sensibly process long sequences found, e.g., in books, due to the quadratic scaling of the underlying attention mechanism. To address this, we propose NextLevelBERT, a Masked Language Model operating not on tokens, but on higher-level semantic representations in the form of text embeddings. We pretrain NextLevelBERT to predict the vector representation of entire masked text chunks and evaluate the effectiveness of the resulting document vectors on three task types: 1) Semantic Textual Similarity via zero-shot document embeddings, 2) Long document classification, 3) Multiple-choice question answering. We find that next level Masked Language Modeling is an effective technique to tackle long-document use cases and can outperform much larger embedding models as long as the required level of detail is not too high. We make model and code available.

視頻描述生成（Video Caption） · 多峰值 · Learning · 知識 (knowledge) · Performer ·

2024 年 2 月 27 日

MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning

Huiyu Xiong,Lanxiao Wang,Heqian Qiu,Taijin Zhao,Benliu Qiu,Hongliang Li

from arxiv, 13 pages

To address the problem of catastrophic forgetting due to the invisibility of old categories in sequential input, existing work based on relatively simple categorization tasks has made some progress. In contrast, video captioning is a more complex task in multimodal scenario, which has not been explored in the field of incremental learning. After identifying this stability-plasticity problem when analyzing video with sequential input, we originally propose a method to Mitigate Catastrophic Forgetting in class-incremental learning for multimodal Video Captioning (MCF-VC). As for effectively maintaining good performance on old tasks at the macro level, we design Fine-grained Sensitivity Selection (FgSS) based on the Mask of Linear's Parameters and Fisher Sensitivity to pick useful knowledge from old tasks. Further, in order to better constrain the knowledge characteristics of old and new tasks at the specific feature level, we have created the Two-stage Knowledge Distillation (TsKD), which is able to learn the new task well while weighing the old task. Specifically, we design two distillation losses, which constrain the cross modal semantic information of semantic attention feature map and the textual information of the final outputs respectively, so that the inter-model and intra-model stylized knowledge of the old class is retained while learning the new class. In order to illustrate the ability of our model to resist forgetting, we designed a metric CIDER_t to detect the stage forgetting rate. Our experiments on the public dataset MSR-VTT show that the proposed method significantly resists the forgetting of previous tasks without replaying old samples, and performs well on the new task.

Facebook AI Research · GROUP · Everything（軟件） · 情景 · MoDELS ·

2024 年 2 月 26 日

FRAPPé: A Group Fairness Framework for Post-Processing Everything

Alexandru ?ifrea,Preethi Lahoti,Ben Packer,Yoni Halpern,Ahmad Beirami,Flavien Prost

from arxiv, Presubmission

Despite achieving promising fairness-error trade-offs, in-processing mitigation techniques for group fairness cannot be employed in numerous practical applications with limited computation resources or no access to the training pipeline of the prediction model. In these situations, post-processing is a viable alternative. However, current methods are tailored to specific problem settings and fairness definitions and hence, are not as broadly applicable as in-processing. In this work, we propose a framework that turns any regularized in-processing method into a post-processing approach. This procedure prescribes a way to obtain post-processing techniques for a much broader range of problem settings than the prior post-processing literature. We show theoretically and through extensive experiments that our framework preserves the good fairness-error trade-offs achieved with in-processing and can improve over the effectiveness of prior post-processing methods. Finally, we demonstrate several advantages of a modular mitigation strategy that disentangles the training of the prediction model from the fairness mitigation, including better performance on tasks with partial group labels.

優化器 · 推斷 · 可辨認的 · 賭博機/老虎機 · 在線 ·

2024 年 2 月 26 日

Online Causal Inference for Advertising in Real-Time Bidding Auctions

Caio Waisman,Harikesh S. Nair,Carlos Carrion

Real-time bidding (RTB) systems, which utilize auctions to allocate user impressions to competing advertisers, continue to enjoy success in digital advertising. Assessing the effectiveness of such advertising remains a challenge in research and practice. This paper proposes a new approach to perform causal inference on advertising bought through such mechanisms. Leveraging the economic structure of first- and second-price auctions, we first show that the effects of advertising are identified by the optimal bids. Hence, since these optimal bids are the only objects that need to be recovered, we introduce an adapted Thompson sampling (TS) algorithm to solve a multi-armed bandit problem that succeeds in recovering such bids and, consequently, the effects of advertising while minimizing the costs of experimentation. We derive a regret bound for our algorithm which is order optimal and use data from RTB auctions to show that it outperforms commonly used methods that estimate the effects of advertising.

三維重建 · Processing（編程語言） · 3D · Automator · INTERACT ·

2024 年 2 月 25 日

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

Xiao Chen,Quanyi Li,Tai Wang,Tianfan Xue,Jiangmiao Pang

While recent advances in neural radiance field enable realistic digitization for large-scale scenes, the image-capturing process is still time-consuming and labor-intensive. Previous works attempt to automate this process using the Next-Best-View (NBV) policy for active 3D reconstruction. However, the existing NBV policies heavily rely on hand-crafted criteria, limited action space, or per-scene optimized representations. These constraints limit their cross-dataset generalizability. To overcome them, we propose GenNBV, an end-to-end generalizable NBV policy. Our policy adopts a reinforcement learning (RL)-based framework and extends typical limited action space to 5D free space. It empowers our agent drone to scan from any viewpoint, and even interact with unseen geometries during training. To boost the cross-dataset generalizability, we also propose a novel multi-source state embedding, including geometric, semantic, and action representations. We establish a benchmark using the Isaac Gym simulator with the Houses3K and OmniObject3D datasets to evaluate this NBV policy. Experiments demonstrate that our policy achieves a 98.26% and 97.12% coverage ratio on unseen building-scale objects from these datasets, respectively, outperforming prior solutions.

QoS · 非凸 · 變換 · 等變 · 約束 ·

2024 年 2 月 25 日

HPE Transformer: Learning to Optimize Multi-Group Multicast Beamforming Under Nonconvex QoS Constraints

Yang Li,Ya-Feng Liu

This paper studies the quality-of-service (QoS) constrained multi-group multicast beamforming design problem, where each multicast group is composed of a number of users requiring the same content. Due to the nonconvex QoS constraints, this problem is nonconvex and NP-hard. While existing optimization-based iterative algorithms can obtain a suboptimal solution, their iterative nature results in large computational complexity and delay. To facilitate real-time implementations, this paper proposes a deep learning-based approach, which consists of a beamforming structure assisted problem transformation and a customized neural network architecture named hierarchical permutation equivariance (HPE) transformer. The proposed HPE transformer is proved to be permutation equivariant with respect to the users within each multicast group, and also permutation equivariant with respect to different multicast groups. Simulation results demonstrate that the proposed HPE transformer outperforms state-of-the-art optimization-based and deep learning-based approaches for multi-group multicast beamforming design in terms of the total transmit power, the constraint violation, and the computational time. In addition, the proposed HPE transformer achieves pretty good generalization performance on different numbers of users, different numbers of multicast groups, and different signal-to-interference-plus-noise ratio targets.

狀態估計 · 估計/估計量 · 可辨認的 · AIM · Integration ·

2024 年 2 月 24 日

Privacy-Preserving State Estimation in the Presence of Eavesdroppers: A Survey

Xinhao Yan,Guanzhong Zhou,Daniel E. Quevedo,Carlos Murguia,Bo Chen,Hailong Huang

from arxiv, 16 pages, 5 figures, 4 tables

Networked systems are increasingly the target of cyberattacks that exploit vulnerabilities within digital communications, embedded hardware, and software. Arguably, the simplest class of attacks -- and often the first type before launching destructive integrity attacks -- are eavesdropping attacks, which aim to infer information by collecting system data and exploiting it for malicious purposes. A key technology of networked systems is state estimation, which leverages sensing and actuation data and first-principles models to enable trajectory planning, real-time monitoring, and control. However, state estimation can also be exploited by eavesdroppers to identify models and reconstruct states with the aim of, e.g., launching integrity (stealthy) attacks and inferring sensitive information. It is therefore crucial to protect disclosed system data to avoid an accurate state estimation by eavesdroppers. This survey presents a comprehensive review of existing literature on privacy-preserving state estimation methods, while also identifying potential limitations and research gaps. Our primary focus revolves around three types of methods: cryptography, data perturbation, and transmission scheduling, with particular emphasis on Kalman-like filters. Within these categories, we delve into the concepts of homomorphic encryption and differential privacy, which have been extensively investigated in recent years in the context of privacy-preserving state estimation. Finally, we shed light on several technical and fundamental challenges surrounding current methods and propose potential directions for future research.

知識 (knowledge) · 大語言模型 · 語言模型化 · MoDELS · 圖 ·

2024 年 2 月 23 日

KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion

Yanbin Wei,Qiushi Huang,James T. Kwok,Yu Zhang

from arxiv, Accepted to EMNLP 2023 Findings

Knowledge Graph Completion (KGC) is crucial for addressing knowledge graph incompleteness and supporting downstream applications. Many models have been proposed for KGC. They can be categorized into two main classes: triple-based and text-based approaches. Triple-based methods struggle with long-tail entities due to limited structural information and imbalanced entity distributions. Text-based methods alleviate this issue but require costly training for language models and specific finetuning for knowledge graphs, which limits their efficiency. To alleviate these limitations, in this paper, we propose KICGPT, a framework that integrates a large language model (LLM) and a triple-based KGC retriever. It alleviates the long-tail problem without incurring additional training overhead. KICGPT uses an in-context learning strategy called Knowledge Prompt, which encodes structural knowledge into demonstrations to guide the LLM. Empirical results on benchmark datasets demonstrate the effectiveness of KICGPT with smaller training overhead and no finetuning.

TransAct · Processing（編程語言） · 塊 · 控制器 · 區塊鏈 ·

2024 年 2 月 23 日

A Two-Layer Blockchain Sharding Protocol Leveraging Safety and Liveness for Enhanced Performance

Yibin Xu,Jingyi Zheng,Boris Düdder,Tijs Slaats,Yongluan Zhou

from arxiv, The paper has been accepted to Network and Distributed System Security (NDSS) Symposium 2024

Sharding is essential for improving blockchain scalability. Existing protocols overlook diverse adversarial attacks, limiting transaction throughput. This paper presents Reticulum, a groundbreaking sharding protocol addressing this issue, boosting blockchain scalability. Reticulum employs a two-phase approach, adapting transaction throughput based on runtime adversarial attacks. It comprises "control" and "process" shards in two layers. Process shards contain at least one trustworthy node, while control shards have a majority of trusted nodes. In the first phase, transactions are written to blocks and voted on by nodes in process shards. Unanimously accepted blocks are confirmed. In the second phase, blocks without unanimous acceptance are voted on by control shards. Blocks are accepted if the majority votes in favor, eliminating first-phase opponents and silent voters. Reticulum uses unanimous voting in the first phase, involving fewer nodes, enabling more parallel process shards. Control shards finalize decisions and resolve disputes. Experiments confirm Reticulum's innovative design, providing high transaction throughput and robustness against various network attacks, outperforming existing sharding protocols for blockchain networks.

自動問答 · Performer · 多峰值 · MoDELS · 可辨認的 ·

2024 年 2 月 19 日

RJUA-MedDQA: A Multimodal Benchmark for Medical Document Question Answering and Clinical Reasoning

Congyun Jin,Ming Zhang,Xiaowei Ma,Li Yujiao,Yingbo Wang,Yabo Jia,Yuliang Du,Tao Sun,Haowen Wang,Cong Fan,Jinjie Gu,Chenfei Chi,Xiangguo Lv,Fangzhou Li,Wei Xue,Yiran Huang

from arxiv, 15 pages, 13 figures

Recent advancements in Large Language Models (LLMs) and Large Multi-modal Models (LMMs) have shown potential in various medical applications, such as Intelligent Medical Diagnosis. Although impressive results have been achieved, we find that existing benchmarks do not reflect the complexity of real medical reports and specialized in-depth reasoning capabilities. In this work, we introduced RJUA-MedDQA, a comprehensive benchmark in the field of medical specialization, which poses several challenges: comprehensively interpreting imgage content across diverse challenging layouts, possessing numerical reasoning ability to identify abnormal indicators and demonstrating clinical reasoning ability to provide statements of disease diagnosis, status and advice based on medical contexts. We carefully design the data generation pipeline and proposed the Efficient Structural Restoration Annotation (ESRA) Method, aimed at restoring textual and tabular content in medical report images. This method substantially enhances annotation efficiency, doubling the productivity of each annotator, and yields a 26.8% improvement in accuracy. We conduct extensive evaluations, including few-shot assessments of 5 LMMs which are capable of solving Chinese medical QA tasks. To further investigate the limitations and potential of current LMMs, we conduct comparative experiments on a set of strong LLMs by using image-text generated by ESRA method. We report the performance of baselines and offer several observations: (1) The overall performance of existing LMMs is still limited; however LMMs more robust to low-quality and diverse-structured images compared to LLMs. (3) Reasoning across context and image content present significant challenges. We hope this benchmark helps the community make progress on these challenging tasks in multi-modal medical document understanding and facilitate its application in healthcare.