亚洲黄色网站不卡免费_无遮挡又黄又刺激的免费视频_亚洲国产男女嘿嘿嘿嘿嘿视频在线观看_五月婷婷亚洲激情综合网_无码免费无遮挡毛片_国产又色又爽又黄刺激的视频在线_无码一区二区三区视频播放

In this paper, we tackle the incremental maintenance of Datalog inference materialisation when the rule set can be updated. This is particularly relevant in the context of the Internet of Things and Edge computing where smart devices may need to reason over newly acquired knowledge represented as Datalog rules. Our solution is based on an adaptation of a stratification strategy applied to a dependency hypergraph whose nodes correspond to rule sets in a Datalog program. Our implementation supports recursive rules containing both negation and aggregation. We demonstrate the effectiveness of our system on real and synthetic data.

相關內容

情景

關注 1

圖 · 無向圖 · 無向 · Extensibility · 操作 ·

2024 年 2 月 12 日

Graphical Proof Theory I: Sequent Systems on Undirected Graphs

Matteo Acclavio

In this paper we explore the design of sequent calculi operating on graphs. For this purpose, we introduce a set of logical connectives allowing us to extend the correspondence between cographs and classical propositional formulas to any graph. We then provide sequent calculi operating on these formulas, we prove cut-elimination and that formula encoding the same graph are logically equivalent. We show that these systems provide conservative extensions of multiplicative linear logic (with and without mix) and classical propositional logic. We conclude by showing that one of these systems is equivalent to the graphical logic GS defined via a system of context-free graph rewiring rules, therefore providing an alternative proof of analyticity for this logic over graphs.

Dart · 控制器 · MoDELS · CASES · 講稿 ·

2024 年 2 月 12 日

DART: A Compact Platform For Autonomous Driving Research

Lorenzo Lyons,Thijs Niesten,Laura Ferranti

from arxiv, 8 pages, 10 figures

This paper presents the design of a research platform for autonomous driving applications, the Delft's Autonomous-driving Robotic Testbed (DART). Our goal was to design a small-scale car-like robot equipped with all the hardware needed for on-board navigation and control while keeping it cost-effective and easy to replicate. To develop DART, we built on an existing off-the-shelf model and augmented its sensor suite to improve its capabilities for control and motion planning tasks. We detail the hardware setup and the system identification challenges to derive the vehicle's models. Furthermore, we present some use cases where we used DART to test different motion planning applications to show the versatility of the platform. Finally, we provide a git repository with all the details to replicate DART, complete with a simulation environment and the data used for system identification.

Performer · state-of-the-art · 線性的 · CASE · 論文 ·

2024 年 2 月 12 日

Symbolic Numeric Planning with Patterns

Matteo Cardellini,Enrico Giunchiglia,Marco Maratea

from arxiv, Accepted at AAAI24

In this paper, we propose a novel approach for solving linear numeric planning problems, called Symbolic Pattern Planning. Given a planning problem $\Pi$, a bound $n$ and a pattern -- defined as an arbitrary sequence of actions -- we encode the problem of finding a plan for $\Pi$ with bound $n$ as a formula with fewer variables and/or clauses than the state-of-the-art rolled-up and relaxed-relaxed-$\exists$ encodings. More importantly, we prove that for any given bound, it is never the case that the latter two encodings allow finding a valid plan while ours does not. On the experimental side, we consider 6 other planning systems -- including the ones which participated in this year's International Planning Competition (IPC) -- and we show that our planner Patty has remarkably good comparative performances on this year's IPC problems.

Analysis · Agent · 數據分析 · INTERACT · Extensibility ·

2024 年 2 月 12 日

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

Xueyu Hu,Ziyu Zhao,Shuang Wei,Ziwei Chai,Qianli Ma,Guoyin Wang,Xuwu Wang,Jing Su,Jingjing Xu,Ming Zhu,Yao Cheng,Jianbo Yuan,Jiwei Li,Kun Kuang,Yang Yang,Hongxia Yang,Fei Wu

from arxiv, 27 pages, 7 figures, work in progress

In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to-end solving complex tasks by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 257 data analysis questions derived from 52 CSV files, and an agent framework which incorporates LLMs to serve as data analysis agents for both serving and evaluation. Since data analysis questions are often open-ended and hard to evaluate without human supervision, we adopt a format-prompting technique to convert each question into a closed-form format so that they can be automatically evaluated. Our extensive benchmarking of 34 LLMs uncovers the current challenges encountered in data analysis tasks. In addition, building on top of our agent framework, we develop a specialized agent, DAAgent, which surpasses GPT-3.5 by 3.9% on DABench. Evaluation datasets and toolkits for InfiAgent-DABench are released at //github.com/InfiAgent/InfiAgent .

Hacking · 相關系數 · MoDELS · Less · 線性的 ·

2024 年 2 月 11 日

ODIN: Disentangled Reward Mitigates Hacking in RLHF

Lichang Chen,Chen Zhu,Davit Soselia,Jiuhai Chen,Tianyi Zhou,Tom Goldstein,Heng Huang,Mohammad Shoeybi,Bryan Catanzaro

In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs. A well-formatted, verbose but less helpful response from the LLMs can often deceive LLMs or even human evaluators to achieve high scores. The same issue also holds for some reward models in RL. To address the challenges in both training and evaluation, we establish a more reliable evaluation protocol for comparing different training configurations, which inspects the trade-off between LLM evaluation score and response length obtained by varying training hyperparameters. Based on this evaluation, we conduct large-scale studies, where the results shed insights into the efficacy of hyperparameters and tricks used in RL on mitigating length bias. We further propose to improve the reward model by jointly training two linear heads on shared feature representations to predict the rewards, one trained to correlate with length, and the other trained to decorrelate with length and therefore focus more on the actual content. We then discard the length head in RL to prevent reward hacking on length. Experiments demonstrate that our approach almost eliminates the reward correlation with length, and improves the obtained policy by a significant margin.

Better · 近似 · 推斷 · INTERACT · Performer ·

2024 年 2 月 9 日

Mimicking Better by Matching the Approximate Action Distribution

Jo?o A. Candido Ramos,Lionel Blondé,Naoya Takeishi,Alexandros Kalousis

In this paper, we introduce MAAD, a novel, sample-efficient on-policy algorithm for Imitation Learning from Observations. MAAD utilizes a surrogate reward signal, which can be derived from various sources such as adversarial games, trajectory matching objectives, or optimal transport criteria. To compensate for the non-availability of expert actions, we rely on an inverse dynamics model that infers plausible actions distribution given the expert's state-state transitions; we regularize the imitator's policy by aligning it to the inferred action distribution. MAAD leads to significantly improved sample efficiency and stability. We demonstrate its effectiveness in a number of MuJoCo environments, both int the OpenAI Gym and the DeepMind Control Suite. We show that it requires considerable fewer interactions to achieve expert performance, outperforming current state-of-the-art on-policy methods. Remarkably, MAAD often stands out as the sole method capable of attaining expert performance levels, underscoring its simplicity and efficacy.

MoDELS · 語言模型化 · 大語言模型 · Performer · 情景 ·

2024 年 2 月 9 日

InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

Huaiyuan Ying,Shuo Zhang,Linyang Li,Zhejian Zhou,Yunfan Shao,Zhaoye Fei,Yichuan Ma,Jiawei Hong,Kuikun Liu,Ziyi Wang,Yudong Wang,Zijian Wu,Shuaibin Li,Fengzhe Zhou,Hongwei Liu,Songyang Zhang,Wenwei Zhang,Hang Yan,Xipeng Qiu,Jiayu Wang,Kai Chen,Dahua Lin

The math abilities of large language models can represent their abstract reasoning ability. In this paper, we introduce and open-source our math reasoning LLMs InternLM-Math which is continue pre-trained from InternLM2. We unify chain-of-thought reasoning, reward modeling, formal reasoning, data augmentation, and code interpreter in a unified seq2seq format and supervise our model to be a versatile math reasoner, verifier, prover, and augmenter. These abilities can be used to develop the next math LLMs or self-iteration. InternLM-Math obtains open-sourced state-of-the-art performance under the setting of in-context learning, supervised fine-tuning, and code-assisted reasoning in various informal and formal benchmarks including GSM8K, MATH, Hungary math exam, MathBench-ZH, and MiniF2F. Our pre-trained model achieves 30.3 on the MiniF2F test set without fine-tuning. We further explore how to use LEAN to solve math problems and study its performance under the setting of multi-task learning which shows the possibility of using LEAN as a unified platform for solving and proving in math. Our models, codes, and data are released at \url{//github.com/InternLM/InternLM-Math}.

自動問答 · 注意力機制 · 可約的 · MoDELS · 匯聚 ·

2021 年 5 月 10 日

Poolingformer: Long Document Modeling with Pooling Attention

Hang Zhang,Yeyun Gong,Yelong Shen,Weisheng Li,Jiancheng Lv,Nan Duan,Weizhu Chen

from arxiv, Accepted by ICML 2021

In this paper, we introduce a two-level attention schema, Poolingformer, for long document modeling. Its first level uses a smaller sliding window pattern to aggregate information from neighbors. Its second level employs a larger window to increase receptive fields with pooling attention to reduce both computational cost and memory consumption. We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA. Experimental results show that Poolingformer sits atop three official leaderboards measured by F1, outperforming previous state-of-the-art models by 1.9 points (79.8 vs. 77.9) on NQ long answer, 1.9 points (79.5 vs. 77.6) on TyDi QA passage answer, and 1.6 points (67.6 vs. 66.0) on TyDi QA minimal answer. We further evaluate Poolingformer on a long sequence summarization task. Experimental results on the arXiv benchmark continue to demonstrate its superior performance.

圖 · 知識圖譜 · 鏈路預測 · Extensibility · entity ·

2020 年 10 月 6 日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Tara Safavi,Danai Koutra

from arxiv, EMNLP 2020

We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content, and is a more difficult link prediction benchmark. Data, code, and pretrained models are available at //bit.ly/2EPbrJs.

Taxonomy · 目標檢測 · 可辨認的 · 評論員 · HTTPS ·

2020 年 3 月 11 日

Imbalance Problems in Object Detection: A Review

Kemal Oksuz,Baris Can Cam,Sinan Kalkan,Emre Akbas

from arxiv, Accepted to IEEE TPAMI; currently in press

In this paper, we present a comprehensive review of the imbalance problems in object detection. To analyze the problems in a systematic manner, we introduce a problem-based taxonomy. Following this taxonomy, we discuss each problem in depth and present a unifying yet critical perspective on the solutions in the literature. In addition, we identify major open issues regarding the existing imbalance problems as well as imbalance problems that have not been discussed before. Moreover, in order to keep our review up to date, we provide an accompanying webpage which catalogs papers addressing imbalance problems, according to our problem-based taxonomy. Researchers can track newer studies on this webpage available at: //github.com/kemaloksuz/ObjectDetectionImbalance .