18禁不卡无毒免费网站入口_亚洲无码精品动漫啪啪一区二区_黑人粗大无码AV人妻系列_日韩中文字幕AV一区二区_日韩精品人妻无码一区二区三_免费国产真实迷JI系列在线_国产精品自产自在线观看1

Approaches to recommendation are typically evaluated in one of two ways: (1) via a (simulated) online experiment, often seen as the gold standard, or (2) via some offline evaluation procedure, where the goal is to approximate the outcome of an online experiment. Several offline evaluation metrics have been adopted in the literature, inspired by ranking metrics prevalent in the field of Information Retrieval. (Normalised) Discounted Cumulative Gain (nDCG) is one such metric that has seen widespread adoption in empirical studies, and higher (n)DCG values have been used to present new methods as the state-of-the-art in top-$n$ recommendation for many years. Our work takes a critical look at this approach, and investigates when we can expect such metrics to approximate the gold standard outcome of an online experiment. We formally present the assumptions that are necessary to consider DCG an unbiased estimator of online reward and provide a derivation for this metric from first principles, highlighting where we deviate from its traditional uses in IR. Importantly, we show that normalising the metric renders it inconsistent, in that even when DCG is unbiased, ranking competing methods by their normalised DCG can invert their relative order. Through a correlation analysis between off- and on-line experiments conducted on a large-scale recommendation platform, we show that our unbiased DCG estimates strongly correlate with online reward, even when some of the metric's inherent assumptions are violated. This statement no longer holds for its normalised variant, suggesting that nDCG's practical utility may be limited.

相關內容

DCG

關注 0

《離散與計算幾(ji)(ji)(ji)(ji)何(he)》(DCG)是一份國際數(shu)學與計算機(ji)科(ke)學雜(za)志(zhi)(zhi)，涵蓋(gai)了廣泛的主題，其(qi)(qi)中幾(ji)(ji)(ji)(ji)何(he)在(zai)其(qi)(qi)中扮演著重要(yao)的角色(se)。它發表幾(ji)(ji)(ji)(ji)何(he)論(lun)文(wen)的主題：多邊形(xing)、空間(jian)細分(fen)(fen)、填充、覆蓋(gai)和(he)平鋪(pu)、配置和(he)排列以(yi)及(ji)幾(ji)(ji)(ji)(ji)何(he)圖(tu)形(xing);幾(ji)(ji)(ji)(ji)何(he)算法及(ji)其(qi)(qi)復(fu)雜(za)性、凸殼、Voronoi圖(tu)、Delaunay三角剖分(fen)(fen)和(he)范圍搜索;立(li)體(ti)建模(mo)、計算機(ji)圖(tu)形(xing)學、圖(tu)像處理、模(mo)式(shi)識別(bie)和(he)運動(dong)規劃(hua);計算拓撲，離散微分(fen)(fen)幾(ji)(ji)(ji)(ji)何(he)，幾(ji)(ji)(ji)(ji)何(he)概率，和(he)真實(shi)代數(shu)幾(ji)(ji)(ji)(ji)何(he)。該雜(za)志(zhi)(zhi)還(huan)接受在(zai)圖(tu)論(lun)、數(shu)學編程、組合優化、代數(shu)幾(ji)(ji)(ji)(ji)何(he)、數(shu)字幾(ji)(ji)(ji)(ji)何(he)、晶體(ti)學、數(shu)據分(fen)(fen)析、機(ji)器(qi)學習和(he)機(ji)器(qi)人等領(ling)域具有獨特(te)幾(ji)(ji)(ji)(ji)何(he)風格的論(lun)文(wen)。該雜(za)志(zhi)(zhi)還(huan)鼓勵其(qi)(qi)他材(cai)料(liao)，如短視頻(pin)、動(dong)畫(hua)圖(tu)形(xing)和(he)類似(si)的電子(zi)補充材(cai)料(liao)。官(guan)網(wang)地址：

Storage · 幾乎必然收斂 · 幾乎必然 · motivation · INFORMS ·

2024 年 1 月 8 日

On the Long-Term behavior of $k$-tuples Frequencies in Mutation Systems

Ohad Elishco

In response to the evolving landscape of data storage, researchers have increasingly explored non-traditional platforms, with DNA-based storage emerging as a cutting-edge solution. Our work is motivated by the potential of in-vivo DNA storage, known for its capacity to store vast amounts of information efficiently and confidentially within an organism's native DNA. While promising, in-vivo DNA storage faces challenges, including susceptibility to errors introduced by mutations. To understand the long-term behavior of such mutation systems, we investigate the frequency of $k$-tuples after multiple mutation applications. Drawing inspiration from related works, we generalize results from the study of mutation systems, particularly focusing on the frequency of $k$-tuples. In this work, we provide a broad analysis through the construction of a specialized matrix and the identification of its eigenvectors. In the context of substitution and duplication systems, we leverage previous results on almost sure convergence, equating the expected frequency to the limiting frequency. Moreover, we demonstrate convergence in probability under certain assumptions.

Taxonomy · 優化器 · 可辨認的 · Continuity · 離散化 ·

2024 年 1 月 8 日

Metaheuristics for (Variable-Size) Mixed Optimization Problems: A Unified Taxonomy and Survey

Prof. El-Ghazali Talbi

Many real world optimization problems are formulated as mixed-variable optimization problems (MVOPs) which involve both continuous and discrete variables. MVOPs including dimensional variables are characterized by a variable-size search space. Depending on the values of dimensional variables, the number and type of the variables of the problem can vary dynamically. MVOPs and variable-size MVOPs (VMVOPs) are difficult to solve and raise a number of scientific challenges in the design of metaheuristics. Standard metaheuristics have been first designed to address continuous or discrete optimization problems, and are not able to tackle (V)MVOPs in an efficient way. The development of metaheuristics for solving such problems has attracted the attention of many researchers and is increasingly popular. However, to our knowledge there is no well established taxonomy and comprehensive survey for handling this important family of optimization problems. This paper presents a unified taxonomy for metaheuristic solutions for solving (V)MVOPs in an attempt to provide a common terminology and classification mechanisms. It provides a general mathematical formulation and concepts of (V)MVOPs, and identifies the various solving methodologies than can be applied in metaheuristics. The advantages, the weaknesses and the limitations of the presented methodologies are discussed. The proposed taxonomy also allows to identify some open research issues which needs further in-depth investigations.

矩 · 磁流變材料 · 變換 · 得分 · CLIP ·

2024 年 1 月 5 日

TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection

Hao Sun,Mingyao Zhou,Wenjing Chen,Wei Xie

from arxiv, Accepted by AAAI-24

Video moment retrieval (MR) and highlight detection (HD) based on natural language queries are two highly related tasks, which aim to obtain relevant moments within videos and highlight scores of each video clip. Recently, several methods have been devoted to building DETR-based networks to solve both MR and HD jointly. These methods simply add two separate task heads after multi-modal feature extraction and feature interaction, achieving good performance. Nevertheless, these approaches underutilize the reciprocal relationship between two tasks. In this paper, we propose a task-reciprocal transformer based on DETR (TR-DETR) that focuses on exploring the inherent reciprocity between MR and HD. Specifically, a local-global multi-modal alignment module is first built to align features from diverse modalities into a shared latent space. Subsequently, a visual feature refinement is designed to eliminate query-irrelevant information from visual features for modal interaction. Finally, a task cooperation module is constructed to refine the retrieval pipeline and the highlight score prediction process by utilizing the reciprocity between MR and HD. Comprehensive experiments on QVHighlights, Charades-STA and TVSum datasets demonstrate that TR-DETR outperforms existing state-of-the-art methods. Codes are available at \url{//github.com/mingyao1120/TR-DETR}.

可理解性 · MoDELS · 大語言模型 · 語言模型化 · Integration ·

2024 年 1 月 5 日

M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models

Atin Sakkeer Hussain,Shansong Liu,Chenshuo Sun,Ying Shan

The current landscape of research leveraging large language models (LLMs) is experiencing a surge. Many works harness the powerful reasoning capabilities of these models to comprehend various modalities, such as text, speech, images, videos, etc. They also utilize LLMs to understand human intention and generate desired outputs like images, videos, and music. However, research that combines both understanding and generation using LLMs is still limited and in its nascent stage. To address this gap, we introduce a Multi-modal Music Understanding and Generation (M$^{2}$UGen) framework that integrates LLM's abilities to comprehend and generate music for different modalities. The M$^{2}$UGen framework is purpose-built to unlock creative potential from diverse sources of inspiration, encompassing music, image, and video through the use of pretrained MERT, ViT, and ViViT models, respectively. To enable music generation, we explore the use of AudioLDM 2 and MusicGen. Bridging multi-modal understanding and music generation is accomplished through the integration of the LLaMA 2 model. Furthermore, we make use of the MU-LLaMA model to generate extensive datasets that support text/image/video-to-music generation, facilitating the training of our M$^{2}$UGen framework. We conduct a thorough evaluation of our proposed framework. The experimental results demonstrate that our model achieves or surpasses the performance of the current state-of-the-art models.

多峰值 · MoDELS · INFORMS · contrastive · Prompt ·

2024 年 1 月 5 日

CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs

Daoan Zhang,Junming Yang,Hanjia Lyu,Zijian Jin,Yuan Yao,Mingkai Chen,Jiebo Luo

When exploring the development of Artificial General Intelligence (AGI), a critical task for these models involves interpreting and processing information from multiple image inputs. However, Large Multimodal Models (LMMs) encounter two issues in such scenarios: (1) a lack of fine-grained perception, and (2) a tendency to blend information across multiple images. We first extensively investigate the capability of LMMs to perceive fine-grained visual details when dealing with multiple input images. The research focuses on two aspects: first, image-to-image matching (to evaluate whether LMMs can effectively reason and pair relevant images), and second, multi-image-to-text matching (to assess whether LMMs can accurately capture and summarize detailed image information). We conduct evaluations on a range of both open-source and closed-source large models, including GPT-4V, Gemini, OpenFlamingo, and MMICL. To enhance model performance, we further develop a Contrastive Chain-of-Thought (CoCoT) prompting approach based on multi-input multimodal models. This method requires LMMs to compare the similarities and differences among multiple image inputs, and then guide the models to answer detailed questions about multi-image inputs based on the identified similarities and differences. Our experimental results showcase CoCoT's proficiency in enhancing the multi-image comprehension capabilities of large multimodal models.

估計/估計量 · PDE · Backstepping · 狀態估計 · 輸出 ·

2024 年 1 月 4 日

Moving-Horizon Estimators for Hyperbolic and Parabolic PDEs in 1-D

Luke Bhan,Yuanyuan Shi,Iasson Karafyllis,Miroslav Krstic,James B. Rawlings

from arxiv, 7 pages, 1 figure, submitted to ACC 2024

Observers for PDEs are themselves PDEs. Therefore, producing real time estimates with such observers is computationally burdensome. For both finite-dimensional and ODE systems, moving-horizon estimators (MHE) are operators whose output is the state estimate, while their inputs are the initial state estimate at the beginning of the horizon as well as the measured output and input signals over the moving time horizon. In this paper we introduce MHEs for PDEs which remove the need for a numerical solution of an observer PDE in real time. We accomplish this using the PDE backstepping method which, for certain classes of both hyperbolic and parabolic PDEs, produces moving-horizon state estimates explicitly. Precisely, to explicitly produce the state estimates, we employ a backstepping transformation of a hard-to-solve observer PDE into a target observer PDE, which is explicitly solvable. The MHEs we propose are not new observer designs but simply the explicit MHE realizations, over a moving horizon of arbitrary length, of the existing backstepping observers. Our PDE MHEs lack the optimality of the MHEs that arose as duals of MPC, but they are given explicitly, even for PDEs. In the paper we provide explicit formulae for MHEs for both hyperbolic and parabolic PDEs, as well as simulation results that illustrate theoretically guaranteed convergence of the MHEs.

語言模型化 · MoDELS · Performer · 任務對話系統 · HTTPS ·

2024 年 1 月 4 日

LLaVA-$φ$: Efficient Multi-Modal Assistant with Small Language Model

Yichen Zhu,Minjie Zhu,Ning Liu,Zhicai Ou,Xiaofeng Mou,Jian Tang

from arxiv, technique report

In this paper, we introduce LLaVA-$\phi$ (LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate multi-modal dialogues. LLaVA-Phi marks a notable advancement in the realm of compact multi-modal models. It demonstrates that even smaller language models, with as few as 2.7B parameters, can effectively engage in intricate dialogues that integrate both textual and visual elements, provided they are trained with high-quality corpora. Our model delivers commendable performance on publicly available benchmarks that encompass visual comprehension, reasoning, and knowledge-based perception. Beyond its remarkable performance in multi-modal dialogue tasks, our model opens new avenues for applications in time-sensitive environments and systems that require real-time interaction, such as embodied agents. It highlights the potential of smaller language models to achieve sophisticated levels of understanding and interaction, while maintaining greater resource efficiency.The project is available at {//github.com/zhuyiche/llava-phi}.

Networking · 講稿 · 可理解性 · AI · 多代理人模型 ·

2023 年 10 月 11 日

Generative Agent-Based Social Networks for Disinformation: Research Opportunities and Open Challenges

Javier Pastor-Galindo,Pantaleone Nespoli,José A. Ruipérez-Valiente

This article presents the affordances that Generative Artificial Intelligence can have in disinformation context, one of the major threats to our digitalized society. We present a research framework to generate customized agent-based social networks for disinformation simulations that would enable understanding and evaluation of the phenomena whilst discussing open challenges.

Transformer-XL · 語言模型化 · MoDELS · 學成 · 樹庫 ·

2019 年 6 月 2 日

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Zihang Dai,Zhilin Yang,Yiming Yang,Jaime Carbonell,Quoc V. Le,Ruslan Salakhutdinov

from arxiv, ACL 2019 long paper. Code and pretrained models are available at //github.com/kimiyoung/transformer-xl

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem. As a result, Transformer-XL learns dependency that is 80% longer than RNNs and 450% longer than vanilla Transformers, achieves better performance on both short and long sequences, and is up to 1,800+ times faster than vanilla Transformers during evaluation. Notably, we improve the state-of-the-art results of bpc/perplexity to 0.99 on enwiki8, 1.08 on text8, 18.3 on WikiText-103, 21.8 on One Billion Word, and 54.5 on Penn Treebank (without finetuning). When trained only on WikiText-103, Transformer-XL manages to generate reasonably coherent, novel text articles with thousands of tokens. Our code, pretrained models, and hyperparameters are available in both Tensorflow and PyTorch.

視覺問答 · 數據集 · Performer · state-of-the-art · MoDELS ·

2018 年 3 月 20 日

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Qing Li,Qingyi Tao,Shafiq Joty,Jianfei Cai,Jiebo Luo

Most existing works in visual question answering (VQA) are dedicated to improving the accuracy of predicted answers, while disregarding the explanations. We argue that the explanation for an answer is of the same or even more importance compared with the answer itself, since it makes the question and answering process more understandable and traceable. To this end, we propose a new task of VQA-E (VQA with Explanation), where the computational models are required to generate an explanation with the predicted answer. We first construct a new dataset, and then frame the VQA-E problem in a multi-task learning architecture. Our VQA-E dataset is automatically derived from the VQA v2 dataset by intelligently exploiting the available captions. We have conducted a user study to validate the quality of explanations synthesized by our method. We quantitatively show that the additional supervision from explanations can not only produce insightful textual sentences to justify the answers, but also improve the performance of answer prediction. Our model outperforms the state-of-the-art methods by a clear margin on the VQA v2 dataset.