东京热加勒比中文无码,亚洲自偷拍狠无码,2021国产精品久久久精品了,日韩在线观看视频第一区,亚洲色中文字幕无码A

Clients often partner with AI experts to develop AI applications tailored to their needs. In these partnerships, careful planning and clear communication are critical, as inaccurate or incomplete specifications can result in misaligned model characteristics, expensive reworks, and potential friction between collaborators. Unfortunately, given the complexity of requirements ranging from functionality, data, and governance, effective guidelines for collaborative specification of requirements in client-AI expert collaborations are missing. In this work, we introduce AINeedsPlanner, a workbook that AI experts and clients can use to facilitate effective interchange and clear specifications. The workbook is based on (1) an interview of 10 completed AI application project teams, which identifies and characterizes steps in AI application planning and (2) a study with 12 AI experts, which defines a taxonomy of AI experts' information needs and dimensions that affect the information needs. Finally, we demonstrate the workbook's utility with two case studies in real-world settings.

相關內容

關注 7030

人工智能雜志AI(Artificial Intelligence)是目前公認的發表該領域最新研究成果的主要國際論壇。該期刊歡迎有關AI廣泛方面的論文，這些論文構成了整個領域的進步，也歡迎介紹人工智能應用的論文，但重點應該放在新的和新穎的人工智能方法如何提高應用領域的性能，而不是介紹傳統人工智能方法的另一個應用。關于應用的論文應該描述一個原則性的解決方案，強調其新穎性，并對正在開發的人工智能技術進行深入的評估。官網地址：

MoDELS · 知識 (knowledge) · 語言模型化 · Performer · Integration ·

2024 年 5 月 26 日

MEMORYLLM: Towards Self-Updatable Large Language Models

Yu Wang,Yifan Gao,Xiusi Chen,Haoming Jiang,Shiyang Li,Jingfeng Yang,Qingyu Yin,Zheng Li,Xian Li,Bing Yin,Jingbo Shang,Julian McAuley

from arxiv, 13 pages, 9 figures

Existing Large Language Models (LLMs) usually remain static after deployment, which might make it hard to inject new knowledge into the model. We aim to build models containing a considerable portion of self-updatable parameters, enabling the model to integrate new knowledge effectively and efficiently. To this end, we introduce MEMORYLLM, a model that comprises a transformer and a fixed-size memory pool within the latent space of the transformer. MEMORYLLM can self-update with text knowledge and memorize the knowledge injected earlier. Our evaluations demonstrate the ability of MEMORYLLM to effectively incorporate new knowledge, as evidenced by its performance on model editing benchmarks. Meanwhile, the model exhibits long-term information retention capacity, which is validated through our custom-designed evaluations and long-context benchmarks. MEMORYLLM also shows operational integrity without any sign of performance degradation even after nearly a million memory updates. Our code and model are open-sourced at //github.com/wangyu-ustc/MemoryLLM.

后驗分布 · MoDELS · 在線 · AIM · Performer ·

2024 年 5 月 25 日

Posterior Probability Matters: Doubly-Adaptive Calibration for Neural Predictions in Online Advertising

Penghui Wei,Weimin Zhang,Ruijie Hou,Jinquan Liu,Shaoguo Liu,Liang Wang,Bo Zheng

from arxiv, SIGIR 2022 (short)

Predicting user response probabilities is vital for ad ranking and bidding. We hope that predictive models can produce accurate probabilistic predictions that reflect true likelihoods. Calibration techniques aim to post-process model predictions to posterior probabilities. Field-level calibration -- which performs calibration w.r.t. to a specific field value -- is fine-grained and more practical. In this paper we propose a doubly-adaptive approach AdaCalib. It learns an isotonic function family to calibrate model predictions with the guidance of posterior statistics, and field-adaptive mechanisms are designed to ensure that the posterior is appropriate for the field value to be calibrated. Experiments verify that AdaCalib achieves significant improvement on calibration performance. It has been deployed online and beats previous approach.

多峰值 · 可理解性 · 查準率/準確率 · 大語言模型 · MoDELS ·

2024 年 5 月 24 日

V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM

Abdur Rahman,Rajat Chawla,Muskaan Kumar,Arkajit Datta,Adarsh Jha,Mukunda NS,Ishaan Bhola

In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting the potential of existing models to enhance automation levels. To bridge this gap, this paper presents V-Zen, an innovative Multimodal Large Language Model (MLLM) meticulously crafted to revolutionise the domain of GUI understanding and grounding. Equipped with dual-resolution image encoders, V-Zen establishes new benchmarks in efficient grounding and next-action prediction, thereby laying the groundwork for self-operating computer systems. Complementing V-Zen is the GUIDE dataset, an extensive collection of real-world GUI elements and task-based sequences, serving as a catalyst for specialised fine-tuning. The successful integration of V-Zen and GUIDE marks the dawn of a new era in multimodal AI research, opening the door to intelligent, autonomous computing experiences. This paper extends an invitation to the research community to join this exciting journey, shaping the future of GUI automation. In the spirit of open science, our code, data, and model will be made publicly available, paving the way for multimodal dialogue scenarios with intricate and precise interactions.

3D · INTERACT · MoDELS · 正則化項 · HTTPS ·

2024 年 5 月 23 日

CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Weiyu Li,Jiarui Liu,Rui Chen,Yixun Liang,Xuelin Chen,Ping Tan,Xiaoxiao Long

from arxiv, HomePage: //craftsman3d.github.io/, Code: //github.com/wyysf-98/CraftsMan

We present a novel generative 3D modeling system, coined CraftsMan, which can generate high-fidelity 3D geometries with highly varied shapes, regular mesh topologies, and detailed surfaces, and, notably, allows for refining the geometry in an interactive manner. Despite the significant advancements in 3D generation, existing methods still struggle with lengthy optimization processes, irregular mesh topologies, noisy surfaces, and difficulties in accommodating user edits, consequently impeding their widespread adoption and implementation in 3D modeling software. Our work is inspired by the craftsman, who usually roughs out the holistic figure of the work first and elaborates the surface details subsequently. Specifically, we employ a 3D native diffusion model, which operates on latent space learned from latent set-based 3D representations, to generate coarse geometries with regular mesh topology in seconds. In particular, this process takes as input a text prompt or a reference image and leverages a powerful multi-view (MV) diffusion model to generate multiple views of the coarse geometry, which are fed into our MV-conditioned 3D diffusion model for generating the 3D geometry, significantly improving robustness and generalizability. Following that, a normal-based geometry refiner is used to significantly enhance the surface details. This refinement can be performed automatically, or interactively with user-supplied edits. Extensive experiments demonstrate that our method achieves high efficacy in producing superior-quality 3D assets compared to existing methods. HomePage: //craftsman3d.github.io/, Code: //github.com/wyysf-98/CraftsMan

層 · AI Agent · 大語言模型 · MoDELS · Agent ·

2024 年 5 月 23 日

FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models

Hongyang Yang,Boyu Zhang,Neng Wang,Cheng Guo,Xiaoli Zhang,Likun Lin,Junlin Wang,Tianyu Zhou,Mao Guan,Runjia Zhang,Christina Dan Wang

from arxiv, FinRobot Whitepaper V1.0

As financial institutions and professionals increasingly incorporate Large Language Models (LLMs) into their workflows, substantial barriers, including proprietary data and specialized knowledge, persist between the finance sector and the AI community. These challenges impede the AI community's ability to enhance financial tasks effectively. Acknowledging financial analysis's critical role, we aim to devise financial-specialized LLM-based toolchains and democratize access to them through open-source initiatives, promoting wider AI adoption in financial decision-making. In this paper, we introduce FinRobot, a novel open-source AI agent platform supporting multiple financially specialized AI agents, each powered by LLM. Specifically, the platform consists of four major layers: 1) the Financial AI Agents layer that formulates Financial Chain-of-Thought (CoT) by breaking sophisticated financial problems down into logical sequences; 2) the Financial LLM Algorithms layer dynamically configures appropriate model application strategies for specific tasks; 3) the LLMOps and DataOps layer produces accurate models by applying training/fine-tuning techniques and using task-relevant data; 4) the Multi-source LLM Foundation Models layer that integrates various LLMs and enables the above layers to access them directly. Finally, FinRobot provides hands-on for both professional-grade analysts and laypersons to utilize powerful AI techniques for advanced financial analysis. We open-source FinRobot at \url{//github.com/AI4Finance-Foundation/FinRobot}.

MoDELS · 秩 · Performer · Better · 語言模型化 ·

2024 年 5 月 23 日

RaFe: Ranking Feedback Improves Query Rewriting for RAG

Shengyu Mao,Yong Jiang,Boli Chen,Xiao Li,Peng Wang,Xinyu Wang,Pengjun Xie,Fei Huang,Huajun Chen,Ningyu Zhang

from arxiv, 16 pages

As Large Language Models (LLMs) and Retrieval Augmentation Generation (RAG) techniques have evolved, query rewriting has been widely incorporated into the RAG system for downstream tasks like open-domain QA. Many works have attempted to utilize small models with reinforcement learning rather than costly LLMs to improve query rewriting. However, current methods require annotations (e.g., labeled relevant documents or downstream answers) or predesigned rewards for feedback, which lack generalization, and fail to utilize signals tailored for query rewriting. In this paper, we propose ours, a framework for training query rewriting models free of annotations. By leveraging a publicly available reranker, ours~provides feedback aligned well with the rewriting objectives. Experimental results demonstrate that ours~can obtain better performance than baselines.

統計量 · 推斷 · 評論員 · 張成子空間 · 設計 ·

2024 年 5 月 22 日

Guarding Multiple Secrets: Enhanced Summary Statistic Privacy for Data Sharing

Shuaiqi Wang,Rongzhe Wei,Mohsen Ghassemi,Eleonora Kreacic,Vamsi K. Potluru

Data sharing enables critical advances in many research areas and business applications, but it may lead to inadvertent disclosure of sensitive summary statistics (e.g., means or quantiles). Existing literature only focuses on protecting a single confidential quantity, while in practice, data sharing involves multiple sensitive statistics. We propose a novel framework to define, analyze, and protect multi-secret summary statistics privacy in data sharing. Specifically, we measure the privacy risk of any data release mechanism by the worst-case probability of an attacker successfully inferring summary statistic secrets. Given an attacker's objective spanning from inferring a subset to the entirety of summary statistic secrets, we systematically design and analyze tailored privacy metrics. Defining the distortion as the worst-case distance between the original and released data distribution, we analyze the tradeoff between privacy and distortion. Our contribution also includes designing and analyzing data release mechanisms tailored for different data distributions and secret types. Evaluations on real-world data demonstrate the effectiveness of our mechanisms in practical applications.

Conformer · MoDELS · 輸出 · 可辨認的 · 預測器/決策函數 ·

2024 年 5 月 21 日

Conformal Alignment: Knowing When to Trust Foundation Models with Guarantees

Yu Gui,Ying Jin,Zhimei Ren

Before deploying outputs from foundation models in high-stakes tasks, it is imperative to ensure that they align with human values. For instance, in radiology report generation, reports generated by a vision-language model must align with human evaluations before their use in medical decision-making. This paper presents Conformal Alignment, a general framework for identifying units whose outputs meet a user-specified alignment criterion. It is guaranteed that on average, a prescribed fraction of selected units indeed meet the alignment criterion, regardless of the foundation model or the data distribution. Given any pre-trained model and new units with model-generated outputs, Conformal Alignment leverages a set of reference data with ground-truth alignment status to train an alignment predictor. It then selects new units whose predicted alignment scores surpass a data-dependent threshold, certifying their corresponding outputs as trustworthy. Through applications to question answering and radiology report generation, we demonstrate that our method is able to accurately identify units with trustworthy outputs via lightweight training over a moderate amount of reference data. En route, we investigate the informativeness of various features in alignment prediction and combine them with standard models to construct the alignment predictor.

Performer · 多峰值 · MINE · MoDELS · 語言表示 ·

2021 年 6 月 25 日

iReason: Multimodal Commonsense Reasoning using Videos and Natural Language with Interpretability

Aman Chadha,Vinija Jain

from arxiv, 12 pages, 1 figure, 7 tables

Causality knowledge is vital to building robust AI systems. Deep learning models often perform poorly on tasks that require causal reasoning, which is often derived using some form of commonsense knowledge not immediately available in the input but implicitly inferred by humans. Prior work has unraveled spurious observational biases that models fall prey to in the absence of causality. While language representation models preserve contextual knowledge within learned embeddings, they do not factor in causal relationships during training. By blending causal relationships with the input features to an existing model that performs visual cognition tasks (such as scene understanding, video captioning, video question-answering, etc.), better performance can be achieved owing to the insight causal relationships bring about. Recently, several models have been proposed that have tackled the task of mining causal data from either the visual or textual modality. However, there does not exist widespread research that mines causal relationships by juxtaposing the visual and language modalities. While images offer a rich and easy-to-process resource for us to mine causality knowledge from, videos are denser and consist of naturally time-ordered events. Also, textual information offers details that could be implicit in videos. We propose iReason, a framework that infers visual-semantic commonsense knowledge using both videos and natural language captions. Furthermore, iReason's architecture integrates a causal rationalization module to aid the process of interpretability, error analysis and bias detection. We demonstrate the effectiveness of iReason using a two-pronged comparative analysis with language representation learning models (BERT, GPT-2) as well as current state-of-the-art multimodal causality models.

state-of-the-art · 可理解性 · BERT · 去噪自編碼器 · Performer ·

2019 年 6 月 19 日

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Zhilin Yang,Zihang Dai,Yiming Yang,Jaime Carbonell,Ruslan Salakhutdinov,Quoc V. Le

from arxiv, Pretrained models and code are available at //github.com/zihangdai/xlnet

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, XLNet outperforms BERT on 20 tasks, often by a large margin, and achieves state-of-the-art results on 18 tasks including question answering, natural language inference, sentiment analysis, and document ranking.