秋霞网一区二区三区,欧美PLAY视频海量性欧美,日韩精品综合一区二区

Buildings play a crucial role in human well-being, influencing occupant comfort, health, and safety. Additionally, they contribute significantly to global energy consumption, accounting for one-third of total energy usage, and carbon emissions. Optimizing building performance presents a vital opportunity to combat climate change and promote human flourishing. However, research in building analytics has been hampered by the lack of accessible, available, and comprehensive real-world datasets on multiple building operations. In this paper, we introduce the Building TimeSeries (BTS) dataset. Our dataset covers three buildings over a three-year period, comprising more than ten thousand timeseries data points with hundreds of unique ontologies. Moreover, the metadata is standardized using the Brick schema. To demonstrate the utility of this dataset, we performed benchmarks on two tasks: timeseries ontology classification and zero-shot forecasting. These tasks represent an essential initial step in addressing challenges related to interoperability in building analytics. Access to the dataset and the code used for benchmarking are available here: //github.com/cruiseresearchgroup/DIEF_BTS .

相關內容

數據集

關注 88

數據集，又稱為資料集、數據集合或資料集合，是一種由數據所組成的集合。
Data set（或dataset）是一個數據的集合，通常以表格形式出現。每一列代表一個特定變量。每一行都對應于某一成員的數據集的問題。它列出的價值觀為每一個變量，如身高和體重的一個物體或價值的隨機數。每個數值被稱為數據資料。對應于行數，該數據集的數據可能包括一個或多個成員。

詞元分析器 · 多峰值 · MoDELS · 語言模型化 · Prompt ·

2024 年 7 月 31 日

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models

Mingrui Wu,Xinyue Cai,Jiayi Ji,Jiale Li,Oucheng Huang,Gen Luo,Hao Fei,Xiaoshuai Sun,Rongrong Ji

from arxiv, Code://github.com/mrwu-mac/ControlMLLM

In this work, we propose a training-free method to inject visual referring into Multimodal Large Language Models (MLLMs) through learnable visual token optimization. We observe the relationship between text prompt tokens and visual tokens in MLLMs, where attention layers model the connection between them. Our approach involves adjusting visual tokens from the MLP output during inference, controlling which text prompt tokens attend to which visual tokens. We optimize a learnable visual token based on an energy function, enhancing the strength of referential regions in the attention map. This enables detailed region description and reasoning without the need for substantial training costs or model retraining. Our method offers a promising direction for integrating referential abilities into MLLMs. Our method support referring with box, mask, scribble and point. The results demonstrate that our method exhibits controllability and interpretability.

MoDELS · 語言模型化 · 數據集 · Automator · Performer ·

2024 年 7 月 31 日

LAPIS: Language Model-Augmented Police Investigation System

Heedou Kim,Dain Kim,Jiwoo Lee,Chanwoong Yoon,Donghee Choi,Mogan Gim,Jaewoo Kang

Crime situations are race against time. An AI-assisted criminal investigation system, providing prompt but precise legal counsel is in need for police officers. We introduce LAPIS (Language Model Augmented Police Investigation System), an automated system that assists police officers to perform rational and legal investigative actions. We constructed a finetuning dataset and retrieval knowledgebase specialized in crime investigation legal reasoning task. We extended the dataset's quality by incorporating manual curation efforts done by a group of domain experts. We then finetuned the pretrained weights of a smaller Korean language model to the newly constructed dataset and integrated it with the crime investigation knowledgebase retrieval approach. Experimental results show LAPIS' potential in providing reliable legal guidance for police officers, even better than the proprietary GPT-4 model. Qualitative analysis on the rationales generated by LAPIS demonstrate the model's reasoning ability to leverage the premises and derive legally correct conclusions.

邊 · MoDELS · 語言模型化 · 大語言模型 · 操作 ·

2024 年 7 月 31 日

EdgeLLM: A Highly Efficient CPU-FPGA Heterogeneous Edge Accelerator for Large Language Models

Mingqiang Huang,Ao Shen,Kai Li,Haoxiang Peng,Boyu Li,Hao Yu

The rapid advancements in artificial intelligence (AI), particularly the Large Language Models (LLMs), have profoundly affected our daily work and communication forms. However, the colossal scale of LLM presents significant operational challenges, particularly when attempting to deploy them on resource-constrained edge devices such as smartphones, robots, and embedded systems. In this work, we proposed EdgeLLM, an efficient CPU-FPGA heterogeneous acceleration framework, to markedly enhance the computational efficiency of LLMs on edge. We first analyzed the whole operators within AI models and developed a universal data parallelism scheme, which is generic and can be adapted to any type of AI algorithm. Then, we developed fully-customized hardware operators according to the designated data formats. A multitude of optimization techniques have been integrated in the design, such as approximate FP16*INT4 and FP16*FP16 computation engines, group vector systolic arrays, log-scale structured sparsity, asynchronous between data transfer and processing. Finally, we proposed an end-to-end compilation scheme that can dynamically compile all of the operators and map the whole model on CPU-FPGA heterogeneous system. The design has been deployed on AMD Xilinx VCU128 FPGA, our accelerator achieves 1.67x higher throughput and 7.4x higher energy efficiency than the commercial GPU (NVIDIA A100-SXM4-80G) on ChatGLM2-6B, and shows 10%~20% better performance than state-of-the-art FPGA accelerator of FlightLLM in terms of HBM bandwidth utilization and LLM throughput.

Automator · Prompt · SOTA · INTERACT · Performance ·

2024 年 7 月 30 日

ThinkRepair: Self-Directed Automated Program Repair

Xin Yin,Chao Ni,Shaohua Wang,Zhenhao Li,Limin Zeng,Xiaohu Yang

from arxiv, Accepted By ISSTA'24

Though many approaches have been proposed for Automated Program Repair (APR) and indeed achieved remarkable performance, they still have limitations in fixing bugs that require analyzing and reasoning about the logic of the buggy program. Recently, large language models (LLMs) instructed by prompt engineering have attracted much attention for their powerful ability to address many kinds of tasks including bug-fixing. However, the quality of the prompt will highly affect the ability of LLMs and manually constructing high-quality prompts is a costly endeavor. To address this limitation, we propose a self-directed LLM-based automated program repair, ThinkRepair, with two main phases: collection phase and fixing phase. The former phase automatically collects various chains of thoughts that constitute pre-fixed knowledge by instructing LLMs with the Chain-of-Thought (CoT) prompt. The latter phase targets fixing a bug by first selecting examples for few-shot learning and second automatically interacting with LLMs, optionally appending with feedback of testing information. Evaluations on two widely studied datasets (Defects4J and QuixBugs) by comparing ThinkRepair with 12 SOTA APRs indicate the priority of ThinkRepair in fixing bugs. Notably, ThinkRepair fixes 98 bugs and improves baselines by 27%-344.4% on Defects4J V1.2. On Defects4J V2.0, ThinkRepair fixes 12-65 more bugs than the SOTA APRs. Additionally, ThinkRepair also makes a considerable improvement on QuixBugs (31 for Java and 21 for Python at most).

多峰值 · 語言模型化 · Performer · MoDELS · INTERACT ·

2024 年 7 月 25 日

Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic

Fakhraddin Alwajih,Gagan Bhatia,Muhammad Abdul-Mageed

Recent advancements have significantly enhanced the capabilities of Multimodal Large Language Models (MLLMs) in generating and understanding image-to-text content. Despite these successes, progress is predominantly limited to English due to the scarcity of high quality multimodal resources in other languages. This limitation impedes the development of competitive models in languages such as Arabic. To alleviate this situation, we introduce an efficient Arabic multimodal assistant, dubbed Dallah, that utilizes an advanced language model based on LLaMA-2 to facilitate multimodal interactions. Dallah demonstrates state-of-the-art performance in Arabic MLLMs. Through fine-tuning six Arabic dialects, Dallah showcases its capability to handle complex dialectal interactions incorporating both textual and visual elements. The model excels in two benchmark tests: one evaluating its performance on Modern Standard Arabic (MSA) and another specifically designed to assess dialectal responses. Beyond its robust performance in multimodal interaction tasks, Dallah has the potential to pave the way for further development of dialect-aware Arabic MLLMs.

估計/估計量 · 詞表 · 3D · 情景 · MoDELS ·

2024 年 7 月 24 日

LangOcc: Self-Supervised Open Vocabulary Occupancy Estimation via Volume Rendering

Simon Boeder,Fabian Gigengack,Benjamin Risse

Semantic occupancy has recently gained significant traction as a prominent method for 3D scene representation. However, most existing camera-based methods rely on costly datasets with fine-grained 3D voxel labels or LiDAR scans for training, which limits their practicality and scalability, raising the need for self-supervised approaches in this domain. Moreover, most methods are tied to a predefined set of classes which they can detect. In this work we present a novel approach for open vocabulary occupancy estimation called \textit{LangOcc}, that is trained only via camera images, and can detect arbitrary semantics via vision-language alignment. In particular, we distill the knowledge of the strong vision-language aligned encoder CLIP into a 3D occupancy model via differentiable volume rendering. Our model estimates vision-language aligned features in a 3D voxel grid using only images. It is trained in a self-supervised manner by rendering our estimations back to 2D space, where ground-truth features can be computed. This training mechanism automatically supervises the scene geometry, allowing for a straight-forward and powerful training method without any explicit geometry supervision. LangOcc outperforms LiDAR-supervised competitors in open vocabulary occupancy by a large margin, solely relying on vision-based training. We also achieve state-of-the-art results in self-supervised semantic occupancy estimation on the Occ3D-nuScenes dataset, despite not being limited to a specific set of categories, thus demonstrating the effectiveness of our proposed vision-language training.

MoDELS · 分解的 · 可理解性 · 均值 · Processing（編程語言） ·

2024 年 7 月 24 日

XMeCap: Meme Caption Generation with Sub-Image Adaptability

Yuyan Chen,Songzhou Yan,Zhihong Zhu,Zhixu Li,Yanghua Xiao

from arxiv, Accepted to MM 2024

Humor, deeply rooted in societal meanings and cultural details, poses a unique challenge for machines. While advances have been made in natural language processing, real-world humor often thrives in a multi-modal context, encapsulated distinctively by memes. This paper poses a particular emphasis on the impact of multi-images on meme captioning. After that, we introduce the \textsc{XMeCap} framework, a novel approach that adopts supervised fine-tuning and reinforcement learning based on an innovative reward model, which factors in both global and local similarities between visuals and text. Our results, benchmarked against contemporary models, manifest a marked improvement in caption generation for both single-image and multi-image memes, as well as different meme categories. \textsc{XMeCap} achieves an average evaluation score of 75.85 for single-image memes and 66.32 for multi-image memes, outperforming the best baseline by 3.71\% and 4.82\%, respectively. This research not only establishes a new frontier in meme-related studies but also underscores the potential of machines in understanding and generating humor in a multi-modal setting.

MoDELS · Performer · HTTPS · 監督 · 組合性 ·

2023 年 12 月 4 日

Data Management For Large Language Models: A Survey

Zige Wang,Wanjun Zhong,Yufei Wang,Qi Zhu,Fei Mi,Baojun Wang,Lifeng Shang,Xin Jiang,Qun Liu

from arxiv, Work in progress

Data plays a fundamental role in the training of Large Language Models (LLMs). Effective data management, particularly in the formulation of a well-suited training dataset, holds significance for enhancing model performance and improving training efficiency during pretraining and supervised fine-tuning phases. Despite the considerable importance of data management, the current research community still falls short in providing a systematic analysis of the rationale behind management strategy selection, its consequential effects, methodologies for evaluating curated datasets, and the ongoing pursuit of improved strategies. Consequently, the exploration of data management has attracted more and more attention among the research community. This survey provides a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs, covering various noteworthy aspects of data management strategy design: data quantity, data quality, domain/task composition, etc. Looking toward the future, we extrapolate existing challenges and outline promising directions for development in this field. Therefore, this survey serves as a guiding resource for practitioners aspiring to construct powerful LLMs through effective data management practices. The collection of the latest papers is available at //github.com/ZigeW/data_management_LLM.

評論員 · Cognition · 多樣性 · Prompt · 講稿 ·

2022 年 11 月 21 日

Intelligent Computing: The Latest Advances, Challenges and Future

Shiqiang Zhu,Ting Yu,Tao Xu,Hongyang Chen,Schahram Dustdar,Sylvain Gigan,Deniz Gunduz,Ekram Hossain,Yaochu Jin,Feng Lin,Bo Liu,Zhiguo Wan,Ji Zhang,Zhifeng Zhao,Wentao Zhu,Zuoning Chen,Tariq Durrani,Huaimin Wang,Jiangxing Wu,Tongyi Zhang,Yunhe Pan

Computing is a critical driving force in the development of human civilization. In recent years, we have witnessed the emergence of intelligent computing, a new computing paradigm that is reshaping traditional computing and promoting digital revolution in the era of big data, artificial intelligence and internet-of-things with new computing theories, architectures, methods, systems, and applications. Intelligent computing has greatly broadened the scope of computing, extending it from traditional computing on data to increasingly diverse computing paradigms such as perceptual intelligence, cognitive intelligence, autonomous intelligence, and human-computer fusion intelligence. Intelligence and computing have undergone paths of different evolution and development for a long time but have become increasingly intertwined in recent years: intelligent computing is not only intelligence-oriented but also intelligence-driven. Such cross-fertilization has prompted the emergence and rapid advancement of intelligent computing. Intelligent computing is still in its infancy and an abundance of innovations in the theories, systems, and applications of intelligent computing are expected to occur soon. We present the first comprehensive survey of literature on intelligent computing, covering its theory fundamentals, the technological fusion of intelligence and computing, important applications, challenges, and future perspectives. We believe that this survey is highly timely and will provide a comprehensive reference and cast valuable insights into intelligent computing for academic and industrial researchers and practitioners.

Branch · Networking · 示例 · Better · 可理解性 ·

2019 年 4 月 10 日

S4Net: Single Stage Salient-Instance Segmentation

Ruochen Fan,Ming-Ming Cheng,Qibin Hou,Tai-Jiang Mu,Jingdong Wang,Shi-Min Hu

We consider an interesting problem-salient instance segmentation in this paper. Other than producing bounding boxes, our network also outputs high-quality instance-level segments. Taking into account the category-independent property of each target, we design a single stage salient instance segmentation framework, with a novel segmentation branch. Our new branch regards not only local context inside each detection window but also its surrounding context, enabling us to distinguish the instances in the same scope even with obstruction. Our network is end-to-end trainable and runs at a fast speed (40 fps when processing an image with resolution 320x320). We evaluate our approach on a publicly available benchmark and show that it outperforms other alternative solutions. We also provide a thorough analysis of the design choices to help readers better understand the functions of each part of our network. The source code can be found at \url{//github.com/RuochenFan/S4Net}.