我的美女教师在线观看免费,国产在线精品99一区

We introduce SuperCLUE-Math6(SC-Math6), a new benchmark dataset to evaluate the mathematical reasoning abilities of Chinese language models. SC-Math6 is designed as an upgraded Chinese version of the GSM8K dataset with enhanced difficulty, diversity, and application scope. It consists of over 2000 mathematical word problems requiring multi-step reasoning and providing natural language solutions. We propose an innovative scheme to quantify the reasoning capability of large models based on performance over problems with different reasoning steps. Experiments on 12 representative Chinese models demonstrate a clear stratification of reasoning levels, with top models like GPT-4 showing superior performance. SC-Math6 fills the gap in Chinese mathematical reasoning benchmarks and provides a comprehensive testbed to advance the intelligence of Chinese language models.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · LIDAR · SLAM · Learning · Performer ·

2024 年 3 月 5 日

LiSTA: Geometric Object-Based Change Detection in Cluttered Environments

Joseph Rowell,Lintong Zhang,Maurice Fallon

from arxiv, 6+n page limit for (accepted) ICRA 2024 submission

We present LiSTA (LiDAR Spatio-Temporal Analysis), a system to detect probabilistic object-level change over time using multi-mission SLAM. Many applications require such a system, including construction, robotic navigation, long-term autonomy, and environmental monitoring. We focus on the semi-static scenario where objects are added, subtracted, or changed in position over weeks or months. Our system combines multi-mission LiDAR SLAM, volumetric differencing, object instance description, and correspondence grouping using learned descriptors to keep track of an open set of objects. Object correspondences between missions are determined by clustering the object's learned descriptors. We demonstrate our approach using datasets collected in a simulated environment and a real-world dataset captured using a LiDAR system mounted on a quadruped robot monitoring an industrial facility containing static, semi-static, and dynamic objects. Our method demonstrates superior performance in detecting changes in semi-static environments compared to existing methods.

多峰值 · 語言模型化 · 大語言模型 · MoDELS · 推斷 ·

2024 年 3 月 5 日

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models

Gen Luo,Yiyi Zhou,Yuxin Zhang,Xiawu Zheng,Xiaoshuai Sun,Rongrong Ji

Despite remarkable progress, existing multimodal large language models (MLLMs) are still inferior in granular visual recognition. Contrary to previous works, we study this problem from the perspective of image resolution, and reveal that a combination of low- and high-resolution visual features can effectively mitigate this shortcoming. Based on this observation, we propose a novel and efficient method for MLLMs, termed Mixture-of-Resolution Adaptation (MRA). In particular, MRA adopts two visual pathways for images with different resolutions, where high-resolution visual information is embedded into the low-resolution pathway via the novel mixture-of-resolution adapters (MR-Adapters). This design also greatly reduces the input sequence length of MLLMs. To validate MRA, we apply it to a recent MLLM called LLaVA, and term the new model LLaVA-HR. We conduct extensive experiments on 11 vision-language (VL) tasks, which show that LLaVA-HR outperforms existing MLLMs on 8 VL tasks, e.g., +9.4% on TextVQA. More importantly, both training and inference of LLaVA-HR remain efficient with MRA, e.g., 20 training hours and 3$\times$ inference speed than LLaVA-1.5. Source codes are released at: //github.com/luogen1996/LLaVA-HR.

剪枝 · 詞元分析器 · 多峰值 · 可約的 · 變換 ·

2024 年 3 月 5 日

MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer

Jianjian Cao,Peng Ye,Shengze Li,Chong Yu,Yansong Tang,Jiwen Lu,Tao Chen

from arxiv, 19 pages, 9 figures, Published in CVPR2024

Vision-Language Transformers (VLTs) have shown great success recently, but are meanwhile accompanied by heavy computation costs, where a major reason can be attributed to the large number of visual and language tokens. Existing token pruning research for compressing VLTs mainly follows a single-modality-based scheme yet ignores the critical role of aligning different modalities for guiding the token pruning process, causing the important tokens for one modality to be falsely pruned in another modality branch. Meanwhile, existing VLT pruning works also lack the flexibility to dynamically compress each layer based on different input samples. To this end, we propose a novel framework named Multimodal Alignment-Guided Dynamic Token Pruning (MADTP) for accelerating various VLTs. Specifically, we first introduce a well-designed Multi-modality Alignment Guidance (MAG) module that can align features of the same semantic concept from different modalities, to ensure the pruned tokens are less important for all modalities. We further design a novel Dynamic Token Pruning (DTP) module, which can adaptively adjust the token compression ratio in each layer based on different input instances. Extensive experiments on various benchmarks demonstrate that MADTP significantly reduces the computational complexity of kinds of multimodal models while preserving competitive performance. Notably, when applied to the BLIP model in the NLVR2 dataset, MADTP can reduce the GFLOPs by 80% with less than 4% performance degradation.

評分函數 · 泛函 · 得分 · MoDELS · 無監督 ·

2024 年 3 月 5 日

DISYRE: Diffusion-Inspired SYnthetic REstoration for Unsupervised Anomaly Detection

Sergio Naval Marimont,Matthew Baugh,Vasilis Siomos,Christos Tzelepis,Bernhard Kainz,Giacomo Tarroni

from arxiv, 5 pages, 3 figures. Accepted for publication in ISBI 2024

Unsupervised Anomaly Detection (UAD) techniques aim to identify and localize anomalies without relying on annotations, only leveraging a model trained on a dataset known to be free of anomalies. Diffusion models learn to modify inputs $x$ to increase the probability of it belonging to a desired distribution, i.e., they model the score function $\nabla_x \log p(x)$. Such a score function is potentially relevant for UAD, since $\nabla_x \log p(x)$ is itself a pixel-wise anomaly score. However, diffusion models are trained to invert a corruption process based on Gaussian noise and the learned score function is unlikely to generalize to medical anomalies. This work addresses the problem of how to learn a score function relevant for UAD and proposes DISYRE: Diffusion-Inspired SYnthetic REstoration. We retain the diffusion-like pipeline but replace the Gaussian noise corruption with a gradual, synthetic anomaly corruption so the learned score function generalizes to medical, naturally occurring anomalies. We evaluate DISYRE on three common Brain MRI UAD benchmarks and substantially outperform other methods in two out of the three tasks.

prototype · 噪聲 · 可辨認的 · Performer · 圖像分割 ·

2024 年 3 月 5 日

ProCNS: Progressive Prototype Calibration and Noise Suppression for Weakly-Supervised Medical Image Segmentation

Y. Liu,L. Lin,K. K. Y. Wong,X. Tang

Weakly-supervised segmentation (WSS) has emerged as a solution to mitigate the conflict between annotation cost and model performance by adopting sparse annotation formats (e.g., point, scribble, block, etc.). Typical approaches attempt to exploit anatomy and topology priors to directly expand sparse annotations into pseudo-labels. However, due to a lack of attention to the ambiguous edges in medical images and insufficient exploration of sparse supervision, existing approaches tend to generate erroneous and overconfident pseudo proposals in noisy regions, leading to cumulative model error and performance degradation. In this work, we propose a novel WSS approach, named ProCNS, encompassing two synergistic modules devised with the principles of progressive prototype calibration and noise suppression. Specifically, we design a Prototype-based Regional Spatial Affinity (PRSA) loss to maximize the pair-wise affinities between spatial and semantic elements, providing our model of interest with more reliable guidance. The affinities are derived from the input images and the prototype-refined predictions. Meanwhile, we propose an Adaptive Noise Perception and Masking (ANPM) module to obtain more enriched and representative prototype representations, which adaptively identifies and masks noisy regions within the pseudo proposals, reducing potential erroneous interference during prototype computation. Furthermore, we generate specialized soft pseudo-labels for the noisy regions identified by ANPM, providing supplementary supervision. Extensive experiments on three medical image segmentation tasks involving different modalities demonstrate that the proposed framework significantly outperforms representative state-of-the-art methods

MoDELS · 黑盒 · Performer · 語言模型化 · 向量化 ·

2024 年 3 月 5 日

Mafin: Enhancing Black-Box Embeddings with Model Augmented Fine-Tuning

Mingtian Zhang,Shawn Lan,Peter Hayes,David Barber

Retrieval Augmented Generation (RAG) has emerged as an effective solution for mitigating hallucinations in Large Language Models (LLMs). The retrieval stage in RAG typically involves a pre-trained embedding model, which converts queries and passages into vectors to capture their semantics. However, a standard pre-trained embedding model may exhibit sub-optimal performance when applied to specific domain knowledge, necessitating fine-tuning. This paper addresses scenarios where the embeddings are only available from a black-box model. We introduce Model augmented fine-tuning (Mafin) -- a novel approach for fine-tuning a black-box embedding model by augmenting it with a trainable embedding model. Our results demonstrate that Mafin significantly enhances the performance of the black-box embeddings by only requiring the training of a small augmented model. We validate the effectiveness of our method on both labeled and unlabeled datasets, illustrating its broad applicability and efficiency.

THz · 估計/估計量 · DSS · 通道 · MoDELS ·

2024 年 3 月 4 日

DSS-o-SAGE: Direction-Scan Sounding-Oriented SAGE Algorithm for Channel Parameter Estimation in mmWave and THz Bands

Yuanbo Li,Chong Han,Yi Chen,Ziming Yu,Xuefeng Yin

from arxiv, 15 pages, 10 figures, 3 tables

Investigation of millimeter (mmWave) and Terahertz (THz) channels relies on channel measurements and estimation of multi-path component (MPC) parameters. As a common measurement technique in the mmWave and THz bands, direction-scan sounding (DSS) resolves angular information and increases the measurable distance. Through mechanical rotation, the DSS creates a virtual multi-antenna sounding system, which however incurs signal phase instability and large data sizes, which are not fully considered in existing estimation algorithms and thus make them ineffective. To tackle this research gap, in this paper, a DSS-oriented space-alternating generalized expectation-maximization (DSS-o-SAGE) algorithm is proposed for channel parameter estimation in mmWave and THz bands. To appropriately capture the measured data in mmWave and THz DSS, the phase instability is modeled by the scanning-direction-dependent signal phases. Furthermore, based on the signal model, the DSS-o-SAGE algorithm is developed, which not only addresses the problems brought by phase instability, but also achieves ultra-low computational complexity by exploiting the narrow antenna beam property of DSS. Simulations in synthetic channels are conducted to demonstrate the efficacy of the proposed algorithm and explore the applicable region of the far-field approximation in DSS-o-SAGE. Last but not least, the proposed DSS-o-SAGE algorithm is applied in real measurements in an indoor corridor scenario at 300~GHz. Compared with results using the baseline noise-elimination method, the channel is characterized more correctly and reasonably based on the DSS-o-SAGE.

3D · 三維重建 · ImageNet (數據集) · 多樣性 · 正則化項 ·

2024 年 3 月 1 日

G3DR: Generative 3D Reconstruction in ImageNet

Pradyumna Reddy,Ismail Elezi,Jiankang Deng

We introduce a novel 3D generative method, Generative 3D Reconstruction (G3DR) in ImageNet, capable of generating diverse and high-quality 3D objects from single images, addressing the limitations of existing methods. At the heart of our framework is a novel depth regularization technique that enables the generation of scenes with high-geometric fidelity. G3DR also leverages a pretrained language-vision model, such as CLIP, to enable reconstruction in novel views and improve the visual realism of generations. Additionally, G3DR designs a simple but effective sampling procedure to further improve the quality of generations. G3DR offers diverse and efficient 3D asset generation based on class or text conditioning. Despite its simplicity, G3DR is able to beat state-of-theart methods, improving over them by up to 22% in perceptual metrics and 90% in geometry scores, while needing only half of the training time. Code is available at //github.com/preddy5/G3DR

大語言模型 · 語言模型化 · MoDELS · Performer · 相關系數 ·

2024 年 3 月 1 日

Crimson: Empowering Strategic Reasoning in Cybersecurity through Large Language Models

Jiandong Jin,Bowen Tang,Mingxuan Ma,Xiao Liu,Yunfei Wang,Qingnan Lai,Jia Yang,Changling Zhou

from arxiv, 9 pages, 7 figures

We introduces Crimson, a system that enhances the strategic reasoning capabilities of Large Language Models (LLMs) within the realm of cybersecurity. By correlating CVEs with MITRE ATT&CK techniques, Crimson advances threat anticipation and strategic defense efforts. Our approach includes defining and evaluating cybersecurity strategic tasks, alongside implementing a comprehensive human-in-the-loop data-synthetic workflow to develop the CVE-to-ATT&CK Mapping (CVEM) dataset. We further enhance LLMs' reasoning abilities through a novel Retrieval-Aware Training (RAT) process and its refined iteration, RAT-R. Our findings demonstrate that an LLM fine-tuned with our techniques, possessing 7 billion parameters, approaches the performance level of GPT-4, showing markedly lower rates of hallucination and errors, and surpassing other models in strategic reasoning tasks. Moreover, domain-specific fine-tuning of embedding models significantly improves performance within cybersecurity contexts, underscoring the efficacy of our methodology. By leveraging Crimson to convert raw vulnerability data into structured and actionable insights, we bolster proactive cybersecurity defenses.

相似度 · INFORMS · 估計/估計量 · Extensibility · 無監督 ·

2021 年 3 月 10 日

SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance

Fu-Zhao Ou,Xingyu Chen,Ruixin Zhang,Yuge Huang,Shaoxin Li,Jilin Li,Yong Li,Liujuan Cao,Yuan-Gen Wang

In recent years, Face Image Quality Assessment (FIQA) has become an indispensable part of the face recognition system to guarantee the stability and reliability of recognition performance in an unconstrained scenario. For this purpose, the FIQA method should consider both the intrinsic property and the recognizability of the face image. Most previous works aim to estimate the sample-wise embedding uncertainty or pair-wise similarity as the quality score, which only considers the information from partial intra-class. However, these methods ignore the valuable information from the inter-class, which is for estimating to the recognizability of face image. In this work, we argue that a high-quality face image should be similar to its intra-class samples and dissimilar to its inter-class samples. Thus, we propose a novel unsupervised FIQA method that incorporates Similarity Distribution Distance for Face Image Quality Assessment (SDD-FIQA). Our method generates quality pseudo-labels by calculating the Wasserstein Distance (WD) between the intra-class similarity distributions and inter-class similarity distributions. With these quality pseudo-labels, we are capable of training a regression network for quality prediction. Extensive experiments on benchmark datasets demonstrate that the proposed SDD-FIQA surpasses the state-of-the-arts by an impressive margin. Meanwhile, our method shows good generalization across different recognition systems.