国产欧美日韩综合在线,久久一级高潮A免费

As Multimodal Large Language Models (MLLMs) grow in size, adapting them to specialized tasks becomes increasingly challenging due to high computational and memory demands. Indeed, traditional fine-tuning methods are costly, due to the need for extensive, task-specific training. While efficient adaptation methods exist that aim to reduce these costs, in practice they suffer from shallow inter-modal alignment, which severely hurts model effectiveness. To tackle these computational challenges and improve inter-modal alignment, we introduce the MultiWay-Adapter (MWA), a novel framework featuring an 'Alignment Enhancer'. This enhancer deepens inter-modal alignment, enabling high transferability with minimal tuning effort. Our experiments show that unlike prior efficient tuning approaches, MWA maintains model effectiveness, while reducing training time by up-to 57%. MWA is also lightweight, increasing model size by only 2-3% (in terms of parameters) for state-of-the-art foundation models like BEiT-3 Large. These results demonstrate that MWA provides an efficient and effective adaptation method for MLLMs, significantly broadening their applicability.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Markov · 隱馬爾科夫模型 · MoDELS · 估計/估計量 ·

2024 年 3 月 19 日

A Bayesian multilevel hidden Markov model with Poisson-lognormal emissions for intense longitudinal count data

S. Mildiner Moraga,E. Aarts

Hidden Markov models (HMMs) are probabilistic methods in which observations are seen as realizations of a latent Markov process with discrete states that switch over time. Moving beyond standard statistical tests, HMMs offer a statistical environment to optimally exploit the information present in multivariate time series, uncovering the latent dynamics that rule them. Here, we extend the Poisson HMM to the multilevel framework, accommodating variability between individuals with continuously distributed individual random effects following a lognormal distribution, and describe how to estimate the model in a fully parametric Bayesian framework. The proposed multilevel HMM enables probabilistic decoding of hidden state sequences from multivariate count time-series based on individual-specific parameters, and offers a framework to quantificate between-individual variability formally. Through a Monte Carlo study we show that the multilevel HMM outperforms the HMM for scenarios involving heterogeneity between individuals, demonstrating improved decoding accuracy and estimation performance of parameters of the emission distribution, and performs equally well when not between heterogeneity is present. Finally, we illustrate how to use our model to explore the latent dynamics governing complex multivariate count data in an empirical application concerning pilot whale diving behaviour in the wild, and how to identify neural states from multi-electrode recordings of motor neural cortex activity in a macaque monkey in an experimental set up. We make the multilevel HMM introduced in this study publicly available in the R-package mHMMbayes in CRAN.

MoDELS · 似然 · 講稿 · 分解的 · 有向 ·

2024 年 3 月 19 日

A maximum penalised likelihood approach for semiparametric accelerated failure time models with time-varying covariates and partly interval censoring

Aishwarya Bhaskaran,Ding Ma,Benoit Liquet,Angela Hong,Serigne N Lo,Stephane Heritier,Jun Ma

from arxiv, 31 pages, 5 figures, 4 tables

Accelerated failure time (AFT) models are frequently used for modelling survival data. This approach is attractive as it quantifies the direct relationship between the time until an event occurs and various covariates. It asserts that the failure times experience either acceleration or deceleration through a multiplicative factor when these covariates are present. While existing literature provides numerous methods for fitting AFT models with time-fixed covariates, adapting these approaches to scenarios involving both time-varying covariates and partly interval-censored data remains challenging. In this paper, we introduce a maximum penalised likelihood approach to fit a semiparametric AFT model. This method, designed for survival data with partly interval-censored failure times, accommodates both time-fixed and time-varying covariates. We utilise Gaussian basis functions to construct a smooth approximation of the nonparametric baseline hazard and fit the model via a constrained optimisation approach. To illustrate the effectiveness of our proposed method, we conduct a comprehensive simulation study. We also present an implementation of our approach on a randomised clinical trial dataset on advanced melanoma patients.

流形 · 泛函 · 近似 · 優化器 · 歐氏空間 ·

2024 年 3 月 18 日

Warped geometric information on the optimisation of Euclidean functions

Marcelo Hartmann,Bernardo Williams,Hanlin Yu,Mark Girolami,Alessandro Barp,Arto Klami

We consider the fundamental task of optimising a real-valued function defined in a potentially high-dimensional Euclidean space, such as the loss function in many machine-learning tasks or the logarithm of the probability distribution in statistical inference. We use Riemannian geometry notions to redefine the optimisation problem of a function on the Euclidean space to a Riemannian manifold with a warped metric, and then find the function's optimum along this manifold. The warped metric chosen for the search domain induces a computational friendly metric-tensor for which optimal search directions associated with geodesic curves on the manifold becomes easier to compute. Performing optimization along geodesics is known to be generally infeasible, yet we show that in this specific manifold we can analytically derive Taylor approximations up to third-order. In general these approximations to the geodesic curve will not lie on the manifold, however we construct suitable retraction maps to pull them back onto the manifold. Therefore, we can efficiently optimize along the approximate geodesic curves. We cover the related theory, describe a practical optimization algorithm and empirically evaluate it on a collection of challenging optimisation benchmarks. Our proposed algorithm, using 3rd-order approximation of geodesics, tends to outperform standard Euclidean gradient-based counterparts in term of number of iterations until convergence.

Integration · 相同 · 相互獨立的 · SimPLe · 確切的 ·

2024 年 3 月 18 日

Shadow Hamiltonians of structure-preserving integrators for Nambu mechanics

Atsushi Horikoshi

from arxiv, 12 pages, 4 figures

Symplectic integrators are widely implemented numerical integrators for Hamiltonian mechanics, which preserve the Hamiltonian structure (symplecticity) of the system. Although the symplectic integrator does not conserve the energy of the system, it is well known that there exists a conserving modified Hamiltonian, called the shadow Hamiltonian. For the Nambu mechanics, which is one of the generalized Hamiltonian mechanics, we can also construct structure-preserving integrators by the same procedure used to construct the symplectic integrators. In the structure-preserving integrator, however, the existence of shadow Hamiltonians is non-trivial. This is because the Nambu mechanics is driven by multiple Hamiltonians and it is non-trivial whether the time evolution by the integrator can be cast into the Nambu mechanical time evolution driven by multiple shadow Hamiltonians. In the present paper we construct structure-preserving integrators for a simple Nambu mechanical system, and derive the shadow Hamiltonians in two ways. This is the first attempt to derive shadow Hamiltonians of structure-preserving integrators for Nambu mechanics. We show that the fundamental identity, which corresponds to the Jacobi identity in Hamiltonian mechanics, plays an important role to calculate the shadow Hamiltonians using the Baker-Campbell-Hausdorff formula. It turns out that the resulting shadow Hamiltonians have indefinite forms depending on how the fundamental identities are used. This is not a technical artifact, because the exact shadow Hamiltonians obtained independently have the same indefiniteness.

大語言模型 · Continuity · CRAFT · 語言模型化 · Processing（編程語言） ·

2024 年 3 月 18 日

LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning

Shu Wang,Muzhi Han,Ziyuan Jiao,Zeyu Zhang,Ying Nian Wu,Song-Chun Zhu,Hangxin Liu

from arxiv, Submitted to IROS 2024. Codes available: //github.com/AssassinWS/LLM-TAMP

Conventional Task and Motion Planning (TAMP) approaches rely on manually crafted interfaces connecting symbolic task planning with continuous motion generation. These domain-specific and labor-intensive modules are limited in addressing emerging tasks in real-world settings. Here, we present LLM^3, a novel Large Language Model (LLM)-based TAMP framework featuring a domain-independent interface. Specifically, we leverage the powerful reasoning and planning capabilities of pre-trained LLMs to propose symbolic action sequences and select continuous action parameters for motion planning. Crucially, LLM^3 incorporates motion planning feed- back through prompting, allowing the LLM to iteratively refine its proposals by reasoning about motion failure. Consequently, LLM^3 interfaces between task planning and motion planning, alleviating the intricate design process of handling domain- specific messages between them. Through a series of simulations in a box-packing domain, we quantitatively demonstrate the effectiveness of LLM^3 in solving TAMP problems and the efficiency in selecting action parameters. Ablation studies un- derscore the significant contribution of motion failure reasoning to the success of LLM^3. Furthermore, we conduct qualitative experiments on a physical manipulator, demonstrating the practical applicability of our approach in real-world settings.

小樣本學習 · Learning · F分數 · 線性的 · 穩健性 ·

2024 年 3 月 17 日

Multitask frame-level learning for few-shot sound event detection

Liang Zou,Genwei Yan,Ruoyu Wang,Jun Du,Meng Lei,Tian Gao,Xin Fang

from arxiv, 6 pages, 4 figures, conference

This paper focuses on few-shot Sound Event Detection (SED), which aims to automatically recognize and classify sound events with limited samples. However, prevailing methods methods in few-shot SED predominantly rely on segment-level predictions, which often providing detailed, fine-grained predictions, particularly for events of brief duration. Although frame-level prediction strategies have been proposed to overcome these limitations, these strategies commonly face difficulties with prediction truncation caused by background noise. To alleviate this issue, we introduces an innovative multitask frame-level SED framework. In addition, we introduce TimeFilterAug, a linear timing mask for data augmentation, to increase the model's robustness and adaptability to diverse acoustic environments. The proposed method achieves a F-score of 63.8%, securing the 1st rank in the few-shot bioacoustic event detection category of the Detection and Classification of Acoustic Scenes and Events Challenge 2023.

邊 · 寬度 · 可辨認的 · 平穩的 · SimPLe ·

2024 年 3 月 16 日

Texture Edge detection by Patch consensus (TEP)

Guangyu Cui,Sung Ha Kang

We propose Texture Edge detection using Patch consensus (TEP) which is a training-free method to detect the boundary of texture. We propose a new simple way to identify the texture edge location, using the consensus of segmented local patch information. While on the boundary, even using local patch information, the distinction between textures are typically not clear, but using neighbor consensus give a clear idea of the boundary. We utilize local patch, and its response against neighboring regions, to emphasize the similarities and the differences across different textures. The step of segmentation of response further emphasizes the edge location, and the neighborhood voting gives consensus and stabilize the edge detection. We analyze texture as a stationary process to give insight into the patch width parameter verses the quality of edge detection. We derive the necessary condition for textures to be distinguished, and analyze the patch width with respect to the scale of textures. Various experiments are presented to validate the proposed model.

推斷 · 似然 · 線性的 · 線性映射 · FAST ·

2024 年 3 月 16 日

Fast, accurate and lightweight sequential simulation-based inference using Gaussian locally linear mappings

Henrik H?ggstr?m,Pedro L. C. Rodrigues,Geoffroy Oudoumanessah,Florence Forbes,Umberto Picchini

from arxiv, 60 pages, 55 figures: the only difference with v1 is a different formatting/LaTeX stylefile

Bayesian inference for complex models with an intractable likelihood can be tackled using algorithms performing many calls to computer simulators. These approaches are collectively known as "simulation-based inference" (SBI). Recent SBI methods have made use of neural networks (NN) to provide approximate, yet expressive constructs for the unavailable likelihood function and the posterior distribution. However, they do not generally achieve an optimal trade-off between accuracy and computational demand. In this work, we propose an alternative that provides both approximations to the likelihood and the posterior distribution, using structured mixtures of probability distributions. Our approach produces accurate posterior inference when compared to state-of-the-art NN-based SBI methods, while exhibiting a much smaller computational footprint. We illustrate our results on several benchmark models from the SBI literature.

類別 · 學成 · MoDELS · Performer · Better ·

2021 年 2 月 15 日

OntoZSL: Ontology-enhanced Zero-shot Learning

Yuxia Geng,Jiaoyan Chen,Zhuo Chen,Jeff Z. Pan,Zhiquan Ye,Zonggang Yuan,Yantao Jia,Huajun Chen

from arxiv, Accepted to The Web Conference (WWW) 2021

Zero-shot Learning (ZSL), which aims to predict for those classes that have never appeared in the training data, has arisen hot research interests. The key of implementing ZSL is to leverage the prior knowledge of classes which builds the semantic relationship between classes and enables the transfer of the learned models (e.g., features) from training classes (i.e., seen classes) to unseen classes. However, the priors adopted by the existing methods are relatively limited with incomplete semantics. In this paper, we explore richer and more competitive prior knowledge to model the inter-class relationship for ZSL via ontology-based knowledge representation and semantic embedding. Meanwhile, to address the data imbalance between seen classes and unseen classes, we developed a generative ZSL framework with Generative Adversarial Networks (GANs). Our main findings include: (i) an ontology-enhanced ZSL framework that can be applied to different domains, such as image classification (IMGC) and knowledge graph completion (KGC); (ii) a comprehensive evaluation with multiple zero-shot datasets from different domains, where our method often achieves better performance than the state-of-the-art models. In particular, on four representative ZSL baselines of IMGC, the ontology-based class semantics outperform the previous priors e.g., the word embeddings of classes by an average of 12.4 accuracy points in the standard ZSL across two example datasets (see Figure 4).

Performer · Color · Networking · CRAFT · 均方誤差 ·

2018 年 1 月 25 日

C2MSNet: A Novel approach for single image haze removal

Akshay Dudhane,Subrahmanyam Murala

from arxiv, Accepted in Winter Conference on Applications of Computer Vision (WACV-2018)

Degradation of image quality due to the presence of haze is a very common phenomenon. Existing DehazeNet [3], MSCNN [11] tackled the drawbacks of hand crafted haze relevant features. However, these methods have the problem of color distortion in gloomy (poor illumination) environment. In this paper, a cardinal (red, green and blue) color fusion network for single image haze removal is proposed. In first stage, network fusses color information present in hazy images and generates multi-channel depth maps. The second stage estimates the scene transmission map from generated dark channels using multi channel multi scale convolutional neural network (McMs-CNN) to recover the original scene. To train the proposed network, we have used two standard datasets namely: ImageNet [5] and D-HAZY [1]. Performance evaluation of the proposed approach has been carried out using structural similarity index (SSIM), mean square error (MSE) and peak signal to noise ratio (PSNR). Performance analysis shows that the proposed approach outperforms the existing state-of-the-art methods for single image dehazing.