爱琴海论坛视频播放三免费,亚洲精品无码国产爽快A片百度,亚洲成A人片在线观看网站黄

The volumetric representation of human interactions is one of the fundamental domains in the development of immersive media productions and telecommunication applications. Particularly in the context of the rapid advancement of Extended Reality (XR) applications, this volumetric data has proven to be an essential technology for future XR elaboration. In this work, we present a new multimodal database to help advance the development of immersive technologies. Our proposed database provides ethically compliant and diverse volumetric data, in particular 27 participants displaying posed facial expressions and subtle body movements while speaking, plus 11 participants wearing head-mounted displays (HMDs). The recording system consists of a volumetric capture (VoCap) studio, including 31 synchronized modules with 62 RGB cameras and 31 depth cameras. In addition to textured meshes, point clouds, and multi-view RGB-D data, we use one Lytro Illum camera for providing light field (LF) data simultaneously. Finally, we also provide an evaluation of our dataset employment with regard to the tasks of facial expression classification, HMDs removal, and point cloud reconstruction. The dataset can be helpful in the evaluation and performance testing of various XR algorithms, including but not limited to facial expression recognition and reconstruction, facial reenactment, and volumetric video. HEADSET and its all associated raw data and license agreement will be publicly available for research purposes.

相關內容

多峰值

關注 2

離散化 · MoDELS · 情景 · 單元 · PDE ·

2024 年 3 月 27 日

An exactly curl-free finite-volume scheme for a hyperbolic compressible barotropic two-phase model

Laura Río-Martín,Firas Dhaouadi,Michael Dumbser

We present a new second order accurate structure-preserving finite volume scheme for the solution of the compressible barotropic two-phase model of Romenski et. al in multiple space dimensions. The governing equations fall into the wider class of symmetric hyperbolic and thermodynamically compatible (SHTC) systems and consist of a set of first-order hyperbolic partial differential equations (PDE). In the absence of algebraic source terms, the model is subject to a curl-free constraint for the relative velocity between the two phases. The main objective of this paper is, therefore, to preserve this structural property exactly also at the discrete level. The new numerical method is based on a staggered grid arrangement where the relative velocity field is stored in the cell vertexes while all the remaining variables are stored in the cell centers. This allows the definition of discretely compatible gradient and curl operators, which ensure that the discrete curl errors of the relative velocity field remain zero up to machine precision. A set of numerical results confirms this property also experimentally.

可約的 · 變換 · INFORMS · 模型評估 · 生物學合理性 ·

2024 年 3 月 27 日

Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification

Qingyu Wang,Duzhen Zhang,Tilelin Zhang,Bo Xu

from arxiv, 18 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:2308.02557

Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial Transformer, whereby the Spiking Self-Attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not always necessary, especially in sparse spike-form calculation manners. In this paper, we innovatively replace vanilla SSA (using dynamic bases calculating from Query and Key) with spike-form Fourier Transform, Wavelet Transform, and their combinations (using fixed triangular or wavelets bases), based on a key hypothesis that both of them use a set of basis functions for information transformation. Hence, the Fourier-or-Wavelet-based spikformer (FWformer) is proposed and verified in visual classification tasks, including both static image and event-based video datasets. The FWformer can achieve comparable or even higher accuracies ($0.4\%$-$1.5\%$), higher running speed ($9\%$-$51\%$ for training and $19\%$-$70\%$ for inference), reduced theoretical energy consumption ($20\%$-$25\%$), and reduced GPU memory usage ($4\%$-$26\%$), compared to the standard spikformer. Our result indicates the continuous refinement of new Transformers, that are inspired either by biological discovery (spike-form), or information theory (Fourier or Wavelet Transform), is promising.

MoDELS · Performance · 語言模型化 · 大語言模型 · Performer ·

2024 年 3 月 26 日

Hyacinth6B: A large language model for Traditional Chinese

Chih-Wei Song,Yin-Te Tsai

from arxiv, 14pages

This research's primary motivation of this study is to address the high hardware and computational demands typically associated with LLMs.Therefore,our goal is to find a balance between model lightness and performance,striving to maximize performance while using a comparatively lightweight model. Hyacinth6B was developed with this objective in mind,aiming to fully leverage the core capabilities of LLMs without incurring substantial resource costs, effectively pushing the boundaries of smaller model's performance. The training approach involves parameter efficient finetuning using the LoRA method.

展開 · 代價 · 估計/估計量 · 特化 · 統計量 ·

2024 年 3 月 26 日

Algorithmic unfolding for image reconstruction and localization problems in fluorescence microscopy

Silvia Bonettini,Luca Calatroni,Danilo Pezzi,Marco Prato

We propose an unfolded accelerated projected-gradient descent procedure to estimate model and algorithmic parameters for image super-resolution and molecule localization problems in image microscopy. The variational lower-level constraint enforces sparsity of the solution and encodes different noise statistics (Gaussian, Poisson), while the upper-level cost assesses optimality w.r.t.~the task considered. In more detail, a standard $\ell_2$ cost is considered for image reconstruction (e.g., deconvolution/super-resolution, semi-blind deconvolution) problems, while a smoothed $\ell_1$ is employed to assess localization precision in some exemplary fluorescence microscopy problems exploiting single-molecule activation. Several numerical experiments are reported to validate the proposed approach on synthetic and realistic ISBI data.

評論員 · MoDELS · 估計/估計量 · AIM · 置信度 ·

2024 年 3 月 26 日

Computing conservative probabilities of rare events with surrogates

Nicolas Bousquet

This article provides a critical review of the main methods used to produce conservative estimators of probabilities of rare events, or critical failures, for reliability and certification studies in the broadest sense. These probabilities must theoretically be calculated from simulations of (certified) numerical models, but which typically suffer from prohibitive computational costs. This occurs frequently, for instance, for complex and critical industrial systems. We focus therefore in adapting the common use of surrogates to replace these numerical models, the aim being to offer a high level of confidence in the results. We suggest avenues of research to improve the guarantees currently reachable.

近似 · Lyapunov · 有向 · 原點 · 可辨認的 ·

2024 年 3 月 26 日

Stability evaluation of approximate Riemann solvers using the direct Lyapunov method

Aishwarjya Gogoi,Jadav Chandra Mandal,Amitabh Saraf

from arxiv, arXiv admin note: text overlap with arXiv:2207.06830

The paper presents a new approach of stability evaluation of the approximate Riemann solvers based on the direct Lyapunov method. The present methodology offers a detailed understanding of the origins of numerical shock instability in the approximate Riemann solvers. The pressure perturbation feeding the density and transverse momentum perturbations is identified as the cause of the numerical shock instabilities in the complete approximate Riemann solvers, while the magnitude of the numerical shock instabilities are found to be proportional to the magnitude of the pressure perturbations. A shock-stable HLLEM scheme is proposed based on the insights obtained from this analysis about the origins of numerical shock instability in the approximate Riemann solvers. A set of numerical test cases are solved to show that the proposed scheme is free from numerical shock instability problems of the original HLLEM scheme at high Mach numbers.

Projection · 話題 · 話題模型 · 可辨認的 · LDA ·

2024 年 3 月 26 日

An Empirical Study of ChatGPT-related projects on GitHub

Zheng Lin,Neng Zhang

As ChatGPT possesses powerful capabilities in natural language processing and code analysis, it has received widespread attention since its launch. Developers have applied its powerful capabilities to various domains through software projects which are hosted on the largest open-source platform (GitHub) worldwide. Simultaneously, these projects have triggered extensive discussions. In order to comprehend the research content of these projects and understand the potential requirements discussed, we collected ChatGPT-related projects from the GitHub platform and utilized the LDA topic model to identify the discussion topics. Specifically, we selected 200 projects, categorizing them into three primary categories through analyzing their descriptions: ChatGPT implementation & training, ChatGPT application, ChatGPT improvement & extension. Subsequently, we employed the LDA topic model to identify 10 topics from issue texts, and compared the distribution and evolution trend of the discovered topics within the three primary project categories. Our observations include (1) The number of projects growing in a single month for the three primary project categories are closely associated with the development of ChatGPT. (2) There exist significant variations in the popularity of each topic for the three primary project categories. (3) The monthly changes in the absolute impact of each topic for the three primary project categories are diverse, which is often closely associated with the variation in the number of projects owned by that category. (4) With the passage of time, the relative impact of each topic exhibits different development trends in the three primary project categories. Based on these findings, we discuss implications for developers and users.

INFORMS · MoDELS · Processing（編程語言） · 大語言模型 · INTERACT ·

2024 年 3 月 25 日

GOLF: Goal-Oriented Long-term liFe tasks supported by human-AI collaboration

Ben Wang

The advent of ChatGPT and similar large language models (LLMs) has revolutionized the human-AI interaction and information-seeking process. Leveraging LLMs as an alternative to search engines, users can now access summarized information tailored to their queries, significantly reducing the cognitive load associated with navigating vast information resources. This shift underscores the potential of LLMs in redefining information access paradigms. Drawing on the foundation of task-focused information retrieval and LLMs' task planning ability, this research extends the scope of LLM capabilities beyond routine task automation to support users in navigating long-term and significant life tasks. It introduces the GOLF framework (Goal-Oriented Long-term liFe tasks), which focuses on enhancing LLMs' ability to assist in significant life decisions through goal orientation and long-term planning. The methodology encompasses a comprehensive simulation study to test the framework's efficacy, followed by model and human evaluations to develop a dataset benchmark for long-term life tasks, and experiments across different models and settings. By shifting the focus from short-term tasks to the broader spectrum of long-term life goals, this research underscores the transformative potential of LLMs in enhancing human decision-making processes and task management, marking a significant step forward in the evolution of human-AI collaboration.

state-of-the-art · 估計/估計量 · 自編碼器 · 潛在 · 圖像還原 ·

2024 年 3 月 25 日

Variational Bayes image restoration with compressive autoencoders

Maud Biquard,Marie Chabert,Thomas Oberlin

Regularization of inverse problems is of paramount importance in computational imaging. The ability of neural networks to learn efficient image representations has been recently exploited to design powerful data-driven regularizers. While state-of-the-art plug-and-play methods rely on an implicit regularization provided by neural denoisers, alternative Bayesian approaches consider Maximum A Posteriori (MAP) estimation in the latent space of a generative model, thus with an explicit regularization. However, state-of-the-art deep generative models require a huge amount of training data compared to denoisers. Besides, their complexity hampers the optimization involved in latent MAP derivation. In this work, we first propose to use compressive autoencoders instead. These networks, which can be seen as variational autoencoders with a flexible latent prior, are smaller and easier to train than state-of-the-art generative models. As a second contribution, we introduce the Variational Bayes Latent Estimation (VBLE) algorithm, which performs latent estimation within the framework of variational inference. Thanks to a simple yet efficient parameterization of the variational posterior, VBLE allows for fast and easy (approximate) posterior sampling. Experimental results on image datasets BSD and FFHQ demonstrate that VBLE reaches similar performance than state-of-the-art plug-and-play methods, while being able to quantify uncertainties faster than other existing posterior sampling techniques.

Vision · 變換 · 學成 · 可約的 · 縮放 ·

2021 年 4 月 8 日

SiT: Self-supervised vIsion Transformer

Sara Atito,Muhammad Awais,Josef Kittler

Self-supervised learning methods are gaining increasing traction in computer vision due to their recent success in reducing the gap with supervised learning. In natural language processing (NLP) self-supervised learning and transformers are already the methods of choice. The recent literature suggests that the transformers are becoming increasingly popular also in computer vision. So far, the vision transformers have been shown to work well when pretrained either using a large scale supervised data or with some kind of co-supervision, e.g. in terms of teacher network. These supervised pretrained vision transformers achieve very good results in downstream tasks with minimal changes. In this work we investigate the merits of self-supervised learning for pretraining image/vision transformers and then using them for downstream classification tasks. We propose Self-supervised vIsion Transformers (SiT) and discuss several self-supervised training mechanisms to obtain a pretext model. The architectural flexibility of SiT allows us to use it as an autoencoder and work with multiple self-supervised tasks seamlessly. We show that a pretrained SiT can be finetuned for a downstream classification task on small scale datasets, consisting of a few thousand images rather than several millions. The proposed approach is evaluated on standard datasets using common protocols. The results demonstrate the strength of the transformers and their suitability for self-supervised learning. We outperformed existing self-supervised learning methods by large margin. We also observed that SiT is good for few shot learning and also showed that it is learning useful representation by simply training a linear classifier on top of the learned features from SiT. Pretraining, finetuning, and evaluation codes will be available under: //github.com/Sara-Ahmed/SiT.