男女一边脱一边亲一边膜_欧美一欧美片在线视频观看_日韩视频一区不卡电影在线_日本精品久久久久中文字幕乱中年_日韩欧美一区二区三区在线观看动漫_殴美人一级黄色网站片欧美_制服丝袜综合第八页

Analysis · 大語言模型 · MoDELS · 語言模型化 · Learning ·

2023 年 12 月 11 日

Can Large Language Models emulate an inductive Thematic Analysis of semi-structured interviews? An exploration and provocation on the limits of the approach and the model

Stefano De Paoli

Large Language Models (LLMs) have emerged as powerful generative Artificial Intelligence solutions which can be applied to several fields and areas of work. This paper presents results and reflection of an experiment done to use the model GPT 3.5-Turbo to emulate some aspects of an inductive Thematic Analysis. Previous research on this subject has largely worked on conducting deductive analysis. Thematic Analysis is a qualitative method for analysis commonly used in social sciences and it is based on interpretations made by the human analyst(s) and the identification of explicit and latent meanings in qualitative data. Attempting an analysis based on human interpretation with an LLM clearly is a provocation but also a way to learn something about how these systems can or cannot be used in qualitative research. The paper presents the motivations for attempting this emulation, it reflects on how the six steps to a Thematic Analysis proposed by Braun and Clarke can at least partially be reproduced with the LLM and it also reflects on what are the outputs produced by the model. The paper used two existing datasets of open access semi-structured interviews, previously analysed with Thematic Analysis by other researchers. It used the previously produced analysis (and the related themes) to compare with the results produced by the LLM. The results show that the model can infer at least partially some of the main Themes. The objective of the paper is not to replace human analysts in qualitative analysis but to learn if some elements of LLM data manipulation can to an extent be of support for qualitative research.

相關內容

Analysis

關注 2

Markov · 馬爾可夫鏈 · 估計/估計量 · 蒙特卡羅 · Analysis ·

2024 年 1 月 31 日

A non-asymptotic error analysis for parallel Monte Carlo estimation from many short Markov chains

Austin Brown

Single-chain Markov chain Monte Carlo simulates realizations from a Markov chain to estimate expectations with the empirical average. The single-chain simulation is generally of considerable length and restricts many advantages of modern parallel computation. This paper constructs a novel many-short-chains Monte Carlo (MSC) estimator by averaging over multiple independent sums from Markov chains of a guaranteed short length. The computational advantage is the independent Markov chain simulations can be fast and may be run in parallel. The MSC estimator requires an importance sampling proposal and a drift condition on the Markov chain without requiring convergence analysis on the Markov chain. A non-asymptotic error analysis is developed for the MSC estimator under both geometric and multiplicative drift conditions. Empirical performance is illustrated on an autoregressive process and the P\'olya-Gamma Gibbs sampler for Bayesian logistic regression to predict cardiovascular disease.

MoDELS · IR · Performer · Safari · 圖像還原 ·

2024 年 1 月 31 日

Spatial-and-Frequency-aware Restoration method for Images based on Diffusion Models

Kyungsung Lee,Donggyu Lee,Myungjoo Kang

Diffusion models have recently emerged as a promising framework for Image Restoration (IR), owing to their ability to produce high-quality reconstructions and their compatibility with established methods. Existing methods for solving noisy inverse problems in IR, considers the pixel-wise data-fidelity. In this paper, we propose SaFaRI, a spatial-and-frequency-aware diffusion model for IR with Gaussian noise. Our model encourages images to preserve data-fidelity in both the spatial and frequency domains, resulting in enhanced reconstruction quality. We comprehensively evaluate the performance of our model on a variety of noisy inverse problems, including inpainting, denoising, and super-resolution. Our thorough evaluation demonstrates that SaFaRI achieves state-of-the-art performance on both the ImageNet datasets and FFHQ datasets, outperforming existing zero-shot IR methods in terms of LPIPS and FID metrics.

圖形處理器 · 圖 · Networking · Neural Networks · 整流線性 ·

2024 年 1 月 30 日

Graph Neural Networks with polynomial activations have limited expressivity

Sammy Khalife

The expressivity of Graph Neural Networks (GNNs) can be entirely characterized by appropriate fragments of the first order logic. Namely, any query of the two variable fragment of graded modal logic (GC2) interpreted over labeled graphs can be expressed using a GNN whose size depends only on the depth of the query. As pointed out by [Barcelo & Al., 2020, Grohe, 2021], this description holds for a family of activation functions, leaving the possibibility for a hierarchy of logics expressible by GNNs depending on the chosen activation function. In this article, we show that such hierarchy indeed exists by proving that GC2 queries cannot be expressed by GNNs with polynomial activation functions. This implies a separation between polynomial and popular non polynomial activations (such as Rectified Linear Units) and answers an open question formulated by [Grohe, 21].

Microsoft Surface · 離散化 · Analysis · 模型評估 · 穩健性 ·

2024 年 1 月 29 日

Mixed-Order Meshes through rp-adaptivity for Surface Fitting to Implicit Geometries

Ketan Mittal,Veselin A. Dobrev,Patrick Knupp,Tzanio Kolev,Franck Ledoux,Claire Roche,Vladimir Z. Tomov

from arxiv, 14 pages, 11 figures

Computational analysis with the finite element method requires geometrically accurate meshes. It is well known that high-order meshes can accurately capture curved surfaces with fewer degrees of freedom in comparison to low-order meshes. Existing techniques for high-order mesh generation typically output meshes with same polynomial order for all elements. However, high order elements away from curvilinear boundaries or interfaces increase the computational cost of the simulation without increasing geometric accuracy. In prior work, we have presented one such approach for generating body-fitted uniform-order meshes that takes a given mesh and morphs it to align with the surface of interest prescribed as the zero isocontour of a level-set function. We extend this method to generate mixed-order meshes such that curved surfaces of the domain are discretized with high-order elements, while low-order elements are used elsewhere. Numerical experiments demonstrate the robustness of the approach and show that it can be used to generate mixed-order meshes that are much more efficient than high uniform-order meshes. The proposed approach is purely algebraic, and extends to different types of elements (quadrilaterals/triangles/tetrahedron/hexahedra) in two- and three-dimensions.

MoDELS · 解碼 · BERT · 掩碼語言模型化 · Performer ·

2024 年 1 月 29 日

DrBERT: Unveiling the Potential of Masked Language Modeling Decoder in BERT pretraining

Wen Liang,Youzhi Liang

BERT (Bidirectional Encoder Representations from Transformers) has revolutionized the field of natural language processing through its exceptional performance on numerous tasks. Yet, the majority of researchers have mainly concentrated on enhancements related to the model structure, such as relative position embedding and more efficient attention mechanisms. Others have delved into pretraining tricks associated with Masked Language Modeling, including whole word masking. DeBERTa introduced an enhanced decoder adapted for BERT's encoder model for pretraining, proving to be highly effective. We argue that the design and research around enhanced masked language modeling decoders have been underappreciated. In this paper, we propose several designs of enhanced decoders and introduce DrBERT (Decoder-refined BERT), a novel method for modeling training. Typically, a pretrained BERT model is fine-tuned for specific Natural Language Understanding (NLU) tasks. In our approach, we utilize the original BERT model as the encoder, making only changes to the decoder without altering the encoder. This approach does not necessitate extensive modifications to the model's architecture and can be seamlessly integrated into existing fine-tuning pipelines and services, offering an efficient and effective enhancement strategy. Compared to other methods, while we also incur a moderate training cost for the decoder during the pretraining process, our approach does not introduce additional training costs during the fine-tuning phase. We test multiple enhanced decoder structures after pretraining and evaluate their performance on the GLUE benchmark. Our results demonstrate that DrBERT, having only undergone subtle refinements to the model structure during pretraining, significantly enhances model performance without escalating the inference time and serving budget.

Processing（編程語言） · 估計/估計量 · 最大似然估計 · 極大似然 · 似然 ·

2024 年 1 月 28 日

The Hubbert diffusion process: Estimation via simulated annealing and variable neighborhood search procedures. Application to forecasting peak oil production

Istoni da Luz Sant'Ana,Patricia Román-Román,Francisco Torres-Ruiz

from arxiv, 30 pages, 5 figures

Accurately charting the progress of oil production is a problem of great current interest. Oil production is widely known to be cyclical: in any given system, after it reaches its peak, a decline will begin. With this in mind, Marion King Hubbert developed his peak theory in 1956 based on the bell-shaped curve that bears his name. In the present work, we consider a stochasticmodel based on the theory of diffusion processes and associated with the Hubbert curve. The problem of the maximum likelihood estimation of the parameters for this process is also considered. Since a complex system of equations appears, with a solution that cannot be guaranteed by classical numerical procedures, we suggest the use of metaheuristic optimization algorithms such as simulated annealing and variable neighborhood search. Some strategies are suggested for bounding the space of solutions, and a description is provided for the application of the algorithms selected. In the case of the variable neighborhood search algorithm, a hybrid method is proposed in which it is combined with simulated annealing. In order to validate the theory developed here, we also carry out some studies based on simulated data and consider 2 real crude oil production scenarios from Norway and Kazakhstan.

知識 (knowledge) · 語言模型化 · 大語言模型 · 可約的 · MoDELS ·

2024 年 1 月 28 日

Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment

Fanqi Wan,Xinting Huang,Leyang Cui,Xiaojun Quan,Wei Bi,Shuming Shi

from arxiv, Work in progress

While Large Language Models (LLMs) have proven to be exceptional on a variety of tasks after alignment, they may still produce responses that contradict the context or world knowledge confidently, a phenomenon known as ``hallucination''. In this paper, we demonstrate that reducing the inconsistency between the external knowledge encapsulated in the training data and the intrinsic knowledge inherited in the pretraining corpus could mitigate hallucination in alignment. Specifically, we introduce a novel knowledge consistent alignment (KCA) approach, which involves automatically formulating examinations based on external knowledge for accessing the comprehension of LLMs. For data encompassing knowledge inconsistency, KCA implements several simple yet efficient strategies for processing. We illustrate the superior performance of the proposed KCA approach in mitigating hallucinations across six benchmarks using LLMs of different backbones and scales. Furthermore, we confirm the correlation between knowledge inconsistency and hallucination, signifying the effectiveness of reducing knowledge inconsistency in alleviating hallucinations. Our code, model weights, and data are public at \url{//github.com/fanqiwan/KCA}.

Integration · INFORMS · Networking · 統計量 · 可理解性 ·

2024 年 1 月 26 日

Evolving higher-order synergies reveals a trade-off between stability and information integration capacity in complex systems

Thomas F. Varley,Joshua Bongard

There has recently been an explosion of interest in how "higher-order" structures emerge in complex systems. This "emergent" organization has been found in a variety of natural and artificial systems, although at present the field lacks a unified understanding of what the consequences of higher-order synergies and redundancies are for systems. Typical research treat the presence (or absence) of synergistic information as a dependent variable and report changes in the level of synergy in response to some change in the system. Here, we attempt to flip the script: rather than treating higher-order information as a dependent variable, we use evolutionary optimization to evolve boolean networks with significant higher-order redundancies, synergies, or statistical complexity. We then analyse these evolved populations of networks using established tools for characterizing discrete dynamics: the number of attractors, average transient length, and Derrida coefficient. We also assess the capacity of the systems to integrate information. We find that high-synergy systems are unstable and chaotic, but with a high capacity to integrate information. In contrast, evolved redundant systems are extremely stable, but have negligible capacity to integrate information. Finally, the complex systems that balance integration and segregation (known as Tononi-Sporns-Edelman complexity) show features of both chaosticity and stability, with a greater capacity to integrate information than the redundant systems while being more stable than the random and synergistic systems. We conclude that there may be a fundamental trade-off between the robustness of a systems dynamics and its capacity to integrate information (which inherently requires flexibility and sensitivity), and that certain kinds of complexity naturally balance this trade-off.

Gemini · GPT-4 · 可理解性 · 模態 · Performer ·

2024 年 1 月 26 日

From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities

Chaochao Lu,Chen Qian,Guodong Zheng,Hongxing Fan,Hongzhi Gao,Jie Zhang,Jing Shao,Jingyi Deng,Jinlan Fu,Kexin Huang,Kunchang Li,Lijun Li,Limin Wang,Lu Sheng,Meiqi Chen,Ming Zhang,Qibing Ren,Sirui Chen,Tao Gui,Wanli Ouyang,Yali Wang,Yan Teng,Yaru Wang,Yi Wang,Yinan He,Yingchun Wang,Yixu Wang,Yongting Zhang,Yu Qiao,Yujiong Shen,Yurong Mou,Yuxi Chen,Zaibin Zhang,Zhelun Shi,Zhenfei Yin,Zhipin Wang

Multi-modal Large Language Models (MLLMs) have shown impressive abilities in generating reasonable responses with respect to multi-modal contents. However, there is still a wide gap between the performance of recent MLLM-based applications and the expectation of the broad public, even though the most powerful OpenAI's GPT-4 and Google's Gemini have been deployed. This paper strives to enhance understanding of the gap through the lens of a qualitative study on the generalizability, trustworthiness, and causal reasoning capabilities of recent proprietary and open-source MLLMs across four modalities: ie, text, code, image, and video, ultimately aiming to improve the transparency of MLLMs. We believe these properties are several representative factors that define the reliability of MLLMs, in supporting various downstream applications. To be specific, we evaluate the closed-source GPT-4 and Gemini and 6 open-source LLMs and MLLMs. Overall we evaluate 230 manually designed cases, where the qualitative results are then summarized into 12 scores (ie, 4 modalities times 3 properties). In total, we uncover 14 empirical findings that are useful to understand the capabilities and limitations of both proprietary and open-source MLLMs, towards more reliable downstream multi-modal applications.

有偏 · 估計/估計量 · 方差 · 自助法/自舉法 · 線性的 ·

2024 年 1 月 26 日

Efficient Bias Correction for Cross-section and Panel Data

Jinyong Hahn,David W. Hughes,Guido Kuersteiner,Whitney K. Newey

Bias correction can often improve the finite sample performance of estimators. We show that the choice of bias correction method has no effect on the higher-order variance of semiparametrically efficient parametric estimators, so long as the estimate of the bias is asymptotically linear. It is also shown that bootstrap, jackknife, and analytical bias estimates are asymptotically linear for estimators with higher-order expansions of a standard form. In particular, we find that for a variety of estimators the straightforward bootstrap bias correction gives the same higher-order variance as more complicated analytical or jackknife bias corrections. In contrast, bias corrections that do not estimate the bias at the parametric rate, such as the split-sample jackknife, result in larger higher-order variances in the i.i.d. setting we focus on. For both a cross-sectional MLE and a panel model with individual fixed effects, we show that the split-sample jackknife has a higher-order variance term that is twice as large as that of the `leave-one-out' jackknife.