亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In Large Language Models (LLMs), there have been consistent advancements in task-specific performance, largely influenced by effective prompt design. While recent research on prompting has enhanced the reasoning capabilities of LLMs, a gap remains in further improving their understanding abilities. In this study, we introduce Metacognitive Prompting (MP), a strategy inspired by human introspective reasoning processes. Using MP, LLMs undergo a systematic series of structured, self-aware evaluations, drawing on both their vast inherent knowledge and new insights. Our experiments involve five prevalent LLMs: Llama2, Vicuna, PaLM, GPT-3.5, and GPT-4, all of which span various general natural language understanding (NLU) tasks from the GLUE and SuperGLUE benchmarks. Results indicate that, although GPT-4 consistently excels in most tasks, PaLM, when equipped with MP, approaches its performance level. Furthermore, across models and datasets, MP consistently outperforms existing prompting methods, including standard and chain-of-thought prompting. This study underscores the potential to amplify the understanding abilities of LLMs and highlights the benefits of mirroring human introspective reasoning in NLU tasks.

相關內容

Large Language Models (LLMs) are quickly becoming ubiquitous, but the implications for social science research are not yet well understood. This paper asks whether LLMs can help us analyse large-N qualitative data from open-ended interviews, with an application to transcripts of interviews with Rohingya refugees in Cox's Bazaar, Bangladesh. We find that a great deal of caution is needed in using LLMs to annotate text as there is a risk of introducing biases that can lead to misleading inferences. We here mean bias in the technical sense, that the errors that LLMs make in annotating interview transcripts are not random with respect to the characteristics of the interview subjects. Training simpler supervised models on high-quality human annotations with flexible coding leads to less measurement error and bias than LLM annotations. Therefore, given that some high quality annotations are necessary in order to asses whether an LLM introduces bias, we argue that it is probably preferable to train a bespoke model on these annotations than it is to use an LLM for annotation.

In this paper, we establish a benchmark named HalluQA (Chinese Hallucination Question-Answering) to measure the hallucination phenomenon in Chinese large language models. HalluQA contains 450 meticulously designed adversarial questions, spanning multiple domains, and takes into account Chinese historical culture, customs, and social phenomena. During the construction of HalluQA, we consider two types of hallucinations: imitative falsehoods and factual errors, and we construct adversarial samples based on GLM-130B and ChatGPT. For evaluation, we design an automated evaluation method using GPT-4 to judge whether a model output is hallucinated. We conduct extensive experiments on 24 large language models, including ERNIE-Bot, Baichuan2, ChatGLM, Qwen, SparkDesk and etc. Out of the 24 models, 18 achieved non-hallucination rates lower than 50%. This indicates that HalluQA is highly challenging. We analyze the primary types of hallucinations in different types of models and their causes. Additionally, we discuss which types of hallucinations should be prioritized for different types of models.

In recent years, there has been significant interest in the development of machine learning-based optimization proxies for AC Optimal Power Flow (AC-OPF). Although significant progress has been achieved in predicting high-quality primal solutions, no existing learning-based approach can provide valid dual bounds for AC-OPF. This paper addresses this gap by training optimization proxies for a convex relaxation of AC-OPF. Namely, the paper considers a second-order cone (SOC) relaxation of ACOPF, and proposes a novel dual architecture that embeds a fast, differentiable (dual) feasibility recovery, thus providing valid dual bounds. The paper combines this new architecture with a self-supervised learning scheme, which alleviates the need for costly training data generation. Extensive numerical experiments on medium- and large-scale power grids demonstrate the efficiency and scalability of the proposed methodology.

In multiagent systems (MASs), agents' observation upon system behaviours may improve the overall team performance, but may also leak sensitive information to an observer. A quantified observability analysis can thus be useful to assist decision-making in MASs by operators seeking to optimise the relationship between performance effectiveness and information exposure through observations in practice. This paper presents a novel approach to quantitatively analysing the observability properties in MASs. The concept of opacity is applied to formally express the characterisation of observability in MASs modelled as partially observable multiagent systems. We propose a temporal logic oPATL to reason about agents' observability with quantitative goals, which capture the probability of information transparency of system behaviours to an observer, and develop verification techniques for quantitatively analysing such properties. We implement the approach as an extension of the PRISM model checker, and illustrate its applicability via several examples.

Neural networks often suffer from a feature preference problem, where they tend to overly rely on specific features to solve a task while disregarding other features, even if those neglected features are essential for the task. Feature preference problems have primarily been investigated in classification task. However, we observe that feature preference occurs in high-dimensional regression task, specifically, source separation. To mitigate feature preference in source separation, we propose FEAture BAlancing by Suppressing Easy feature (FEABASE). This approach enables efficient data utilization by learning hidden information about the neglected feature. We evaluate our method in a multi-channel source separation task, where feature preference between spatial feature and timbre feature appears.

Ontology embedding methods are powerful approaches to represent and reason over structured knowledge in various domains. One advantage of ontology embeddings over knowledge graph embeddings is their ability to capture and impose an underlying schema to which the model must conform. Despite advances, most current approaches do not guarantee that the resulting embedding respects the axioms the ontology entails. In this work, we formally prove that normalized ${\cal ELH}$ has the strong faithfulness property on convex geometric models, which means that there is an embedding that precisely captures the original ontology. We present a region-based geometric model for embedding normalized ${\cal ELH}$ ontologies into a continuous vector space. To prove strong faithfulness, our construction takes advantage of the fact that normalized ${\cal ELH}$ has a finite canonical model. We first prove the statement assuming (possibly) non-convex regions, allowing us to keep the required dimensions low. Then, we impose convexity on the regions and show the property still holds. Finally, we consider reasoning tasks on geometric models and analyze the complexity in the class of convex geometric models used for proving strong faithfulness.

Recently, it has been shown that the incorporation of structured knowledge into Large Language Models significantly improves the results for a variety of NLP tasks. In this paper, we propose a method for exploring pre-trained Text-to-Text Language Models enriched with additional information from Knowledge Graphs for answering factoid questions. More specifically, we propose an algorithm for subgraphs extraction from a Knowledge Graph based on question entities and answer candidates. Then, we procure easily interpreted information with Transformer-based models through the linearization of the extracted subgraphs. Final re-ranking of the answer candidates with the extracted information boosts Hits@1 scores of the pre-trained text-to-text language models by 4-6%.

Since it has often been claimed by academics that reconfiguration is essential, many approaches to reconfiguration, especially of robotic systems, have been developed. Accordingly, the literature on robotics is rich in techniques for reconfiguring robotic systems. However, when talking to researchers in the domain, there seems to be no common understanding of what exactly reconfiguration is and how it relates to other concepts such as adaptation. Beyond this academic perspective, robotics frameworks provide mechanisms for dynamically loading and unloading parts of robotics applications. While we have a fuzzy picture of the state-of-the-art in robotic reconfiguration from an academic perspective, we lack a picture of the state-of-practice from a practitioner perspective. To fill this gap, we survey the literature on reconfiguration in robotic systems by identifying and analyzing 98 relevant papers, review how four major robotics frameworks support reconfiguration, and finally investigate the realization of reconfiguration in 48 robotics applications. When comparing the state-of-the-art with the state-of-practice, we observed a significant discrepancy between them, in particular, the scientific community focuses on complex structural reconfiguration, while in practice only parameter reconfiguration is widely used. Based on our observations, we discuss possible reasons for this discrepancy and conclude with a takeaway message for academics and practitioners interested in robotics.

While Reinforcement Learning (RL) achieves tremendous success in sequential decision-making problems of many domains, it still faces key challenges of data inefficiency and the lack of interpretability. Interestingly, many researchers have leveraged insights from the causality literature recently, bringing forth flourishing works to unify the merits of causality and address well the challenges from RL. As such, it is of great necessity and significance to collate these Causal Reinforcement Learning (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL. In particular, we divide existing CRL approaches into two categories according to whether their causality-based information is given in advance or not. We further analyze each category in terms of the formalization of different models, ranging from the Markov Decision Process (MDP), Partially Observed Markov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic Treatment Regime (DTR). Moreover, we summarize the evaluation matrices and open sources while we discuss emerging applications, along with promising prospects for the future development of CRL.

Graph Neural Networks (GNNs) have recently been used for node and graph classification tasks with great success, but GNNs model dependencies among the attributes of nearby neighboring nodes rather than dependencies among observed node labels. In this work, we consider the task of inductive node classification using GNNs in supervised and semi-supervised settings, with the goal of incorporating label dependencies. Because current GNNs are not universal (i.e., most-expressive) graph representations, we propose a general collective learning approach to increase the representation power of any existing GNN. Our framework combines ideas from collective classification with self-supervised learning, and uses a Monte Carlo approach to sampling embeddings for inductive learning across graphs. We evaluate performance on five real-world network datasets and demonstrate consistent, significant improvement in node classification accuracy, for a variety of state-of-the-art GNNs.

北京阿比特科技有限公司