亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Leveraging Input Convex Neural Networks (ICNNs), ICNN-based Model Predictive Control (MPC) successfully attains globally optimal solutions by upholding convexity within the MPC framework. However, current ICNN architectures encounter the issue of vanishing/exploding gradients, which limits their ability to serve as deep neural networks for complex tasks. Additionally, the current neural network-based MPC, including conventional neural network-based MPC and ICNN-based MPC, faces slower convergence speed when compared to MPC based on first-principles models. In this study, we leverage the principles of ICNNs to propose a novel Input Convex LSTM for Lyapunov-based MPC, with the specific goal of reducing convergence time and mitigating the vanishing/exploding gradient problem while ensuring closed-loop stability. From a simulation study of a nonlinear chemical reactor, we observed a mitigation of vanishing/exploding gradient problem and a reduction in convergence time, with a percentage decrease of 46.7%, 31.3%, and 20.2% compared to baseline plain RNN, plain LSTM, and Input Convex Recurrent Neural Network, respectively.

相關內容

長短期記憶網絡(LSTM)是一種用于深度學習領域的人工回歸神經網絡(RNN)結構。與標準的前饋神經網絡不同,LSTM具有反饋連接。它不僅可以處理單個數據點(如圖像),還可以處理整個數據序列(如語音或視頻)。例如,LSTM適用于未分段、連接的手寫識別、語音識別、網絡流量或IDSs(入侵檢測系統)中的異常檢測等任務。

Reconstructing visual stimuli from functional Magnetic Resonance Imaging (fMRI) based on Latent Diffusion Models (LDM) provides a fine-grained retrieval of the brain. A challenge persists in reconstructing a cohesive alignment of details (such as structure, background, texture, color, etc.). Moreover, LDMs would generate different image results even under the same conditions. For these, we first uncover the neuroscientific perspective of LDM-based methods that is top-down creation based on pre-trained knowledge from massive images but lack of detail-driven bottom-up perception resulting in unfaithful details. We propose NeuralDiffuser which introduces primary visual feature guidance to provide detail cues in the form of gradients, extending the bottom-up process for LDM-based methods to achieve faithful semantics and details. We also developed a novel guidance strategy to ensure the consistency of repeated reconstructions rather than a variety of results. We obtain the state-of-the-art performance of NeuralDiffuser on the Natural Senses Dataset (NSD), which offers more faithful details and consistent results.

Large Language Models (LLMs) have revolutionized open-domain dialogue agents but encounter challenges in multi-character role-playing (MCRP) scenarios. To address the issue, we present Neeko, an innovative framework designed for efficient multiple characters imitation. Unlike existing methods, Neeko employs a dynamic low-rank adapter (LoRA) strategy, enabling it to adapt seamlessly to diverse characters. Our framework breaks down the role-playing process into agent pre-training, multiple characters playing, and character incremental learning, effectively handling both seen and unseen roles. This dynamic approach, coupled with distinct LoRA blocks for each character, enhances Neeko's adaptability to unique attributes, personalities, and speaking patterns. As a result, Neeko demonstrates superior performance in MCRP over most existing methods, offering more engaging and versatile user interaction experiences. Code and data are available at //github.com/weiyifan1023/Neeko.

Large Language Models (LLMs) have demonstrated impressive capabilities across a wide range of tasks. However, their proficiency and reliability in the specialized domain of Data Analysis, particularly with a focus on data-driven thinking, remain uncertain. To bridge this gap, we introduce BIBench, a comprehensive benchmark designed to evaluate the data analysis capabilities of LLMs within the context of Business Intelligence (BI). BIBench assesses LLMs across three dimensions: 1) BI foundational knowledge, evaluating the models' numerical reasoning and familiarity with financial concepts; 2) BI knowledge application, determining the models' ability to quickly comprehend textual information and generate analysis questions from multiple views; and 3) BI technical skills, examining the models' use of technical knowledge to address real-world data analysis challenges. BIBench comprises 11 sub-tasks, spanning three categories of task types: classification, extraction, and generation. Additionally, we've developed BIChat, a domain-specific dataset with over a million data points, to fine-tune LLMs. We will release BIBenchmark, BIChat, and the evaluation scripts at \url{//github.com/cubenlp/BIBench}. This benchmark aims to provide a measure for in-depth analysis of LLM abilities and foster the advancement of LLMs in the field of data analysis.

Despite achieving rapid developments and with widespread applications, Large Vision-Language Models (LVLMs) confront a serious challenge of being prone to generating hallucinations. An over-reliance on linguistic priors has been identified as a key factor leading to these hallucinations. In this paper, we propose to alleviate this problem by introducing a novel image-biased decoding (IBD) technique. Our method derives the next-token probability distribution by contrasting predictions from a conventional LVLM with those of an image-biased LVLM, thereby amplifying the correct information highly correlated with image content while mitigating the hallucinatory errors caused by excessive dependence on text. We further conduct a comprehensive statistical analysis to validate the reliability of our method, and design an adaptive adjustment strategy to achieve robust and flexible handling under varying conditions. Experimental results across multiple evaluation metrics verify that our method, despite not requiring additional training data and only with a minimal increase in model parameters, can significantly reduce hallucinations in LVLMs and enhance the truthfulness of the generated response.

Despite the predominance of English in their training data, English-centric Large Language Models (LLMs) like GPT-3 and LLaMA display a remarkable ability to perform multilingual tasks, raising questions about the depth and nature of their cross-lingual capabilities. This paper introduces the decomposed prompting approach to probe the linguistic structure understanding of these LLMs in sequence labeling tasks. Diverging from the single text-to-text prompt, our method generates for each token of the input sentence an individual prompt which asks for its linguistic label. We assess our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages, utilizing both English-centric and multilingual LLMs. Our findings show that decomposed prompting surpasses the iterative prompting baseline in efficacy and efficiency under zero- and few-shot settings. Further analysis reveals the influence of evaluation methods and the use of instructions in prompts. Our multilingual investigation shows that English-centric language models perform better on average than multilingual models. Our study offers insights into the multilingual transferability of English-centric LLMs, contributing to the understanding of their multilingual linguistic knowledge.

The growing availability of Synthetic Aperture Radar (SAR) target datasets allows for the consolidation of different SAR Automatic Target Recognition (ATR) tasks using a foundational model powered by Self-Supervised Learning (SSL). SSL aims to derive supervision signals directly from the data, thereby minimizing the need for costly expert labeling and maximizing the use of the expanding sample pool in constructing a foundational model. This study investigates an effective SSL method for SAR ATR, which can pave the way for building the foundation model. The primary obstacles faced in SSL for SAR ATR are the scale problem of the remote sensing images and speckle noise in SAR images. To overcome these challenges, we present a novel approach called Knowledge-Guided Predictive Architecture (SAR-KPGA), which leverages local masked patches to predict the multi-scale SAR feature representations of unseen context. The key aspect of SAR-KPGA is integrating SAR domain features to ensure high-quality target features for SSL. Furthermore, we employ local masks and multi-scale features to accommodate the large image scale and target scale variations in remote sensing scenarios. By evaluating our framework on three target recognition datasets (vehicle, ship, and aircraft), we demonstrate its outperformance over other SSL methods and its effectiveness with increasing SAR data. This study showcases the potential of SSL for SAR target recognition across diverse targets, scenes, and sensors.

Radar Automated Target Recognition (RATR) for Unmanned Aerial Vehicles (UAVs) involves transmitting Electromagnetic Waves (EMWs) and performing target type recognition on the received radar echo, crucial for defense and aerospace applications. Previous studies highlighted the advantages of multistatic radar configurations over monostatic ones in RATR. However, fusion methods in multistatic radar configurations often suboptimally combine classification vectors from individual radars probabilistically. To address this, we propose a fully Bayesian RATR framework employing Optimal Bayesian Fusion (OBF) to aggregate classification probability vectors from multiple radars. OBF, based on expected 0-1 loss, updates a Recursive Bayesian Classification (RBC) posterior distribution for target UAV type, conditioned on historical observations across multiple time steps. We evaluate the approach using simulated random walk trajectories for seven drones, correlating target aspect angles to Radar Cross Section (RCS) measurements in an anechoic chamber. Comparing against single radar Automated Target Recognition (ATR) systems and suboptimal fusion methods, our empirical results demonstrate that the OBF method integrated with RBC significantly enhances classification accuracy compared to other fusion methods and single radar configurations.

As Graph Neural Networks (GNNs) become popular, libraries like PyTorch-Geometric (PyG) and Deep Graph Library (DGL) are proposed; these libraries have emerged as the de facto standard for implementing GNNs because they provide graph-oriented APIs and are purposefully designed to manage the inherent sparsity and irregularity in graph structures. However, these libraries show poor scalability on multi-core processors, which under-utilizes the available platform resources and limits the performance. This is because GNN training is a resource-intensive workload with high volume of irregular data accessing, and existing libraries fail to utilize the memory bandwidth efficiently. To address this challenge, we propose ARGO, a novel runtime system for GNN training that offers scalable performance. ARGO exploits multi-processing and core-binding techniques to improve platform resource utilization. We further develop an auto-tuner that searches for the optimal configuration for multi-processing and core-binding. The auto-tuner works automatically, making it completely transparent from the user. Furthermore, the auto-tuner allows ARGO to adapt to various platforms, GNN models, datasets, etc. We evaluate ARGO on two representative GNN models and four widely-used datasets on two platforms. With the proposed autotuner, ARGO is able to select a near-optimal configuration by exploring only 5% of the design space. ARGO speeds up state-of-the-art GNN libraries by up to 5.06x and 4.54x on a four-socket Ice Lake machine with 112 cores and a two-socket Sapphire Rapids machine with 64 cores, respectively. Finally, ARGO can seamlessly integrate into widely-used GNN libraries (e.g., DGL, PyG) with few lines of code and speed up GNN training.

Recent developments in Distributed Ledger Technology (DLT), including Blockchain offer new opportunities in the manufacturing domain, by providing mechanisms to automate trust services (digital identity, trusted interactions, and auditable transactions) and when combined with other advanced digital technologies (e.g. machine learning) can provide a secure backbone for trusted data flows between independent entities. This paper presents an DLT-based architectural pattern and technology solution known as SmartQC that aims to provide an extensible and flexible approach to integrating DLT technology into existing workflows and processes. SmartQC offers an opportunity to make processes more time efficient, reliable, and robust by providing two key features i) data integrity through immutable ledgers and ii) automation of business workflows leveraging smart contracts. The paper will present the system architecture, extensible data model and the application of SmartQC in the context of example smart manufacturing applications.

With the extremely rapid advances in remote sensing (RS) technology, a great quantity of Earth observation (EO) data featuring considerable and complicated heterogeneity is readily available nowadays, which renders researchers an opportunity to tackle current geoscience applications in a fresh way. With the joint utilization of EO data, much research on multimodal RS data fusion has made tremendous progress in recent years, yet these developed traditional algorithms inevitably meet the performance bottleneck due to the lack of the ability to comprehensively analyse and interpret these strongly heterogeneous data. Hence, this non-negligible limitation further arouses an intense demand for an alternative tool with powerful processing competence. Deep learning (DL), as a cutting-edge technology, has witnessed remarkable breakthroughs in numerous computer vision tasks owing to its impressive ability in data representation and reconstruction. Naturally, it has been successfully applied to the field of multimodal RS data fusion, yielding great improvement compared with traditional methods. This survey aims to present a systematic overview in DL-based multimodal RS data fusion. More specifically, some essential knowledge about this topic is first given. Subsequently, a literature survey is conducted to analyse the trends of this field. Some prevalent sub-fields in the multimodal RS data fusion are then reviewed in terms of the to-be-fused data modalities, i.e., spatiospectral, spatiotemporal, light detection and ranging-optical, synthetic aperture radar-optical, and RS-Geospatial Big Data fusion. Furthermore, We collect and summarize some valuable resources for the sake of the development in multimodal RS data fusion. Finally, the remaining challenges and potential future directions are highlighted.

北京阿比特科技有限公司