亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In recent years, Large Language Models (LLMs) have garnered significant attention from the research community due to their exceptional performance and generalization capabilities. In this paper, we introduce a novel method for contextualizing speech recognition models incorporating LLMs. Our approach casts speech recognition as a mixed-modal language modeling task based on a pretrained LLM. We provide audio features, along with optional text tokens for context, to train the system to complete transcriptions in a decoder-only fashion. As a result, the system is implicitly incentivized to learn how to leverage unstructured contextual information during training. Our empirical results demonstrate a significant improvement in performance, with a 6% WER reduction when additional textual context is provided. Moreover, we find that our method performs competitively and improve by 7.5% WER overall and 17% WER on rare words against a baseline contextualized RNN-T system that has been trained on more than twenty five times larger speech dataset. Overall, we demonstrate that by only adding a handful number of trainable parameters via adapters, we can unlock contextualized speech recognition capability for the pretrained LLM while keeping the same text-only input functionality.

相關內容

Successfully training Physics Informed Neural Networks (PINNs) for highly nonlinear PDEs on complex 3D domains remains a challenging task. In this paper, PINNs are employed to solve the 3D incompressible Navier-Stokes (NS) equations at moderate to high Reynolds numbers for complex geometries. The presented method utilizes very sparsely distributed solution data in the domain. A detailed investigation on the effect of the amount of supplied data and the PDE-based regularizers is presented. Additionally, a hybrid data-PINNs approach is used to generate a surrogate model of a realistic flow-thermal electronics design problem. This surrogate model provides near real-time sampling and was found to outperform standard data-driven neural networks when tested on unseen query points. The findings of the paper show how PINNs can be effective when used in conjunction with sparse data for solving 3D nonlinear PDEs or for surrogate modeling of design spaces governed by them.

Various first order approaches have been proposed in the literature to solve Linear Programming (LP) problems, recently leading to practically efficient solvers for large-scale LPs. From a theoretical perspective, linear convergence rates have been established for first order LP algorithms, despite the fact that the underlying formulations are not strongly convex. However, the convergence rate typically depends on the Hoffman constant of a large matrix that contains the constraint matrix, as well as the right hand side, cost, and capacity vectors. We introduce a first order approach for LP optimization with a convergence rate depending polynomially on the circuit imbalance measure, which is a geometric parameter of the constraint matrix, and depending logarithmically on the right hand side, capacity, and cost vectors. This provides much stronger convergence guarantees. For example, if the constraint matrix is totally unimodular, we obtain polynomial-time algorithms, whereas the convergence guarantees for approaches based on primal-dual formulations may have arbitrarily slow convergence rates for this class. Our approach is based on a fast gradient method due to Necoara, Nesterov, and Glineur (Math. Prog. 2019); this algorithm is called repeatedly in a framework that gradually fixes variables to the boundary. This technique is based on a new approximate version of Tardos's method, that was used to obtain a strongly polynomial algorithm for combinatorial LPs (Oper. Res. 1986).

Security challenges for Cloud or Fog-based machine learning services pose several concerns. Securing the underlying Cloud or Fog services is essential, as successful attacks against these services, on which machine learning applications rely, can lead to significant impairments of these applications. Because the requirements for AI applications can also be different, we differentiate according to whether they are used in the Cloud or in a Fog Computing network. This then also results in different threats or attack possibilities. For Cloud platforms, the responsibility for security can be divided between different parties. Security deficiencies at a lower level can have a direct impact on the higher level where user data is stored. While responsibilities are simpler for Fog Computing networks, by moving services to the edge of the network, we have to secure them against physical access to the devices. We conclude by outlining specific information security requirements for AI applications.

We propose a new class of physics-informed neural networks, called Physics-Informed Generator-Encoder Adversarial Networks, to effectively address the challenges posed by forward, inverse, and mixed problems in stochastic differential equations. In these scenarios, while the governing equations are known, the available data consist of only a limited set of snapshots for system parameters. Our model consists of two key components: the generator and the encoder, both updated alternately by gradient descent. In contrast to previous approaches of directly matching the approximated solutions with real snapshots, we employ an indirect matching that operates within the lower-dimensional latent feature space. This method circumvents challenges associated with high-dimensional inputs and complex data distributions, while yielding more accurate solutions compared to existing neural network solvers. In addition, the approach also mitigates the training instability issues encountered in previous adversarial frameworks in an efficient manner. Numerical results provide compelling evidence of the effectiveness of the proposed method in solving different types of stochastic differential equations.

This study introduces an efficacious approach, Masked Collaborative Contrast (MCC), to highlight semantic regions in weakly supervised semantic segmentation. MCC adroitly draws inspiration from masked image modeling and contrastive learning to devise a novel framework that induces keys to contract toward semantic regions. Unlike prevalent techniques that directly eradicate patch regions in the input image when generating masks, we scrutinize the neighborhood relations of patch tokens by exploring masks considering keys on the affinity matrix. Moreover, we generate positive and negative samples in contrastive learning by utilizing the masked local output and contrasting it with the global output. Elaborate experiments on commonly employed datasets evidences that the proposed MCC mechanism effectively aligns global and local perspectives within the image, attaining impressive performance. The source code is available at \url{//github.com/fwu11/MCC}.

As many of us in the information retrieval (IR) research community know and appreciate, search is far from being a solved problem. Millions of people struggle with tasks on search engines every day. Often, their struggles relate to the intrinsic complexity of their task and the failure of search systems to fully understand the task and serve relevant results. The task motivates the search, creating the gap/problematic situation that searchers attempt to bridge/resolve and drives search behavior as they work through different task facets. Complex search tasks require more than support for rudimentary fact finding or re-finding. Research on methods to support complex tasks includes work on generating query and website suggestions, personalizing and contextualizing search, and developing new search experiences, including those that span time and space. The recent emergence of generative artificial intelligence (AI) and the arrival of assistive agents, or copilots, based on this technology, has the potential to offer further assistance to searchers, especially those engaged in complex tasks. There are profound implications from these advances for the design of intelligent systems and for the future of search itself. This article, based on a keynote by the author at the 2023 ACM SIGIR Conference, explores these issues and charts a course toward new horizons in information access guided by AI copilots.

Deployment of Internet of Things (IoT) devices and Data Fusion techniques have gained popularity in public and government domains. This usually requires capturing and consolidating data from multiple sources. As datasets do not necessarily originate from identical sensors, fused data typically results in a complex data problem. Because military is investigating how heterogeneous IoT devices can aid processes and tasks, we investigate a multi-sensor approach. Moreover, we propose a signal to image encoding approach to transform information (signal) to integrate (fuse) data from IoT wearable devices to an image which is invertible and easier to visualize supporting decision making. Furthermore, we investigate the challenge of enabling an intelligent identification and detection operation and demonstrate the feasibility of the proposed Deep Learning and Anomaly Detection models that can support future application that utilizes hand gesture data from wearable devices.

Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.

Incorporating knowledge graph into recommender systems has attracted increasing attention in recent years. By exploring the interlinks within a knowledge graph, the connectivity between users and items can be discovered as paths, which provide rich and complementary information to user-item interactions. Such connectivity not only reveals the semantics of entities and relations, but also helps to comprehend a user's interest. However, existing efforts have not fully explored this connectivity to infer user preferences, especially in terms of modeling the sequential dependencies within and holistic semantics of a path. In this paper, we contribute a new model named Knowledge-aware Path Recurrent Network (KPRN) to exploit knowledge graph for recommendation. KPRN can generate path representations by composing the semantics of both entities and relations. By leveraging the sequential dependencies within a path, we allow effective reasoning on paths to infer the underlying rationale of a user-item interaction. Furthermore, we design a new weighted pooling operation to discriminate the strengths of different paths in connecting a user with an item, endowing our model with a certain level of explainability. We conduct extensive experiments on two datasets about movie and music, demonstrating significant improvements over state-of-the-art solutions Collaborative Knowledge Base Embedding and Neural Factorization Machine.

The present paper surveys neural approaches to conversational AI that have been developed in the last few years. We group conversational systems into three categories: (1) question answering agents, (2) task-oriented dialogue agents, and (3) chatbots. For each category, we present a review of state-of-the-art neural approaches, draw the connection between them and traditional approaches, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies.

北京阿比特科技有限公司