欧美成年黄色网站在线观看,日本国产欧美精品视频一区二区三区,臀蜜乳日韩精品一区二区视频,日本高清一区二区三区在线有码,97人妻起碰公开免费视频

Speaker identity plays a significant role in human communication and is being increasingly used in societal applications, many through advances in machine learning. Speaker identity perception is an essential cognitive phenomenon that can be broadly reduced to two main tasks: recognizing a voice or discriminating between voices. Several studies have attempted to identify acoustic correlates of identity perception to pinpoint salient parameters for such a task. Unlike other communicative social signals, most efforts have yielded inefficacious conclusions. Furthermore, current neurocognitive models of voice identity processing consider the bases of perception as acoustic dimensions such as fundamental frequency, harmonics-to-noise ratio, and formant dispersion. However, these findings do not account for naturalistic speech and within-speaker variability. Representational spaces of current self-supervised models have shown significant performance in various speech-related tasks. In this work, we demonstrate that self-supervised representations from different families (e.g., generative, contrastive, and predictive models) are significantly better for speaker identification over acoustic representations. We also show that such a speaker identification task can be used to better understand the nature of acoustic information representation in different layers of these powerful networks. By evaluating speaker identification accuracy across acoustic, phonemic, prosodic, and linguistic variants, we report similarity between model performance and human identity perception. We further examine these similarities by juxtaposing the encoding spaces of models and humans and challenging the use of distance metrics as a proxy for speaker proximity. Lastly, we show that some models can predict brain responses in Auditory and Language regions during naturalistic stimuli.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 秩 · Performer · Learning · 有偏 ·

2024 年 7 月 29 日

Practical and Robust Safety Guarantees for Advanced Counterfactual Learning to Rank

Shashank Gupta,Harrie Oosterhuis,Maarten de Rijke

from arxiv, Full paper at CIKM 2024

Counterfactual learning to rank (CLTR ) can be risky; various circumstances can cause it to produce sub-optimal models that hurt performance when deployed. Safe CLTR was introduced to mitigate these risks when using inverse propensity scoring to correct for position bias. However, the existing safety measure for CLTR is not applicable to state-of-the-art CLTR, it cannot handle trust bias, and its guarantees rely on specific assumptions about user behavior. Our contributions are two-fold. First, we generalize the existing safe CLTR approach to make it applicable to state-of-the-art doubly robust (DR) CLTR and trust bias. Second, we propose a novel approach, proximal ranking policy optimization (PRPO ), that provides safety in deployment without assumptions about user behavior. PRPO removes incentives for learning ranking behavior that is too dissimilar to a safe ranking model. Thereby, PRPO imposes a limit on how much learned models can degrade performance metrics, without relying on any specific user assumptions. Our experiments show that both our novel safe doubly robust method and PRPO provide higher performance than the existing safe inverse propensity scoring approach. However, when circumstances are unexpected, the safe doubly robust approach can become unsafe and bring detrimental performance. In contrast, PRPO always maintains safety, even in maximally adversarial situations. By avoiding assumptions, PRPO is the first method with unconditional safety in deployment that translates to robust safety for real-world applications.

Processing（編程語言） · 變換 · Graph Transformer · MoDELS · 圖 ·

2024 年 7 月 29 日

A Unified Graph Transformer for Overcoming Isolations in Multi-modal Recommendation

Zixuan Yi,Iadh Ounis

With the rapid development of online multimedia services, especially in e-commerce platforms, there is a pressing need for personalised recommendation systems that can effectively encode the diverse multi-modal content associated with each item. However, we argue that existing multi-modal recommender systems typically use isolated processes for both feature extraction and modality modelling. Such isolated processes can harm the recommendation performance. Firstly, an isolated extraction process underestimates the importance of effective feature extraction in multi-modal recommendations, potentially incorporating non-relevant information, which is harmful to item representations. Second, an isolated modality modelling process produces disjointed embeddings for item modalities due to the individual processing of each modality, which leads to a suboptimal fusion of user/item representations for effective user preferences prediction. We hypothesise that the use of a unified model for addressing both aforementioned isolated processes will enable the consistent extraction and cohesive fusion of joint multi-modal features, thereby enhancing the effectiveness of multi-modal recommender systems. In this paper, we propose a novel model, called Unified Multi-modal Graph Transformer (UGT), which firstly leverages a multi-way transformer to extract aligned multi-modal features from raw data for top-k recommendation. Subsequently, we build a unified graph neural network in our UGT model to jointly fuse the user/item representations with their corresponding multi-modal features. Using the graph transformer architecture of our UGT model, we show that the UGT model can achieve significant effectiveness gains, especially when jointly optimised with the commonly-used multi-modal recommendation losses.

等變 · 正則化項 · Networking · 不變 · Performance ·

2024 年 7 月 25 日

Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning

Mirco Theile,Hongpeng Cao,Marco Caccamo,Alberto L. Sangiovanni-Vincentelli

from arxiv, Accepted at IROS 2024. A video can be found here: //youtu.be/L6NOdvU7n7s. The code is available at //github.com/theilem/uavSim

In reinforcement learning (RL), exploiting environmental symmetries can significantly enhance efficiency, robustness, and performance. However, ensuring that the deep RL policy and value networks are respectively equivariant and invariant to exploit these symmetries is a substantial challenge. Related works try to design networks that are equivariant and invariant by construction, limiting them to a very restricted library of components, which in turn hampers the expressiveness of the networks. This paper proposes a method to construct equivariant policies and invariant value functions without specialized neural network components, which we term equivariant ensembles. We further add a regularization term for adding inductive bias during training. In a map-based path planning case study, we show how equivariant ensembles and regularization benefit sample efficiency and performance.

控制器 · INTERACT · MoDELS · 可約的 · 均方根 ·

2024 年 7 月 23 日

Real-Time Interactions Between Human Controllers and Remote Devices in Metaverse

Kan Chen,Zhen Meng,Xiangmin Xu,Changyang She,Philip G. Zhao

from arxiv, This paper is accepted with minor revisions by IEEE MetroXRAINE 2024

Supporting real-time interactions between human controllers and remote devices remains a challenging goal in the Metaverse due to the stringent requirements on computing workload, communication throughput, and round-trip latency. In this paper, we establish a novel framework for real-time interactions through the virtual models in the Metaverse. Specifically, we jointly predict the motion of the human controller for 1) proactive rendering in the Metaverse and 2) generating control commands to the real-world remote device in advance. The virtual model is decoupled into two components for rendering and control, respectively. To dynamically adjust the prediction horizons for rendering and control, we develop a two-step human-in-the-loop continuous reinforcement learning approach and use an expert policy to improve the training efficiency. An experimental prototype is built to verify our algorithm with different communication latencies. Compared with the baseline policy without prediction, our proposed method can reduce 1) the Motion-To-Photon (MTP) latency between human motion and rendering feedback and 2) the root mean squared error (RMSE) between human motion and real-world remote devices significantly.

MoDELS · Networking · 神經元 · Neural Networks · MNIST (數據集) ·

2024 年 7 月 23 日

A Quantum Leaky Integrate-and-Fire Spiking Neuron and Network

Dean Brand,Francesco Petruccione

from arxiv, 9 pages, 6 figures, 1 table

Quantum machine learning is in a period of rapid development and discovery, however it still lacks the resources and diversity of computational models of its classical complement. With the growing difficulties of classical models requiring extreme hardware and power solutions, and quantum models being limited by noisy intermediate-scale quantum (NISQ) hardware, there is an emerging opportunity to solve both problems together. Here we introduce a new software model for quantum neuromorphic computing -- a quantum leaky integrate-and-fire (QLIF) neuron, implemented as a compact high-fidelity quantum circuit, requiring only 2 rotation gates and no CNOT gates. We use these neurons as building blocks in the construction of a quantum spiking neural network (QSNN), and a quantum spiking convolutional neural network (QSCNN), as the first of their kind. We apply these models to the MNIST, Fashion-MNIST, and KMNIST datasets for a full comparison with other classical and quantum models. We find that the proposed models perform competitively, with comparative accuracy, with efficient scaling and fast computation in classical simulation as well as on quantum devices.

Continuity · 泛函 · MoDELS · Learning · 博弈論 ·

2024 年 7 月 22 日

Equilibrium Computation in Multi-Stage Auctions and Contests

Fabian R. Pieroth,Nils Kohring,Martin Bichler

We compute equilibrium strategies in multi-stage games with continuous signal and action spaces as they are widely used in the management sciences and economics. Examples include sequential sales via auctions, multi-stage elimination contests, and Stackelberg competitions. In sequential auctions, analysts performing equilibrium analysis are required to derive not just single bids but bid functions for all possible signals or values that a bidder might have in multiple stages. Due to the continuity of the signal and action spaces, these bid functions come from an infinite dimensional space. While such models are fundamental to game theory and its applications, equilibrium strategies are rarely known. The resulting system of non-linear differential equations is considered intractable for all but elementary models. This has been limiting progress in game theory and is a barrier to its adoption in the field. We show that Deep Reinforcement Learning and self-play can learn equilibrium bidding strategies for various multi-stage games. We find equilibrium in models that have not yet been explored analytically and new asymmetric equilibrium bid functions for established models of sequential auctions. The verification of equilibrium is challenging in such games due to the continuous signal and action spaces. We introduce a verification algorithm and prove that the error of this verifier decreases when considering Lipschitz continuous strategies with increasing levels of discretization and sample sizes.

Learning · CASES · Performer · state-of-the-art · Machine Learning ·

2024 年 7 月 19 日

Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment

Théo Gnassounou,Antoine Collas,Rémi Flamary,Karim Lounici,Alexandre Gramfort

Machine learning applications on signals such as computer vision or biomedical data often face significant challenges due to the variability that exists across hardware devices or session recordings. This variability poses a Domain Adaptation (DA) problem, as training and testing data distributions often differ. In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities. This Optimal Transport (OT) based method adapts the cross-power spectrum density (cross-PSD) of multivariate signals by mapping them to the Wasserstein barycenter of source domains (multi-source DA). Predictions for new domains can be done with a filtering without the need for retraining a model with source data (test-time DA). We also study and discuss two special cases of the method, Temporal Monge Alignment (TMA) and Spatial Monge Alignment (SMA). Non-asymptotic concentration bounds are derived for the mappings estimation, which reveals a bias-plus-variance error structure with a variance decay rate of $\mathcal{O}(n_\ell^{-1/2})$ with $n_\ell$ the signal length. This theoretical guarantee demonstrates the efficiency of the proposed computational schema. Numerical experiments on multivariate biosignals and image data show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings. Notably, STMA is a pre-processing step complementary to state-of-the-art deep learning methods.

模態 · 潛在 · 正則化 · 損失 · Learning ·

2023 年 3 月 10 日

Understanding and Constructing Latent Modality Structures in Multi-modal Representation Learning

Qian Jiang,Changyou Chen,Han Zhao,Liqun Chen,Qing Ping,Son Dinh Tran,Yi Xu,Belinda Zeng,Trishul Chilimbi

from arxiv, 14 pages, 8 figure, CVPR 2023 accepted

Contrastive loss has been increasingly used in learning representations from multiple modalities. In the limit, the nature of the contrastive loss encourages modalities to exactly match each other in the latent space. Yet it remains an open question how the modality alignment affects the downstream task performance. In this paper, based on an information-theoretic argument, we first prove that exact modality alignment is sub-optimal in general for downstream prediction tasks. Hence we advocate that the key of better performance lies in meaningful latent modality structures instead of perfect modality alignment. To this end, we propose three general approaches to construct latent modality structures. Specifically, we design 1) a deep feature separation loss for intra-modality regularization; 2) a Brownian-bridge loss for inter-modality regularization; and 3) a geometric consistency loss for both intra- and inter-modality regularization. Extensive experiments are conducted on two popular multi-modal representation learning frameworks: the CLIP-based two-tower model and the ALBEF-based fusion model. We test our model on a variety of tasks including zero/few-shot image classification, image-text retrieval, visual question answering, visual reasoning, and visual entailment. Our method achieves consistent improvements over existing methods, demonstrating the effectiveness and generalizability of our proposed approach on latent modality structure regularization.

Extensibility · MoDELS · INFORMS · prototype · 優化器 ·

2022 年 3 月 14 日

A Logic for Hyperproperties in Multi-Agent Systems

Raven Beutner,Bernd Finkbeiner

Hyperproperties are commonly used in computer security to define information-flow policies and other requirements that reason about the relationship between multiple computations. In this paper, we study a novel class of hyperproperties where the individual computation paths are chosen by the strategic choices of a coalition of agents in a multi-agent system. We introduce HyperATL*, an extension of computation tree logic with path variables and strategy quantifiers. Our logic can express strategic hyperproperties, such as that the scheduler in a concurrent system has a strategy to avoid information leakage. HyperATL* is particularly useful to specify asynchronous hyperproperties, i.e., hyperproperties where the speed of the execution on the different computation paths depends on the choices of the scheduler. Unlike other recent logics for the specification of asynchronous hyperproperties, our logic is the first to admit decidable model checking for the full logic. We present a model checking algorithm for HyperATL* based on alternating automata, and show that our algorithm is asymptotically optimal by providing a matching lower bound. We have implemented a prototype model checker for a fragment of HyperATL*, able to check various security properties on small programs.

MoDELS · INTERACT · Better · Integration · Performance ·

2020 年 3 月 19 日

Recent Advances and Challenges in Task-oriented Dialog System

Zheng Zhang,Ryuichi Takanobu,Minlie Huang,Xiaoyan Zhu

from arxiv, Under review of SCIENCE CHINA Technological Science

Due to the significance and value in human-computer interaction and natural language processing, task-oriented dialog systems are attracting more and more attention in both academic and industrial communities. In this paper, we survey recent advances and challenges in an issue-specific manner. We discuss three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog system modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance, and (3) integrating domain ontology knowledge into the dialog model in both pipeline and end-to-end models. We also review the recent progresses in dialog evaluation and some widely-used corpora. We believe that this survey can shed a light on future research in task-oriented dialog systems.