Speaker recognition is a widely used voice-based biometric technology with applications in various industries, including banking, education, recruitment, immigration, law enforcement, healthcare, and well-being. However, while dataset evaluations and audits have improved data practices in face recognition and other computer vision tasks, the data practices in speaker recognition have gone largely unquestioned. Our research aims to address this gap by exploring how dataset usage has evolved over time and what implications this has on bias, fairness and privacy in speaker recognition systems. Previous studies have demonstrated the presence of historical, representation, and measurement biases in popular speaker recognition benchmarks. In this paper, we present a longitudinal study of speaker recognition datasets used for training and evaluation from 2012 to 2021. We survey close to 700 papers to investigate community adoption of datasets and changes in usage over a crucial time period where speaker recognition approaches transitioned to the widespread adoption of deep neural networks. Our study identifies the most commonly used datasets in the field, examines their usage patterns, and assesses their attributes that affect bias, fairness, and other ethical concerns. Our findings suggest areas for further research on the ethics and fairness of speaker recognition technology.
Adversarial examples in machine learning has emerged as a focal point of research due to their remarkable ability to deceive models with seemingly inconspicuous input perturbations, potentially resulting in severe consequences. In this study, we embark on a comprehensive exploration of adversarial machine learning models, shedding light on their intrinsic complexity and interpretability. Our investigation reveals intriguing links between machine learning model complexity and Einstein's theory of special relativity, through the concept of entanglement. More specific, we define entanglement computationally and demonstrate that distant feature samples can exhibit strong correlations, akin to entanglement in quantum realm. This revelation challenges conventional perspectives in describing the phenomenon of adversarial transferability observed in contemporary machine learning models. By drawing parallels with the relativistic effects of time dilation and length contraction during computation, we gain deeper insights into adversarial machine learning, paving the way for more robust and interpretable models in this rapidly evolving field.
While 5G networks are already being deployed for commercial applications, Academia and industry are focusing their effort on the development and standardization of the next generations of mobile networks, i.e., 5G-Advance and 6G. Beyond 5G networks will revolutionize communications systems providing seamless connectivity, both in time and in space, to a unique ecosystem consisting of the convergence of the digital, physical, and human domains. In this scenario, NonTerrestrial Networks (NTN) will play a crucial role by providing ubiquitous, secure, and resilient infrastructure fully integrated into the overall system. The additional network complexity introduced by the third dimension of the architecture requires the interoperability of different network elements, enabled by the disaggregation and virtualization of network components, their interconnection by standard interfaces and orchestration by data-driven network artificial intelligence. The disaggregation paradigm foresees the division of the radio access network in different virtualized block of functions, introducing the concept of functional split. Wisely selecting the RAN functional split is possible to better exploit the system resources, obtaining costs saving, and to increase the system performances. In this paper, we firstly provide a discussion of the current 6G NTN development in terms of architectural solutions and then, we thoroughly analyze the impact of the typical NTN channel impairments on the available functional splits. Finally, the benefits of introducing the dynamic optimization of the functional split in NTN are analyzed, together with the foreseen challenges.
Explainable artificial intelligence (XAI) methods are portrayed as a remedy for debugging and trusting statistical and deep learning models, as well as interpreting their predictions. However, recent advances in adversarial machine learning (AdvML) highlight the limitations and vulnerabilities of state-of-the-art explanation methods, putting their security and trustworthiness into question. The possibility of manipulating, fooling or fairwashing evidence of the model's reasoning has detrimental consequences when applied in high-stakes decision-making and knowledge discovery. This survey provides a comprehensive overview of research concerning adversarial attacks on explanations of machine learning models, as well as fairness metrics. We introduce a unified notation and taxonomy of methods facilitating a common ground for researchers and practitioners from the intersecting research fields of AdvML and XAI. We discuss how to defend against attacks and design robust interpretation methods. We contribute a list of existing insecurities in XAI and outline the emerging research directions in adversarial XAI (AdvXAI). Future work should address improving explanation methods and evaluation protocols to take into account the reported safety issues.
Semantic communication (SemCom) has recently been considered a promising solution to guarantee high resource utilization and transmission reliability for future wireless networks. Nevertheless, the unique demand for background knowledge matching makes it challenging to achieve efficient wireless resource management for multiple users in SemCom-enabled networks (SC-Nets). To this end, this paper investigates SemCom from a networking perspective, where two fundamental problems of user association (UA) and bandwidth allocation (BA) are systematically addressed in the SC-Net. First, considering varying knowledge matching states between mobile users and associated base stations, we identify two general SC-Net scenarios, namely perfect knowledge matching-based SC-Net and imperfect knowledge matching-based SC-Net. Afterward, for each SC-Net scenario, we describe its distinctive semantic channel model from the semantic information theory perspective, whereby a concept of bit-rate-to-message-rate transformation is developed along with a new semantics-level metric, namely system throughput in message (STM), to measure the overall network performance. In this way, we then formulate a joint STM-maximization problem of UA and BA for each SC-Net scenario, followed by a corresponding optimal solution proposed. Numerical results in both scenarios demonstrate significant superiority and reliability of our solutions in the STM performance compared with two benchmarks.
Vectorial dual-bent functions have recently attracted some researchers' interest as they play a significant role in constructing partial difference sets, association schemes, bent partitions and linear codes. In this paper, we further study vectorial dual-bent functions $F: V_{n}^{(p)}\rightarrow V_{m}^{(p)}$, where $2\leq m \leq \frac{n}{2}$, $V_{n}^{(p)}$ denotes an $n$-dimensional vector space over the prime field $\mathbb{F}_{p}$. We give new characterizations of certain vectorial dual-bent functions (called vectorial dual-bent functions with Condition A) in terms of amorphic association schemes, linear codes and generalized Hadamard matrices, respectively. When $p=2$, we characterize vectorial dual-bent functions with Condition A in terms of bent partitions. Furthermore, we characterize certain bent partitions in terms of amorphic association schemes, linear codes and generalized Hadamard matrices, respectively. For general vectorial dual-bent functions $F: V_{n}^{(p)}\rightarrow V_{m}^{(p)}$ with $F(0)=0, F(x)=F(-x)$ and $2\leq m \leq \frac{n}{2}$, we give a necessary and sufficient condition on constructing association schemes. Based on such a result, more association schemes are constructed from vectorial dual-bent functions.
Ordered sequences of data, specified with a join operation to combine sequences, serve as a foundation for the implementation of parallel functional algorithms. This abstract data type can be elegantly and efficiently implemented using balanced binary trees, where a join operation is provided to combine two trees and rebalance as necessary. In this work, we present a verified implementation and cost analysis of joinable red-black trees in $\textbf{calf}$, a dependent type theory for cost analysis. We implement red-black trees and auxiliary intermediate data structures in such a way that all correctness invariants are intrinsically maintained. Then, we describe and verify precise cost bounds on the operations, making use of the red-black tree invariants. Finally, we implement standard algorithms on sequences using the simple join-based signature and bound their cost in the case that red-black trees are used as the underlying implementation. All proofs are formally mechanized using the embedding of $\textbf{calf}$ in the Agda theorem prover.
Energy consumption is a fundamental concern in mobile application development, bearing substantial significance for both developers and end-users. Moreover, it is a critical determinant in the consumer's decision-making process when considering a smartphone purchase. From the sustainability perspective, it becomes imperative to explore approaches aimed at mitigating the energy consumption of mobile devices, given the significant global consequences arising from the extensive utilisation of billions of smartphones, which imparts a profound environmental impact. Despite the existence of various energy-efficient programming practices within the Android platform, the dominant mobile ecosystem, there remains a need for documented machine learning-based energy prediction algorithms tailored explicitly for mobile app development. Hence, the main objective of this research is to propose a novel neural network-based framework, enhanced by a metaheuristic approach, to achieve robust energy prediction in the context of mobile app development. The metaheuristic approach here plays a crucial role in not only identifying suitable learning algorithms and their corresponding parameters but also determining the optimal number of layers and neurons within each layer. To the best of our knowledge, prior studies have yet to employ any metaheuristic algorithm to address all these hyperparameters simultaneously. Moreover, due to limitations in accessing certain aspects of a mobile phone, there might be missing data in the data set, and the proposed framework can handle this. In addition, we conducted an optimal algorithm selection strategy, employing 13 metaheuristic algorithms, to identify the best algorithm based on accuracy and resistance to missing values. The comprehensive experiments demonstrate that our proposed approach yields significant outcomes for energy consumption prediction.
Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These applications require lower resolution, longer ranges, and elevated viewpoints. To meet these critical needs, we collected and curated the first and second subsets of a large multi-modal biometric dataset designed for use in the research and development (R&D) of biometric recognition technologies under extremely challenging conditions. Thus far, the dataset includes more than 350,000 still images and over 1,300 hours of video footage of approximately 1,000 subjects. To collect this data, we used Nikon DSLR cameras, a variety of commercial surveillance cameras, specialized long-rage R&D cameras, and Group 1 and Group 2 UAV platforms. The goal is to support the development of algorithms capable of accurately recognizing people at ranges up to 1,000 m and from high angles of elevation. These advances will include improvements to the state of the art in face recognition and will support new research in the area of whole-body recognition using methods based on gait and anthropometry. This paper describes methods used to collect and curate the dataset, and the dataset's characteristics at the current stage.
Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators. Capitalizing on these advances in computer vision, the medical imaging field has also witnessed growing interest for Transformers that can capture global context compared to CNNs with local receptive fields. Inspired from this transition, in this survey, we attempt to provide a comprehensive review of the applications of Transformers in medical imaging covering various aspects, ranging from recently proposed architectural designs to unsolved issues. Specifically, we survey the use of Transformers in medical image segmentation, detection, classification, reconstruction, synthesis, registration, clinical report generation, and other tasks. In particular, for each of these applications, we develop taxonomy, identify application-specific challenges as well as provide insights to solve them, and highlight recent trends. Further, we provide a critical discussion of the field's current state as a whole, including the identification of key challenges, open problems, and outlining promising future directions. We hope this survey will ignite further interest in the community and provide researchers with an up-to-date reference regarding applications of Transformer models in medical imaging. Finally, to cope with the rapid development in this field, we intend to regularly update the relevant latest papers and their open-source implementations at \url{//github.com/fahadshamshad/awesome-transformers-in-medical-imaging}.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.