Artificial Intelligence (AI) is one of the most transformative technologies of the 21st century. The extent and scope of future AI capabilities remain a key uncertainty, with widespread disagreement on timelines and potential impacts. As nations and technology companies race toward greater complexity and autonomy in AI systems, there are concerns over the extent of integration and oversight of opaque AI decision processes. This is especially true in the subfield of machine learning (ML), where systems learn to optimize objectives without human assistance. Objectives can be imperfectly specified or executed in an unexpected or potentially harmful way. This becomes more concerning as systems increase in power and autonomy, where an abrupt capability jump could result in unexpected shifts in power dynamics or even catastrophic failures. This study presents a hierarchical complex systems framework to model AI risk and provide a template for alternative futures analysis. Survey data were collected from domain experts in the public and private sectors to classify AI impact and likelihood. The results show increased uncertainty over the powerful AI agent scenario, confidence in multiagent environments, and increased concern over AI alignment failures and influence-seeking behavior.
Artificial Intelligence (AI), particularly through the advent of large-scale generative AI (GenAI) models such as Large Language Models (LLMs), has become a transformative element in contemporary technology. While these models have unlocked new possibilities, they simultaneously present significant challenges, such as concerns over data privacy and the propensity to generate misleading or fabricated content. Current frameworks for Responsible AI (RAI) often fall short in providing the granular guidance necessary for tangible application, especially for Accountability-a principle that is pivotal for ensuring transparent and auditable decision-making, bolstering public trust, and meeting increasing regulatory expectations. This study bridges the accountability gap by introducing our effort towards a comprehensive metrics catalogue, formulated through a systematic multivocal literature review (MLR) that integrates findings from both academic and grey literature. Our catalogue delineates process metrics that underpin procedural integrity, resource metrics that provide necessary tools and frameworks, and product metrics that reflect the outputs of AI systems. This tripartite framework is designed to operationalize Accountability in AI, with a special emphasis on addressing the intricacies of GenAI.
To realize holographic communications, a potential technology for spectrum efficiency improvement in the future sixth-generation (6G) network, antenna arrays inlaid with numerous antenna elements will be deployed. However, the increase in antenna aperture size makes some users lie in the Fresnel region, leading to the hybrid near-field and far-field communication mode, where the conventional far-field channel estimation methods no longer work well. To tackle the above challenge, this paper considers channel estimation in a hybrid-field multipath environment, where each user and each scatterer can be in either the far-field or the near-field region. First, a joint angular-polar domain channel transform is designed to capture the hybrid-field channel's near-field and far-field features. We then analyze the power diffusion effect in the hybrid-field channel, which indicates that the power corresponding to one near-field (far-field) path component of the multipath channel may spread to far-field (near-field) paths and causes estimation error. We design a novel power-diffusion-based orthogonal matching pursuit channel estimation algorithm (PD-OMP). It can eliminate the prior knowledge requirement of path numbers in the far field and near field, which is a must in other OMP-based channel estimation algorithms. Simulation results show that PD-OMP outperforms current hybrid-field channel estimation methods.
Maximal Extractable Value (MEV) searching has gained prominence on the Ethereum blockchain since the surge in Decentralized Finance activities. In Ethereum, MEV extraction primarily hinges on fee payments to block proposers. However, in First-Come-First-Served (FCFS) blockchain networks, the focus shifts to latency optimizations, akin to High-Frequency Trading in Traditional Finance. This paper illustrates the dynamics of the MEV extraction game in an FCFS network, specifically Algorand. We introduce an arbitrage detection algorithm tailored to the unique time constraints of FCFS networks and assess its effectiveness. Additionally, our experiments investigate potential optimizations in Algorand's network layer to secure optimal execution positions. Our analysis reveals that while the states of relevant trading pools are updated approximately every six blocks on median, pursuing MEV at the block state level is not viable on Algorand, as arbitrage opportunities are typically executed within the blocks they appear. Our algorithm's performance under varying time constraints underscores the importance of timing in arbitrage discovery. Furthermore, our network-level experiments identify critical transaction prioritization strategies for Algorand's FCFS network. Key among these is reducing latency in connections with relays that are well-connected to high-staked proposers.
Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) is a cutting-edge concept for the sixth-generation (6G) wireless networks. In this letter, we propose a novel system that incorporates STAR-RIS with simultaneous wireless information and power transfer (SWIPT) using rate splitting multiple access (RSMA). The proposed system facilitates communication from a multi-antenna base station (BS) to single-antenna users in a downlink transmission. The BS concurrently sends energy and information signals to multiple energy harvesting receivers (EHRs) and information data receivers (IDRs) with the support of a deployed STAR-RIS. Furthermore, a multi-objective optimization is introduced to strike a balance between users' sum rate and the total harvested energy. To achieve this, an optimization problem is formulated to optimize the energy/information beamforming vectors at the BS, the phase shifts at the STAR-RIS, and the common message rate. Subsequently, we employ a meta deep deterministic policy gradient (Meta-DDPG) approach to solve the complex problem. Simulation results validate that the proposed algorithm significantly enhances both data rate and harvested energy in comparison to conventional DDPG.
This research investigates the transferability of Automatic Speech Recognition (ASR)-robust Natural Language Understanding (NLU) models from controlled experimental conditions to practical, real-world applications. Focused on smart home automation commands in Urdu, the study assesses model performance under diverse noise profiles, linguistic variations, and ASR error scenarios. Leveraging the UrduBERT model, the research employs a systematic methodology involving real-world data collection, cross-validation, transfer learning, noise variation studies, and domain adaptation. Evaluation metrics encompass task-specific accuracy, latency, user satisfaction, and robustness to ASR errors. The findings contribute insights into the challenges and adaptability of ASR-robust NLU models in transcending controlled environments.
Visual Speech Recognition (VSR) is the task of predicting spoken words from silent lip movements. VSR is regarded as a challenging task because of the insufficient information on lip movements. In this paper, we propose an Audio Knowledge empowered Visual Speech Recognition framework (AKVSR) to complement the insufficient speech information of visual modality by using audio modality. Different from the previous methods, the proposed AKVSR 1) utilizes rich audio knowledge encoded by a large-scale pretrained audio model, 2) saves the linguistic information of audio knowledge in compact audio memory by discarding the non-linguistic information from the audio through quantization, and 3) includes Audio Bridging Module which can find the best-matched audio features from the compact audio memory, which makes our training possible without audio inputs, once after the compact audio memory is composed. We validate the effectiveness of the proposed method through extensive experiments, and achieve new state-of-the-art performances on the widely-used LRS3 dataset.
College students with ADHD respond positively to simple socially assistive robots (SARs) that monitor attention and provide non-verbal feedback, but studies have been done only in brief in-lab sessions. We present an initial design and evaluation of an in-dorm SAR study companion for college students with ADHD. This work represents the introductory stages of an ongoing user-centered, participatory design process. In a three-week within-subjects user study, university students (N=11) with self-reported symptoms of adult ADHD had a SAR study companion in their dorm room for two weeks and a computer-based system for one week. Toward developing SARs for long-term, in-dorm use, we focus on 1) evaluating the usability and desire for SAR study companions by college students with ADHD and 2) collecting participant feedback about the SAR design and functionality. Participants responded positively to the robot; after one week of regular use, 91% (10 of 11) chose to continue using the robot voluntarily in the second week.
Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by enhancing the trust of end-users in machines. As the number of connected devices keeps on growing, the Internet of Things (IoT) market needs to be trustworthy for the end-users. However, existing literature still lacks a systematic and comprehensive survey work on the use of XAI for IoT. To bridge this lacking, in this paper, we address the XAI frameworks with a focus on their characteristics and support for IoT. We illustrate the widely-used XAI services for IoT applications, such as security enhancement, Internet of Medical Things (IoMT), Industrial IoT (IIoT), and Internet of City Things (IoCT). We also suggest the implementation choice of XAI models over IoT systems in these applications with appropriate examples and summarize the key inferences for future works. Moreover, we present the cutting-edge development in edge XAI structures and the support of sixth-generation (6G) communication services for IoT applications, along with key inferences. In a nutshell, this paper constitutes the first holistic compilation on the development of XAI-based frameworks tailored for the demands of future IoT use cases.
Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.
We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.