The growing popularity of Machine Learning (ML) has led to its deployment in various sensitive domains, which has resulted in significant research focused on ML security and privacy. However, in some applications, such as autonomous driving, integrity verification of the outsourced ML workload is more critical--a facet that has not received much attention. Existing solutions, such as multi-party computation and proof-based systems, impose significant computation overhead, which makes them unfit for real-time applications. We propose Fides, a novel framework for real-time validation of outsourced ML workloads. Fides features a novel and efficient distillation technique--Greedy Distillation Transfer Learning--that dynamically distills and fine-tunes a space and compute-efficient verification model for verifying the corresponding service model while running inside a trusted execution environment. Fides features a client-side attack detection model that uses statistical analysis and divergence measurements to identify, with a high likelihood, if the service model is under attack. Fides also offers a re-classification functionality that predicts the original class whenever an attack is identified. We devised a generative adversarial network framework for training the attack detection and re-classification models. The evaluation shows that Fides achieves an accuracy of up to 98% for attack detection and 94% for re-classification.
Holographic MIMO (HMIMO) is being increasingly recognized as a key enabling technology for 6G wireless systems through the deployment of an extremely large number of antennas within a compact space to fully exploit the potentials of the electromagnetic (EM) channel. Nevertheless, the benefits of HMIMO systems cannot be fully unleashed without an efficient means to estimate the high-dimensional channel, whose distribution becomes increasingly complicated due to the accessibility of the near-field region. In this paper, we address the fundamental challenge of designing a low-complexity Bayes-optimal channel estimator in near-field HMIMO systems operating in unknown EM environments. The core idea is to estimate the HMIMO channels solely based on the Stein's score function of the received pilot signals and an estimated noise level, without relying on priors or supervision that is not feasible in practical deployment. A neural network is trained with the unsupervised denoising score matching objective to learn the parameterized score function. Meanwhile, a principal component analysis (PCA)-based algorithm is proposed to estimate the noise level leveraging the low-rank near-field spatial correlation. Building upon these techniques, we develop a Bayes-optimal score-based channel estimator for fully-digital HMIMO transceivers in a closed form. The optimal score-based estimator is also extended to hybrid analog-digital HMIMO systems by incorporating it into a low-complexity message passing algorithm. The (quasi-) Bayes-optimality of the proposed estimators is validated both in theory and by extensive simulation results. In addition to optimality, it is shown that our proposal is robust to various mismatches and can quickly adapt to dynamic EM environments in an online manner thanks to its unsupervised nature, demonstrating its potential in real-world deployment.
The purpose of this article is to investigate the viability of Multi-Carrier Modulation (MCM) systems based on the Fast Walsh Hadamard Transform (FWHT). In addition, a nonlinear Joint Low-Complexity Optimized Zero Forcing Successive Interference Cancellation (JLCOZF-SIC) equalizer is proposed. To that end, general equations for the number of flops of the proposed equalizer and various other equalizers are given. This article discusses the use of Banded Matrix Approximation (BMA) as a technique for reducing complexity. The proposed equalizer uses BMA to accomplish both equalization and co-Carrier Frequency Offset (co-CFO) corrections. In addition, three cases involving the proposed equalizer were investigated. In the first case, diagonal compensation is used. In the second case, BMA compensation is used. In the third case, complete matrix compensation is used. In the presence of frequency offset, noise, and frequency-selective Rayleigh fading environments, analysis and simulation results show that the OFDM-FWHT system with the proposed equalizer outperforms the conventional OFDM system with various linear and nonlinear equalizers.
We devise a version of Linear Temporal Logic (LTL) on a denotational domain of streams. We investigate this logic in terms of domain theory, (point-free) topology and geometric logic. This yields the first steps toward an extension of the "Domain Theory in Logical Form" paradigm to temporal liveness properties. We show that the negation-free formulae of LTL induce sober subspaces of streams, but that this is in general not the case in presence of negation. We propose a direct, inductive, translation of negation-free LTL to geometric logic. This translation reflects the approximations used to compute the usual fixpoint representations of LTL modalities. As a motivating example, we handle a natural input-output specification for the usual filter function on streams.
The rapid adoption of artificial intelligence (AI) and machine learning (ML) has generated growing interest in understanding their environmental impact and the challenges associated with designing environmentally friendly ML-enabled systems. While Green AI research, i.e., research that tries to minimize the energy footprint of AI, is receiving increasing attention, very few concrete guidelines are available on how ML-enabled systems can be designed to be more environmentally sustainable. In this paper, we provide a catalog of 30 green architectural tactics for ML-enabled systems to fill this gap. An architectural tactic is a high-level design technique to improve software quality, in our case environmental sustainability. We derived the tactics from the analysis of 51 peer-reviewed publications that primarily explore Green AI, and validated them using a focus group approach with three experts. The 30 tactics we identified are aimed to serve as an initial reference guide for further exploration into Green AI from a software engineering perspective, and assist in designing sustainable ML-enabled systems. To enhance transparency and facilitate their widespread use and extension, we make the tactics available online in easily consumable formats. Wide-spread adoption of these tactics has the potential to substantially reduce the societal impact of ML-enabled systems regarding their energy and carbon footprint.
We investigate two efficient time discretizations for the post-processing technique of discontinuous Galerkin (DG) methods to solve hyperbolic conservation laws. The post-processing technique, which is applied at the final time of the DG method, can enhance the accuracy of the original DG solution (spatial superconvergence). One main difficulty of the post-processing technique is that the spatial superconvergence after post-processing needs to be matched with proper temporary accuracy. If the semi-discretized system (ODE system after spatial discretization) is under-resolved in time, then the space superconvergence will be concealed. In this paper, we focus our investigation on the recently designed SDG method and derive its explicit scheme from a correction process based on the DG weak formulation. We also introduce another similar technique, namely the spectral deferred correction (SDC) method. A comparison is made among both proposed time discretization techniques with the standard third-order Runge-Kutta method through several numerical examples, to conclude that both the SDG and SDC methods are efficient time discretization techniques for exploiting the spatial superconvergence of the DG methods.
As coding challenges become more complex, recent advancements in Large Language Models (LLMs) have led to notable successes, such as achieving a 94.6\% solve rate on the HumanEval benchmark. Concurrently, there is an increasing commercial push for repository-level inline code completion tools, such as GitHub Copilot and Tab Nine, aimed at enhancing developer productivity. This paper delves into the transition from individual coding problems to repository-scale solutions, presenting a thorough review of the current literature on effective LLM prompting for code generation at the repository level. We examine approaches that will work with black-box LLMs such that they will be useful and applicable to commercial use cases, and their applicability in interpreting code at a repository scale. We juxtapose the Repository-Level Prompt Generation technique with RepoCoder, an iterative retrieval and generation method, to highlight the trade-offs inherent in each approach and to establish best practices for their application in cutting-edge coding benchmarks. The interplay between iterative refinement of prompts and the development of advanced retrieval systems forms the core of our discussion, offering a pathway to significantly improve LLM performance in code generation tasks. Insights from this study not only guide the application of these methods but also chart a course for future research to integrate such techniques into broader software engineering contexts.
We present the design of a mixed reality (MR) telehealth training system that aims to close the gap between in-person and distance training and re-training for medical procedures. Our system uses real-time volumetric capture as a means for communicating and relating spatial information between the non-colocated trainee and instructor. The system's design is based on a requirements elicitation study performed in situ, at a medical school simulation training center. The focus is on the lightweight real-time transmission of volumetric data - meaning the use of consumer hardware, easy and quick deployment, and low-demand computations. We evaluate the MR system design by analyzing the workload for the users during medical training. We compare in-person, video, and MR training workloads. The results indicate that the overall workload for central line placement training with MR does not increase significantly compared to video communication. Our work shows that, when designed strategically together with domain experts, an MR communication system can be used effectively for complex medical procedural training without increasing the overall workload for users significantly. Moreover, MR systems offer new opportunities for teaching due to spatial information, hand tracking, and augmented communication.
Visual Place Recognition (VPR) is a critical task for performing global re-localization in visual perception systems. It requires the ability to accurately recognize a previously visited location under variations such as illumination, occlusion, appearance and viewpoint. In the case of robotic systems and augmented reality, the target devices for deployment are battery powered edge devices. Therefore whilst the accuracy of VPR methods is important so too is memory consumption and latency. Recently new works have focused on the recall@1 metric as a performance measure with limited focus on resource utilization. This has resulted in methods that use deep learning models too large to deploy on low powered edge devices. We hypothesize that these large models are highly over-parameterized and can be optimized to satisfy the constraints of a low powered embedded system whilst maintaining high recall performance. Our work studies the impact of compact convolutional network architecture design in combination with full-precision and mixed-precision post-training quantization on VPR performance. Importantly we not only measure performance via the recall@1 score but also measure memory consumption and latency. We characterize the design implications on memory, latency and recall scores and provide a number of design recommendations for VPR systems under these resource limitations.
Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.
Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.