The fifth-generation (5G) mobile/cellular and time-sensitive networking (TSN) technologies are widely recognized as the key to shaping smart manufacturing for Industry 4.0 and beyond. Converged operation of the two offers end-to-end real-time and deterministic connectivity over hybrid wired and wireless segments. On the other hand, the augmented reality (AR) technology provides various benefits for the manufacturing sector. To this end, this demonstration showcases AR-aided remote assistance use-case over a hybrid TSN and 5G system. The demonstration setup comprises off-the-shelf 5G and TSN devices, a near product-grade 5G system, and an AR solution based on smart glasses. The demonstration shows the viability of over-the air transmission of scheduled TSN traffic and real-time assistance for a local user from a remote environments. Performance results from the demonstration setup are also shown.
Reconfigurable intelligent surfaces (RISs) are envisioned as a promising technology for future wireless communication systems due to their ability to control the propagation environment in a hardware- and energy-efficient way. Recently, the concept of RISs has been extended to beyond diagonal RISs (BD-RISs), which unlock the full potential of RISs thanks to the presence of tunable interconnections between RIS elements. While various algorithms have been proposed for specific BD-RIS architectures, a universal optimization framework applicable to arbitrary architectures is still lacking. In this paper, we bridge this research gap by proposing an architecture-independent framework for BD-RIS optimization, with the main focus on sum-rate maximization and transmit power minimization in multiuser multi-input single-output (MU-MISO) systems. Specifically, we first incorporate BD-RIS architectures into the models by connecting the scattering matrix with the admittance matrix and introducing appropriate constraints to the admittance matrix. The formulated problems are then solved by our custom-designed partially proximal alternating direction method of multipliers (pp-ADMM) algorithms. The pp-ADMM algorithms are computationally efficient, with each subproblem either admitting a closed-form solution or being easily solvable. We further explore the extension of the proposed framework to general utility functions and multiuser multi-input multi-output (MU-MIMO) systems. Simulation results demonstrate that the proposed approaches achieve a better trade-off between performance and computational efficiency compared to existing methods. We also compare the performance of various BD-RIS architectures in MU-MISO systems using the proposed approach, which has not been explored before due to the lack of an architecture-independent framework.
Fair and trustworthy AI is becoming ever more important in both machine learning and legal domains. One important consequence is that decision makers must seek to guarantee a 'fair', i.e., non-discriminatory, algorithmic decision procedure. However, there are several competing notions of algorithmic fairness that have been shown to be mutually incompatible under realistic factual assumptions. This concerns, for example, the widely used fairness measures of 'calibration within groups' and 'balance for the positive/negative class'. In this paper, we present a novel algorithm (FAir Interpolation Method: FAIM) for continuously interpolating between these three fairness criteria. Thus, an initially unfair prediction can be remedied to, at least partially, meet a desired, weighted combination of the respective fairness conditions. We demonstrate the effectiveness of our algorithm when applied to synthetic data, the COMPAS data set, and a new, real-world data set from the e-commerce sector. Finally, we discuss to what extent FAIM can be harnessed to comply with conflicting legal obligations. The analysis suggests that it may operationalize duties in traditional legal fields, such as credit scoring and criminal justice proceedings, but also for the latest AI regulations put forth in the EU, like the Digital Markets Act and the recently enacted AI Act.
High-quality benchmarks are the foundation for embodied AI research, enabling significant advancements in long-horizon navigation, manipulation and rearrangement tasks. However, as frontier tasks in robotics get more advanced, they require faster simulation speed, more intricate test environments, and larger demonstration datasets. To this end, we present MS-HAB, a holistic benchmark for low-level manipulation and in-home object rearrangement. First, we provide a GPU-accelerated implementation of the Home Assistant Benchmark (HAB). We support realistic low-level control and achieve over 3x the speed of previous magical grasp implementations at similar GPU memory usage. Second, we train extensive reinforcement learning (RL) and imitation learning (IL) baselines for future work to compare against. Finally, we develop a rule-based trajectory filtering system to sample specific demonstrations from our RL policies which match predefined criteria for robot behavior and safety. Combining demonstration filtering with our fast environments enables efficient, controlled data generation at scale.
Social media platforms have become the hubs for various user interactions covering a wide range of needs, including technical support and services related to brands, products, or user accounts. Unfortunately, there has been a recent surge in scammers impersonating official services and providing fake technical support to users through these platforms. In this study, we focus on scammers engaging in such fake technical support to target users who are having problems recovering their accounts. More specifically, we focus on users encountering access problems with social media profiles (e.g., on platforms such as Facebook, Instagram, Gmail, and X) and cryptocurrency wallets. The main contribution of our work is the development of an automated system that interacts with scammers via a chatbot that mimics different personas. By initiating decoy interactions (e.g., through deceptive tweets), we have enticed scammers to interact with our system so that we can analyze their modus operandi. Our results show that scammers employ many social media profiles asking users to contact them via a few communication channels. Using a large language model (LLM), our chatbot had conversations with 450 scammers and provided valuable insights into their tactics and, most importantly, their payment profiles. This automated approach highlights how scammers use a variety of strategies, including role-playing, to trick victims into disclosing personal or financial information. With this study, we lay the foundation for using automated chat-based interactions with scammers to detect and study fraudulent activities at scale in an automated way.
The development of highly sophisticated neural networks has allowed for fast progress in every field of computer vision, however, applications where annotated data is prohibited due to privacy or security concerns remain challenging. Federated Learning (FL) offers a promising framework for individuals aiming to collaboratively develop a shared model while preserving data privacy. Nevertheless, our findings reveal that variations in data distribution among clients can profoundly affect FL methodologies, primarily due to instabilities in the aggregation process. We also propose a novel FL framework to mitigate the adverse effects of covariate shifts among federated clients by combining individual parameter pruning and regularization techniques to improve the robustness of individual clients' models to aggregate. Each client's model is optimized through magnitude-based pruning and the addition of dropout and noise injection layers to build more resilient decision pathways in the networks and improve the robustness of the model's parameter aggregation step. The proposed framework is capable of extracting robust representations even in the presence of very large covariate shifts among client data distributions and in the federation of a small number of clients. Empirical findings substantiate the effectiveness of our proposed methodology across common benchmark datasets, including CIFAR10, MNIST, SVHN, and Fashion MNIST. Furthermore, we introduce the CelebA-Gender dataset, specifically designed to evaluate performance on a more realistic domain. The proposed method is capable of extracting robust representations even in the presence of both high and low covariate shifts among client data distributions.
The Consumer Internet of Things (CIoT), a notable segment within the IoT domain, involves the integration of IoT technology into consumer electronics and devices, such as smart homes and smart wearables. Compared to traditional IoT fields, CIoT differs notably in target users, product types, and design approaches. While offering convenience to users, it also raises new security and privacy concerns. Network traffic analysis, a widely used technique in the security community, has been extensively applied to investigate these concerns about CIoT. Compared to network traffic analysis in other fields such as mobile apps and websites, CIoT presents unique characteristics, introducing new challenges and research opportunities. Researchers have made significant contributions in this area. To aid researchers in understanding the application of traffic analysis tools for studying CIoT security and privacy risks, this survey reviews 303 publications on traffic analysis within the CIoT security and privacy domain from January 2018 to June 2024, focusing on three research questions. Our work: 1) outlines the CIoT traffic analysis process and highlights its differences from general network traffic analysis. 2) summarizes and classifies existing research into four categories according to its application objectives: device fingerprinting, user activity inference, malicious traffic detection, and measurement. 3) explores emerging challenges and potential future research directions based on each step of the CIoT traffic analysis process. This will provide new insights to the community and guide the industry towards safer product designs.
Recently, the surge of efficient and automated 3D AI-generated content (AIGC) methods has increasingly illuminated the path of transforming human imagination into complex 3D structures. However, the automated generation of 3D content is still significantly lags in industrial application. This gap exists because 3D modeling demands high-quality assets with sharp geometry, exquisite topology, and physically based rendering (PBR), among other criteria. To narrow the disparity between generated results and artists' expectations, we introduce GraphicsDreamer, a method for creating highly usable 3D meshes from single images. To better capture the geometry and material details, we integrate the PBR lighting equation into our cross-domain diffusion model, concurrently predicting multi-view color, normal, depth images, and PBR materials. In the geometry fusion stage, we continue to enforce the PBR constraints, ensuring that the generated 3D objects possess reliable texture details, supporting realistic relighting. Furthermore, our method incorporates topology optimization and fast UV unwrapping capabilities, allowing the 3D products to be seamlessly imported into graphics engines. Extensive experiments demonstrate that our model can produce high quality 3D assets in a reasonable time cost compared to previous methods.
Over the past few years, the rapid development of deep learning technologies for computer vision has greatly promoted the performance of medical image segmentation (MedISeg). However, the recent MedISeg publications usually focus on presentations of the major contributions (e.g., network architectures, training strategies, and loss functions) while unwittingly ignoring some marginal implementation details (also known as "tricks"), leading to a potential problem of the unfair experimental result comparisons. In this paper, we collect a series of MedISeg tricks for different model implementation phases (i.e., pre-training model, data pre-processing, data augmentation, model implementation, model inference, and result post-processing), and experimentally explore the effectiveness of these tricks on the consistent baseline models. Compared to paper-driven surveys that only blandly focus on the advantages and limitation analyses of segmentation models, our work provides a large number of solid experiments and is more technically operable. With the extensive experimental results on both the representative 2D and 3D medical image datasets, we explicitly clarify the effect of these tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong MedISeg repository, where each of its components has the advantage of plug-and-play. We believe that this milestone work not only completes a comprehensive and complementary survey of the state-of-the-art MedISeg approaches, but also offers a practical guide for addressing the future medical image processing challenges including but not limited to small dataset learning, class imbalance learning, multi-modality learning, and domain adaptation. The code has been released at: //github.com/hust-linyi/MedISeg
The incredible development of federated learning (FL) has benefited various tasks in the domains of computer vision and natural language processing, and the existing frameworks such as TFF and FATE has made the deployment easy in real-world applications. However, federated graph learning (FGL), even though graph data are prevalent, has not been well supported due to its unique characteristics and requirements. The lack of FGL-related framework increases the efforts for accomplishing reproducible research and deploying in real-world applications. Motivated by such strong demand, in this paper, we first discuss the challenges in creating an easy-to-use FGL package and accordingly present our implemented package FederatedScope-GNN (FS-G), which provides (1) a unified view for modularizing and expressing FGL algorithms; (2) comprehensive DataZoo and ModelZoo for out-of-the-box FGL capability; (3) an efficient model auto-tuning component; and (4) off-the-shelf privacy attack and defense abilities. We validate the effectiveness of FS-G by conducting extensive experiments, which simultaneously gains many valuable insights about FGL for the community. Moreover, we employ FS-G to serve the FGL application in real-world E-commerce scenarios, where the attained improvements indicate great potential business benefits. We publicly release FS-G, as submodules of FederatedScope, at //github.com/alibaba/FederatedScope to promote FGL's research and enable broad applications that would otherwise be infeasible due to the lack of a dedicated package.
With the advent of 5G commercialization, the need for more reliable, faster, and intelligent telecommunication systems are envisaged for the next generation beyond 5G (B5G) radio access technologies. Artificial Intelligence (AI) and Machine Learning (ML) are not just immensely popular in the service layer applications but also have been proposed as essential enablers in many aspects of B5G networks, from IoT devices and edge computing to cloud-based infrastructures. However, most of the existing surveys in B5G security focus on the performance of AI/ML models and their accuracy, but they often overlook the accountability and trustworthiness of the models' decisions. Explainable AI (XAI) methods are promising techniques that would allow system developers to identify the internal workings of AI/ML black-box models. The goal of using XAI in the security domain of B5G is to allow the decision-making processes of the security of systems to be transparent and comprehensible to stakeholders making the systems accountable for automated actions. In every facet of the forthcoming B5G era, including B5G technologies such as RAN, zero-touch network management, E2E slicing, this survey emphasizes the role of XAI in them and the use cases that the general users would ultimately enjoy. Furthermore, we presented the lessons learned from recent efforts and future research directions on top of the currently conducted projects involving XAI.