With the advancement of IoT technology, many electronic devices are interconnected through networks, communicating with each other and performing specific roles. However, as numerous devices join networks, the threat of cyberattacks also escalates. Preventing and detecting cyber threats are crucial, and one method of preventing such threats involves using attack graphs. Attack graphs are widely used to assess security threats within networks. However, a drawback emerges as the network scales, as generating attack graphs becomes time-consuming. To overcome this limitation, artificial intelligence models can be employed. By utilizing AI models, attack graphs can be created within a short period, approximating optimal outcomes. AI models designed for attack graph generation consist of encoders and decoders, trained using reinforcement learning algorithms. After training the AI models, we confirmed the model's learning effectiveness by observing changes in loss and reward values. Additionally, we compared attack graphs generated by the AI model with those created through conventional methods.
We revisit existing linear computation coding (LCC) algorithms, and introduce a new framework that measures the computational cost of computing multidimensional linear functions, not only in terms of the number of additions, but also with respect to their suitability for parallel processing. Utilizing directed acyclic graphs, which correspond to signal flow graphs in hardware, we propose a novel LCC algorithm that controls the trade-off between the total number of operations and their parallel executability. Numerical evaluations show that the proposed algorithm, constrained to a fully parallel structure, outperforms existing schemes.
Sonification is a data visualization technique which expresses data attributes via psychoacoustic parameters, which are non-speech audio signals used to convey information. This paper investigates the binary estimation of cognitive load induced by psychoacoustic parameters conveying the focus level of an astronomical image via Electroencephalogram (EEG) embeddings. Employing machine learning and deep learning methodologies, we demonstrate that EEG signals are reliable for (a) binary estimation of cognitive load, (b) isolating easy vs difficult visual-to-auditory perceptual mappings, and (c) capturing perceptual similarities among psychoacoustic parameters. Our key findings reveal that (1) EEG embeddings can reliably measure cognitive load, achieving a peak F1-score of 0.98; (2) Extreme focus levels are easier to detect via auditory mappings than intermediate ones, and (3) psychoacoustic parameters inducing comparable cognitive load levels tend to generate similar EEG encodings.
In recent years, finding an effective and efficient strategy for exploiting spatial and temporal information has been a hot research topic in video saliency prediction (VSP). With the emergence of spatio-temporal transformers, the weakness of the prior strategies, e.g., 3D convolutional networks and LSTM-based networks, for capturing long-range dependencies has been effectively compensated. While VSP has drawn benefits from spatio-temporal transformers, finding the most effective way for aggregating temporal features is still challenging. To address this concern, we propose a transformer-based video saliency prediction approach with high temporal dimension decoding network (THTD-Net). This strategy accounts for the lack of complex hierarchical interactions between features that are extracted from the transformer-based spatio-temporal encoder: in particular, it does not require multiple decoders and aims at gradually reducing temporal features' dimensions in the decoder. This decoder-based architecture yields comparable performance to multi-branch and over-complicated models on common benchmarks such as DHF1K, UCF-sports and Hollywood-2.
Cyber-physical systems (CPS) offer immense optimization potential for manufacturing processes through the availability of multivariate time series data of actors and sensors. Based on automated analysis software, the deployment of adaptive and responsive measures is possible for time series data. Due to the complex and dynamic nature of modern manufacturing, analysis and modeling often cannot be entirely automated. Even machine- or deep learning approaches often depend on a priori expert knowledge and labelling. In this paper, an information-based data preprocessing approach is proposed. By applying statistical methods including variance and correlation analysis, an approximation of the sampling rate in event-based systems and the utilization of spectral analysis, knowledge about the underlying manufacturing processes can be gained prior to modeling. The paper presents, how statistical analysis enables the pruning of a dataset's least important features and how the sampling rate approximation approach sets the base for further data analysis and modeling. The data's underlying periodicity, originating from the cyclic nature of an automated manufacturing process, will be detected by utilizing the fast Fourier transform. This information-based preprocessing method will then be validated for process time series data of cyber-physical systems' programmable logic controllers (PLC).
To address intricate real-world tasks, there has been a rising interest in tool utilization in applications of large language models (LLMs). To develop LLM-based agents, it usually requires LLMs to understand many tool functions from different tool documentation. But these documentations could be diverse, redundant or incomplete, which immensely affects the capability of LLMs in using tools. To solve this, we introduce EASYTOOL, a framework transforming diverse and lengthy tool documentation into a unified and concise tool instruction for easier tool usage. EasyTool purifies essential information from extensive tool documentation of different sources, and elaborates a unified interface (i.e., tool instruction) to offer standardized tool descriptions and functionalities for LLM-based agents. Extensive experiments on multiple different tasks demonstrate that EasyTool can significantly reduce token consumption and improve the performance of tool utilization in real-world scenarios. Our code will be available at \url{//github.com/microsoft/JARVIS/} in the future.
The new era of technology has brought us to the point where it is convenient for people to share their opinions over an abundance of platforms. These platforms have a provision for the users to express themselves in multiple forms of representations, including text, images, videos, and audio. This, however, makes it difficult for users to obtain all the key information about a topic, making the task of automatic multi-modal summarization (MMS) essential. In this paper, we present a comprehensive survey of the existing research in the area of MMS.
Search in social networks such as Facebook poses different challenges than in classical web search: besides the query text, it is important to take into account the searcher's context to provide relevant results. Their social graph is an integral part of this context and is a unique aspect of Facebook search. While embedding-based retrieval (EBR) has been applied in eb search engines for years, Facebook search was still mainly based on a Boolean matching model. In this paper, we discuss the techniques for applying EBR to a Facebook Search system. We introduce the unified embedding framework developed to model semantic embeddings for personalized search, and the system to serve embedding-based retrieval in a typical search system based on an inverted index. We discuss various tricks and experiences on end-to-end optimization of the whole system, including ANN parameter tuning and full-stack optimization. Finally, we present our progress on two selected advanced topics about modeling. We evaluated EBR on verticals for Facebook Search with significant metrics gains observed in online A/B experiments. We believe this paper will provide useful insights and experiences to help people on developing embedding-based retrieval systems in search engines.
Recommender systems are widely used in big information-based companies such as Google, Twitter, LinkedIn, and Netflix. A recommender system deals with the problem of information overload by filtering important information fragments according to users' preferences. In light of the increasing success of deep learning, recent studies have proved the benefits of using deep learning in various recommendation tasks. However, most proposed techniques only aim to target individuals, which cannot be efficiently applied in group recommendation. In this paper, we propose a deep learning architecture to solve the group recommendation problem. On the one hand, as different individual preferences in a group necessitate preference trade-offs in making group recommendations, it is essential that the recommendation model can discover substitutes among user behaviors. On the other hand, it has been observed that a user as an individual and as a group member behaves differently. To tackle such problems, we propose using an attention mechanism to capture the impact of each user in a group. Specifically, our model automatically learns the influence weight of each user in a group and recommends items to the group based on its members' weighted preferences. We conduct extensive experiments on four datasets. Our model significantly outperforms baseline methods and shows promising results in applying deep learning to the group recommendation problem.
Recently, ensemble has been applied to deep metric learning to yield state-of-the-art results. Deep metric learning aims to learn deep neural networks for feature embeddings, distances of which satisfy given constraint. In deep metric learning, ensemble takes average of distances learned by multiple learners. As one important aspect of ensemble, the learners should be diverse in their feature embeddings. To this end, we propose an attention-based ensemble, which uses multiple attention masks, so that each learner can attend to different parts of the object. We also propose a divergence loss, which encourages diversity among the learners. The proposed method is applied to the standard benchmarks of deep metric learning and experimental results show that it outperforms the state-of-the-art methods by a significant margin on image retrieval tasks.
Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users' personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users' feedbacks. In particular, we introduce an online user-agent interacting environment simulator, which can pre-train and evaluate model parameters offline before applying the model online. Moreover, we validate the importance of list-wise recommendations during the interactions between users and agent, and develop a novel approach to incorporate them into the proposed framework LIRD for list-wide recommendations. The experimental results based on a real-world e-commerce dataset demonstrate the effectiveness of the proposed framework.