To fight against the evolution of malware and its development, the specific methodologies that are applied by the malware analysts are crucial. Yet, this is something often overlooked in the relevant bibliography or in the formal and informal training of the relevant professionals. There are only two generic and all-encompassing structured methodologies for Malware Analysis (MA) - SAMA and MARE. The question is whether they are adequate and there is no need for another one or whether there is no such need at all. This paper will try to answer the above and it will contribute in the following ways: it will present, compare and dissect those two malware analysis methodologies, it will present their capacity for analysing modern malware by applying them on a random modern specimen and finally, it will conclude on whether there is a procedural optimization for malware analysis over the evolution of these two methodologies.
We present a method for calculating and analyzing stakeholder utilities of processes that arise in, but are not limited to, the social sciences. These areas include business process analysis, healthcare workflow analysis and policy process analysis. This method is quite general and applicable to any situation in which declarative-type constraints of a modal and/or temporal nature play a part. A declarative process is a process in which activities may freely happen while respecting a set of constraints. For such a process, anything may happen so long as it is not explicitly forbidden. Declarative processes have been used and studied as models of business and healthcare workflows by several authors. In considering a declarative process as a model of some system it is natural to consider how the process behaves with respect to stakeholders. We derive a measure for stakeholder utility that can be applied in a very general setting. This derivation is achieved by listing a collection a properties which we argue such a stakeholder utility function ought to satisfy, and then using these to show a very specific form must hold for such a utility. The utility measure depends on the set of unique traces of the declarative process, and calculating this set requires a combinatorial analysis of the declarative graph that represents the process. This builds on previous work of the author wherein the combinatorial diversity metrics for declarative processes were derived for use in policy process analysis. The collection of stakeholder utilities can themselves then be used to form a metric with which we can compare different declarative processes to one another. These are illustrated using several examples of declarative processes that already exist in the literature.
Digital technologies that help us take care of our dogs are becoming more widespread. Yet, little research explores what the role of technology in the human-dog relationship should be. We conducted a mixed-method study incorporating quantitative and qualitative thematic analysis of 155 UK dog owners reflecting on their daily routines and technology's role in it, disentangling the what-where-why of interspecies routines and activities, technological desires, and rationales for technological support across common human-dog activities. We found that increasingly entangled daily routines lead to close multi-species households where dog owners conceptualize technology as having a role to support them in giving care to their dogs. When confronted with the role of technology across various activities, only chores like cleaning up after our dogs lead to largely positive considerations, while activities that benefit us like walking together lead to largely negative considerations. For other activities, whether playing, training, or feeding, attitudes remain diverse. In general, across all activities both a nightmare scenario of technology taking the human's role and in doing so disentangling the human-dog bond, as well as a dream scenario of technology augmenting our abilities arise. We argue that the current trajectory of digital technology for pets towards allowing dog owners to interact with their dogs while away (feeding, playing, and so on) is an example of this nightmare scenario becoming reality, and that it is important to redirect this trajectory to one of technology predominantly supporting us in becoming better and more informed caregivers.
Learning a graph topology to reveal the underlying relationship between data entities plays an important role in various machine learning and data analysis tasks. Under the assumption that structured data vary smoothly over a graph, the problem can be formulated as a regularised convex optimisation over a positive semidefinite cone and solved by iterative algorithms. Classic methods require an explicit convex function to reflect generic topological priors, e.g. the $\ell_1$ penalty for enforcing sparsity, which limits the flexibility and expressiveness in learning rich topological structures. We propose to learn a mapping from node data to the graph structure based on the idea of learning to optimise (L2O). Specifically, our model first unrolls an iterative primal-dual splitting algorithm into a neural network. The key structural proximal projection is replaced with a variational autoencoder that refines the estimated graph with enhanced topological properties. The model is trained in an end-to-end fashion with pairs of node data and graph samples. Experiments on both synthetic and real-world data demonstrate that our model is more efficient than classic iterative algorithms in learning a graph with specific topological properties.
With the advances of data-driven machine learning research, a wide variety of prediction problems have been tackled. It has become critical to explore how machine learning and specifically deep learning methods can be exploited to analyse healthcare data. A major limitation of existing methods has been the focus on grid-like data; however, the structure of physiological recordings are often irregular and unordered which makes it difficult to conceptualise them as a matrix. As such, graph neural networks have attracted significant attention by exploiting implicit information that resides in a biological system, with interactive nodes connected by edges whose weights can be either temporal associations or anatomical junctions. In this survey, we thoroughly review the different types of graph architectures and their applications in healthcare. We provide an overview of these methods in a systematic manner, organized by their domain of application including functional connectivity, anatomical structure and electrical-based analysis. We also outline the limitations of existing techniques and discuss potential directions for future research.
Deep Learning has implemented a wide range of applications and has become increasingly popular in recent years. The goal of multimodal deep learning is to create models that can process and link information using various modalities. Despite the extensive development made for unimodal learning, it still cannot cover all the aspects of human learning. Multimodal learning helps to understand and analyze better when various senses are engaged in the processing of information. This paper focuses on multiple types of modalities, i.e., image, video, text, audio, body gestures, facial expressions, and physiological signals. Detailed analysis of past and current baseline approaches and an in-depth study of recent advancements in multimodal deep learning applications has been provided. A fine-grained taxonomy of various multimodal deep learning applications is proposed, elaborating on different applications in more depth. Architectures and datasets used in these applications are also discussed, along with their evaluation metrics. Last, main issues are highlighted separately for each domain along with their possible future research directions.
The aim of this paper is to offer the first systematic exploration and definition of equivalent causal models in the context where both models are not made up of the same variables. The idea is that two models are equivalent when they agree on all "essential" causal information that can be expressed using their common variables. I do so by focussing on the two main features of causal models, namely their structural relations and their functional relations. In particular, I define several relations of causal ancestry and several relations of causal sufficiency, and require that the most general of these relations are preserved across equivalent models.
Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.
Lots of learning tasks require dealing with graph data which contains rich relation information among elements. Modeling physics system, learning molecular fingerprints, predicting protein interface, and classifying diseases require a model to learn from graph inputs. In other domains such as learning from non-structural data like texts and images, reasoning on extracted structures, like the dependency tree of sentences and the scene graph of images, is an important research topic which also needs graph reasoning models. Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth. Although the primitive GNNs have been found difficult to train for a fixed point, recent advances in network architectures, optimization techniques, and parallel computation have enabled successful learning with them. In recent years, systems based on variants of graph neural networks such as graph convolutional network (GCN), graph attention network (GAT), gated graph neural network (GGNN) have demonstrated ground-breaking performance on many tasks mentioned above. In this survey, we provide a detailed review over existing graph neural network models, systematically categorize the applications, and propose four open problems for future research.
The task of Question Answering has gained prominence in the past few decades for testing the ability of machines to understand natural language. Large datasets for Machine Reading have led to the development of neural models that cater to deeper language understanding compared to information retrieval tasks. Different components in these neural architectures are intended to tackle different challenges. As a first step towards achieving generalization across multiple domains, we attempt to understand and compare the peculiarities of existing end-to-end neural models on the Stanford Question Answering Dataset (SQuAD) by performing quantitative as well as qualitative analysis of the results attained by each of them. We observed that prediction errors reflect certain model-specific biases, which we further discuss in this paper.
We study how to generate captions that are not only accurate in describing an image but also discriminative across different images. The problem is both fundamental and interesting, as most machine-generated captions, despite phenomenal research progresses in the past several years, are expressed in a very monotonic and featureless format. While such captions are normally accurate, they often lack important characteristics in human languages - distinctiveness for each caption and diversity for different images. To address this problem, we propose a novel conditional generative adversarial network for generating diverse captions across images. Instead of estimating the quality of a caption solely on one image, the proposed comparative adversarial learning framework better assesses the quality of captions by comparing a set of captions within the image-caption joint space. By contrasting with human-written captions and image-mismatched captions, the caption generator effectively exploits the inherent characteristics of human languages, and generates more discriminative captions. We show that our proposed network is capable of producing accurate and diverse captions across images.