Enterprise Application Integration deals with the problem of connecting heterogeneous applications, and is the centerpiece of current on-premise, cloud and device integration scenarios. For integration scenarios, structurally correct composition of patterns into processes and improvements of integration processes are crucial. In order to achieve this, we formalize compositions of integration patterns based on their characteristics, and describe optimization strategies that help to reduce the model complexity, and improve the process execution efficiency using design time techniques. Using the formalism of timed DB-nets - a refinement of Petri nets - we model integration logic features such as control- and data flow, transactional data storage, compensation and exception handling, and time aspects that are present in reoccurring solutions as separate integration patterns. We then propose a realization of optimization strategies using graph rewriting, and prove that the optimizations we consider preserve both structural and functional correctness. We evaluate the improvements on a real-world catalog of pattern compositions, containing over 900 integration processes, and illustrate the correctness properties in case studies based on two of these processes.
In this paper, we investigate the convergence properties of the stochastic gradient descent (SGD) method and its variants, especially in training neural networks built from nonsmooth activation functions. We develop a novel framework that assigns different timescales to stepsizes for updating the momentum terms and variables, respectively. Under mild conditions, we prove the global convergence of our proposed framework in both single-timescale and two-timescale cases. We show that our proposed framework encompasses a wide range of well-known SGD-type methods, including heavy-ball SGD, SignSGD, Lion, normalized SGD and clipped SGD. Furthermore, when the objective function adopts a finite-sum formulation, we prove the convergence properties for these SGD-type methods based on our proposed framework. In particular, we prove that these SGD-type methods find the Clarke stationary points of the objective function with randomly chosen stepsizes and initial points under mild assumptions. Preliminary numerical experiments demonstrate the high efficiency of our analyzed SGD-type methods.
In recent years, self-supervised learning (SSL) has emerged as a promising approach for extracting valuable representations from unlabeled data. One successful SSL method is contrastive learning, which aims to bring positive examples closer while pushing negative examples apart. Many current contrastive learning approaches utilize a parameterized projection head. Through a combination of empirical analysis and theoretical investigation, we provide insights into the internal mechanisms of the projection head and its relationship with the phenomenon of dimensional collapse. Our findings demonstrate that the projection head enhances the quality of representations by performing contrastive loss in a projected subspace. Therefore, we propose an assumption that only a subset of features is necessary when minimizing the contrastive loss of a mini-batch of data. Theoretical analysis further suggests that a sparse projection head can enhance generalization, leading us to introduce SparseHead - a regularization term that effectively constrains the sparsity of the projection head, and can be seamlessly integrated with any self-supervised learning (SSL) approaches. Our experimental results validate the effectiveness of SparseHead, demonstrating its ability to improve the performance of existing contrastive methods.
Machine learning has affected the way in which many phenomena for various domains are modelled, one of these domains being that of structural dynamics. However, because machine-learning algorithms are problem-specific, they often fail to perform efficiently in cases of data scarcity. To deal with such issues, combination of physics-based approaches and machine learning algorithms have been developed. Although such methods are effective, they also require the analyser's understanding of the underlying physics of the problem. The current work is aimed at motivating the use of models which learn such relationships from a population of phenomena, whose underlying physics are similar. The development of such models is motivated by the way that physics-based models, and more specifically finite element models, work. Such models are considered transferrable, explainable and trustworthy, attributes which are not trivially imposed or achieved for machine-learning models. For this reason, machine-learning approaches are less trusted by industry and often considered more difficult to form validated models. To achieve such data-driven models, a population-based scheme is followed here and two different machine-learning algorithms from the meta-learning domain are used. The two algorithms are the model-agnostic meta-learning (MAML) algorithm and the conditional neural processes (CNP) model. The algorithms seem to perform as intended and outperform a traditional machine-learning algorithm at approximating the quantities of interest. Moreover, they exhibit behaviour similar to traditional machine learning algorithms (e.g. neural networks or Gaussian processes), concerning their performance as a function of the available structures in the training population.
The scores of distance-based outlier detection methods are difficult to interpret, making it challenging to determine a cut-off threshold between normal and outlier data points without additional context. We describe a generic transformation of distance-based outlier scores into interpretable, probabilistic estimates. The transformation is ranking-stable and increases the contrast between normal and outlier data points. Determining distance relationships between data points is necessary to identify the nearest-neighbor relationships in the data, yet, most of the computed distances are typically discarded. We show that the distances to other data points can be used to model distance probability distributions and, subsequently, use the distributions to turn distance-based outlier scores into outlier probabilities. Our experiments show that the probabilistic transformation does not impact detection performance over numerous tabular and image benchmark datasets but results in interpretable outlier scores with increased contrast between normal and outlier samples. Our work generalizes to a wide range of distance-based outlier detection methods, and because existing distance computations are used, it adds no significant computational overhead.
Model-driven software engineering is a suitable method for dealing with the ever-increasing complexity of software development processes. Graphs and graph transformations have proven useful for representing such models and changes to them. These models must satisfy certain sets of constraints. An example are the multiplicities of a class structure. During the development process, a change to a model may result in an inconsistent model that must at some point be repaired. This problem is called model repair. In particular, we will consider rule-based graph repair which is defined as follows: Given a graph $G$, a constraint $c$ such that $G$ does not satisfy $c$, and a set of rules $R$, use the rules of $\mathcal{R}$ to transform $G$ into a graph that satisfies $c$. Known notions of consistency have either viewed consistency as a binary property, either a graph is consistent w.r.t. a constraint $c$ or not, or only viewed the number of violations of the first graph of a constraint. In this thesis, we introduce new notions of consistency, which we call consistency-maintaining and consistency-increasing transformations and rules, respectively. This is based on the possibility that a constraint can be satisfied up to a certain nesting level. We present constructions for direct consistency-maintaining or direct consistency-increasing application conditions, respectively. Finally, we present an rule-based graph repair approach that is able to repair so-called \emph{circular conflict-free constraints}, and so-called circular conflict-free sets of constraints. Intuitively, a set of constraint $C$ is circular conflict free, if there is an ordering $c_1, \ldots, c_n$ of all constraints of $C$ such that there is no $j <i$ such that a repair of $c_i$ at all graphs satisfying $c_j$ leads to a graph not satisfying $c_j$.
The last two decades have seen considerable progress in foundational aspects of statistical network analysis, but the path from theory to application is not straightforward. Two large, heterogeneous samples of small networks of within-household contacts in Belgium were collected using two different but complementary sampling designs: one smaller but with all contacts in each household observed, the other larger and more representative but recording contacts of only one person per household. We wish to combine their strengths to learn the social forces that shape household contact formation and facilitate simulation for prediction of disease spread, while generalising to the population of households in the region. To accomplish this, we describe a flexible framework for specifying multi-network models in the exponential family class and identify the requirements for inference and prediction under this framework to be consistent, identifiable, and generalisable, even when data are incomplete; explore how these requirements may be violated in practice; and develop a suite of quantitative and graphical diagnostics for detecting violations and suggesting improvements to candidate models. We report on the effects of network size, geography, and household roles on household contact patterns (activity, heterogeneity in activity, and triadic closure).
Graph theory and enumerative combinatorics are two branches of mathematical sciences that have developed astonishingly over the past one hundred years. It is especially important to point out that graph theory employs combinatorial techniques to solve key problems of characterization, construction, enumeration and classification of an enormous set of different families of graphs. This paper describes the construction of two classes of bigeodetic blocks using balanced incomplete block designs (BIBDs). On the other hand, even though graph theory and combinatorics have a close relationship, the opposite problem, that is, considering certain graph constructions when solving problems of combinatorics is not common, but possible. The construction of the second class of bigeodetic blocks described in this paper represents an example of how graph theory could somehow give a clue to the description of a problem of existence in combinatorics. We refer to the problem of existence for biplanes. A connection between the mentioned construction, the Bruck-Ryser-Chowla theorem and the problem of existence for biplanes is considered.
In recent decades, a number of ways of dealing with causality in practice, such as propensity score matching, the PC algorithm and invariant causal prediction, have been introduced. Besides its interpretational appeal, the causal model provides the best out-of-sample prediction guarantees. In this paper, we study the identification of causal-like models from in-sample data that provide out-of-sample risk guarantees when predicting a target variable from a set of covariates. Whereas ordinary least squares provides the best in-sample risk with limited out-of-sample guarantees, causal models have the best out-of-sample guarantees but achieve an inferior in-sample risk. By defining a trade-off of these properties, we introduce $\textit{causal regularization}$. As the regularization is increased, it provides estimators whose risk is more stable across sub-samples at the cost of increasing their overall in-sample risk. The increased risk stability is shown to lead to out-of-sample risk guarantees. We provide finite sample risk bounds for all models and prove the adequacy of cross-validation for attaining these bounds.
Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. We provide a definition and propose a concept for informed machine learning which illustrates its building blocks and distinguishes it from conventional machine learning. We introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Based on this taxonomy, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.