Hyperauthorship, a phenomenon whereby there are a disproportionately large number of authors on a single paper, is increasingly common in several scientific disciplines, but with unknown consequences for network metrics used to study scientific collaboration. The validity of co-authorship as a proxy for scientific collaboration is affected by this. Using bibliometric data from publications in the field of genomics, we examine the impact of hyperauthorship on metrics of scientific collaboration, and propose a method to determine a suitable cutoff threshold for hyperauthored papers and compare co-authorship networks with and without hyperauthored works. Our analysis reveals that including hyperauthored papers dramatically impacts the structural positioning of central authors and the topological characteristics of the network, while producing small influences on whole-network cohesion measures. We present two solutions to minimize the impact of hyperauthorship: using a mathematically grounded and reproducible calculation of threshold cutoff to exclude hyperauthored papers or fractional counting to weight network results. Our findings affirm the structural influences of hyperauthored papers and suggest that scholars should be mindful when using co-authorship networks to study scientific collaboration.
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.
Clustering of publication networks is an efficient way to obtain classifications of large collections of research publications. Such classifications can be used to, e.g., detect research topics, normalize citation relations, or explore the publication output of a unit. Citation networks can be created using a variety of approaches. Best practices to obtain classifications using clustering have been investigated, in particular the performance of different publication-publication relatedness measures. However, evaluation of different approaches to normalization of citation relations have not been explored to the same extent. In this paper, we evaluate five approaches to normalization of direct citation relations with respect to clustering solution quality in four data sets. A sixth approach is evaluated using no normalization. To assess the quality of clustering solutions, we use three measures. (1) We compare the clustering solution to the reference lists of a set of publications using the Adjusted Rand Index. (2) Using the Sihouette width measure, we quantity to which extent the publications have relations to other clusters than the one they have been assigned to. (3) We propose a measure that captures publications that have probably been inaccurately assigned. The results clearly show that normalization is preferred over unnormalized direct citation relations. Furthermore, the results indicate that the fractional normalization approach, which can be considered the standard approach, causes inaccurate assignments. The geometric normalization approach has a similar performance as the fractional approach regarding Adjusted Rand Index and Silhouette width but leads to fewer inaccurate assignments. We therefore believe that the geometric approach may be preferred over the fractional approach.
The main topic of this paper are algorithms for computing Nash equilibria. We cast our particular methods as instances of a general algorithmic abstraction, namely, a method we call {\em algorithmic boosting}, which is also relevant to other fixed-point computation problems. Algorithmic boosting is the principle of computing fixed points by taking (long-run) averages of iterated maps and it is a generalization of exponentiation. We first define our method in the setting of nonlinear maps. Secondly, we restrict attention to convergent linear maps (for computing dominant eigenvectors, for example, in the PageRank algorithm) and show that our algorithmic boosting method can set in motion {\em exponential speedups in the convergence rate}. Thirdly, we show that algorithmic boosting can convert a (weak) non-convergent iterator to a (strong) convergent one. We also consider a {\em variational approach} to algorithmic boosting providing tools to convert a non-convergent continuous flow to a convergent one. Then, by embedding the construction of averages in the design of the iterated map, we constructively prove the existence of Nash equilibria (and, therefore, Brouwer fixed points). We then discuss implementations of averaging and exponentiation, an important matter even for the scalar case. We finally discuss a relationship between dominant (PageRank) eigenvectors and Nash equilibria.
Selfadhesivity is a property of entropic polymatroids which guarantees that the polymatroid can be glued to an identical copy of itself along arbitrary restrictions such that the two pieces are independent given the common restriction. We show that positive definite matrices satisfy this condition as well and examine consequences for Gaussian conditional independence structures. New axioms of Gaussian CI are obtained by applying selfadhesivity to the previously known axioms of structural semigraphoids and orientable gaussoids.
Discovering causal relationships from observational data is a fundamental yet challenging task. In some applications, it may suffice to learn the causal features of a given response variable, instead of learning the entire underlying causal structure. Invariant causal prediction (ICP, Peters et al., 2016) is a method for causal feature selection which requires data from heterogeneous settings. ICP assumes that the mechanism for generating the response from its direct causes is the same in all settings and exploits this invariance to output a subset of the causal features. The framework of ICP has been extended to general additive noise models and to nonparametric settings using conditional independence testing. However, nonparametric conditional independence testing often suffers from low power (or poor type I error control) and the aforementioned parametric models are not suitable for applications in which the response is not measured on a continuous scale, but rather reflects categories or counts. To bridge this gap, we develop ICP in the context of transformation models (TRAMs), allowing for continuous, categorical, count-type, and uninformatively censored responses (we show that, in general, these model classes do not allow for identifiability when there is no exogenous heterogeneity). We propose TRAM-GCM, a test for invariance of a subset of covariates, based on the expected conditional covariance between environments and score residuals which satisfies uniform asymptotic level guarantees. For the special case of linear shift TRAMs, we propose an additional invariance test, TRAM-Wald, based on the Wald statistic. We implement both proposed methods in the open-source R package "tramicp" and show in simulations that under the correct model specification, our approach empirically yields higher power than nonparametric ICP based on conditional independence testing.
In this paper, we propose a fully discrete soft thresholding trigonometric polynomial approximation on $[-\pi,\pi],$ named Lasso trigonometric interpolation. This approximation is an $\ell_1$-regularized discrete least squares approximation under the same conditions of classical trigonometric interpolation on an equidistant grid. Lasso trigonometric interpolation is sparse and meanwhile it is an efficient tool to deal with noisy data. We theoretically analyze Lasso trigonometric interpolation for continuous periodic function. The principal results show that the $L_2$ error bound of Lasso trigonometric interpolation is less than that of classical trigonometric interpolation, which improved the robustness of trigonometric interpolation. This paper also presents numerical results on Lasso trigonometric interpolation on $[-\pi,\pi]$, with or without the presence of data errors.
A rigidity circuit (in 2D) is a minimal dependent set in the rigidity matroid, i.e. a minimal graph supporting a non-trivial stress in any generic placement of its vertices in $\mathbb R^2$. Any rigidity circuit on $n\geq 5$ vertices can be obtained from rigidity circuits on a fewer number of vertices by applying the combinatorial resultant (CR) operation. The inverse operation is called a combinatorial resultant decomposition (CR-decomp). Any rigidity circuit on $n\geq 5$ vertices can be successively decomposed into smaller circuits, until the complete graphs $K_4$ are reached. This sequence of CR-decomps has the structure of a rooted binary tree called the combinatorial resultant tree (CR-tree). A CR-tree encodes an elimination strategy for computing circuit polynomials via Sylvester resultants. Different CR-trees lead to elimination strategies that can vary greatly in time and memory consumption. It is an open problem to establish criteria for optimal CR-trees, or at least to characterize those CR-trees that lead to good elimination strategies. In [12] we presented an algorithm for enumerating CR-trees where we give the algorithms for decomposing 3-connected rigidity circuits in polynomial time. In this paper we focus on those circuits that are not 3-connected, which we simply call 2-connected. In order to enumerate CR-decomps of 2-connected circuits $G$, a brute force exp-time search has to be performed among the subgraphs induced by the subsets of $V(G)$. This exp-time bottleneck is not present in the 3-connected case. In this paper we will argue that we do not have to account for all possible CR-decomps of 2-connected rigidity circuits to find a good elimination strategy; we only have to account for those CR-decomps that are a 2-split, all of which can be enumerated in polynomial time. We present algorithms and computational evidence in support of this heuristic.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.
In this paper we develop a novel neural network model for predicting implied volatility surface. Prior financial domain knowledge is taken into account. A new activation function that incorporates volatility smile is proposed, which is used for the hidden nodes that process the underlying asset price. In addition, financial conditions, such as the absence of arbitrage, the boundaries and the asymptotic slope, are embedded into the loss function. This is one of the very first studies which discuss a methodological framework that incorporates prior financial domain knowledge into neural network architecture design and model training. The proposed model outperforms the benchmarked models with the option data on the S&P 500 index over 20 years. More importantly, the domain knowledge is satisfied empirically, showing the model is consistent with the existing financial theories and conditions related to implied volatility surface.
Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.