Modeling the whole cardiac function involves several complex multi-physics and multi-scale phenomena that are highly computationally demanding, which makes calling for simpler yet accurate, high-performance computational tools still a paramount challenge to be addressed. Despite all the efforts made by several research groups worldwide, no software has progressed as a standard reference tool for whole-heart fully-coupled cardiac simulations in the scientific community yet. In this work we present the first publicly released package of the heart module of life$^x$, a high-performance solver for multi-physics and multi-scale problems, aimed at cardiac applications. The goal of life$^x$ is twofold. On the one side, it aims at making in silico experiments easily reproducible and accessible to the wider public, targeting also users with a background in medicine or bio-engineering, thanks to an extensive documentation and user guide. On the other hand, being conceived as an academic research library, life$^x$ can be exploited by scientific computing experts to explore new modeling and numerical methodologies within a robust development framework. life$^x$ has been developed with a modular structure and will be released bundled in different modules/packages. This initial release includes a generator for myocardial fibers based on Laplace-Dirichlet-Rule-Based-Methods (LDRBMs). This report comes with an extensive technical and mathematical documentation to welcome new users to the core structure of a prototypical life$^x$ application and to provide with them a possible approach to include the generated cardiac fibers into more sophisticated computational pipelines.
Interactive machine learning (IML) is a field of research that explores how to leverage both human and computational abilities in decision making systems. IML represents a collaboration between multiple complementary human and machine intelligent systems working as a team, each with their own unique abilities and limitations. This teamwork might mean that both systems take actions at the same time, or in sequence. Two major open research questions in the field of IML are: "How should we design systems that can learn to make better decisions over time with human interaction?" and "How should we evaluate the design and deployment of such systems?" A lack of appropriate consideration for the humans involved can lead to problematic system behaviour, and issues of fairness, accountability, and transparency. Thus, our goal with this work is to present a human-centred guide to designing and evaluating IML systems while mitigating risks. This guide is intended to be used by machine learning practitioners who are responsible for the health, safety, and well-being of interacting humans. An obligation of responsibility for public interaction means acting with integrity, honesty, fairness, and abiding by applicable legal statutes. With these values and principles in mind, we as a machine learning research community can better achieve goals of augmenting human skills and abilities. This practical guide therefore aims to support many of the responsible decisions necessary throughout the iterative design, development, and dissemination of IML systems.
Community detection refers to the problem of clustering the nodes of a network into groups. Existing inferential methods for community structure mainly focus on unweighted (binary) networks. Many real-world networks are nonetheless weighted and a common practice is to dichotomize a weighted network to an unweighted one which is known to result in information loss. Literature on hypothesis testing in the latter situation is still missing. In this paper, we study the problem of testing the existence of community structure in weighted networks. Our contributions are threefold: (a). We use the (possibly infinite-dimensional) exponential family to model the weights and derive the sharp information-theoretic limit for the existence of consistent test. Within the limit, any test is inconsistent; and beyond the limit, we propose a useful consistent test. (b). Based on the information-theoretic limits, we provide the first formal way to quantify the loss of information incurred by dichotomizing weighted graphs into unweighted graphs in the context of hypothesis testing. (c). We propose several new and practically useful test statistics. Simulation study show that the proposed tests have good performance. Finally, we apply the proposed tests to an animal social network.
Artificial intelligence (AI) and machine learning (ML) techniques have been increasingly used in several fields to improve performance and the level of automation. In recent years, this use has exponentially increased due to the advancement of high-performance computing and the ever increasing size of data. One of such fields is that of hardware design; specifically the design of digital and analog integrated circuits~(ICs), where AI/ ML techniques have been extensively used to address ever-increasing design complexity, aggressive time-to-market, and the growing number of ubiquitous interconnected devices (IoT). However, the security concerns and issues related to IC design have been highly overlooked. In this paper, we summarize the state-of-the-art in AL/ML for circuit design/optimization, security and engineering challenges, research in security-aware CAD/EDA, and future research directions and needs for using AI/ML for security-aware circuit design.
Data collection and research methodology represents a critical part of the research pipeline. On the one hand, it is important that we collect data in a way that maximises the validity of what we are measuring, which may involve the use of long scales with many items. On the other hand, collecting a large number of items across multiple scales results in participant fatigue, and expensive and time consuming data collection. It is therefore important that we use the available resources optimally. In this work, we consider how a consideration for theory and the associated causal/structural model can help us to streamline data collection procedures by not wasting time collecting data for variables which are not causally critical for subsequent analysis. This not only saves time and enables us to redirect resources to attend to other variables which are more important, but also increases research transparency and the reliability of theory testing. In order to achieve this streamlined data collection, we leverage structural models, and Markov conditional independency structures implicit in these models to identify the substructures which are critical for answering a particular research question. In this work, we review the relevant concepts and present a number of didactic examples with the hope that psychologists can use these techniques to streamline their data collection process without invalidating the subsequent analysis. We provide a number of simulation results to demonstrate the limited analytical impact of this streamlining.
Linear mixed models (LMMs) are instrumental for regression analysis with structured dependence, such as grouped, clustered, or multilevel data. However, selection among the covariates--while accounting for this structured dependence--remains a challenge. We introduce a Bayesian decision analysis for subset selection with LMMs. Using a Mahalanobis loss function that incorporates the structured dependence, we derive optimal linear coefficients for (i) any given subset of variables and (ii) all subsets of variables that satisfy a cardinality constraint. Crucially, these estimates inherit shrinkage or regularization and uncertainty quantification from the underlying Bayesian model, and apply for any well-specified Bayesian LMM. More broadly, our decision analysis strategy deemphasizes the role of a single "best" subset, which is often unstable and limited in its information content, and instead favors a collection of near-optimal subsets. This collection is summarized by key member subsets and variable-specific importance metrics. Customized subset search and out-of-sample approximation algorithms are provided for more scalable computing. These tools are applied to simulated data and a longitudinal physical activity dataset, and demonstrate excellent prediction, estimation, and selection ability.
Molecular mechanics (MM) potentials have long been a workhorse of computational chemistry. Leveraging accuracy and speed, these functional forms find use in a wide variety of applications in biomolecular modeling and drug discovery, from rapid virtual screening to detailed free energy calculations. Traditionally, MM potentials have relied on human-curated, inflexible, and poorly extensible discrete chemical perception rules or applying parameters to small molecules or biopolymers, making it difficult to optimize both types and parameters to fit quantum chemical or physical property data. Here, we propose an alternative approach that uses graph neural networks to perceive chemical environments, producing continuous atom embeddings from which valence and nonbonded parameters can be predicted using invariance-preserving layers. Since all stages are built from smooth neural functions, the entire process is modular and end-to-end differentiable with respect to model parameters, allowing new force fields to be easily constructed, extended, and applied to arbitrary molecules. We show that this approach is not only sufficiently expressive to reproduce legacy atom types, but that it can learn to accurately reproduce and extend existing molecular mechanics force fields. Trained with arbitrary loss functions, it can construct entirely new force fields self-consistently applicable to both biopolymers and small molecules directly from quantum chemical calculations, with superior fidelity than traditional atom or parameter typing schemes. When trained on the same quantum chemical small molecule dataset used to parameterize the openff-1.2.0 small molecule force field augmented with a peptide dataset, the resulting espaloma model shows superior accuracy vis-\`a-vis experiments in computing relative alchemical free energy calculations for a popular benchmark set.
Spectral efficiency improvement is a key focus in most wireless communication systems and achieved by various means such as using large antenna arrays and/or advanced modulation schemes and signal formats. This work proposes to further improve spectral efficiency through combining non-orthogonal spectrally efficient frequency division multiplexing (SEFDM) systems with index modulation (IM), which can efficiently make use of the indices of activated subcarriers as communication information. Recent research has verified that IM may be used with SEFDM to alleviate inter-carrier interference (ICI) and improve error performance. This work proposes new SEFDM signal formats based on novel activation pattern designs, which limit the locations of activated subcarriers and enable a variable number of activated subcarriers in each SEFDM subblock. SEFDM-IM system designs are developed by jointly considering activation patterns, modulation schemes and signal waveform formats, with a set of solutions evaluated under different spectral efficiency scenarios. Detailed modelling of coded systems and simulation studies reveal that the proposed designs not only lead to better bit error rate (BER) but also lower peak-to-average power ratio (PAPR) and reduced computational complexity relative to other reported index-modulated systems.
Coflow is a network abstraction used to represent communication patterns in data centers. The coflow scheduling problem in large data centers is one of the most important $NP$-hard problems. Many previous studies on coflow scheduling mainly focus on the single-core model. However, with the growth of data centers, this single-core model is no longer sufficient. This paper considers the coflow scheduling problem in heterogeneous parallel networks. The heterogeneous parallel network is an architecture based on multiple network cores running in parallel. In this paper, two polynomial-time approximation algorithms are developed for scheduling divisible and indivisible coflows in heterogeneous parallel networks, respectively. Both algorithms achieve an approximation ratio of $O(\log m/ \log \log m)$ with arbitrary release times.
We demonstrate that merely analog transmissions and match filtering can realize the function of an edge server in federated learning (FL). Therefore, a network with massively distributed user equipments (UEs) can achieve large-scale FL without an edge server. We also develop a training algorithm that allows UEs to continuously perform local computing without being interrupted by the global parameter uploading, which exploits the full potential of UEs' processing power. We derive convergence rates for the proposed schemes to quantify their training efficiency. The analyses reveal that when the interference obeys a Gaussian distribution, the proposed algorithm retrieves the convergence rate of a server-based FL. But if the interference distribution is heavy-tailed, then the heavier the tail, the slower the algorithm converges. Nonetheless, the system run time can be largely reduced by enabling computation in parallel with communication, whereas the gain is particularly pronounced when communication latency is high. These findings are corroborated via excessive simulations.
We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.