Reducing wealth inequality and disparity is a global challenge. The economic system is mainly divided into (1) gift and reciprocity, (2) power and redistribution, (3) market exchange, and (4) mutual aid without reciprocal obligations. The current inequality stems from a capitalist economy consisting of (2) and (3). To sublimate (1), which is the human economy, to (4), the concept of a "mixbiotic society" has been proposed in the philosophical realm. This is a society in which free and diverse individuals, "I," mix with each other, recognize their respective "fundamental incapability" and sublimate them into "WE" solidarity. The economy in this society must have moral responsibility as a coadventurer and consideration for vulnerability to risk. Therefore, I focus on two factors of mind perception: moral responsibility and risk vulnerability, and propose a novel model of wealth distribution following an econophysical approach. Specifically, I developed a joint-venture model, a redistribution model in the joint-venture model, and a "WE economy" model. A simulation comparison of a combination of the joint ventures and redistribution with the WE economies reveals that WE economies are effective in reducing inequality and resilient in normalizing wealth distribution as advantages, and susceptible to free riders as disadvantages. However, this disadvantage can be compensated for by fostering consensus and fellowship, and by complementing it with joint ventures. This study essentially presents the effectiveness of moral responsibility, the complementarity between the WE economy and the joint economy, and the direction of the economy toward reducing inequality. Future challenges are to develop the WE economy model based on real economic analysis and psychology, as well as to promote WE economy fieldwork for worker coops and platform cooperatives to realize a desirable mixbiotic society.
Gene set analysis, a popular approach for analysing high-throughput gene expression data, aims to identify sets of genes that show enriched expression patterns between two conditions. In addition to the multitude of methods available for this task, users are typically left with many options when creating the required input and specifying the internal parameters of the chosen method. This flexibility can lead to uncertainty about the 'right' choice, further reinforced by a lack of evidence-based guidance. Especially when their statistical experience is scarce, this uncertainty might entice users to produce preferable results using a 'trial-and-error' approach. While it may seem unproblematic at first glance, this practice can be viewed as a form of 'cherry-picking' and cause an optimistic bias, rendering the results non-replicable on independent data. After this problem has attracted a lot of attention in the context of classical hypothesis testing, we now aim to raise awareness of such over-optimism in the different and more complex context of gene set analyses. We mimic a hypothetical researcher who systematically selects the analysis variants yielding their preferred results, thereby considering three distinct goals they might pursue. Using a selection of popular gene set analysis methods, we tweak the results in this way for two frequently used benchmark gene expression data sets. Our study indicates that the potential for over-optimism is particularly high for a group of methods frequently used despite being commonly criticised. We conclude by providing practical recommendations to counter over-optimism in research findings in gene set analysis and beyond.
Conventional neural network elastoplasticity models are often perceived as lacking interpretability. This paper introduces a two-step machine learning approach that returns mathematical models interpretable by human experts. In particular, we introduce a surrogate model where yield surfaces are expressed in terms of a set of single-variable feature mappings obtained from supervised learning. A post-processing step is then used to re-interpret the set of single-variable neural network mapping functions into mathematical form through symbolic regression. This divide-and-conquer approach provides several important advantages. First, it enables us to overcome the scaling issue of symbolic regression algorithms. From a practical perspective, it enhances the portability of learned models for partial differential equation solvers written in different programming languages. Finally, it enables us to have a concrete understanding of the attributes of the materials, such as convexity and symmetries of models, through automated derivations and reasoning. Numerical examples have been provided, along with an open-source code to enable third-party validation.
This work has been motivated by a longitudinal data set on HIV CD4 T+ cell counts from Livingstone district, Zambia. The corresponding histogram plots indicate lack of symmetry in the marginal distributions and the pairwise scatter plots show non-elliptical dependence patterns. The standard linear mixed model for longitudinal data fails to capture these features. Thus it seems appropriate to consider a more general framework for modeling such data. In this article, we consider generalized linear mixed models (GLMM) for the marginals (e.g. Gamma mixed model), and temporal dependency of the repeated measurements is modeled by the copula corresponding to some skew-elliptical distributions (like skew-normal/skew-t). Our proposed class of copula based mixed models simultaneously takes into account asymmetry, between-subject variability and non-standard temporal dependence, and hence can be considered extensions to the standard linear mixed model based on multivariate normality. We estimate the model parameters using the IFM (inference function of margins) method, and also describe how to obtain standard errors of the parameter estimates. We investigate the finite sample performance of our procedure with extensive simulation studies involving skewed and symmetric marginal distributions and several choices of the copula. We finally apply our models to the HIV data set and report the findings.
Random probabilities are a key component to many nonparametric methods in Statistics and Machine Learning. To quantify comparisons between different laws of random probabilities several works are starting to use the elegant Wasserstein over Wasserstein distance. In this paper we prove that the infinite-dimensionality of the space of probabilities drastically deteriorates its sample complexity, which is slower than any polynomial rate in the sample size. We thus propose a new distance that preserves many desirable properties of the former while achieving a parametric rate of convergence. In particular, our distance 1) metrizes weak convergence; 2) can be estimated numerically through samples with low complexity; 3) can be bounded analytically from above and below. The main ingredient are integral probability metrics, which lead to the name hierarchical IPM.
The Shapley value equals a player's contribution to the potential of a game. The potential is a most natural one-number summary of a game, which can be computed as the expected accumulated worth of a random partition of the players. This computation integrates the coalition formation of all players and readily extends to games with externalities. We investigate those potential functions for games with externalities that can be computed this way. It turns out that the potential that corresponds to the MPW solution introduced by Macho-Stadler et al. (2007, J. Econ. Theory 135, 339-356), is unique in the following sense. It is obtained as a the expected accumulated worth of a random partition, it generalizes the potential for games without externalities, and it induces a solution that satisfies the null player property even in the presence of externalities.
In this paper, we provide an analysis of a recently proposed multicontinuum homogenization technique. The analysis differs from those used in classical homogenization methods for several reasons. First, the cell problems in multicontinuum homogenization use constraint problems and can not be directly substituted into the differential operator. Secondly, the problem contains high contrast that remains in the homogenized problem. The homogenized problem averages the microstructure while containing the small parameter. In this analysis, we first based on our previous techniques, CEM-GMsFEM, to define a CEM-downscaling operator that maps the multicontinuum quantities to an approximated microscopic solution. Following the regularity assumption of the multicontinuum quantities, we construct a downscaling operator and the homogenized multicontinuum equations using the information of linear approximation of the multicontinuum quantities. The error analysis is given by the residual estimate of the homogenized equations and the well-posedness assumption of the homogenized equations.
Stepped wedge cluster randomized experiments represent a class of unidirectional crossover designs increasingly adopted for comparative effectiveness and implementation science research. Although stepped wedge cluster randomized experiments have become popular, definitions of estimands and robust methods to target clearly-defined estimands remain insufficient. To address this gap, we describe a class of estimands that explicitly acknowledge the multilevel data structure in stepped wedge cluster randomized experiments, and highlight three typical members of the estimand class that are interpretable and are of practical interest. We then introduce four possible formulations of analysis of covariance (ANCOVA) working models to achieve estimand-aligned analyses. By exploiting baseline covariates, each ANCOVA model can potentially improve the estimation efficiency over the unadjusted estimators. In addition, each ANCOVA estimator is model-assisted in the sense that its point estimator is consistent with the target estimand even when the working model is misspecified. Under the stepped wedge randomization scheme, we establish the finite population Central Limit Theorem for each estimator, which motivates design-based variance estimators. Through simulations, we study the finite-sample operating characteristics of the ANCOVA estimators under different data-generating processes. We illustrate their applications via the analysis of the Washington State Expedited Partner Therapy study.
Environmental data science for spatial extremes has traditionally relied heavily on max-stable processes. Even though the popularity of these models has perhaps peaked with statisticians, they are still perceived and considered as the `state-of-the-art' in many applied fields. However, while the asymptotic theory supporting the use of max-stable processes is mathematically rigorous and comprehensive, we think that it has also been overused, if not misused, in environmental applications, to the detriment of more purposeful and meticulously validated models. In this paper, we review the main limitations of max-stable process models, and strongly argue against their systematic use in environmental studies. Alternative solutions based on more flexible frameworks using the exceedances of variables above appropriately chosen high thresholds are discussed, and an outlook on future research is given, highlighting recommendations moving forward and the opportunities offered by hybridizing machine learning with extreme-value statistics.
Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work we present a higher-order GNN model trained to predict the fourth-order stiffness tensor of periodic strut-based lattices. The key features of the model are (i) SE(3) equivariance, and (ii) consistency with the thermodynamic law of conservation of energy. We compare the model to non-equivariant models based on a number of error metrics and demonstrate the benefits of the encoded equivariance and energy conservation in terms of predictive performance and reduced training requirements.
Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.