Engineering a sustainable world requires to consider various systems that interact with each other. These systems include ecological systems, economical systems, social systems and tech-nical systems. They are loosely coupled, geographically distributed, evolve permanently and generate emergent behavior. As these are characteristics of systems of systems (SoS), we discuss the engi-neering of a sustainable world from a SoS engineering perspective. We studied SoS engineering in context of a research project, which aims at political recommendations and a research roadmap for engineering dynamic SoS. The project included an exhaustive literature review, interviews and work-shops with representatives from industry and academia from different application domains. Based on these results and observations, we will discuss how suitable the current state-of-the-art in SoS engi-neering is in order to engineer sustainability. Sustainability was a major driver for SoS engineering in all domains, but we argue that the current scope of SoS engineering is too limited in order to engineer sustainability. Further, we argue that mastering dynamics in this larger scope is essential to engineer sustainability and that this is accompanied by dynamic adaptation of technological SoS.
Collaborative problem-solving (CPS) is a vital skill used both in the workplace and in educational environments. CPS is useful in tackling increasingly complex global, economic, and political issues and is considered a central 21st century skill. The increasingly connected global community presents a fruitful opportunity for creative and collaborative problem-solving interactions and solutions that involve diverse perspectives. Unfortunately, women and underrepresented minorities (URMs) often face obstacles during collaborative interactions that hinder their key participation in these problem-solving conversations. Here, we explored the communication patterns of minority and non-minority individuals working together in a CPS task. Group Communication Analysis (GCA), a temporally-sensitive computational linguistic tool, was used to examine how URM status impacts individuals' sociocognitive linguistic patterns. Results show differences across racial/ethnic groups in key sociocognitive features that indicate fruitful collaborative interactions. We also investigated how the groups' racial/ethnic composition impacts both individual and group communication patterns. In general, individuals in more demographically diverse groups displayed more productive communication behaviors than individuals who were in majority-dominated groups. We discuss the implications of individual and group diversity on communication patterns that emerge during CPS and how these patterns can impact collaborative outcomes.
Modern consumer devices must execute multimedia applications that exhibit high resource utilization. In order to efficiently execute these applications, the dynamic memory subsystem needs to be optimized. This complex task can be tackled in two complementary ways: optimizing the application source code or designing custom dynamic memory management mechanisms. Currently, the first approach has been well established, and several automatic methodologies have been proposed. Regarding the second approach, software engineers often write custom dynamic memory managers from scratch, which is a difficult and error-prone work. This paper presents a novel way to automatically generate custom dynamic memory managers optimizing both performance and memory usage of the target application. The design space is pruned using grammatical evolution converging to the best dynamic memory manager implementation for the target application. Our methodology achieves important improvements (62.55\% and 30.62\% better on average in performance and memory usage, respectively) when its results are compared to five different general-purpose dynamic memory managers.
In many application settings, the data have missing entries which make analysis challenging. An abundant literature addresses missing values in an inferential framework: estimating parameters and their variance from incomplete tables. Here, we consider supervised-learning settings: predicting a target when missing values appear in both training and testing data. We show the consistency of two approaches in prediction. A striking result is that the widely-used method of imputing with a constant, such as the mean prior to learning is consistent when missing values are not informative. This contrasts with inferential settings where mean imputation is pointed at for distorting the distribution of the data. That such a simple approach can be consistent is important in practice. We also show that a predictor suited for complete observations can predict optimally on incomplete data,through multiple imputation.Finally, to compare imputation with learning directly with a model that accounts for missing values, we analyze further decision trees. These can naturally tackle empirical risk minimization with missing values, due to their ability to handle the half-discrete nature of incomplete variables. After comparing theoretically and empirically different missing values strategies in trees, we recommend using the "missing incorporated in attribute" method as it can handle both non-informative and informative missing values.
The minimum covariance determinant (MCD) estimator is a popular method for robustly estimating the mean and covariance of multivariate data. We extend the MCD to the setting where the observations are matrices rather than vectors and introduce the matrix minimum covariance determinant (MMCD) estimators for robust parameter estimation. These estimators hold equivariance properties, achieve a high breakdown point, and are consistent under elliptical matrix-variate distributions. We have also developed an efficient algorithm with convergence guarantees to compute the MMCD estimators. Using the MMCD estimators, we can compute robust Mahalanobis distances that can be used for outlier detection. Those distances can be decomposed into outlyingness contributions from each cell, row, or column of a matrix-variate observation using Shapley values, a concept for outlier explanation recently introduced in the multivariate setting. Simulations and examples reveal the excellent properties and usefulness of the robust estimators.
This work explores the dimension reduction problem for Bayesian nonparametric regression and density estimation. More precisely, we are interested in estimating a functional parameter $f$ over the unit ball in $\mathbb{R}^d$, which depends only on a $d_0$-dimensional subspace of $\mathbb{R}^d$, with $d_0 < d$.It is well-known that rescaled Gaussian process priors over the function space achieve smoothness adaptation and posterior contraction with near minimax-optimal rates. Moreover, hierarchical extensions of this approach, equipped with subspace projection, can also adapt to the intrinsic dimension $d_0$ (\cite{Tokdar2011DimensionAdapt}).When the ambient dimension $d$ does not vary with $n$, the minimax rate remains of the order $n^{-\beta/(2\beta +d_0)}$.%When $d$ does not vary with $n$, the order of the minimax rate remains the same regardless of the ambient dimension $d$. However, this is up to multiplicative constants that can become prohibitively large when $d$ grows. The dependences between the contraction rate and the ambient dimension have not been fully explored yet and this work provides a first insight: we let the dimension $d$ grow with $n$ and, by combining the arguments of \cite{Tokdar2011DimensionAdapt} and \cite{Jiang2021VariableSelection}, we derive a growth rate for $d$ that still leads to posterior consistency with minimax rate.The optimality of this growth rate is then discussed.Additionally, we provide a set of assumptions under which consistent estimation of $f$ leads to a correct estimation of the subspace projection, assuming that $d_0$ is known.
We consider the reliable implementation of an adaptive high-order unfitted finite element method on Cartesian meshes for solving elliptic interface problems with geometrically curved singularities. We extend our previous work on the reliable cell merging algorithm for smooth interfaces to automatically generate the induced mesh for piecewise smooth interfaces. An $hp$ a posteriori error estimate is derived for a new unfitted finite element method whose finite element functions are conforming in each subdomain. Numerical examples illustrate the competitive performance of the method.
During the evolution of large models, performance evaluation is necessarily performed to assess their capabilities and ensure safety before practical application. However, current model evaluations mainly rely on specific tasks and datasets, lacking a united framework for assessing the multidimensional intelligence of large models. In this perspective, we advocate for a comprehensive framework of cognitive science-inspired artificial general intelligence (AGI) tests, aimed at fulfilling the testing needs of large models with enhanced capabilities. The cognitive science-inspired AGI tests encompass the full spectrum of intelligence facets, including crystallized intelligence, fluid intelligence, social intelligence, and embodied intelligence. To assess the multidimensional intelligence of large models, the AGI tests consist of a battery of well-designed cognitive tests adopted from human intelligence tests, and then naturally encapsulates into an immersive virtual community. We propose increasing the complexity of AGI testing tasks commensurate with advancements in large models and emphasizing the necessity for the interpretation of test results to avoid false negatives and false positives. We believe that cognitive science-inspired AGI tests will effectively guide the targeted improvement of large models in specific dimensions of intelligence and accelerate the integration of large models into human society.
An approach to optimal actuator design based on shape and topology optimisation techniques is presented. For linear diffusion equations, two scenarios are considered. For the first one, best actuators are determined depending on a given initial condition. In the second scenario, optimal actuators are determined based on all initial conditions not exceeding a chosen norm. Shape and topological sensitivities of these cost functionals are determined. A numerical algorithm for optimal actuator design based on the sensitivities and a level-set method is presented. Numerical results support the proposed methodology.
Evaluating environmental variables that vary stochastically is the principal topic for designing better environmental management and restoration schemes. Both the upper and lower estimates of these variables, such as water quality indices and flood and drought water levels, are important and should be consistently evaluated within a unified mathematical framework. We propose a novel pair of Orlicz regrets to consistently bound the statistics of random variables both from below and above. Here, consistency indicates that the upper and lower bounds are evaluated with common coefficients and parameter values being different from some of the risk measures proposed thus far. Orlicz regrets can flexibly evaluate the statistics of random variables based on their tail behavior. The explicit linkage between Orlicz regrets and divergence risk measures was exploited to better comprehend them. We obtain sufficient conditions to pose the Orlicz regrets as well as divergence risk measures, and further provide gradient descent-type numerical algorithms to compute them. Finally, we apply the proposed mathematical framework to the statistical evaluation of 31-year water quality data as key environmental indicators in a Japanese river environment.
Knowledge graphs (KGs) of real-world facts about entities and their relationships are useful resources for a variety of natural language processing tasks. However, because knowledge graphs are typically incomplete, it is useful to perform knowledge graph completion or link prediction, i.e. predict whether a relationship not in the knowledge graph is likely to be true. This paper serves as a comprehensive survey of embedding models of entities and relationships for knowledge graph completion, summarizing up-to-date experimental results on standard benchmark datasets and pointing out potential future research directions.