This paper introduces an extended tensor decomposition (XTD) method for model reduction. The proposed method is based on a sparse non-separated enrichment to the conventional tensor decomposition, which is expected to improve the approximation accuracy and the reducibility (compressibility) in highly nonlinear and singular cases. The proposed XTD method can be a powerful tool for solving nonlinear space-time parametric problems. The method has been successfully applied to parametric elastic-plastic problems and real time additive manufacturing residual stress predictions with uncertainty quantification. Furthermore, a combined XTD-SCA (self-consistent clustering analysis) strategy has been presented for multi-scale material modeling, which enables real time multi-scale multi-parametric simulations. The efficiency of the method is demonstrated with comparison to finite element analysis. The proposed method enables a novel framework for fast manufacturing and material design with uncertainties.
Advances in large language models (LLMs) have driven an explosion of interest about their societal impacts. Much of the discourse around how they will impact social equity has been cautionary or negative, focusing on questions like "how might LLMs be biased and how would we mitigate those biases?" This is a vital discussion: the ways in which AI generally, and LLMs specifically, can entrench biases have been well-documented. But equally vital, and much less discussed, is the more opportunity-focused counterpoint: "what promising applications do LLMs enable that could promote equity?" If LLMs are to enable a more equitable world, it is not enough just to play defense against their biases and failure modes. We must also go on offense, applying them positively to equity-enhancing use cases to increase opportunities for underserved groups and reduce societal discrimination. There are many choices which determine the impact of AI, and a fundamental choice very early in the pipeline is the problems we choose to apply it to. If we focus only later in the pipeline -- making LLMs marginally more fair as they facilitate use cases which intrinsically entrench power -- we will miss an important opportunity to guide them to equitable impacts. Here, we highlight the emerging potential of LLMs to promote equity by presenting four newly possible, promising research directions, while keeping risks and cautionary points in clear view.
This paper presents the first application of the direct parametrisation method for invariant manifolds to a fully coupled multiphysics problem involving the nonlinear vibrations of deformable structures subjected to an electrostatic field. The formulation proposed is intended for model order reduction of electrostatically actuated resonating Micro-Electro-Mechanical Systems (MEMS). The continuous problem is first rewritten in a manner that can be directly handled by the parametrisation method, which relies upon automated asymptotic expansions. A new mixed fully Lagrangian formulation is thus proposed which contains only explicit polynomial nonlinearities, which is then discretised in the framework of finite element procedures. Validation is performed on the classical parallel plate configuration, where different formulations using either the general framework, or an approximation of the electrostatic field due to the geometric configuration selected, are compared. Reduced-order models along these formulations are also compared to full-order simulations operated with a time integration approach. Numerical results show a remarkable performance both in terms of accuracy and wealth of nonlinear effects that can be accounted for. In particular, the transition from hardening to softening behaviour of the primary resonance while increasing the constant voltage component of the electric actuation, is recovered. Secondary resonances leading to superharmonic and parametric resonances are also investigated with the reduced-order model.
The generative paradigm has become increasingly important in machine learning and deep learning models. Among popular generative models are normalizing flows, which enable exact likelihood estimation by transforming a base distribution through diffeomorphic transformations. Extending the normalizing flow framework to handle time-indexed flows gave dynamic normalizing flows, a powerful tool to model time series, stochastic processes, and neural stochastic differential equations (SDEs). In this work, we propose a novel variant of dynamic normalizing flows, a Time Changed Normalizing Flow (TCNF), based on time deformation of a Brownian motion which constitutes a versatile and extensive family of Gaussian processes. This approach enables us to effectively model some SDEs, that cannot be modeled otherwise, including standard ones such as the well-known Ornstein-Uhlenbeck process, and generalizes prior methodologies, leading to improved results and better inference and prediction capability.
Most categorical models for dependent types have traditionally been heavily set based: contexts form a category, and for each we have a set of types in said context -- and for each type a set of terms of said type. This is the case for categories with families, categories with attributes, and natural models; in particular, all of them can be traced back to certain discrete Grothendieck fibrations. We extend this intuition to the case of general, non necessarily discrete, fibrations, so that over a given context one has not only a set but a category of types. We argue that the added structure can be attributed to a notion of subtyping that shares many features with that of coercive subtyping, in the sense that it is the product of thinking about subtyping as an abbreviation mechanism: we say that a given type $A'$ is a subtype of $A$ if there is a unique coercion from $A'$ to $A$. Whenever we need a term of type $A$, then, it suffices to have a term of type $A'$, which we can `plug-in' into $A$. For this version of subtyping we provide rules, coherences, and explicit models, and we compare and contrast it to coercive subtyping as introduced by Z. Luo and others. We conclude by suggesting how the tools we present can be employed in finding appropriate rules relating subtyping and certain type constructors.
High-order tensor methods for solving both convex and nonconvex optimization problems have generated significant research interest, leading to algorithms with optimal global rates of convergence and local rates that are faster than Newton's method. On each iteration, these methods require the unconstrained local minimization of a (potentially nonconvex) multivariate polynomial of degree higher than two, constructed using third-order (or higher) derivative information, and regularized by an appropriate power of regularization. Developing efficient techniques for solving such subproblems is an ongoing topic of research, and this paper addresses the case of the third-order tensor subproblem. We propose the CQR algorithmic framework, for minimizing a nonconvex Cubic multivariate polynomial with Quartic Regularisation, by minimizing a sequence of local quadratic models that incorporate simple cubic and quartic terms. The role of the cubic term is to crudely approximate local tensor information, while the quartic one controls model regularization and progress. We provide necessary and sufficient optimality conditions that fully characterise the global minimizers of these cubic-quartic models. We then turn these conditions into secular equations that can be solved using nonlinear eigenvalue techniques. We show, using our optimality characterisations, that a CQR algorithmic variant has the optimal-order evaluation complexity of $\mathcal{O}(\epsilon^{-3/2})$ when applied to minimizing our quartically-regularised cubic subproblem, which can be further improved in special cases. We propose practical CQR variants that use local tensor information to construct the local cubic-quartic models. We test these variants numerically and observe them to be competitive with ARC and other subproblem solvers on typical instances and even superior on ill-conditioned subproblems with special structure.
A high-order, degree-adaptive hybridizable discontinuous Galerkin (HDG) method is presented for two-fluid incompressible Stokes flows, with boundaries and interfaces described using NURBS. The NURBS curves are embedded in a fixed Cartesian grid, yielding an unfitted HDG scheme capable of treating the exact geometry of the boundaries/interfaces, circumventing the need for fitted, high-order, curved meshes. The framework of the NURBS-enhanced finite element method (NEFEM) is employed for accurate quadrature along immersed NURBS and in elements cut by NURBS curves. A Nitsche's formulation is used to enforce Dirichlet conditions on embedded surfaces, yielding unknowns only on the mesh skeleton as in standard HDG, without introducing any additional degree of freedom on non-matching boundaries/interfaces. The resulting unfitted HDG-NEFEM method combines non-conforming meshes, exact NURBS geometry and high-order approximations to provide high-fidelity results on coarse meshes, independent of the geometric features of the domain. Numerical examples illustrate the optimal accuracy and robustness of the method, even in the presence of badly cut cells or faces, and its suitability to simulate microfluidic systems from CAD geometries.
Disability insurance claims are often affected by lengthy reporting delays and adjudication processes. The classic multistate life insurance modeling framework is ill-suited to handle such information delays since the cash flow and available information can no longer be based on the biometric multistate process determining the contractual payments. We propose a new individual reserving model for disability insurance schemes which describes the claim evolution in real-time. Under suitable independence assumptions between the available information and the underlying biometric multistate process, we show that these new reserves may be calculated as natural modifications of the classic reserves. We propose suitable parametric estimators for the model constituents and a real data application shows the practical relevance of our concepts and results.
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.
Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.