In this work, we revisit the marking decisions made in the standard adaptive finite element method (AFEM). Experience shows that a na\"{i}ve marking policy leads to inefficient use of computational resources for adaptive mesh refinement (AMR). Consequently, using AFEM in practice often involves ad-hoc or time-consuming offline parameter tuning to set appropriate parameters for the marking subroutine. To address these practical concerns, we recast AMR as a Markov decision process in which refinement parameters can be selected on-the-fly at run time, without the need for pre-tuning by expert users. In this new paradigm, the refinement parameters are also chosen adaptively via a marking policy that can be optimized using methods from reinforcement learning. We use the Poisson equation to demonstrate our techniques on $h$- and $hp$-refinement benchmark problems, and our experiments suggest that superior marking policies remain undiscovered for many classical AFEM applications. Furthermore, an unexpected observation from this work is that marking policies trained on one family of PDEs are sometimes robust enough to perform well on problems far outside the training family. For illustration, we show that a simple $hp$-refinement policy trained on 2D domains with only a single re-entrant corner can be deployed on far more complicated 2D domains, and even 3D domains, without significant performance loss. For reproduction and broader adoption, we accompany this work with an open-source implementation of our methods.
In this paper, we propose to consider various models of pattern recognition. At the same time, it is proposed to consider models in the form of two operators: a recognizing operator and a decision rule. Algebraic operations are introduced on recognizing operators, and based on the application of these operators, a family of recognizing algorithms is created. An upper estimate is constructed for the model, which guarantees the completeness of the extension.
Cross-validation is the standard approach for tuning parameter selection in many non-parametric regression problems. However its use is less common in change-point regression, perhaps as its prediction error-based criterion may appear to permit small spurious changes and hence be less well-suited to estimation of the number and location of change-points. We show that in fact the problems of cross-validation with squared error loss are more severe and can lead to systematic under- or over-estimation of the number of change-points, and highly suboptimal estimation of the mean function in simple settings where changes are easily detectable. We propose two simple approaches to remedy these issues, the first involving the use of absolute error rather than squared error loss, and the second involving modifying the holdout sets used. For the latter, we provide conditions that permit consistent estimation of the number of change-points for a general change-point estimation procedure. We show these conditions are satisfied for least squares estimation using new results on its performance when supplied with the incorrect number of change-points. Numerical experiments show that our new approaches are competitive with common change-point methods using classical tuning parameter choices when error distributions are well-specified, but can substantially outperform these in misspecified models. An implementation of our methodology is available in the R package crossvalidationCP on CRAN.
Autonomous Intelligent Agents are employed in many applications upon which the life and welfare of living beings and vital social functions may depend. Therefore, agents should be trustworthy. A priori certification techniques (i.e., techniques applied prior to system's deployment) can be useful, but are not sufficient for agents that evolve, and thus modify their epistemic and belief state, and for open Multi-Agent Systems, where heterogeneous agents can join or leave the system at any stage of its operation. In this paper, we propose/refine/extend dynamic (runtime) logic-based self-checking techniques, devised in order to be able to ensure agents' trustworthy and ethical behaviour.
In this paper, we propose an adaptive finite element method for computing the first eigenpair of the p-Laplacian problem. We prove that starting from a fine initial mesh our proposed adaptive algorithm produces a sequence of discrete first eigenvalues that converges to the first eigenvalue of the continuous problem and the distance between discrete eigenfunctions and the normalized eigenfunction set with respect to the first eigenvalue in $W^{1,p}$-norm also tends to zero. Extensive numerical examples are provided to show the effectiveness and efficiency.
Independent Component Analysis (ICA) was introduced in the 1980's as a model for Blind Source Separation (BSS), which refers to the process of recovering the sources underlying a mixture of signals, with little knowledge about the source signals or the mixing process. While there are many sophisticated algorithms for estimation, different methods have different shortcomings. In this paper, we develop a nonparametric score to adaptively pick the right algorithm for ICA with arbitrary Gaussian noise. The novelty of this score stems from the fact that it just assumes a finite second moment of the data and uses the characteristic function to evaluate the quality of the estimated mixing matrix without any knowledge of the parameters of the noise distribution. In addition, we propose some new contrast functions and algorithms that enjoy the same fast computability as existing algorithms like FASTICA and JADE but work in domains where the former may fail. While these also may have weaknesses, our proposed diagnostic, as shown by our simulations, can remedy them. Finally, we propose a theoretical framework to analyze the local and global convergence properties of our algorithms.
Accelerated life-tests (ALTs) are used for inferring lifetime characteristics of highly reliable products. In particular, step-stress ALTs increase the stress level at which units under test are subject at certain pre-fixed times, thus accelerating the product's wear and inducing its failure. In some cases, due to cost or product nature constraints, continuous monitoring of devices is infeasible, and so the units are inspected for failures at particular inspection time points. In a such setup, the ALT response is interval-censored. Furthermore, when a test unit fails, there are often more than one fatal cause for the failure, known as competing risks. In this paper, we assume that all competing risks are independent and follow exponential distributions with scale parameters depending on the stress level. Under this setup, we present a family of robust estimators based on density power divergence, including the classical maximum likelihood estimator (MLE) as a particular case. We derive asymptotic and robustness properties of the Minimum Density Power Divergence Estimator (MDPDE), showing its consistency for large samples. Based on these MDPDEs, estimates of the lifetime characteristics of the product as well as estimates of cause-specific lifetime characteristics are then developed. Direct asymptotic, transformed and, bootstrap confidence intervals for the mean lifetime to failure, reliability at a mission time and, distribution quantiles are proposed, and their performance is then compared through Monte Carlo simulations. Moreover, the performance of the MDPDE family has been examined through an extensive numerical study and the methods of inference discussed here are finally illustrated with a real-data example concerning electronic devices.
In this paper, we combine the Smolyak technique for multi-dimensional interpolation with the Filon-Clenshaw-Curtis (FCC) rule for one-dimensional oscillatory integration, to obtain a new Filon-Clenshaw-Curtis-Smolyak (FCCS) rule for oscillatory integrals with linear phase over the $d-$dimensional cube $[-1,1]^d$. By combining stability and convergence estimates for the FCC rule with error estimates for the Smolyak interpolation operator, we obtain an error estimate for the FCCS rule, consisting of the product of a Smolyak-type error estimate multiplied by a term that decreases with $\mathcal{O}(k^{-\tilde{d}})$, where $k$ is the wavenumber and $\tilde{d}$ is the number of oscillatory dimensions. If all dimensions are oscillatory, a higher negative power of $k$ appears in the estimate. As an application, we consider the forward problem of uncertainty quantification (UQ) for a one-space-dimensional Helmholtz problem with wavenumber $k$ and a random heterogeneous refractive index, depending in an affine way on $d$ i.i.d. uniform random variables. After applying a classical hybrid numerical-asymptotic approximation, expectations of functionals of the solution of this problem can be formulated as a sum of oscillatory integrals over $[-1,1]^d$, which we compute using the FCCS rule. We give numerical results for the FCCS rule and the UQ algorithm showing that accuracy improves when both $k$ and the order of the rule increase. We also give results for dimension-adaptive sparse grid FCCS quadrature showing its efficiency as dimension increases.
Under losses which are potentially heavy-tailed, we consider the task of minimizing sums of the loss mean and standard deviation, without trying to accurately estimate the variance. By modifying a technique for variance-free robust mean estimation to fit our problem setting, we derive a simple learning procedure which can be easily combined with standard gradient-based solvers to be used in traditional machine learning workflows. Empirically, we verify that our proposed approach, despite its simplicity, performs as well or better than even the best-performing candidates derived from alternative criteria such as CVaR or DRO risks on a variety of datasets.
In large-scale systems there are fundamental challenges when centralised techniques are used for task allocation. The number of interactions is limited by resource constraints such as on computation, storage, and network communication. We can increase scalability by implementing the system as a distributed task-allocation system, sharing tasks across many agents. However, this also increases the resource cost of communications and synchronisation, and is difficult to scale. In this paper we present four algorithms to solve these problems. The combination of these algorithms enable each agent to improve their task allocation strategy through reinforcement learning, while changing how much they explore the system in response to how optimal they believe their current strategy is, given their past experience. We focus on distributed agent systems where the agents' behaviours are constrained by resource usage limits, limiting agents to local rather than system-wide knowledge. We evaluate these algorithms in a simulated environment where agents are given a task composed of multiple subtasks that must be allocated to other agents with differing capabilities, to then carry out those tasks. We also simulate real-life system effects such as networking instability. Our solution is shown to solve the task allocation problem to 6.7% of the theoretical optimal within the system configurations considered. It provides 5x better performance recovery over no-knowledge retention approaches when system connectivity is impacted, and is tested against systems up to 100 agents with less than a 9% impact on the algorithms' performance.
This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language