This study presents a novel physics-enhanced machine learning (ML) and optimization framework tailored to address the challenges of designing intricate spinodal metamaterials with customized mechanical properties in scenarios where computational modeling is restricted, and experimental data is sparse. By utilizing sparse experimental data directly, our approach facilitates the inverse design of spinodal structures with precise finite-strain mechanical responses. Leveraging physics-based inductive biases to compensate for limited data availability, the framework sheds light on instability-induced pattern formation in periodic metamaterials, attributing it to nonconvex energetic potentials. Inspired by phase transformation modeling, the method integrates multiple partial input convex neural networks to create nonconvex potentials, effectively capturing complex nonlinear stress-strain behavior, even under extreme deformations.
We present MIPS, a novel method for program synthesis based on automated mechanistic interpretability of neural networks trained to perform the desired task, auto-distilling the learned algorithm into Python code. We test MIPS on a benchmark of 62 algorithmic tasks that can be learned by an RNN and find it highly complementary to GPT-4: MIPS solves 32 of them, including 13 that are not solved by GPT-4 (which also solves 30). MIPS uses an integer autoencoder to convert the RNN into a finite state machine, then applies Boolean or integer symbolic regression to capture the learned algorithm. As opposed to large language models, this program synthesis technique makes no use of (and is therefore not limited by) human training data such as algorithms and code from GitHub. We discuss opportunities and challenges for scaling up this approach to make machine-learned models more interpretable and trustworthy.
This study introduces a novel machine learning framework, integrating domain knowledge, to accurately predict the bearing capacity of CFSTs, bridging the gap between traditional engineering and machine learning techniques. Utilizing a comprehensive database of 2621 experimental data points on CFSTs, we developed a Domain Knowledge Enhanced Neural Network (DKNN) model. This model incorporates advanced feature engineering techniques, including Pearson correlation, XGBoost, and Random tree algorithms. The DKNN model demonstrated a marked improvement in prediction accuracy, with a Mean Absolute Percentage Error (MAPE) reduction of over 50% compared to existing models. Its robustness was confirmed through extensive performance assessments, maintaining high accuracy even in noisy environments. Furthermore, sensitivity and SHAP analysis were conducted to assess the contribution of each effective parameter to axial load capacity and propose design recommendations for the diameter of cross-section, material strength range and material combination. This research advances CFST predictive modelling, showcasing the potential of integrating machine learning with domain expertise in structural engineering. The DKNN model sets a new benchmark for accuracy and reliability in the field.
The challenge in biomarker discovery using machine learning from omics data lies in the abundance of molecular features but scarcity of samples. Most feature selection methods in machine learning require evaluating various sets of features (models) to determine the most effective combination. This process, typically conducted using a validation dataset, involves testing different feature sets to optimize the model's performance. Evaluations have performance estimation error and when the selection involves many models the best ones are almost certainly overestimated. Biomarker identification with feature selection methods can be addressed as a multi-objective problem with trade-offs between predictive ability and parsimony in the number of features. Genetic algorithms are a popular tool for multi-objective optimization but they evolve numerous solutions thus are prone to overestimation. Methods have been proposed to reduce the overestimation after a model has already been selected in single-objective problems, but no algorithm existed capable of reducing the overestimation during the optimization, improving model selection, or applied in the more general multi-objective domain. We propose DOSA-MO, a novel multi-objective optimization wrapper algorithm that learns how the original estimation, its variance, and the feature set size of the solutions predict the overestimation. DOSA-MO adjusts the expectation of the performance during the optimization, improving the composition of the solution set. We verify that DOSA-MO improves the performance of a state-of-the-art genetic algorithm on left-out or external sample sets, when predicting cancer subtypes and/or patient overall survival, using three transcriptomics datasets for kidney and breast cancer.
This paper investigates the impact of multiscale data on machine learning algorithms, particularly in the context of deep learning. A dataset is multiscale if its distribution shows large variations in scale across different directions. This paper reveals multiscale structures in the loss landscape, including its gradients and Hessians inherited from the data. Correspondingly, it introduces a novel gradient descent approach, drawing inspiration from multiscale algorithms used in scientific computing. This approach seeks to transcend empirical learning rate selection, offering a more systematic, data-informed strategy to enhance training efficiency, especially in the later stages.
Based on the expectile loss function and the adaptive LASSO penalty, the paper proposes and studies the estimation methods for the accelerated failure time (AFT) model. In this approach, we need to estimate the survival function of the censoring variable by the Kaplan-Meier estimator. The AFT model parameters are first estimated by the expectile method and afterwards, when the number of explanatory variables can be large, by the adaptive LASSO expectile method which directly carries out the automatic selection of variables. We also obtain the convergence rate and asymptotic normality for the two estimators, while showing the sparsity property for the censored adaptive LASSO expectile estimator. A numerical study using Monte Carlo simulations confirms the theoretical results and demonstrates the competitive performance of the two proposed estimators. The usefulness of these estimators is illustrated by applying them to three survival data sets.
When applying Hamiltonian operator splitting methods for the time integration of multi-species Vlasov-Maxwell-Landau systems, the reliable and efficient numerical approximation of the Landau equation represents a fundamental component of the entire algorithm. Substantial computational issues arise from the treatment of the physically most relevant three-dimensional case with Coulomb interaction. This work is concerned with the introduction and numerical comparison of novel approaches for the evaluation of the Landau collision operator. In the spirit of collocation, common tools are the identification of fundamental integrals, series expansions of the integral kernel and the density function on the main part of the velocity domain, and interpolation as well as quadrature approximation nearby the singularity of the kernel. Focusing on the favourable choice of the Fourier spectral method, their practical implementation uses the reduction to basic integrals, fast Fourier techniques, and summations along certain directions. Moreover, an important observation is that a significant percentage of the overall computational effort can be transferred to precomputations which are independent of the density function. For the purpose of exposition and numerical validation, the cases of constant, regular, and singular integral kernels are distinguished, and the procedure is adapted accordingly to the increasing complexity of the problem. With regard to the time integration of the Landau equation, the most expedient approach is applied in such a manner that the conservation of mass is ensured.
We study variation in policing outcomes attributable to differential policing practices in New York City (NYC) using geographic regression discontinuity designs (GeoRDDs). By focusing on small geographic windows near police precinct boundaries we can estimate local average treatment effects of police precincts on arrest rates. We propose estimands and develop estimators for the GeoRDD when the data come from a spatial point process. Additionally, standard GeoRDDs rely on continuity assumptions of the potential outcome surface or a local randomization assumption within a window around the boundary. These assumptions, however, can easily be violated in realistic applications. We develop a novel and robust approach to testing whether there are differences in policing outcomes that are caused by differences in police precincts across NYC. Importantly, this approach is applicable to standard regression discontinuity designs with both numeric and point process data. This approach is robust to violations of traditional assumptions made, and is valid under weaker assumptions. We use a unique form of resampling to provide a valid estimate of our test statistic's null distribution even under violations of standard assumptions. This procedure gives substantially different results in the analysis of NYC arrest rates than those that rely on standard assumptions.
Autonomous vehicles rely on accurate trajectory prediction to inform decision-making processes related to navigation and collision avoidance. However, current trajectory prediction models show signs of overfitting, which may lead to unsafe or suboptimal behavior. To address these challenges, this paper presents a comprehensive framework that categorizes and assesses the definitions and strategies used in the literature on evaluating and improving the robustness of trajectory prediction models. This involves a detailed exploration of various approaches, including data slicing methods, perturbation techniques, model architecture changes, and post-training adjustments. In the literature, we see many promising methods for increasing robustness, which are necessary for safe and reliable autonomous driving.
This study introduces a two-scale Graph Neural Operator (GNO), namely, LatticeGraphNet (LGN), designed as a surrogate model for costly nonlinear finite-element simulations of three-dimensional latticed parts and structures. LGN has two networks: LGN-i, learning the reduced dynamics of lattices, and LGN-ii, learning the mapping from the reduced representation onto the tetrahedral mesh. LGN can predict deformation for arbitrary lattices, therefore the name operator. Our approach significantly reduces inference time while maintaining high accuracy for unseen simulations, establishing the use of GNOs as efficient surrogate models for evaluating mechanical responses of lattices and structures.
The present article introduces, mathematically analyzes, and numerically validates a new weak Galerkin (WG) mixed-FEM based on Banach spaces for the stationary Navier--Stokes equation in pseudostress-velocity formulation. More precisely, a modified pseudostress tensor, called $ \boldsymbol{\sigma} $, depending on the pressure, and the diffusive and convective terms has been introduced in the proposed technique, and a dual-mixed variational formulation has been derived where the aforementioned pseudostress tensor and the velocity, are the main unknowns of the system, whereas the pressure is computed via a post-processing formula. Thus, it is sufficient to provide a WG space for the tensor variable and a space of piecewise polynomial vectors of total degree at most 'k' for the velocity. Moreover, in order to define the weak discrete bilinear form, whose continuous version involves the classical divergence operator, the weak divergence operator as a well-known alternative for the classical divergence operator in a suitable discrete subspace is proposed. The well-posedness of the numerical solution is proven using a fixed-point approach and the discrete versions of the Babu\v{s}ka-Brezzi theory and the Banach-Ne\v{c}as-Babu\v{s}ka theorem. Additionally, an a priori error estimate is derived for the proposed method. Finally, several numerical results illustrating the method's good performance and confirming the theoretical rates of convergence are presented.