Computational fluid dynamics (CFD) simulations of a single-cylinder gasoline compression ignition engine are performed to investigate the impact of gasoline-ethanol blending on autoignition, nitrogen oxide (NOx), and soot emissions under low-load conditions. A four-component toluene primary reference fuel (TPRF) + ethanol (ETPRF) surrogate (with 10% ethanol by volume; E10) is employed to represent the test gasoline (RD5-87). A 3D engine CFD model employing finite-rate chemistry with a skeletal kinetic mechanism, adaptive mesh refinement (AMR), and hybrid method of moments (HMOM) is adopted to capture in-cylinder combustion and soot/NOx emissions. The engine CFD model is validated against experimental data for three gasoline-ethanol blends: E10, E30 and E100, with varying ethanol content by volume. Model validation is carried out for multiple start-of-injection (SOI) timings (-21, -27, -36, and -45 crank angle degrees after top-dead-center (aTDC)) with respect to in-cylinder pressure, heat release rate, combustion phasing, NOx and soot emissions. For late injection timings (-21 and -27oaTDC), E30 yields higher soot than E10; while the trend reverses for early injection cases (-36 and -45oaTDC). E100 yields the lowest amount of soot among all fuels irrespective of SOI timing. Further, E10 shows a non-monotonic trend in soot emissions with SOI timing: SOI-36>SOI-45>SOI-21>SOI-27, while soot emissions from E30 exhibit monotonic decrease with advancing SOI timing. NOx emissions from various fuels follow a trend of E10>E30>E100. NOx emissions increase as SOI timing is advanced for all fuels, with an anomaly for E10 and E100 where NOx decreases when SOI is advanced beyond -36oaTDC. Detailed analysis of the numerical results is performed to investigate the emission trends and elucidate the impact of chemical composition and physical properties on autoignition and emissions characteristics.
Federated Learning (FL) allows multiple privacy-sensitive applications to leverage their dataset for a global model construction without any disclosure of the information. One of those domains is healthcare, where groups of silos collaborate in order to generate a global predictor with improved accuracy and generalization. However, the inherent challenge lies in the high heterogeneity of medical data, necessitating sophisticated techniques for assessment and compensation. This paper presents a comprehensive exploration of the mathematical formalization and taxonomy of heterogeneity within FL environments, focusing on the intricacies of medical data. In particular, we address the evaluation and comparison of the most popular FL algorithms with respect to their ability to cope with quantity-based, feature and label distribution-based heterogeneity. The goal is to provide a quantitative evaluation of the impact of data heterogeneity in FL systems for healthcare networks as well as a guideline on FL algorithm selection. Our research extends beyond existing studies by benchmarking seven of the most common FL algorithms against the unique challenges posed by medical data use cases. The paper targets the prediction of the risk of stroke recurrence through a set of tabular clinical reports collected by different federated hospital silos: data heterogeneity frequently encountered in this scenario and its impact on FL performance are discussed.
We propose hardware-oriented models of intrinsic plasticity (IP) and synaptic plasticity (SP) for spiking randomly connected recursive neural network (RNN). Although the potential of RNNs for temporal data processing has been demonstrated, randomness of the network architecture often causes performance degradation. Self-organization mechanism using IP and SP can mitigate the degradation, therefore, we compile these functions in a spiking neuronal model. To implement the function of IP, a variable firing threshold is introduced to each excitatory neuron in the RNN that changes stepwise in accordance with its activity. We also define other thresholds for SP that synchronize with the firing threshold, which determine the direction of stepwise synaptic update that is executed on receiving a pre-synaptic spike. We demonstrate the effectiveness of our model through simulations of temporal data learning and anomaly detection with a spiking RNN using publicly available electrocardiograms. Considering hardware implementation, we employ discretized thresholds and synaptic weights and show that these parameters can be reduced to binary if the RNN architecture is appropriately designed. This contributes to minimization of the circuit of the neuronal system having IP and SP.
We consider a graph coloring algorithm that processes vertices in order taken uniformly at random and assigns colors to them using First-Fit strategy. We show that this algorithm uses, in expectation, at most $(\frac{1}{2} + o(1))\cdot \ln n \,/\, \ln\ln n$ different colors to color any forest with $n$ vertices. We also construct a family of forests that shows that this bound is best possible.
This study delves into the auto-ignition temperature of n-heptane and ethanol mixtures within a counterflow flame configuration under low strain rate, with a particular focus on the impact of ethanol blending on heat release rates. Employing the sensitivity analysis method inspired by Zurada's sensitivity approach for neural network, this study identifies the group of critical species influencing the heat release rate. Further analysis concentration change reveals the intricate interactions among these various radicals across different temperature zones. It is found that, in n-heptane dominant mixtures, inhibition of low-temperature chemistry (LTC) caused by additional ethanol, impacts heat release rate at high temperature zone through diffusion effect of specific radicals such as CH2O, C2H4, C3H6 and H2O2. For ethanol-dominant mixtures, an increase in heat release rate was observed with higher ethanol fraction. Further concentration change analysis elucidated it is primarily attributed to the decomposition of ethanol and its subsequent reactions. This research underscores the significance of incorporating both chemical kinetics and species diffusion effects when analyzing the counterflow configuration of complex fuel mixtures.
This work introduces a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes, inspired by the large-sample properties of the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. Specifically, we establish matching upper and lower bounds for the ELBO without assuming conjugate priors, suggesting the consistency of model selection for FMMs based on maximizing the ELBO. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution, which tends to eliminate the extra components under model misspecification where the number of mixture components is over-specified. This stable behavior also leads to the $n^{-1/2}$ convergence rate for parameter estimation, up to a logarithmic factor, under this model overspecification. Empirical experiments are conducted to validate our theoretical findings and compare with other state-of-the-art methods for selecting the number of components in FMMs.
Multi-modal foundation models such as CLIP have showcased impressive zero-shot capabilities. However, their applicability in resource-constrained environments is limited due to their large number of parameters and high inference time. While existing approaches have scaled down the entire CLIP architecture, we focus on training smaller variants of the image encoder, which suffices for efficient zero-shot classification. The use of synthetic data has shown promise in distilling representations from larger teachers, resulting in strong few-shot and linear probe performance. However, we find that this approach surprisingly fails in true zero-shot settings when using contrastive losses. We identify the exploitation of spurious features as being responsible for poor generalization between synthetic and real data. However, by using the image feature-based L2 distillation loss, we mitigate these problems and train students that achieve zero-shot performance which on four domain-specific datasets is on-par with a ViT-B/32 teacher model trained on DataCompXL, while featuring up to 92% fewer parameters.
Software engineers develop, fine-tune, and deploy deep learning (DL) models using a variety of development frameworks and runtime environments. DL model converters move models between frameworks and to runtime environments. Conversion errors compromise model quality and disrupt deployment. However, the failure characteristics of DL model converters are unknown, adding risk when using DL interoperability technologies. This paper analyzes failures in DL model converters. We survey software engineers about DL interoperability tools, use cases, and pain points (N=92). Then, we characterize failures in model converters associated with the main interoperability tool, ONNX (N=200 issues in PyTorch and TensorFlow). Finally, we formulate and test two hypotheses about structural causes for the failures we studied. We find that the node conversion stage of a model converter accounts for ~75% of the defects and 33% of reported failure are related to semantically incorrect models. The cause of semantically incorrect models is elusive, but models with behaviour inconsistencies share operator sequences. Our results motivate future research on making DL interoperability software simpler to maintain, extend, and validate. Research into behavioural tolerances and architectural coverage metrics could be fruitful.
The success of AI models relies on the availability of large, diverse, and high-quality datasets, which can be challenging to obtain due to data scarcity, privacy concerns, and high costs. Synthetic data has emerged as a promising solution by generating artificial data that mimics real-world patterns. This paper provides an overview of synthetic data research, discussing its applications, challenges, and future directions. We present empirical evidence from prior art to demonstrate its effectiveness and highlight the importance of ensuring its factuality, fidelity, and unbiasedness. We emphasize the need for responsible use of synthetic data to build more powerful, inclusive, and trustworthy language models.
Deep neural networks have revolutionized many machine learning tasks in power systems, ranging from pattern recognition to signal processing. The data in these tasks is typically represented in Euclidean domains. Nevertheless, there is an increasing number of applications in power systems, where data are collected from non-Euclidean domains and represented as the graph-structured data with high dimensional features and interdependency among nodes. The complexity of graph-structured data has brought significant challenges to the existing deep neural networks defined in Euclidean domains. Recently, many studies on extending deep neural networks for graph-structured data in power systems have emerged. In this paper, a comprehensive overview of graph neural networks (GNNs) in power systems is proposed. Specifically, several classical paradigms of GNNs structures (e.g., graph convolutional networks, graph recurrent neural networks, graph attention networks, graph generative networks, spatial-temporal graph convolutional networks, and hybrid forms of GNNs) are summarized, and key applications in power systems such as fault diagnosis, power prediction, power flow calculation, and data generation are reviewed in detail. Furthermore, main issues and some research trends about the applications of GNNs in power systems are discussed.
We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed Attention U-Net architecture is evaluated on two large CT abdominal datasets for multi-class image segmentation. Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency. The code for the proposed architecture is publicly available.