For finite element (FE) analysis of no-insulation (NI) high-temperature superconducting (HTS) pancake coils, the high aspect ratio of the turn-to-turn contact layer (T2TCL) leads to meshing difficulties which result in either poor quality mesh elements resulting in a decrease of the solution accuracy or a high number of degrees of freedom. We proposed to mitigate this issue by collapsing the T2TCL volume into a surface and using a so-called thin shell approximation (TSA). Previously, two TSA have been introduced, one to solve the heat equation and the other for an $\vec{H}-\phi$ magnetodynamic formulation. In this work, we propose to combine the magnetodynamic and thermal TSA to create a coupled magneto-thermal TSA for three-dimensional FE analysis. Particular attention is paid to the detailed derivation of the coupling terms. In the context of NI HTS pancake coils, the TSA represents the electric and thermal contact resistance of the T2TCL. For the HTS coated conductor (CC) itself, an anisotropic homogenization is used which represents its multi-layered structure. In axial and azimuthal direction, it resolves the current sharing between the HTS and other layers of the CC. The coupled TSA formulation is verified against a reference model with volumetric T2TCL. The coupled TSA is shown to significantly reduce the solution time as well as the manual effort required for high-quality meshes of the T2TCL. The implementation is open-source and a reference implementation is made publicly available.
Large Language Models (LLMs) have proven effective at In-Context Learning (ICL), an ability that allows them to create predictors from labeled examples. Few studies have explored the interplay between ICL and specific properties of functions it attempts to approximate. In our study, we use a formal framework to explore ICL and propose a new task of approximating functions with varying number of minima. We implement a method that allows for producing functions with given inputs as minima. We find that increasing the number of minima degrades ICL performance. At the same time, our evaluation shows that ICL outperforms 2-layer Neural Network (2NN) model. Furthermore, ICL learns faster than 2NN in all settings. We validate the findings through a set of few-shot experiments across various hyperparameter configurations.
Due to the ever-increasing complexity of income tax laws in the United States, the number of US taxpayers filing their taxes using tax preparation software (henceforth, tax software) continues to increase. According to the U.S. Internal Revenue Service (IRS), in FY22, nearly 50% of taxpayers filed their individual income taxes using tax software. Given the legal consequences of incorrectly filing taxes for the taxpayer, ensuring the correctness of tax software is of paramount importance. Metamorphic testing has emerged as a leading solution to test and debug legal-critical tax software due to the absence of correctness requirements and trustworthy datasets. The key idea behind metamorphic testing is to express the properties of a system in terms of the relationship between one input and its slightly metamorphosed twinned input. Extracting metamorphic properties from IRS tax publications is a tedious and time-consuming process. As a response, this paper formulates the task of generating metamorphic specifications as a translation task between properties extracted from tax documents - expressed in natural language - to a contrastive first-order logic form. We perform a systematic analysis on the potential and limitations of in-context learning with Large Language Models(LLMs) for this task, and outline a research agenda towards automating the generation of metamorphic specifications for tax preparation software.
Existing graph contrastive learning (GCL) techniques typically require two forward passes for a single instance to construct the contrastive loss, which is effective for capturing the low-frequency signals of node features. Such a dual-pass design has shown empirical success on homophilic graphs, but its effectiveness on heterophilic graphs, where directly connected nodes typically have different labels, is unknown. In addition, existing GCL approaches fail to provide strong performance guarantees. Coupled with the unpredictability of GCL approaches on heterophilic graphs, their applicability in real-world contexts is limited. Then, a natural question arises: Can we design a GCL method that works for both homophilic and heterophilic graphs with a performance guarantee? To answer this question, we theoretically study the concentration property of features obtained by neighborhood aggregation on homophilic and heterophilic graphs, introduce the single-pass augmentation-free graph contrastive learning loss based on the property, and provide performance guarantees for the minimizer of the loss on downstream tasks. As a direct consequence of our analysis, we implement the Single-Pass Graph Contrastive Learning method (SP-GCL). Empirically, on 14 benchmark datasets with varying degrees of homophily, the features learned by the SP-GCL can match or outperform existing strong baselines with significantly less computational overhead, which demonstrates the usefulness of our findings in real-world cases.
We conduct a thorough study of different forms of horizontally explicit and vertically implicit (HEVI) time-integration strategies for the compressible Euler equations on spherical domains typical of nonhydrostatic global atmospheric applications. We compare the computational time and complexity of two nonlinear variants (NHEVI-GMRES and NHEVI-LU) and a linear variant (LHEVI). We report on the performance of these three variants for a number of additive Runge-Kutta Methods ranging in order of accuracy from second through fifth, and confirm the expected order of accuracy of the HEVI methods for each time-integrator. To gauge the maximum usable time-step of each HEVI method, we run simulations of a nonhydrostatic baroclinic instability for 100 days and then use this time-step to compare the time-to-solution of each method. The results show that NHEVI-LU is 2x faster than NHEVI-GMRES, and LHEVI is 5x faster than NHEVI-LU, for the idealized cases tested. The baroclinic instability and inertia-gravity wave simulations indicate that the optimal choice of time-integrator is LHEVI with either second or third order schemes, as both schemes yield similar time to solution and relative L2 error at their maximum usable time-steps. In the future, we will report on whether these results hold for more complex problems using, e.g., real atmospheric data and/or a higher model top typical of space weather applications.
We study learning-based design of fair allocation mechanisms for divisible resources, using proportional fairness (PF) as a benchmark. The learning setting is a significant departure from the classic mechanism design literature, in that, we need to learn fair mechanisms solely from data. In particular, we consider the challenging problem of learning one-shot allocation mechanisms -- without the use of money -- that incentivize strategic agents to be truthful when reporting their valuations. It is well-known that the mechanism that directly seeks to optimize PF is not incentive compatible, meaning that the agents can potentially misreport their preferences to gain increased allocations. We introduce the notion of "exploitability" of a mechanism to measure the relative gain in utility from misreport, and make the following important contributions in the paper: (i) Using sophisticated techniques inspired by differentiable convex programming literature, we design a numerically efficient approach for computing the exploitability of the PF mechanism. This novel contribution enables us to quantify the gap that needs to be bridged to approximate PF via incentive compatible mechanisms. (ii) Next, we modify the PF mechanism to introduce a trade-off between fairness and exploitability. By properly controlling this trade-off using data, we show that our proposed mechanism, ExPF-Net, provides a strong approximation to the PF mechanism while maintaining low exploitability. This mechanism, however, comes with a high computational cost. (iii) To address the computational challenges, we propose another mechanism ExS-Net, which is end-to-end parameterized by a neural network. ExS-Net enjoys similar (slightly inferior) performance and significantly accelerated training and inference time performance. (iv) Extensive numerical simulations demonstrate the robustness and efficacy of the proposed mechanisms.
Agent-based simulation, a powerful tool for analyzing complex systems, faces challenges when integrating geographic elements due to increased computational demands. This study introduces a series of 'agent-in-the-cell' Agent-Based Models to simulate COVID spread in a city, utilizing geographical features and real-world mobility data from Safegraph. We depart from traditional aggregated transmission probabilities, focusing on direct person-to-person contact probabilities, informed by physics-based transmission studies. Our approach addresses computational complexities through innovative strategies. Agents, termed 'meta-agents', are linked to specific home cells in a city's tessellation. We explore various tessellations and agent densities, finding that Voronoi Diagram tessellations, based on specific street network locations, outperform Census Block Group tessellations in preserving dynamics. Additionally, a hybrid tessellation combining Voronoi Diagrams and Census Block Groups proves effective with fewer meta-agents, maintaining an accurate representation of city dynamics. Our analysis covers diverse city sizes in the U.S., offering insights into agent count reduction effects, sensitivity metrics, and city-specific factors. We benchmark our model against an existing ABM, focusing on runtime and reduced agent count implications. Key optimizations include meta-agent usage, advanced tessellation methods, and parallelization techniques. This study's findings contribute to the field of agent-based modeling, especially in scenarios requiring geographic specificity and high computational efficiency.
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond just a ``chatbot'', and use it as an assistant or even replacement for domain experts and tools in specific domains such as healthcare, finance, and education. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). To fill such a gap, explosively-increase research, and practices have been conducted in very recent years on the domain specialization of LLMs, which, however, calls for a comprehensive and systematic review to better summarizes and guide this promising domain. In this survey paper, first, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. We also present a comprehensive taxonomy of critical application domains that can benefit from specialized LLMs, discussing their practical significance and open challenges. Furthermore, we offer insights into the current research status and future trends in this area.
Graph Convolutional Networks (GCNs) have been widely applied in various fields due to their significant power on processing graph-structured data. Typical GCN and its variants work under a homophily assumption (i.e., nodes with same class are prone to connect to each other), while ignoring the heterophily which exists in many real-world networks (i.e., nodes with different classes tend to form edges). Existing methods deal with heterophily by mainly aggregating higher-order neighborhoods or combing the immediate representations, which leads to noise and irrelevant information in the result. But these methods did not change the propagation mechanism which works under homophily assumption (that is a fundamental part of GCNs). This makes it difficult to distinguish the representation of nodes from different classes. To address this problem, in this paper we design a novel propagation mechanism, which can automatically change the propagation and aggregation process according to homophily or heterophily between node pairs. To adaptively learn the propagation process, we introduce two measurements of homophily degree between node pairs, which is learned based on topological and attribute information, respectively. Then we incorporate the learnable homophily degree into the graph convolution framework, which is trained in an end-to-end schema, enabling it to go beyond the assumption of homophily. More importantly, we theoretically prove that our model can constrain the similarity of representations between nodes according to their homophily degree. Experiments on seven real-world datasets demonstrate that this new approach outperforms the state-of-the-art methods under heterophily or low homophily, and gains competitive performance under homophily.
High spectral dimensionality and the shortage of annotations make hyperspectral image (HSI) classification a challenging problem. Recent studies suggest that convolutional neural networks can learn discriminative spatial features, which play a paramount role in HSI interpretation. However, most of these methods ignore the distinctive spectral-spatial characteristic of hyperspectral data. In addition, a large amount of unlabeled data remains an unexploited gold mine for efficient data use. Therefore, we proposed an integration of generative adversarial networks (GANs) and probabilistic graphical models for HSI classification. Specifically, we used a spectral-spatial generator and a discriminator to identify land cover categories of hyperspectral cubes. Moreover, to take advantage of a large amount of unlabeled data, we adopted a conditional random field to refine the preliminary classification results generated by GANs. Experimental results obtained using two commonly studied datasets demonstrate that the proposed framework achieved encouraging classification accuracy using a small number of data for training.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.