This paper studies the discretization of a homogenization and dimension reduction model for the elastic deformation of microstructured thin plates proposed by Hornung, Neukamm, and Vel\v{c}i\'c in 2014. Thereby, a nonlinear bending energy is based on a homogenized quadratic form which acts on the second fundamental form associated with the elastic deformation. Convergence is proven for a multi-affine finite element discretization of the involved three-dimensional microscopic cell problems and a discrete Kirchhoff triangle discretization of the two-dimensional isometry-constrained macroscopic problem. Finally, the convergence properties are numerically verified in selected test cases and qualitatively compared with deformation experiments for microstructured sheets of paper.
Seven years ago, researchers proposed a postprocessing method to equalize the error rates of a model across different demographic groups. The work launched hundreds of papers purporting to improve over the postprocessing baseline. We empirically evaluate these claims through thousands of model evaluations on several tabular datasets. We find that the fairness-accuracy Pareto frontier achieved by postprocessing contains all other methods we were feasibly able to evaluate. In doing so, we address two common methodological errors that have confounded previous observations. One relates to the comparison of methods with different unconstrained base models. The other concerns methods achieving different levels of constraint relaxation. At the heart of our study is a simple idea we call unprocessing that roughly corresponds to the inverse of postprocessing. Unprocessing allows for a direct comparison of methods using different underlying models and levels of relaxation.
Large Language models (LLMs) possess the capability to engage In-context Learning (ICL) by leveraging a few demonstrations pertaining to a new downstream task as conditions. However, this particular learning paradigm suffers from high instability stemming from substantial variances induced by factors such as the input distribution of selected examples, their ordering, and prompt formats. In this work, we demonstrate that even when all these factors are held constant, the random selection of examples still results in high variance. Consequently, we aim to explore the informative ability of data examples by quantifying the Information Gain (IG) obtained in prediction after observing a given example candidate. Then we propose to sample those with maximum IG. Additionally, we identify the presence of template bias, which can lead to unfair evaluations of IG during the sampling process. To mitigate this bias, we introduce Calibration Before Sampling strategy. The experimental results illustrate that our proposed method can yield an average relative improvement of 14.3% across six classification tasks using three LLMs.
This paper proposes a general optimization framework for rate splitting multiple access (RSMA) in beyond diagonal (BD) reconfigurable intelligent surface (RIS) assisted ultra-reliable low-latency communications (URLLC) systems. This framework can provide a suboptimal solution for a large family of optimization problems in which the objective and/or constraints are linear functions of the rates and/or energy efficiency (EE) of users. Using this framework, we show that RSMA and RIS can be mutually beneficial tools when the system is overloaded, i.e., when the number of users per cell is higher than the number of base station (BS) antennas. Additionally, we show that the benefits of RSMA increase when the packets are shorter and/or the reliability constraint is more stringent. Furthermore, we show that the RSMA benefits increase with the number of users per cell and decrease with the number of BS antennas. Finally, we show that RIS (either diagonal or BD) can highly improve the system performance, and BD-RIS outperforms regular RIS.
Case-based reasoning (CBR) as a methodology for problem-solving can use any appropriate computational technique. This position paper argues that CBR researchers have somewhat overlooked recent developments in deep learning and large language models (LLMs). The underlying technical developments that have enabled the recent breakthroughs in AI have strong synergies with CBR and could be used to provide a persistent memory for LLMs to make progress towards Artificial General Intelligence.
An executable binary typically contains a large number of machine instructions. Although the statistics of popular instructions is well known, the distribution of non-popular instructions has been relatively under explored. Our finding shows that an arbitrary group of binaries com es with both i) a similar distribution of common machine instructions, and ii) quite a few rarely appeared instructions (e.g., less than five occurrences) apart from the distribution. Their infrequency may represent the signature of a code chunk or the footprint of a binary. In this work, we investigate such rare instructions with an in-depth analysis at the source level, clas sifying them into four categories.
The study of persistence rests largely on the result that any finitely-indexed persistence module of finite-dimensional vector spaces admits an interval decomposition -- that is, a decomposition as a direct sum of interval modules. This result fails if we replace vector spaces with modules over more general coefficient rings. For example, not every persistence module of finitely-generated free abelian groups admits an interval decomposition. Nevertheless, many interesting examples of such persistence modules have been empirically observed to decompose into intervals. Due to the prevalence of these modules in applied and theoretical settings, it is important to understand the conditions under which interval decomposition is possible. We provide a necessary and sufficient condition, and a polynomial-time algorithm to either (a) compute an interval decomposition of a persistence module of free abelian groups, or (b) certify that no such decomposition exists. This complements earlier work, which characterizes filtered topological spaces whose persistence diagrams are independent of the choice of ground field.
This paper adopts a tool from computational topology, the Euler characteristic curve (ECC) of a sample, to perform one- and two-sample goodness of fit tests. We call our procedure TopoTests. The presented tests work for samples of arbitrary dimension, having comparable power to the state-of-the-art tests in the one-dimensional case. It is demonstrated that the type I error of TopoTests can be controlled and their type II error vanishes exponentially with increasing sample size. Extensive numerical simulations of TopoTests are conducted to demonstrate their power for samples of various sizes.
The execution of deep neural network (DNN) algorithms suffers from significant bottlenecks due to the separation of the processing and memory units in traditional computer systems. Emerging memristive computing systems introduce an in situ approach that overcomes this bottleneck. The non-volatility of memristive devices, however, may expose the DNN weights stored in memristive crossbars to potential theft attacks. Therefore, this paper proposes a two-dimensional permutation-based protection (TDPP) method that thwarts such attacks. We first introduce the underlying concept that motivates the TDPP method: permuting both the rows and columns of the DNN weight matrices. This contrasts with previous methods, which focused solely on permuting a single dimension of the weight matrices, either the rows or columns. While it's possible for an adversary to access the matrix values, the original arrangement of rows and columns in the matrices remains concealed. As a result, the extracted DNN model from the accessed matrix values would fail to operate correctly. We consider two different memristive computing systems (designed for layer-by-layer and layer-parallel processing, respectively) and demonstrate the design of the TDPP method that could be embedded into the two systems. Finally, we present a security analysis. Our experiments demonstrate that TDPP can achieve comparable effectiveness to prior approaches, with a high level of security when appropriately parameterized. In addition, TDPP is more scalable than previous methods and results in reduced area and power overheads. The area and power are reduced by, respectively, 1218$\times$ and 2815$\times$ for the layer-by-layer system and by 178$\times$ and 203$\times$ for the layer-parallel system compared to prior works.
The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.