Kirigami are part of the larger class of mechanical metamaterials, which exhibit exotic properties. This article focuses on rhombi-slits, which is a specific type of kirigami. A nonlinear kinematics model was previously proposed as a second order divergence form PDE with a possibly degenerate, and sign-changing coefficient matrix. We first propose to study the existence solutions of this equation by using the limiting absorption principle. Then, we propose a numerical method based on adding a complex dissipation to approximate the solutions. Finally, comparisons of simulations with experiments are performed.
Despite the growing interest in ML-guided EDA tools from RTL to GDSII, there are no standard datasets or prototypical learning tasks defined for the EDA problem domain. Experience from the computer vision community suggests that such datasets are crucial to spur further progress in ML for EDA. Here we describe our experience curating two large-scale, high-quality datasets for Verilog code generation and logic synthesis. The first, VeriGen, is a dataset of Verilog code collected from GitHub and Verilog textbooks. The second, OpenABC-D, is a large-scale, labeled dataset designed to aid ML for logic synthesis tasks. The dataset consists of 870,000 And-Inverter-Graphs (AIGs) produced from 1500 synthesis runs on a large number of open-source hardware projects. In this paper we will discuss challenges in curating, maintaining and growing the size and scale of these datasets. We will also touch upon questions of dataset quality and security, and the use of novel data augmentation tools that are tailored for the hardware domain.
Anomaly detection requires detecting abnormal samples in large unlabeled datasets. While progress in deep learning and the advent of foundation models has produced powerful unsupervised anomaly detection methods, their deployment in practice is often hindered by the lack of labeled data -- without it, the detection accuracy of an anomaly detector cannot be evaluated reliably. In this work, we propose a general-purpose framework for evaluating image-based anomaly detectors with synthetically generated validation data. Our method assumes access to a small support set of normal images which are processed with a pre-trained diffusion model (our proposed method requires no training or fine-tuning) to produce synthetic anomalies. When mixed with normal samples from the support set, the synthetic anomalies create detection tasks that compose a validation framework for anomaly detection evaluation and model selection. In an extensive empirical study, ranging from natural images to industrial applications, we find that our synthetic validation framework selects the same models and hyper-parameters as selection with a ground-truth validation set. In addition, we find that prompts selected by our method for CLIP-based anomaly detection outperforms all other prompt selection strategies, and leads to the overall best detection accuracy, even on the challenging MVTec-AD dataset.
The maximum likelihood estimation is widely used for statistical inferences. In this study, we reformulate the h-likelihood proposed by Lee and Nelder in 1996, whose maximization yields maximum likelihood estimators for fixed parameters and asymptotically best unbiased predictors for random parameters. We establish the statistical foundations for h-likelihood theories, which extend classical likelihood theories to embrace broad classes of statistical models with random parameters. The maximum h-likelihood estimators asymptotically achieve the generalized Cramer-Rao lower bound. Furthermore, we explore asymptotic theory when the consistency of either fixed parameter estimation or random parameter prediction is violated. The introduction of this new h-likelihood framework enables likelihood theories to cover inferences for a much broader class of models, while also providing computationally efficient fitting algorithms to give asymptotically optimal estimators for fixed parameters and predictors for random parameters.
We consider two classes of natural stochastic processes on finite unlabeled graphs. These are Euclidean stochastic optimization algorithms on the adjacency matrix of weighted graphs and a modified version of the Metropolis MCMC algorithm on stochastic block models over unweighted graphs. In both cases we show that, as the size of the graph goes to infinity, the random trajectories of the stochastic processes converge to deterministic curves on the space of measure-valued graphons. Measure-valued graphons, introduced by Lov\'{a}sz and Szegedy in \cite{lovasz2010decorated}, are a refinement of the concept of graphons that can distinguish between two infinite exchangeable arrays that give rise to the same graphon limit. We introduce new metrics on this space which provide us with a natural notion of convergence for our limit theorems. This notion is equivalent to the convergence of infinite-exchangeable arrays. Under suitable assumptions and a specified time-scaling, the Metropolis chain admits a diffusion limit as the number of vertices go to infinity. We then demonstrate that, in an appropriately formulated zero-noise limit, the stochastic process of adjacency matrices of this diffusion converges to a deterministic gradient flow curve on the space of graphons introduced in\cite{Oh2023}. A novel feature of this approach is that it provides a precise exponential convergence rate for the Metropolis chain in a certain limiting regime. The connection between a natural Metropolis chain commonly used in exponential random graph models and gradient flows on graphons, to the best of our knowledge, is new in the literature as well.
Inverse problems are characterized by their inherent non-uniqueness and sensitivity with respect to data perturbations. Their stable solution requires the application of regularization methods including variational and iterative regularization methods. Superiorization is a heuristic approach that can steer basic iterative algorithms to have small value of certain regularization functional while keeping the algorithms simplicity and computational efforts, but is able to account for additional prior information. In this note, we combine the superiorization methodology with iterative regularization methods and show that the superiorized version of the scheme yields again a regularization method, however accounting for different prior information.
Diffusion models have demonstrated compelling generation quality by optimizing the variational lower bound through a simple denoising score matching loss. In this paper, we provide theoretical evidence that the prevailing practice of using a constant loss weight strategy in diffusion models leads to biased estimation during the training phase. Simply optimizing the denoising network to predict Gaussian noise with constant weighting may hinder precise estimations of original images. To address the issue, we propose an elegant and effective weighting strategy grounded in the theoretically unbiased principle. Moreover, we conduct a comprehensive and systematic exploration to dissect the inherent bias problem deriving from constant weighting loss from the perspectives of its existence, impact and reasons. These analyses are expected to advance our understanding and demystify the inner workings of diffusion models. Through empirical evaluation, we demonstrate that our proposed debiased estimation method significantly enhances sample quality without the reliance on complex techniques, and exhibits improved efficiency compared to the baseline method both in training and sampling processes.
Knowledge base construction entails acquiring structured information to create a knowledge base of factual and relational data, facilitating question answering, information retrieval, and semantic understanding. The challenge called "Knowledge Base Construction from Pretrained Language Models" at International Semantic Web Conference 2023 defines tasks focused on constructing knowledge base using language model. Our focus was on Track 1 of the challenge, where the parameters are constrained to a maximum of 1 billion, and the inclusion of entity descriptions within the prompt is prohibited. Although the masked language model offers sufficient flexibility to extend its vocabulary, it is not inherently designed for multi-token prediction. To address this, we present Vocabulary Expandable BERT for knowledge base construction, which expand the language model's vocabulary while preserving semantic embeddings for newly added words. We adopt task-specific re-pre-training on masked language model to further enhance the language model. Through experimentation, the results show the effectiveness of our approaches. Our framework achieves F1 score of 0.323 on the hidden test set and 0.362 on the validation set, both data set is provided by the challenge. Notably, our framework adopts a lightweight language model (BERT-base, 0.13 billion parameters) and surpasses the model using prompts directly on large language model (Chatgpt-3, 175 billion parameters). Besides, Token-Recode achieves comparable performances as Re-pretrain. This research advances language understanding models by enabling the direct embedding of multi-token entities, signifying a substantial step forward in link prediction task in knowledge graph and metadata completion in data management.
This work aims to numerically construct exactly commuting matrices close to given almost commuting ones, which is equivalent to the joint approximate diagonalization problem. We first prove that almost commuting matrices generically have approximate common eigenvectors that are almost orthogonal to each other. Based on this key observation, we propose a fast and robust vector-wise joint diagonalization (VJD) algorithm, which constructs the orthogonal similarity transform by sequentially finding these approximate common eigenvectors. In doing so, we consider sub-optimization problems over the unit sphere, for which we present a Riemannian quasi-Newton method with rigorous convergence analysis. We also discuss the numerical stability of the proposed VJD algorithm. Numerical examples with applications in independent component analysis are provided to reveal the relation with Huaxin Lin's theorem and to demonstrate that our method compares favorably with the state-of-the-art Jacobi-type joint diagonalization algorithm.
We consider ordinal online problems, i.e., tasks that only require pairwise comparisons between elements of the input. A classic example is the secretary problem and the game of googol, as well as its multiple combinatorial extensions such as $(J,K)$-secretary, $2$-sided game of googol, ordinal-competitive matroid secretary. A natural approach to these tasks is to use ordinal algorithms that at each step only consider relative ranking among the arrived elements, without looking at the numerical values of the input. We formally study the question of how cardinal algorithms can improve upon ordinal algorithms. We give first a universal construction of the input distribution for any ordinal online problem, such that the advantage of any cardinal algorithm over the ordinal algorithms is at most $1+\varepsilon$ for arbitrary small $\varepsilon> 0$. As an implication, previous lower bounds for the aforementioned variants of secretary problems hold not only against ordinal algorithms, but also against any online algorithm. However, the value range of the input elements in our construction is huge: $N=O\left(\frac{n^3\cdot n!\cdot n!}{\varepsilon}\right)\uparrow\uparrow(n-1)$ (tower of exponents) for an input sequence of length $n$. As a second result, we identify a class of natural ordinal problems and find cardinal algorithm with a matching advantage of $1+ \Omega \left(\frac{1}{\log^{(c)}N}\right),$ where $\log^{(c)}N=\log\ldots\log N$ with $c$ iterative logs and $c$ is an arbitrary constant. Further, we introduce the cardinal complexity for any given ordinal online task: the minimum size $N(\varepsilon)$ of different numerical values in the input such the advantage of cardinal over ordinal algorithms is at most $1+\varepsilon$. As a third result, we show that the game of googol has much lower cardinal complexity of $N=O\left(\left(\frac{n}{\varepsilon}\right)^n\right)$.
Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.