Many scientific and engineering applications require fitting regression models that are nonlinear in the parameters. Advances in computer hardware and software in recent decades have made it easier to fit such models. Relative to fitting regression models that are linear in the parameters, however, fitting nonlinear regression models is more complicated. In particular, software like the $\texttt{nls}$ R function requires care in how the model is parameterized and how initial values are chosen for the maximum likelihood iterations. Often special diagnostics are needed to detect and suggest approaches for dealing with identifiability problems that can arise with such model fitting. When using Bayesian inference, there is the added complication of having to specify (often noninformative or weakly informative) prior distributions. Generally, the details for these tasks must be determined for each new nonlinear regression model. This paper provides a step-by-step procedure for specifying these details for any appropriate nonlinear regression model. Following the procedure will result in a numerically robust algorithm for fitting the nonlinear regression model. We illustrate the methods with three different nonlinear models that are used in the analysis of experimental fatigue data and we include two detailed numerical examples.
Spatial join processing techniques that identify intersections between complex geometries (e.g.,polygons) commonly follow a two-step filter-and-refine pipeline; the filter step evaluates the query predicate on the minimum bounding rectangles (MBRs) of objects and the refinement step eliminates false positives by applying the query on the exact geometries. We propose a raster intervals approximation of object geometries and introduce a powerful intermediate step in pipeline. In a preprocessing phase, our method (i) rasterizes each object geometry using a fine grid, (ii) models groups of nearby cells that intersect the polygon as an interval, and (iii) encodes each interval by a bitstring that captures the overlap of each cell in it with the polygon. Going one step further, we improve our approach to approximate each object by two sets of intervals that succintly capture the raster cells which (i) intersect with the object and (ii) are fully contained in the object. Using this representation, we show that we can verify whether two polygons intersect by a sequence of joins between the interval sets that take linear time. Our approximations can effectively be compressed and can be customized for use on partitioned data and polygons of varying sizes, rasterized at different granularities. Finally, we propose a novel algorithm that computes the interval approximation of a polygon without fully rasterizing it first, rendering the computation of approximations orders of magnitude faster. Experiments on real data demonstrate the effectiveness and efficiency of our proposal over previous work.
Multiple sequence alignment (MSA) is a fundamental algorithm in bioinformatics. In a situation when the alignment might need to be protected while revealing the other information such the input sequences and the alignment score, zero knowledge proof can be used. In this paper, a validator checks the consistency between the input sequence and the alignment, and between the alignment and the alignment score. The validator is written in Circom language which will be compile into a circuit. Using a zero knowledge prove system called zkSNARK, a cryptographic proof is generates for the circuit and its input. This proof demonstrates that all inputs are consistent without revealing the actual alignment.
We present the application of the physics-informed neural network (PINN) approach in Bayesian formulation. We have adopted the Bayesian neural network framework to obtain posterior densities from Laplace approximation. For each model or fit, the evidence is computed, which is a measure that classifies the hypothesis. The optimal solution is the one with the highest value of evidence. We have proposed a modification of the Bayesian algorithm to obtain hyperparameters of the model. We have shown that within the Bayesian framework, one can obtain the relative weights between the boundary and equation contributions to the total loss. Presented method leads to predictions comparable to those obtained by sampling from the posterior distribution within the Hybrid Monte Carlo algorithm (HMC). We have solved heat, wave, and Burger's equations, and the results obtained are in agreement with the exact solutions, demonstrating the effectiveness of our approach. In Burger's equation problem, we have demonstrated that the framework can combine information from differential equations and potential measurements. All solutions are provided with uncertainties (induced by the model's parameter dependence) computed within the Bayesian framework.
Ontology and knowledge graph matching systems are evaluated annually by the Ontology Alignment Evaluation Initiative (OAEI). More and more systems use machine learning-based approaches, including large language models. The training and validation datasets are usually determined by the system developer and often a subset of the reference alignments are used. This sampling is against the OAEI rules and makes a fair comparison impossible. Furthermore, those models are trained offline (a trained and optimized model is packaged into the matcher) and therefore the systems are specifically trained for those tasks. In this paper, we introduce a dataset that contains training, validation, and test sets for most of the OAEI tracks. Thus, online model learning (the systems must adapt to the given input alignment without human intervention) is made possible to enable a fair comparison for ML-based systems. We showcase the usefulness of the dataset by fine-tuning the confidence thresholds of popular systems.
We propose a scheme leveraging reinforcement learning to engineer control fields for generating non-classical states. It is exemplified by the application to prepare spin-squeezed states for an open collective spin model where a linear control field is designed to govern the dynamics. The reinforcement learning agent determines the temporal sequence of control pulses, commencing from a coherent spin state in an environment characterized by dissipation and dephasing. Compared to the constant control scenario, this approach provides various control sequences maintaining collective spin squeezing and entanglement. It is observed that denser application of the control pulses enhances the performanceof the outcomes. However, there is a minor enhancement in the performance by adding control actions. The proposed strategy demonstrates increased effectiveness for larger systems. Thermal excitations of the reservoir are detrimental to the control outcomes. Feasible experiments are suggested to implement the control proposal. The extension to continuous control problems and another quantum system are discussed. The replaceability of the reinforcement learning module is also emphasized. This research paves the way for its application in manipulating other quantum systems.
Software engineering is a domain characterized by intricate decision-making processes, often relying on nuanced intuition and consultation. Recent advancements in deep learning have started to revolutionize software engineering practices through elaborate designs implemented at various stages of software development. In this paper, we present an innovative paradigm that leverages large language models (LLMs) throughout the entire software development process, streamlining and unifying key processes through natural language communication, thereby eliminating the need for specialized models at each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered software development company that mirrors the established waterfall model, meticulously dividing the development process into four distinct chronological stages: designing, coding, testing, and documenting. Each stage engages a team of agents, such as programmers, code reviewers, and test engineers, fostering collaborative dialogue and facilitating a seamless workflow. The chat chain acts as a facilitator, breaking down each stage into atomic subtasks. This enables dual roles, allowing for proposing and validating solutions through context-aware communication, leading to efficient resolution of specific subtasks. The instrumental analysis of ChatDev highlights its remarkable efficacy in software generation, enabling the completion of the entire software development process in under seven minutes at a cost of less than one dollar. It not only identifies and alleviates potential vulnerabilities but also rectifies potential hallucinations while maintaining commendable efficiency and cost-effectiveness. The potential of ChatDev unveils fresh possibilities for integrating LLMs into the realm of software development.
Relation prediction for knowledge graphs aims at predicting missing relationships between entities. Despite the importance of inductive relation prediction, most previous works are limited to a transductive setting and cannot process previously unseen entities. The recent proposed subgraph-based relation reasoning models provided alternatives to predict links from the subgraph structure surrounding a candidate triplet inductively. However, we observe that these methods often neglect the directed nature of the extracted subgraph and weaken the role of relation information in the subgraph modeling. As a result, they fail to effectively handle the asymmetric/anti-symmetric triplets and produce insufficient embeddings for the target triplets. To this end, we introduce a \textbf{C}\textbf{o}mmunicative \textbf{M}essage \textbf{P}assing neural network for \textbf{I}nductive re\textbf{L}ation r\textbf{E}asoning, \textbf{CoMPILE}, that reasons over local directed subgraph structures and has a vigorous inductive bias to process entity-independent semantic relations. In contrast to existing models, CoMPILE strengthens the message interactions between edges and entitles through a communicative kernel and enables a sufficient flow of relation information. Moreover, we demonstrate that CoMPILE can naturally handle asymmetric/anti-symmetric relations without the need for explosively increasing the number of model parameters by extracting the directed enclosing subgraphs. Extensive experiments show substantial performance gains in comparison to state-of-the-art methods on commonly used benchmark datasets with variant inductive settings.
Data augmentation has been widely used to improve generalizability of machine learning models. However, comparatively little work studies data augmentation for graphs. This is largely due to the complex, non-Euclidean structure of graphs, which limits possible manipulation operations. Augmentation operations commonly used in vision and language have no analogs for graphs. Our work studies graph data augmentation for graph neural networks (GNNs) in the context of improving semi-supervised node-classification. We discuss practical and theoretical motivations, considerations and strategies for graph data augmentation. Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction. Extensive experiments on multiple benchmarks show that augmentation via GAug improves performance across GNN architectures and datasets.
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
In this paper, we propose a conceptually simple and geometrically interpretable objective function, i.e. additive margin Softmax (AM-Softmax), for deep face verification. In general, the face verification task can be viewed as a metric learning problem, so learning large-margin face features whose intra-class variation is small and inter-class difference is large is of great importance in order to achieve good performance. Recently, Large-margin Softmax and Angular Softmax have been proposed to incorporate the angular margin in a multiplicative manner. In this work, we introduce a novel additive angular margin for the Softmax loss, which is intuitively appealing and more interpretable than the existing works. We also emphasize and discuss the importance of feature normalization in the paper. Most importantly, our experiments on LFW BLUFR and MegaFace show that our additive margin softmax loss consistently performs better than the current state-of-the-art methods using the same network architecture and training dataset. Our code has also been made available at //github.com/happynear/AMSoftmax