Deep learning methods have been employed in gravitational-wave astronomy to accelerate the construction of surrogate waveforms for the inspiral of spin-aligned black hole binaries, among other applications. We face the challenge of modeling the residual error of an artificial neural network that models the coefficients of the surrogate waveform expansion (especially those of the phase of the waveform) which we demonstrate has sufficient structure to be learnable by a second network. Adding this second network, we were able to reduce the maximum mismatch for waveforms in a validation set by 13.4 times. We also explored several other ideas for improving the accuracy of the surrogate model, such as the exploitation of similarities between waveforms, the augmentation of the training set, the dissection of the input space, using dedicated networks per output coefficient and output augmentation. In several cases, small improvements can be observed, but the most significant improvement still comes from the addition of a second network that models the residual error. Since the residual error for more general surrogate waveform models (when e.g., eccentricity is included) may also have a specific structure, one can expect our method to be applicable to cases where the gain in accuracy could lead to significant gains in computational time.
Image captioning is a challenging task involving generating a textual description for an image using computer vision and natural language processing techniques. This paper proposes a deep neural framework for image caption generation using a GRU-based attention mechanism. Our approach employs multiple pre-trained convolutional neural networks as the encoder to extract features from the image and a GRU-based language model as the decoder to generate descriptive sentences. To improve performance, we integrate the Bahdanau attention model with the GRU decoder to enable learning to focus on specific image parts. We evaluate our approach using the MSCOCO and Flickr30k datasets and show that it achieves competitive scores compared to state-of-the-art methods. Our proposed framework can bridge the gap between computer vision and natural language and can be extended to specific domains.
In orthogonal time frequency space (OTFS) systems, the impact of frequency-dependent Doppler which is referred to as the Doppler squint effect (DSE) is accumulated through longer duration, whose negligence has prevented OTFS systems from exploiting the performance superiority. In this paper, practical OFDM system using cyclic prefix time guard interval (CP-OFDM)-based OTFS systems with DSE are adopted. Cyclic prefix (CP) length is analyzed while the input-output relation considering DSE is derived. By deploying two prefix OFDM symbols, the channel estimation can be easily divided into three parts as delay detection, Doppler extraction and gain estimation. The linear equalization scheme is adopted taking the block diagonal property of the channel matrix into account, which completes the low-complexity receiver design. Simulation results confirm the significance of DSE and the considerable performance of the proposed low-complexity receiver scheme considering DSE.
Affine frequency division multiplexing (AFDM) is an emerging multicarrier waveform that offers a potential solution for achieving reliable communication for time-varying channels. This paper proposes two maximum likelihood (ML) estimators of symbol time offset and carrier frequency offset for AFDM systems. The joint ML estimator evaluates the arrival time and frequency offset by comparing the correlations of samples. Moreover, we propose the stepwise ML estimator to reduce the complexity. The proposed estimators exploit the redundant information contained within the chirp-periodic prefix inherent in AFDM symbols, thus dispensing with any additional pilots. To further mitigate the intercarrier interference resulting from the residual frequency offset, we design a mirror-mappingbased scheme for AFDM systems. Numerical results verify the effectiveness of the proposed time and frequency offset estimation criteria and the mirror-mapping-based modulation for AFDM systems.
Continual relation extraction (CRE) aims to solve the problem of catastrophic forgetting when learning a sequence of newly emerging relations. Recent CRE studies have found that catastrophic forgetting arises from the model's lack of robustness against future analogous relations. To address the issue, we introduce rationale, i.e., the explanations of relation classification results generated by large language models (LLM), into CRE task. Specifically, we design the multi-task rationale tuning strategy to help the model learn current relations robustly. We also conduct contrastive rationale replay to further distinguish analogous relations. Experimental results on two standard benchmarks demonstrate that our method outperforms the state-of-the-art CRE models.
Surrogate-assisted evolutionary algorithms have been widely developed to solve complex and computationally expensive multi-objective optimization problems in recent years. However, when dealing with high-dimensional optimization problems, the performance of these surrogate-assisted multi-objective evolutionary algorithms deteriorate drastically. In this work, a novel Classifier-assisted rank-based learning and Local Model based multi-objective Evolutionary Algorithm (CLMEA) is proposed for high-dimensional expensive multi-objective optimization problems. The proposed algorithm consists of three parts: classifier-assisted rank-based learning, hypervolume-based non-dominated search, and local search in the relatively sparse objective space. Specifically, a probabilistic neural network is built as classifier to divide the offspring into a number of ranks. The offspring in different ranks uses rank-based learning strategy to generate more promising and informative candidates for real function evaluations. Then, radial basis function networks are built as surrogates to approximate the objective functions. After searching non-dominated solutions assisted by the surrogate model, the candidates with higher hypervolume improvement are selected for real evaluations. Subsequently, in order to maintain the diversity of solutions, the most uncertain sample point from the non-dominated solutions measured by the crowding distance is selected as the guided parent to further infill in the uncertain region of the front. The experimental results of benchmark problems and a real-world application on geothermal reservoir heat extraction optimization demonstrate that the proposed algorithm shows superior performance compared with the state-of-the-art surrogate-assisted multi-objective evolutionary algorithms. The source code for this work is available at //github.com/JellyChen7/CLMEA.
Just as computational simulations of atoms, molecules and cells have shaped the way we study the sciences, true-to-life simulations of human-like agents can be valuable tools for studying human behavior. We propose Humanoid Agents, a system that guides Generative Agents to behave more like humans by introducing three elements of System 1 processing: Basic needs (e.g. hunger, health and energy), Emotion and Closeness in Relationships. Humanoid Agents are able to use these dynamic elements to adapt their daily activities and conversations with other agents, as supported with empirical experiments. Our system is designed to be extensible to various settings, three of which we demonstrate, as well as to other elements influencing human behavior (e.g. empathy, moral values and cultural background). Our platform also includes a Unity WebGL game interface for visualization and an interactive analytics dashboard to show agent statuses over time. Our platform is available on //www.humanoidagents.com/ and code is on //github.com/HumanoidAgents/HumanoidAgents
We discuss some aspects of our work on the mechanization of syntax and semantics in the UniMath library, based on the proof assistant Coq. We focus on experiences where Coq (as a type-theoretic proof assistant with decidable typechecking) made us use more theory or helped us to see theory more clearly.
This paper presents an alternative approach to dehomogenisation of elastic Rank-N laminate structures based on the computer graphics discipline of phasor noise. The proposed methodology offers an improvement of existing methods, where high-quality single-scale designs can be obtained efficiently without the utilisation of any least-squares problem or pre-trained models. By utilising a continuous and periodic representation of the translation at each intermediate step, appropriate length-scale and thicknesses can be obtained. Numerical tests verifies the performance of the proposed methodology compared to state-of-the-art alternatives, and the dehomogenised designs achieve structural performance within a few percentages of the optimised homogenised solution. The nature of the phasor-based dehomogenisation is inherently mesh-independent and highly parallelisable, allowing for further efficient implementations and future extensions to 3D problems on unstructured meshes.
We propose a method to improve the efficiency and accuracy of amortized Bayesian inference (ABI) by leveraging universal symmetries in the probabilistic joint model $p(\theta, y)$ of parameters $\theta$ and data $y$. In a nutshell, we invert Bayes' theorem and estimate the marginal likelihood based on approximate representations of the joint model. Upon perfect approximation, the marginal likelihood is constant across all parameter values by definition. However, approximation error leads to undesirable variance in the marginal likelihood estimates across different parameter values. We formulate violations of this symmetry as a loss function to accelerate the learning dynamics of conditional neural density estimators. We apply our method to a bimodal toy problem with an explicit likelihood (likelihood-based) and a realistic model with an implicit likelihood (simulation-based).
Graph Neural Networks (GNNs) have received considerable attention on graph-structured data learning for a wide variety of tasks. The well-designed propagation mechanism which has been demonstrated effective is the most fundamental part of GNNs. Although most of GNNs basically follow a message passing manner, litter effort has been made to discover and analyze their essential relations. In this paper, we establish a surprising connection between different propagation mechanisms with a unified optimization problem, showing that despite the proliferation of various GNNs, in fact, their proposed propagation mechanisms are the optimal solution optimizing a feature fitting function over a wide class of graph kernels with a graph regularization term. Our proposed unified optimization framework, summarizing the commonalities between several of the most representative GNNs, not only provides a macroscopic view on surveying the relations between different GNNs, but also further opens up new opportunities for flexibly designing new GNNs. With the proposed framework, we discover that existing works usually utilize naive graph convolutional kernels for feature fitting function, and we further develop two novel objective functions considering adjustable graph kernels showing low-pass or high-pass filtering capabilities respectively. Moreover, we provide the convergence proofs and expressive power comparisons for the proposed models. Extensive experiments on benchmark datasets clearly show that the proposed GNNs not only outperform the state-of-the-art methods but also have good ability to alleviate over-smoothing, and further verify the feasibility for designing GNNs with our unified optimization framework.