The details of second-order partial derivatives of rigid-body Inverse/Forward dynamics are provided. Several properties and identities using Spatial Vector Algebra are listed, along with their detailed derivations. The expressions build upon previous work by the author on first-order partial derivatives of inverse dynamics. The first/second-order derivatives are also extended for systems with external forces. Finally, the KKT Forward dynamics and Impact dynamics derivatives are derived.
Generative models and in particular Generative Adversarial Networks (GANs) have become very popular and powerful data generation tool. In recent years, major progress has been made in extending this concept into the quantum realm. However, most of the current methods focus on generating classes of states that were supplied in the input set and seen at the training time. In this work, we propose a new hybrid classical-quantum method based on quantum Wasserstein GANs that overcomes this limitation. It allows to learn the function governing the measurement expectations of the supplied states and generate new states, that were not a part of the input set, but which expectations follow the same underlying function.
The communities of blockchains and distributed ledgers have been stirred up by the introduction of zero-knowledge proofs (ZKPs). Originally designed to solve privacy issues, ZKPs have now evolved into an effective remedy for scalability concerns and are applied in Zcash (internet money like Bitcoin). To enable ZKPs, Rank-1 Constraint Systems (R1CS) offer a verifier for bi-linear equations. To accurately and efficiently represent R1CS, several language tools like Circom, Noir, and Snarky have been proposed to automate the compilation of advanced programs into R1CS. However, due to the flexible nature of R1CS representation, there can be significant differences in the compiled R1CS forms generated from circuit language programs with the same underlying semantics. To address this issue, this paper uses a data-flow-based R1CS paradigm algorithm, which produces a standardized format for different R1CS instances with identical semantics. By using the normalized R1CS format circuits, the complexity of circuits' verification can be reduced. In addition, this paper presents an R1CS normalization algorithm benchmark, and our experimental evaluation demonstrates the effectiveness and correctness of our methods.
In neural network training, RMSProp and ADAM remain widely favoured optimization algorithms. One of the keys to their performance lies in selecting the correct step size, which can significantly influence their effectiveness. It is worth noting that these algorithms performance can vary considerably, depending on the chosen step sizes. Additionally, questions about their theoretical convergence properties continue to be a subject of interest. In this paper, we theoretically analyze a constant stepsize version of ADAM in the non-convex setting. We show sufficient conditions for the stepsize to achieve almost sure asymptotic convergence of the gradients to zero with minimal assumptions. We also provide runtime bounds for deterministic ADAM to reach approximate criticality when working with smooth, non-convex functions.
The application of Physics-Informed Neural Networks (PINNs) is investigated for the first time in solving the one-dimensional Countercurrent spontaneous imbibition (COUCSI) problem at both early and late time (i.e., before and after the imbibition front meets the no-flow boundary). We introduce utilization of Change-of-Variables as a technique for improving performance of PINNs. We formulated the COUCSI problem in three equivalent forms by changing the independent variables. The first describes saturation as function of normalized position X and time T; the second as function of X and Y=T^0.5; and the third as a sole function of Z=X/T^0.5 (valid only at early time). The PINN model was generated using a feed-forward neural network and trained based on minimizing a weighted loss function, including the physics-informed loss term and terms corresponding to the initial and boundary conditions. All three formulations could closely approximate the correct solutions, with water saturation mean absolute errors around 0.019 and 0.009 for XT and XY formulations and 0.012 for the Z formulation at early time. The Z formulation perfectly captured the self-similarity of the system at early time. This was less captured by XT and XY formulations. The total variation of saturation was preserved in the Z formulation, and it was better preserved with XY- than XT formulation. Redefining the problem based on the physics-inspired variables reduced the non-linearity of the problem and allowed higher solution accuracies, a higher degree of loss-landscape convexity, a lower number of required collocation points, smaller network sizes, and more computationally efficient solutions.
Recently, large-scale pre-trained language-image models like CLIP have shown extraordinary capabilities for understanding spatial contents, but naively transferring such models to video recognition still suffers from unsatisfactory temporal modeling capabilities. Existing methods insert tunable structures into or in parallel with the pre-trained model, which either requires back-propagation through the whole pre-trained model and is thus resource-demanding, or is limited by the temporal reasoning capability of the pre-trained structure. In this work, we present DiST, which disentangles the learning of spatial and temporal aspects of videos. Specifically, DiST uses a dual-encoder structure, where a pre-trained foundation model acts as the spatial encoder, and a lightweight network is introduced as the temporal encoder. An integration branch is inserted between the encoders to fuse spatio-temporal information. The disentangled spatial and temporal learning in DiST is highly efficient because it avoids the back-propagation of massive pre-trained parameters. Meanwhile, we empirically show that disentangled learning with an extra network for integration benefits both spatial and temporal understanding. Extensive experiments on five benchmarks show that DiST delivers better performance than existing state-of-the-art methods by convincing gaps. When pre-training on the large-scale Kinetics-710, we achieve 89.7% on Kinetics-400 with a frozen ViT-L model, which verifies the scalability of DiST. Codes and models can be found in //github.com/alibaba-mmai-research/DiST.
Distribution-dependent stochastic dynamical systems arise widely in engineering and science. We consider a class of such systems which model the limit behaviors of interacting particles moving in a vector field with random fluctuations. We aim to examine the most likely transition path between equilibrium stable states of the vector field. In the small noise regime, the action functional does not involve the solution of the skeleton equation which describes the unperturbed deterministic flow of the vector field shifted by the interaction at zero distance. As a result, we are led to study the most likely transition path for a stochastic differential equation without distribution dependency. This enables the computation of the most likely transition path for these distribution-dependent stochastic dynamical systems by the adaptive minimum action method and we illustrate our approach in two examples.
Markov processes are widely used mathematical models for describing dynamic systems in various fields. However, accurately simulating large-scale systems at long time scales is computationally expensive due to the short time steps required for accurate integration. In this paper, we introduce an inference process that maps complex systems into a simplified representational space and models large jumps in time. To achieve this, we propose Time-lagged Information Bottleneck (T-IB), a principled objective rooted in information theory, which aims to capture relevant temporal features while discarding high-frequency information to simplify the simulation task and minimize the inference error. Our experiments demonstrate that T-IB learns information-optimal representations for accurately modeling the statistical properties and dynamics of the original process at a selected time lag, outperforming existing time-lagged dimensionality reduction methods.
Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.
Pre-trained Language Models (PLMs) which are trained on large text corpus via self-supervised learning method, have yielded promising performance on various tasks in Natural Language Processing (NLP). However, though PLMs with huge parameters can effectively possess rich knowledge learned from massive training text and benefit downstream tasks at the fine-tuning stage, they still have some limitations such as poor reasoning ability due to the lack of external knowledge. Research has been dedicated to incorporating knowledge into PLMs to tackle these issues. In this paper, we present a comprehensive review of Knowledge-Enhanced Pre-trained Language Models (KE-PLMs) to provide a clear insight into this thriving field. We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP. For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG), and rule knowledge. The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods. Finally, we point out some promising future directions of KE-PLMs.
Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.