The intersection of connection graphs and discrete optimal transport presents a novel paradigm for understanding complex graphs and node interactions. In this paper, we delve into this unexplored territory by focusing on the Beckmann problem within the context of connection graphs. Our study establishes feasibility conditions for the resulting convex optimization problem on connection graphs. Furthermore, we establish strong duality for the conventional Beckmann problem, and extend our analysis to encompass strong duality and duality correspondence for a quadratically regularized variant. To put our findings into practice, we implement the regularized problem using gradient descent, enabling a practical approach to solving this complex problem. We showcase optimal flows and solutions, providing valuable insights into the real-world implications of our theoretical framework.
The ability of deep image prior (DIP) to recover high-quality images from incomplete or corrupted measurements has made it popular in inverse problems in image restoration and medical imaging including magnetic resonance imaging (MRI). However, conventional DIP suffers from severe overfitting and spectral bias effects.In this work, we first provide an analysis of how DIP recovers information from undersampled imaging measurements by analyzing the training dynamics of the underlying networks in the kernel regime for different architectures.This study sheds light on important underlying properties for DIP-based recovery.Current research suggests that incorporating a reference image as network input can enhance DIP's performance in image reconstruction compared to using random inputs. However, obtaining suitable reference images requires supervision, and raises practical difficulties. In an attempt to overcome this obstacle, we further introduce a self-driven reconstruction process that concurrently optimizes both the network weights and the input while eliminating the need for training data. Our method incorporates a novel denoiser regularization term which enables robust and stable joint estimation of both the network input and reconstructed image.We demonstrate that our self-guided method surpasses both the original DIP and modern supervised methods in terms of MR image reconstruction performance and outperforms previous DIP-based schemes for image inpainting.
A major limitation for the broader scope of problems solvable by transformers is the quadratic scaling of computational complexity with input size. In this study, we investigate the recurrent memory augmentation of pre-trained transformer models to extend input context length while linearly scaling compute. Our approach demonstrates the capability to store information in memory for sequences of up to an unprecedented two million tokens while maintaining high retrieval accuracy. Experiments with language modeling tasks show perplexity improvement as the number of processed input segments increases. These results underscore the effectiveness of our method, which has significant potential to enhance long-term dependency handling in natural language understanding and generation tasks, as well as enable large-scale context processing for memory-intensive applications.
We conduct a systematic study of the approximation properties of Transformer for sequence modeling with long, sparse and complicated memory. We investigate the mechanisms through which different components of Transformer, such as the dot-product self-attention, positional encoding and feed-forward layer, affect its expressive power, and we study their combined effects through establishing explicit approximation rates. Our study reveals the roles of critical parameters in the Transformer, such as the number of layers and the number of attention heads, and these insights also provide natural suggestions for alternative architectures.
We describe a version of the FGLM algorithm that can be used to compute generic fibers of positive-dimensional polynomial ideals. It combines the FGLM algorithm with a Hensel lifting strategy. We show that this algorithm has a complexity quasi-linear in the number of lifting steps. Some provided experimental data also demonstrates the practical efficacy of our algorithm. Additionally, we sketch a related Hensel lifting method to compute Gr\"obner bases using so-called tracers.
The development dynamics of digital innovations for industry, business, and society are producing complex system conglomerates that can no longer be designed centrally and hierarchically in classic development processes. Instead, systems are evolving in DevOps processes in which heterogeneous actors act together on an open platform. Influencing and controlling such dynamically and autonomously changing system landscapes is currently a major challenge and a fundamental interest of service users and providers, as well as operators of the platform infrastructures. In this paper, we propose an architecture for such an emergent software service platform. A software platform that implements this architecture with the underlying engineering methodology is demonstrated by a smart parking lot scenario.
We develop an enthalpy-based modeling and computational framework to quantify uncertainty in Stefan problems with an injection boundary. Inspired by airfoil icing studies, we consider a system featuring an injection boundary inducing domain changes and a free boundary separating phases, resulting in two types of moving boundaries. Our proposed enthalpy-based formulation seamlessly integrates thermal diffusion across the domain with energy fluxes at the boundaries, addressing a modified injection condition for boundary movement. Uncertainty then stems from random variations in the injection boundary. The primary focus of our Uncertainty Quantification (UQ) centers on investigating the effects of uncertainty on free boundary propagation. Through mapping to a reference domain, we derive an enthalpy-based numerical scheme tailored to the transformed coordinate system, facilitating a simple and efficient simulation. Numerical and UQ studies in one and two dimensions validate the proposed model and the extended enthalpy method. They offer intriguing insights into ice accretion and other multiphysics processes involving phase transitions.
This paper introduces an innovative guidance and control method for simultaneously capturing and stabilizing a fast-spinning target satellite, such as a spin-stabilized satellite, using a spinning-base servicing satellite equipped with a robotic manipulator, joint locks, and reaction wheels (RWs). The method involves controlling the RWs of the servicing satellite to replicate the spinning motion of the target satellite, while locking the manipulator's joints to achieve spin-matching. This maneuver makes the target stationary with respect to the rotating frame of the servicing satellite located at its center-of-mass (CoM), simplifying the robot capture trajectory planning and eliminating post-capture trajectory planning entirely. In the next phase, the joints are unlocked, and a coordination controller drives the robotic manipulator to capture the target satellite while maintaining zero relative rotation between the servicing and target satellites. The spin stabilization phase begins after completing the capture phase, where the joints are locked to form a single tumbling rigid body consisting of the rigidly connected servicing and target satellites. An optimal controller applies negative control torques to the RWs to dampen out the tumbling motion of the interconnected satellites as quickly as possible, subject to the actuation torque limit of the RWs and the maximum torque exerted by the manipulator's end-effector.
This paper discusses two approaches to the diachronic normalization of Polish texts: a rule-based solution that relies on a set of handcrafted patterns, and a neural normalization model based on the text-to-text transfer transformer architecture. The training and evaluation data prepared for the task are discussed in detail, along with experiments conducted to compare the proposed normalization solutions. A quantitative and qualitative analysis is made. It is shown that at the current stage of inquiry into the problem, the rule-based solution outperforms the neural one on 3 out of 4 variants of the prepared dataset, although in practice both approaches have distinct advantages and disadvantages.
We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.
We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing. Our goal is different than the standard sentiment analysis goal of predicting whether a sentence expresses positive or negative sentiment; instead, we aim to infer the latent emotional state of the user. Thus, we focus on predicting the emotion word tags attached by users to their Tumblr posts, treating these as "self-reported emotions." We demonstrate that our multimodal model combining both text and image features outperforms separate models based solely on either images or text. Our model's results are interpretable, automatically yielding sensible word lists associated with emotions. We explore the structure of emotions implied by our model and compare it to what has been posited in the psychology literature, and validate our model on a set of images that have been used in psychology studies. Finally, our work also provides a useful tool for the growing academic study of images - both photographs and memes - on social networks.