Accurate forecasting of long-term time series has important applications for decision making and planning. However, it remains challenging to capture the long-term dependencies in time series data. To better extract long-term dependencies, We propose Multi Scale Dilated Convolution Network (MSDCN), a method that utilizes a shallow dilated convolution architecture to capture the period and trend characteristics of long time series. We design different convolution blocks with exponentially growing dilations and varying kernel sizes to sample time series data at different scales. Furthermore, we utilize traditional autoregressive model to capture the linear relationships within the data. To validate the effectiveness of the proposed approach, we conduct experiments on eight challenging long-term time series forecasting benchmark datasets. The experimental results show that our approach outperforms the prior state-of-the-art approaches and shows significant inference speed improvements compared to several strong baseline methods.
We focus on designing Peer-to-Peer (P2P) networks that enable efficient communication. Over the last two decades, there has been substantial algorithmic research on distributed protocols for building P2P networks with various desirable properties such as high expansion, low diameter, and robustness to a large number of deletions. A key underlying theme in all of these works is to distributively build a \emph{random graph} topology that guarantees the above properties. Moreover, the random connectivity topology is widely deployed in many P2P systems today, including those that implement blockchains and cryptocurrencies. However, a major drawback of using a random graph topology for a P2P network is that the random topology does not respect the \emph{underlying} (Internet) communication topology. This creates a large \emph{propagation delay}, which is a major communication bottleneck in modern P2P networks. In this paper, we work towards designing P2P networks that are communication-efficient (having small propagation delay) with provable guarantees. Our main contribution is an efficient, decentralized protocol, $\textsc{Close-Weaver}$, that transforms a random graph topology embedded in an underlying Euclidean space into a topology that also respects the underlying metric. We then present efficient point-to-point routing and broadcast protocols that achieve essentially optimal performance with respect to the underlying space.
In this research, we explore different ways to improve generative adversarial networks for video super-resolution tasks from a base single image super-resolution GAN model. Our primary objective is to identify potential techniques that enhance these models and to analyze which of these techniques yield the most significant improvements. We evaluate our results using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). Our findings indicate that the most effective techniques include temporal smoothing, long short-term memory (LSTM) layers, and a temporal loss function. The integration of these methods results in an 11.97% improvement in PSNR and an 8% improvement in SSIM compared to the baseline video super-resolution generative adversarial network (GAN) model. This substantial improvement suggests potential further applications to enhance current state-of-the-art models.
Dynamic systems described by differential equations often involve feedback among system components. When there are time delays for components to sense and respond to feedback, delay differential equation (DDE) models are commonly used. This paper considers the problem of inferring unknown system parameters, including the time delays, from noisy and sparse experimental data observed from the system. We propose an extension of manifold-constrained Gaussian processes to conduct parameter inference for DDEs, whereas the time delay parameters have posed a challenge for existing methods that bypass numerical solvers. Our method uses a Bayesian framework to impose a Gaussian process model over the system trajectory, conditioned on the manifold constraint that satisfies the DDEs. For efficient computation, a linear interpolation scheme is developed to approximate the values of the time-delayed system outputs, along with corresponding theoretical error bounds on the approximated derivatives. Two simulation examples, based on Hutchinson's equation and the lac operon system, together with a real-world application using Ontario COVID-19 data, are used to illustrate the efficacy of our method.
Object detection is crucial for ensuring safe autonomous driving. However, data-driven approaches face challenges when encountering minority or novel objects in the 3D driving scene. In this paper, we propose VisLED, a language-driven active learning framework for diverse open-set 3D Object Detection. Our method leverages active learning techniques to query diverse and informative data samples from an unlabeled pool, enhancing the model's ability to detect underrepresented or novel objects. Specifically, we introduce the Vision-Language Embedding Diversity Querying (VisLED-Querying) algorithm, which operates in both open-world exploring and closed-world mining settings. In open-world exploring, VisLED-Querying selects data points most novel relative to existing data, while in closed-world mining, it mines novel instances of known classes. We evaluate our approach on the nuScenes dataset and demonstrate its efficiency compared to random sampling and entropy-querying methods. Our results show that VisLED-Querying consistently outperforms random sampling and offers competitive performance compared to entropy-querying despite the latter's model-optimality, highlighting the potential of VisLED for improving object detection in autonomous driving scenarios. We make our code publicly available at //github.com/Bjork-crypto/VisLED-Querying
Click-Through Rate (CTR) prediction has become an essential task in digital industries, such as digital advertising or online shopping. Many deep learning-based methods have been implemented and have become state-of-the-art models in the domain. To further improve the performance of CTR models, Knowledge Distillation based approaches have been widely used. However, most of the current CTR prediction models do not have much complex architectures, so it's hard to call one of them 'cumbersome' and the other one 'tiny'. On the other hand, the performance gap is also not very large between complex and simple models. So, distilling knowledge from one model to the other could not be worth the effort. Under these considerations, Mutual Learning could be a better approach, since all the models could be improved mutually. In this paper, we showed how useful the mutual learning algorithm could be when it is between equals. In our experiments on the Criteo and Avazu datasets, the mutual learning algorithm improved the performance of the model by up to 0.66% relative improvement.
There has been a growing need to devise processes that can create comprehensive datasets in the world of Computer Algebra, both for accurate benchmarking and for new intersections with machine learning technology. We present here a method to generate integrands that are guaranteed to be integrable, dubbed the LIOUVILLE method. It is based on Liouville's theorem and the Parallel Risch Algorithm for symbolic integration. We show that this data generation method retains the best qualities of previous data generation methods, while overcoming some of the issues built into that prior work. The LIOUVILLE generator is able to generate sufficiently complex and realistic integrands, and could be used for benchmarking or machine learning training tasks related to symbolic integration.
Machine unlearning is an emerging technology that has come to attract widespread attention. A number of factors, including regulations and laws, privacy, and usability concerns, have resulted in this need to allow a trained model to forget some of its training data. Existing studies of machine unlearning mainly focus on unlearning requests that forget a cluster of instances or all instances from one class. While these approaches are effective in removing instances, they do not scale to scenarios where partial targets within an instance need to be forgotten. For example, one would like to only unlearn a person from all instances that simultaneously contain the person and other targets. Directly migrating instance-level unlearning to target-level unlearning will reduce the performance of the model after the unlearning process, or fail to erase information completely. To address these concerns, we have proposed a more effective and efficient unlearning scheme that focuses on removing partial targets from the model, which we name "target unlearning". Specifically, we first construct an essential graph data structure to describe the relationships between all important parameters that are selected based on the model explanation method. After that, we simultaneously filter parameters that are also important for the remaining targets and use the pruning-based unlearning method, which is a simple but effective solution to remove information about the target that needs to be forgotten. Experiments with different training models on various datasets demonstrate the effectiveness of the proposed approach.
We describe ACE0, a lightweight platform for evaluating the suitability and viability of AI methods for behaviour discovery in multiagent simulations. Specifically, ACE0 was designed to explore AI methods for multi-agent simulations used in operations research studies related to new technologies such as autonomous aircraft. Simulation environments used in production are often high-fidelity, complex, require significant domain knowledge and as a result have high R&D costs. Minimal and lightweight simulation environments can help researchers and engineers evaluate the viability of new AI technologies for behaviour discovery in a more agile and potentially cost effective manner. In this paper we describe the motivation for the development of ACE0.We provide a technical overview of the system architecture, describe a case study of behaviour discovery in the aerospace domain, and provide a qualitative evaluation of the system. The evaluation includes a brief description of collaborative research projects with academic partners, exploring different AI behaviour discovery methods.
Vast amount of data generated from networks of sensors, wearables, and the Internet of Things (IoT) devices underscores the need for advanced modeling techniques that leverage the spatio-temporal structure of decentralized data due to the need for edge computation and licensing (data access) issues. While federated learning (FL) has emerged as a framework for model training without requiring direct data sharing and exchange, effectively modeling the complex spatio-temporal dependencies to improve forecasting capabilities still remains an open problem. On the other hand, state-of-the-art spatio-temporal forecasting models assume unfettered access to the data, neglecting constraints on data sharing. To bridge this gap, we propose a federated spatio-temporal model -- Cross-Node Federated Graph Neural Network (CNFGNN) -- which explicitly encodes the underlying graph structure using graph neural network (GNN)-based architecture under the constraint of cross-node federated learning, which requires that data in a network of nodes is generated locally on each node and remains decentralized. CNFGNN operates by disentangling the temporal dynamics modeling on devices and spatial dynamics on the server, utilizing alternating optimization to reduce the communication cost, facilitating computations on the edge devices. Experiments on the traffic flow forecasting task show that CNFGNN achieves the best forecasting performance in both transductive and inductive learning settings with no extra computation cost on edge devices, while incurring modest communication cost.
Multivariate time series forecasting is extensively studied throughout the years with ubiquitous applications in areas such as finance, traffic, environment, etc. Still, concerns have been raised on traditional methods for incapable of modeling complex patterns or dependencies lying in real word data. To address such concerns, various deep learning models, mainly Recurrent Neural Network (RNN) based methods, are proposed. Nevertheless, capturing extremely long-term patterns while effectively incorporating information from other variables remains a challenge for time-series forecasting. Furthermore, lack-of-explainability remains one serious drawback for deep neural network models. Inspired by Memory Network proposed for solving the question-answering task, we propose a deep learning based model named Memory Time-series network (MTNet) for time series forecasting. MTNet consists of a large memory component, three separate encoders, and an autoregressive component to train jointly. Additionally, the attention mechanism designed enable MTNet to be highly interpretable. We can easily tell which part of the historic data is referenced the most.