In this paper we consider the filtering of a class of partially observed piecewise deterministic Markov processes (PDMPs). In particular, we assume that an ordinary differential equation (ODE) drives the deterministic element and can only be solved numerically via a time discretization. We develop, based upon the approach in [20], a new particle and multilevel particle filter (MLPF) in order to approximate the filter associated to the discretized ODE. We provide a bound on the mean square error associated to the MLPF which provides guidance on setting the simulation parameter of that algorithm and implies that significant computational gains can be obtained versus using a particle filter. Our theoretical claims are confirmed in several numerical examples.
In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear functional approximation for policy evaluation in discounted Markov Decision Processes. We show that a simple algorithm with a universal and instance-independent step size together with Polyak-Ruppert tail averaging is sufficient to obtain near-optimal variance and bias terms. We also provide the respective sample complexity bounds. Our proof technique is based on refined error bounds for linear stochastic approximation together with the novel stability result for the product of random matrices that arise from the TD-type recurrence.
This paper is motivated by a question whether it is possible to calculate a chaotic sequence efficiently, e.g., is it possible to get the $n$-th bit of a bit sequence generated by a chaotic map, such as $\beta$-expansion, tent map and logistic map in $o(n)$ time/space? This paper gives an affirmative answer to the question about the space complexity of a tent map. We prove that a tent code of $n$-bits with an initial condition uniformly at random is exactly generated in $O(\log^2 n)$ space in expectation.
This paper develops a Blue-Green Infrastructure (BGI) performance evaluation approach by integrating a Non-dominated Sorting Genetic Algorithm II (NSGA-II) with a detailed hydrodynamic model. The proposed Cost OptimisatioN Framework for Implementing blue-Green infrastructURE (CONFIGURE), with a simplified problem-framing process and efficient genetic operations, can be connected to any flood simulation model. In this study, CONFIGURE is integrated with the CityCAT hydrodynamic model to optimise the locations and combinations of permeable surfaces. Permeable zones with four different levels of spatial discretisation are designed to evaluate their efficiency for 100-year and 30-year return period rainstorms. Overall, the framework performs effectively for the given scenarios. The application of the detailed hydrodynamic model explicitly captures the functioning of permeable features to provide the optimal locations for their deployment. Moreover, the size and the location of the permeable surfaces and the intensity of the rainstorm events are the critical performance parameters for economical BGI deployment.
This paper challenges the well-established paradigm for building any-to-any networks for training Large Language Models (LLMs). We show that LLMs exhibit a unique communication pattern where only small groups of GPUs require high-bandwidth any-to-any communication within them, to achieve near-optimal training performance. Across these groups of GPUs, the communication is insignificant, sparse, and homogeneous. We propose a new network architecture that closely resembles the communication requirement of LLMs. Our architecture partitions the cluster into sets of GPUs interconnected with non-blocking any-to-any high-bandwidth interconnects that we call HB domains. Across the HB domains, the network only connects GPUs with communication demands. We call this network a "rail-only" connection, and show that our proposed architecture reduces the network cost by up to 75% compared to the state-of-the-art any-to-any Clos networks without compromising the performance of LLM training.
Informationization is a prevailing trend in today's world. The increasing demand for information in decision-making processes poses significant challenges for investigation activities, particularly in terms of effectively allocating limited resources to plan investigation programs. This paper addresses the investigation path planning problem by formulating it as a multi-traveling salesman problem (MTSP). Our objective is to minimize costs, and to achieve this, we propose a chaotic artificial fish swarm algorithm based on multiple population differential evolution (DE-CAFSA). To overcome the limitations of the artificial fish swarm algorithm, such as low optimization accuracy and the inability to consider global and local information, we incorporate adaptive field of view and step size adjustments, replace random behavior with the 2-opt operation, and introduce chaos theory and sub-optimal solutions to enhance optimization accuracy and search performance. Additionally, we integrate the differential evolution algorithm to create a hybrid algorithm that leverages the complementary advantages of both approaches. Experimental results demonstrate that DE-CAFSA outperforms other algorithms on various public datasets of different sizes, as well as showcasing excellent performance on the examples proposed in this study.
This work presents an algorithm for tracking the shape of multiple entangling Deformable Linear Objects (DLOs) from a sequence of RGB-D images. This algorithm runs in real-time and improves on previous single-DLO tracking approaches by enabling tracking of multiple objects. This is achieved using Global-Local Topology Preservation (GLTP). This work uses the geodesic distance in GLTP to define the distance between separate objects and the distance between different parts of the same object. Tracking multiple entangling DLOs is demonstrated experimentally. The source code is publicly released.
In this paper, a new method is given for counting cycles in the Tanner graph of a (Type-I) quasi-cyclic (QC) low-density parity-check (LDPC) code which the complexity mainly is dependent on the base matrix, independent from the CPM-size of the constructed code. Interestingly, for large CPM-sizes, in comparison of the existing methods, this algorithm is the first approach which efficiently counts the cycles in the Tanner graphs of QC-LDPC codes. In fact, the algorithm recursively counts the cycles in the parity-check matrix column-by-column by finding all non-isomorph tailless backtrackless closed (TBC) walks in the base graph and enumerating theoretically their corresponding cycles in the same equivalent class. Moreover, this approach can be modified in few steps to find the cycle distributions of a class of LDPC codes based on Affine permutation matrices (APM-LDPC codes). Interestingly, unlike the existing methods which count the cycles up to $2g-2$, where $g$ is the girth, the proposed algorithm can be used to enumerate the cycles of arbitrary length in the Tanner graph. Moreover, the proposed cycle searching algorithm improves upon various previously known methods, in terms of computational complexity and memory requirements.
In this paper, we propose a novel method for 3D scene and object reconstruction from sparse multi-view images. Different from previous methods that leverage extra information such as depth or generalizable features across scenes, our approach leverages the scene properties embedded in the multi-view inputs to create precise pseudo-labels for optimization without any prior training. Specifically, we introduce a geometry-guided approach that improves surface reconstruction accuracy from sparse views by leveraging spherical harmonics to predict the novel radiance while holistically considering all color observations for a point in the scene. Also, our pipeline exploits proxy geometry and correctly handles the occlusion in generating the pseudo-labels of radiance, which previous image-warping methods fail to avoid. Our method, dubbed Ray Augmentation (RayAug), achieves superior results on DTU and Blender datasets without requiring prior training, demonstrating its effectiveness in addressing the problem of sparse view reconstruction. Our pipeline is flexible and can be integrated into other implicit neural reconstruction methods for sparse views.
We introduce DISSC, a novel, lightweight method that converts the rhythm, pitch contour and timbre of a recording to a target speaker in a textless manner. Unlike DISSC, most voice conversion (VC) methods focus primarily on timbre, and ignore people's unique speaking style (prosody). The proposed approach uses a pretrained, self-supervised model for encoding speech to discrete units, which makes it simple, effective, and fast to train. All conversion modules are only trained on reconstruction like tasks, thus suitable for any-to-many VC with no paired data. We introduce a suite of quantitative and qualitative evaluation metrics for this setup, and empirically demonstrate that DISSC significantly outperforms the evaluated baselines. Code and samples are available at //pages.cs.huji.ac.il/adiyoss-lab/dissc/.
Deep Convolutional Neural Networks have pushed the state-of-the art for semantic segmentation provided that a large amount of images together with pixel-wise annotations is available. Data collection is expensive and a solution to alleviate it is to use transfer learning. This reduces the amount of annotated data required for the network training but it does not get rid of this heavy processing step. We propose a method of transfer learning without annotations on the target task for datasets with redundant content and distinct pixel distributions. Our method takes advantage of the approximate content alignment of the images between two datasets when the approximation error prevents the reuse of annotation from one dataset to another. Given the annotations for only one dataset, we train a first network in a supervised manner. This network autonomously learns to generate deep data representations relevant to the semantic segmentation. Then the images in the new dataset, we train a new network to generate a deep data representation that matches the one from the first network on the previous dataset. The training consists in a regression between feature maps and does not require any annotations on the new dataset. We show that this method reaches performances similar to a classic transfer learning on the PASCAL VOC dataset with synthetic transformations.