Simulating showers of particles in highly-granular detectors is a key frontier in the application of machine learning to particle physics. Achieving high accuracy and speed with generative machine learning models would enable them to augment traditional simulations and alleviate a major computing constraint. This work achieves a major breakthrough in this task by, for the first time, directly generating a point cloud of a few thousand space points with energy depositions in the detector in 3D space without relying on a fixed-grid structure. This is made possible by two key innovations: i) using recent improvements in generative modeling we apply a diffusion model to generate ii) an initial even higher-resolution point cloud of up to 40,000 so-called Geant4 steps which is subsequently down-sampled to the desired number of up to 6,000 space points. We showcase the performance of this approach using the specific example of simulating photon showers in the planned electromagnetic calorimeter of the International Large Detector (ILD) and achieve overall good modeling of physically relevant distributions.
When analyzing real-world data it is common to work with event ensembles, which comprise sets of observations that collectively constrain the parameters of an underlying model of interest. Such models often have a hierarchical structure, where "local" parameters impact individual events and "global" parameters influence the entire dataset. We introduce practical approaches for optimal dataset-wide probabilistic inference in cases where the likelihood is intractable, but simulations can be realized via forward modeling. We construct neural estimators for the likelihood(-ratio) or posterior and show that explicitly accounting for the model's hierarchical structure can lead to tighter parameter constraints. We ground our discussion using case studies from the physical sciences, focusing on examples from particle physics (particle collider data) and astrophysics (strong gravitational lensing observations).
Managing divertor plasmas is crucial for operating reactor scale tokamak devices due to heat and particle flux constraints on the divertor target. Simulation is an important tool to understand and control these plasmas, however, for real-time applications or exhaustive parameter scans only simple approximations are currently fast enough. We address this lack of fast simulators using neural PDE surrogates, data-driven neural network-based surrogate models trained using solutions generated with a classical numerical method. The surrogate approximates a time-stepping operator that evolves the full spatial solution of a reference physics-based model over time. We use DIV1D, a 1D dynamic model of the divertor plasma, as reference model to generate data. DIV1D's domain covers a 1D heat flux tube from the X-point (upstream) to the target. We simulate a realistic TCV divertor plasma with dynamics induced by upstream density ramps and provide an exploratory outlook towards fast transients. State-of-the-art neural PDE surrogates are evaluated in a common framework and extended for properties of the DIV1D data. We evaluate (1) the speed-accuracy trade-off; (2) recreating non-linear behavior; (3) data efficiency; and (4) parameter inter- and extrapolation. Once trained, neural PDE surrogates can faithfully approximate DIV1D's divertor plasma dynamics at sub real-time computation speeds: In the proposed configuration, 2ms of plasma dynamics can be computed in $\approx$0.63ms of wall-clock time, several orders of magnitude faster than DIV1D.
Depth estimation is a crucial step for image-guided intervention in robotic surgery and laparoscopic imaging system. Since per-pixel depth ground truth is difficult to acquire for laparoscopic image data, it is rarely possible to apply supervised depth estimation to surgical applications. As an alternative, self-supervised methods have been introduced to train depth estimators using only synchronized stereo image pairs. However, most recent work focused on the left-right consistency in 2D and ignored valuable inherent 3D information on the object in real world coordinates, meaning that the left-right 3D geometric structural consistency is not fully utilized. To overcome this limitation, we present M3Depth, a self-supervised depth estimator to leverage 3D geometric structural information hidden in stereo pairs while keeping monocular inference. The method also removes the influence of border regions unseen in at least one of the stereo images via masking, to enhance the correspondences between left and right images in overlapping areas. Intensive experiments show that our method outperforms previous self-supervised approaches on both a public dataset and a newly acquired dataset by a large margin, indicating a good generalization across different samples and laparoscopes. Code and data are available at //github.com/br0202/M3Depth.
Recent studies indicate that kernel machines can often perform similarly or better than deep neural networks (DNNs) on small datasets. The interest in kernel machines has been additionally bolstered by the discovery of their equivalence to wide neural networks in certain regimes. However, a key feature of DNNs is their ability to scale the model size and training data size independently, whereas in traditional kernel machines model size is tied to data size. Because of this coupling, scaling kernel machines to large data has been computationally challenging. In this paper, we provide a way forward for constructing large-scale general kernel models, which are a generalization of kernel machines that decouples the model and data, allowing training on large datasets. Specifically, we introduce EigenPro 3.0, an algorithm based on projected dual preconditioned SGD and show scaling to model and data sizes which have not been possible with existing kernel methods.
Deterministic methods for motion planning guarantee safety amidst uncertainty in obstacle locations by trying to restrict the robot from operating in any possible location that an obstacle could be in. Unfortunately, this can result in overly conservative behavior. Chance-constrained optimization can be applied to improve the performance of motion planning algorithms by allowing for a user-specified amount of bounded constraint violation. However, state-of-the-art methods rely either on moment-based inequalities, which can be overly conservative, or make it difficult to satisfy assumptions about the class of probability distributions used to model uncertainty. To address these challenges, this work proposes a real-time, risk-aware reachability-based motion planning framework called RADIUS. The method first generates a reachable set of parameterized trajectories for the robot offline. At run time, RADIUS computes a closed-form over-approximation of the risk of a collision with an obstacle. This is done without restricting the probability distribution used to model uncertainty to a simple class (e.g., Gaussian). Then, RADIUS performs real-time optimization to construct a trajectory that can be followed by the robot in a manner that is certified to have a risk of collision that is less than or equal to a user-specified threshold. The proposed algorithm is compared to several state-of-the-art chance-constrained and deterministic methods in simulation, and is shown to consistently outperform them in a variety of driving scenarios. A demonstration of the proposed framework on hardware is also provided.
We examine the problem of variance components testing in general mixed effects models using the likelihood ratio test. We account for the presence of nuisance parameters, i.e. the fact that some untested variances might also be equal to zero. Two main issues arise in this context leading to a non regular setting. First, under the null hypothesis the true parameter value lies on the boundary of the parameter space. Moreover, due to the presence of nuisance parameters the exact location of these boundary points is not known, which prevents from using classical asymptotic theory of maximum likelihood estimation. Then, in the specific context of nonlinear mixed-effects models, the Fisher information matrix is singular at the true parameter value. We address these two points by proposing a shrinked parametric bootstrap procedure, which is straightforward to apply even for nonlinear models. We show that the procedure is consistent, solving both the boundary and the singularity issues, and we provide a verifiable criterion for the applicability of our theoretical results. We show through a simulation study that, compared to the asymptotic approach, our procedure has a better small sample performance and is more robust to the presence of nuisance parameters. A real data application is also provided.
Integrated Sensing and Communication (ISAC) is an emerging technology that integrates wireless sensing and communication into a single system, transforming many applications, including cooperative mobile robotics. However, in scenarios where radio communications are unavailable, alternative approaches are needed. In this paper, we propose a new optical ISAC (OISAC) scheme for cooperative mobile robots by integrating camera sensing and screen-camera communication (SCC). Unlike previous throughput-oriented SCC designs that work with stationary SCC links, our OISAC scheme is designed for real-time control of mobile robots. It addresses new problems such as image blur and long image display delay. As a case study, we consider the leader-follower formation control problem, an essential part of cooperative mobile robotics. The proposed OISAC scheme enables the follower robot to simultaneously acquire the information shared by the leader and sense the relative pose to the leader using only RGB images captured by its onboard camera. We then design a new control law that can leverage all the information acquired by the camera to achieve stable and accurate formations. We design and conduct real-world experiments involving uniform and nonuniform motions to evaluate the proposed system and demonstrate the advantages of applying OISAC over a benchmark approach that uses extended Kalman filtering (EKF) to estimate the leader's states. Our results show that the proposed OISAC-augmented leader-follower formation system achieves better performance in terms of accuracy, stability, and robustness.
Monocular visual-inertial odometry (VIO) is a low-cost solution to provide high-accuracy, low-drifting pose estimation. However, it has been meeting challenges in vehicular scenarios due to limited dynamics and lack of stable features. In this paper, we propose Ground-VIO, which utilizes ground features and the specific camera-ground geometry to enhance monocular VIO performance in realistic road environments. In the method, the camera-ground geometry is modeled with vehicle-centered parameters and integrated into an optimization-based VIO framework. These parameters could be calibrated online and simultaneously improve the odometry accuracy by providing stable scale-awareness. Besides, a specially designed visual front-end is developed to stably extract and track ground features via the inverse perspective mapping (IPM) technique. Both simulation tests and real-world experiments are conducted to verify the effectiveness of the proposed method. The results show that our implementation could dramatically improve monocular VIO accuracy in vehicular scenarios, achieving comparable or even better performance than state-of-art stereo VIO solutions. The system could also be used for the auto-calibration of IPM which is widely used in vehicle perception. A toolkit for ground feature processing, together with the experimental datasets, would be made open-source (//github.com/GREAT-WHU/gv_tools).
This paper describes an efficient unsupervised learning method for a neural source separation model that utilizes a probabilistic generative model of observed multichannel mixtures proposed for blind source separation (BSS). For this purpose, amortized variational inference (AVI) has been used for directly solving the inverse problem of BSS with full-rank spatial covariance analysis (FCA). Although this unsupervised technique called neural FCA is in principle free from the domain mismatch problem, it is computationally demanding due to the full rankness of the spatial model in exchange for robustness against relatively short reverberations. To reduce the model complexity without sacrificing performance, we propose neural FastFCA based on the jointly-diagonalizable yet full-rank spatial model. Our neural separation model introduced for AVI alternately performs neural network blocks and single steps of an efficient iterative algorithm called iterative source steering. This alternating architecture enables the separation model to quickly separate the mixture spectrogram by leveraging both the deep neural network and the multichannel optimization algorithm. The training objective with AVI is derived to maximize the marginalized likelihood of the observed mixtures. The experiment using mixture signals of two to four sound sources shows that neural FastFCA outperforms conventional BSS methods and reduces the computational time to about 2% of that for the neural FCA.