In this article on variational regularization for ill-posed nonlinear problems, we are once again discussing the consequences of an oversmoothing penalty term. This means in our model that the searched-for solution of the considered nonlinear operator equation does not belong to the domain of definition of the penalty functional. In the past years, such variational regularization has been investigated comprehensively in Hilbert scales, but rarely in a Banach space setting. Our present results try to establish a theoretical justification of oversmoothing regularization in Banach scales. This new study includes convergence rates results for a priori and a posteriori choices of the regularization parameter, both for H\"older-type smoothness and low order-type smoothness. An illustrative example is intended to indicate the specificity of occurring non-reflexive Banach spaces.
Computational simulations have the potential to assist in liver resection surgeries by facilitating surgical planning, optimizing resection strategies, and predicting postoperative outcomes. The modeling of liver tissue across multiple length scales constitutes a significant challenge, primarily due to the multiphysics coupling of mechanical response and perfusion within the complex multiscale vascularization of the organ. In this paper, we present a modeling framework that connects continuum poroelasticity and discrete vascular tree structures to model liver tissue across disparate levels of the perfusion hierarchy. The connection is achieved through a series of modeling decisions, which include source terms in the pressure equation to model inflow from the supplying tree, pressure boundary conditions to model outflow into the draining tree, and contact conditions to model surrounding tissue. We investigate the numerical behaviour of our framework and apply it to a patient-specific full-scale liver problem that demonstrates its potential to help assess surgical liver resection procedures
This paper presents a new distance metric to compare two continuous probability density functions. The main advantage of this metric is that, unlike other statistical measurements, it can provide an analytic, closed-form expression for a mixture of Gaussian distributions while satisfying all metric properties. These characteristics enable fast, stable, and efficient calculations, which are highly desirable in real-world signal processing applications. The application in mind is Gaussian Mixture Reduction (GMR), which is widely used in density estimation, recursive tracking, and belief propagation. To address this problem, we developed a novel algorithm dubbed the Optimization-based Greedy GMR (OGGMR), which employs our metric as a criterion to approximate a high-order Gaussian mixture with a lower order. Experimental results show that the OGGMR algorithm is significantly faster and more efficient than state-of-the-art GMR algorithms while retaining the geometric shape of the original mixture.
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift, and it naturally arises from the optimization of two-layer neural networks via (noisy) gradient descent. Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures. However, all prior analyses assumed the infinite-particle or continuous-time limit, and cannot handle stochastic gradient updates. We provide an general framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and stochastic gradient approximation. To demonstrate the wide applicability of this framework, we establish quantitative convergence rate guarantees to the regularized global optimal solution under (i) a wide range of learning problems such as neural network in the mean-field regime and MMD minimization, and (ii) different gradient estimators including SGD and SVRG. Despite the generality of our results, we achieve an improved convergence rate in both the SGD and SVRG settings when specialized to the standard Langevin dynamics.
A common forecasting setting in real world applications considers a set of possibly heterogeneous time series of the same domain. Due to different properties of each time series such as length, obtaining forecasts for each individual time series in a straight-forward way is challenging. This paper proposes a general framework utilizing a similarity measure in Dynamic Time Warping to find similar time series to build neighborhoods in a k-Nearest Neighbor fashion, and improve forecasts of possibly simple models by averaging. Several ways of performing the averaging are suggested, and theoretical arguments underline the usefulness of averaging for forecasting. Additionally, diagnostics tools are proposed allowing a deep understanding of the procedure.
Many stochastic continuous-state dynamical systems can be modeled as probabilistic programs with nonlinear non-polynomial updates in non-nested loops. We present two methods, one approximate and one exact, to automatically compute, without sampling, moment-based invariants for such probabilistic programs as closed-form solutions parameterized by the loop iteration. The exact method applies to probabilistic programs with trigonometric and exponential updates and is embedded in the Polar tool. The approximate method for moment computation applies to any nonlinear random function as it exploits the theory of polynomial chaos expansion to approximate non-polynomial updates as the sum of orthogonal polynomials. This translates the dynamical system to a non-nested loop with polynomial updates, and thus renders it conformable with the Polar tool that computes the moments of any order of the state variables. We evaluate our methods on an extensive number of examples ranging from modeling monetary policy to several physical motion systems in uncertain environments. The experimental results demonstrate the advantages of our approach with respect to the current state-of-the-art.
Directed acyclic graphs (DAGs) encode a lot of information about a particular distribution in their structure. However, compute required to infer these structures is typically super-exponential in the number of variables, as inference requires a sweep of a combinatorially large space of potential structures. That is, until recent advances made it possible to search this space using a differentiable metric, drastically reducing search time. While this technique -- named NOTEARS -- is widely considered a seminal work in DAG-discovery, it concedes an important property in favour of differentiability: transportability. To be transportable, the structures discovered on one dataset must apply to another dataset from the same domain. We introduce D-Struct which recovers transportability in the discovered structures through a novel architecture and loss function while remaining fully differentiable. Because D-Struct remains differentiable, our method can be easily adopted in existing differentiable architectures, as was previously done with NOTEARS. In our experiments, we empirically validate D-Struct with respect to edge accuracy and structural Hamming distance in a variety of settings.
For ultra-reliable, low-latency communications (URLLC) applications such as mission-critical industrial control and extended reality (XR), it is important to ensure the communication quality of individual packets. Prior studies have considered Probabilistic Per-packet Real-time Communications (PPRC) guarantees for single-cell, single-channel networks, but they have not considered real-world complexities such as inter-cell interference in large-scale networks with multiple communication channels and heterogeneous real-time requirements. To fill the gap, we propose a real-time scheduling algorithm based on \emph{local-deadline-partition (LDP)}, and the LDP algorithm ensures PPRC guarantee for large-scale, multi-channel networks with heterogeneous real-time constraints. We also address the associated challenge of schedulability test. In particular, we propose the concept of \emph{feasible set}, identify a closed-form sufficient condition for the schedulability of PPRC traffic, and then propose an efficient distributed algorithm for the schedulability test. We numerically study the properties of the LDP algorithm and observe that it significantly improves the network capacity of URLLC, for instance, by a factor of 5-20 as compared with a typical method. Furthermore, the PPRC traffic supportable by the LDP algorithm is significantly higher than that of state-of-the-art comparison schemes. This demonstrates the potential of fine-grained scheduling algorithms for URLLC wireless systems regarding interference scenarios.
This paper proposes a framework for generating fast, smooth and predictable braking manoeuvers for a controlled robot. The proposed framework integrates two approaches to obtain feasible modal limits for designing braking trajectories. The first approach is real-time capable but conservative considering the usage of the available feasible actuator control region, resulting in longer braking times. In contrast, the second approach maximizes the used braking control inputs at the cost of requiring more time to evaluate larger, feasible modal limits via optimization. Both approaches allow for predicting the robot's stopping trajectory online. In addition, we also formulated and solved a constrained, nonlinear final-time minimization problem to find optimal torque inputs. The optimal solutions were used as a benchmark to evaluate the performance of the proposed predictable braking framework. A comparative study was compiled in simulation versus a classical optimal controller on a 7-DoF robot arm with only three moving joints. The results verified the effectiveness of our proposed framework and its integrated approaches in achieving fast robot braking manoeuvers with accurate online predictions of the stopping trajectories and distances under various braking settings.
Among randomized numerical linear algebra strategies, so-called sketching procedures are emerging as effective reduction means to accelerate the computation of Krylov subspace methods for, e.g., the solution of linear systems, eigenvalue computations, and the approximation of matrix functions. While there is plenty of experimental evidence showing that sketched Krylov solvers may dramatically improve performance over standard Krylov methods, many features of these schemes are still unexplored. We derive new theoretical results that allow us to significantly improve our understanding of sketched Krylov methods, and to identify, among several possible equivalent formulations, the most suitable sketched approximations according to their numerical stability properties. These results are also employed to analyze the error of sketched Krylov methods in the approximation of the action of matrix functions, significantly contributing to the theory available in the current literature.
Hyperparameter optimization, also known as hyperparameter tuning, is a widely recognized technique for improving model performance. Regrettably, when training private ML models, many practitioners often overlook the privacy risks associated with hyperparameter optimization, which could potentially expose sensitive information about the underlying dataset. Currently, the sole existing approach to allow privacy-preserving hyperparameter optimization is to uniformly and randomly select hyperparameters for a number of runs, subsequently reporting the best-performing hyperparameter. In contrast, in non-private settings, practitioners commonly utilize "adaptive" hyperparameter optimization methods such as Gaussian process-based optimization, which select the next candidate based on information gathered from previous outputs. This substantial contrast between private and non-private hyperparameter optimization underscores a critical concern. In our paper, we introduce DP-HyPO, a pioneering framework for "adaptive" private hyperparameter optimization, aiming to bridge the gap between private and non-private hyperparameter optimization. To accomplish this, we provide a comprehensive differential privacy analysis of our framework. Furthermore, we empirically demonstrate the effectiveness of DP-HyPO on a diverse set of real-world and synthetic datasets.