Solutions to many partial differential equations satisfy certain bounds or constraints. For example, the density and pressure are positive for equations of fluid dynamics, and in the relativistic case the fluid velocity is upper bounded by the speed of light, etc. As widely realized, it is crucial to develop bound-preserving numerical methods that preserve such intrinsic constraints. Exploring provably bound-preserving schemes has attracted much attention and is actively studied in recent years. This is however still a challenging task for many systems especially those involving nonlinear constraints. Based on some key insights from geometry, we systematically propose an innovative and general framework, referred to as geometric quasilinearization (GQL), which paves a new effective way for studying bound-preserving problems with nonlinear constraints. The essential idea of GQL is to equivalently transfer all nonlinear constraints into linear ones, through properly introducing some free auxiliary variables. We establish the fundamental principle and general theory of GQL via the geometric properties of convex regions, and propose three simple effective methods for constructing GQL. We apply the GQL approach to a variety of partial differential equations, and demonstrate its effectiveness and remarkable advantages for studying bound-preserving schemes, by diverse challenging examples and applications which cannot be easily handled by direct or traditional approaches.
While many works exploiting an existing Lie group structure have been proposed for state estimation, in particular the Invariant Extended Kalman Filter (IEKF), few papers address the construction of a group structure that allows casting a given system into the IEKF framework, namely making the dynamics group affine and the observations invariant. In this paper we introduce a large class of systems encompassing most problems involving a navigating vehicle encountered in practice. For those systems we introduce a novel methodology that systematically provides a group structure for the state space, including vectors of the body frame such as biases. We use it to derive observers having properties akin to those of linear observers or filters. The proposed unifying and versatile framework encompasses all systems where IEKF has proved successful, improves state-of-the art "imperfect" IEKF for inertial navigation with sensor biases, and allows addressing novel examples, like GNSS antenna lever arm estimation.
We present a discontinuous Galerkin internal-penalty scheme that is applicable to a large class of linear and nonlinear elliptic partial differential equations. The unified scheme can accommodate all second-order elliptic equations that can be formulated in first-order flux form, encompassing problems in linear elasticity, general relativity, and hydrodynamics, including problems formulated on a curved manifold. It allows for a wide range of linear and nonlinear boundary conditions, and accommodates curved and nonconforming meshes. Our generalized internal-penalty numerical flux and our Schur-complement strategy of eliminating auxiliary degrees of freedom make the scheme compact without requiring equation-specific modifications. We demonstrate the accuracy of the scheme for a suite of numerical test problems. The scheme is implemented in the open-source SpECTRE numerical relativity code.
The non-convexity of the artificial neural network (ANN) training landscape brings inherent optimization difficulties. While the traditional back-propagation stochastic gradient descent (SGD) algorithm and its variants are effective in certain cases, they can become stuck at spurious local minima and are sensitive to initializations and hyperparameters. Recent work has shown that the training of an ANN with ReLU activations can be reformulated as a convex program, bringing hope to globally optimizing interpretable ANNs. However, naively solving the convex training formulation has an exponential complexity, and even an approximation heuristic requires cubic time. In this work, we characterize the quality of this approximation and develop two efficient algorithms that train ANNs with global convergence guarantees. The first algorithm is based on the alternating direction method of multiplier (ADMM). It solves both the exact convex formulation and the approximate counterpart. Linear global convergence is achieved, and the initial several iterations often yield a solution with high prediction accuracy. When solving the approximate formulation, the per-iteration time complexity is quadratic. The second algorithm, based on the "sampled convex programs" theory, is simpler to implement. It solves unconstrained convex formulations and converges to an approximately globally optimal classifier. The non-convexity of the ANN training landscape exacerbates when adversarial training is considered. We apply the robust convex optimization theory to convex training and develop convex formulations that train ANNs robust to adversarial inputs. Our analysis explicitly focuses on one-hidden-layer fully connected ANNs, but can extend to more sophisticated architectures.
Inverse kinematics (IK) is the problem of finding robot joint configurations that satisfy constraints on the position or pose of one or more end-effectors. For robots with redundant degrees of freedom, there is often an infinite, nonconvex set of solutions. The IK problem is further complicated when collision avoidance constraints are imposed by obstacles in the workspace. In general, closed-form expressions yielding feasible configurations do not exist, motivating the use of numerical solution methods. However, these approaches rely on local optimization of nonconvex problems, often requiring an accurate initialization or numerous re-initializations to converge to a valid solution. In this work, we first formulate inverse kinematics with complex workspace constraints as a convex feasibility problem whose low-rank feasible points provide exact IK solutions. We then present \texttt{CIDGIK} (Convex Iteration for Distance-Geometric Inverse Kinematics), an algorithm that solves this feasibility problem with a sequence of semidefinite programs whose objectives are designed to encourage low-rank minimizers. Our problem formulation elegantly unifies the configuration space and workspace constraints of a robot: intrinsic robot geometry and obstacle avoidance are both expressed as simple linear matrix equations and inequalities. Our experimental results for a variety of popular manipulator models demonstrate faster and more accurate convergence than a conventional nonlinear optimization-based approach, especially in environments with many obstacles.
For the general class of residual distribution (RD) schemes, including many finite element (such as continuous/discontinuous Galerkin) and flux reconstruction methods, an approach to construct entropy conservative/ dissipative semidiscretizations by adding suitable correction terms has been proposed by Abgrall (J.~Comp.~Phys. 372: pp. 640--666, 2018). In this work, the correction terms are characterized as solutions of certain optimization problems and are adapted to the SBP-SAT framework, focusing on discontinuous Galerkin methods. Novel generalizations to entropy inequalities, multiple constraints, and kinetic energy preservation for the Euler equations are developed and tested in numerical experiments. For all of these optimization problems, explicit solutions are provided. Additionally, the correction approach is applied for the first time to obtain a fully discrete entropy conservative/dissipative RD scheme. Here, the application of the deferred correction (DeC) method for the time integration is essential. This paper can be seen as describing a systematic method to construct structure preserving discretization, at least for the considered example.
In this paper, we consider downlink low Earth orbit (LEO) satellite communication systems where multiple LEO satellites are uniformly distributed over a sphere at a certain altitude according to a homogeneous binomial point process (BPP). Based on the characteristics of the BPP, we analyze the distance distributions and the distribution cases for the serving satellite. We analytically derive the exact outage probability, and the approximated expression is obtained using the Poisson limit theorem. With these derived expressions, the system throughput maximization problem is formulated under the satellite-visibility and outage constraints. To solve this problem, we reformulate it with bounded feasible sets and propose an iterative algorithm to obtain near-optimal solutions. Simulation results perfectly match the derived exact expressions for the outage probability and system throughput. The analytical results of the approximated expressions are fairly close to those of the exact ones. It is also shown that the proposed algorithm for the throughput maximization is very close to the optimal performance obtained by a two-dimensional exhaustive search.
Imbalanced datasets are commonplace in modern machine learning problems. The presence of under-represented classes or groups with sensitive attributes results in concerns about generalization and fairness. Such concerns are further exacerbated by the fact that large capacity deep nets can perfectly fit the training data and appear to achieve perfect accuracy and fairness during training, but perform poorly during test. To address these challenges, we propose AutoBalance, a bi-level optimization framework that automatically designs a training loss function to optimize a blend of accuracy and fairness-seeking objectives. Specifically, a lower-level problem trains the model weights, and an upper-level problem tunes the loss function by monitoring and optimizing the desired objective over the validation data. Our loss design enables personalized treatment for classes/groups by employing a parametric cross-entropy loss and individualized data augmentation schemes. We evaluate the benefits and performance of our approach for the application scenarios of imbalanced and group-sensitive classification. Extensive empirical evaluations demonstrate the benefits of AutoBalance over state-of-the-art approaches. Our experimental findings are complemented with theoretical insights on loss function design and the benefits of train-validation split. All code is available open-source.
The positive definiteness of discrete time-fractional derivatives is fundamental to the numerical stability (in the energy sense) for time-fractional phase-field models. A novel technique is proposed to estimate the minimum eigenvalue of discrete convolution kernels generated by the nonuniform L1, half-grid based L1 and time-averaged L1 formulas of the fractional Caputo's derivative. The main discrete tools are the discrete orthogonal convolution kernels and discrete complementary convolution kernels. Certain variational energy dissipation laws at discrete levels of the variable-step L1-type methods are then established for time-fractional Cahn-Hilliard model.They are shown to be asymptotically compatible, in the fractional order limit $\alpha\rightarrow1$, with the associated energy dissipation law for the classical Cahn-Hilliard equation. Numerical examples together with an adaptive time-stepping procedure are provided to demonstrate the effectiveness of the proposed methods.
Deep learning is the mainstream technique for many machine learning tasks, including image recognition, machine translation, speech recognition, and so on. It has outperformed conventional methods in various fields and achieved great successes. Unfortunately, the understanding on how it works remains unclear. It has the central importance to lay down the theoretic foundation for deep learning. In this work, we give a geometric view to understand deep learning: we show that the fundamental principle attributing to the success is the manifold structure in data, namely natural high dimensional data concentrates close to a low-dimensional manifold, deep learning learns the manifold and the probability distribution on it. We further introduce the concepts of rectified linear complexity for deep neural network measuring its learning capability, rectified linear complexity of an embedding manifold describing the difficulty to be learned. Then we show for any deep neural network with fixed architecture, there exists a manifold that cannot be learned by the network. Finally, we propose to apply optimal mass transportation theory to control the probability distribution in the latent space.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.