A reliable model order reduction process for parametric analysis in electromagnetics is detailed. Special emphasis is placed on certifying the accuracy of the reduced-order model. For this purpose, a sharp state error estimator is proposed. Standard a posteriori state error estimation for model order reduction relies on the inf-sup constant. For parametric systems, the inf-sup constant is parameter-dependent. The a posteriori error estimation for systems with very small or vanishing inf-sup constant poses a challenge, since it is inversely proportional to the inf-sup constant, resulting in rather useless, overly pessimistic error estimators. Such systems appear in electromagnetics since the inf-sup constant values are close to zero at points close to resonant frequencies, where they eventually vanish. We propose a novel a posteriori state error estimator which avoids the calculation of the inf-sup constant. The proposed state error estimator is compared with the standard error estimator and a recently proposed one in the literature. It is shown that our proposed error estimator outperforms both existing estimators. Numerical experiments are performed on real-life microwave devices such as narrowband and wideband antennas, two types of dielectric resonator filters as well as a dual-mode waveguide filter. These examples show the capabilities and efficiency of the proposed methodology.
Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPMSolver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0.2s on 100-step DiffusionPolicy and 16s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score.
Monte Carlo integration is a commonly used technique to compute intractable integrals and is typically thought to perform poorly for very high-dimensional integrals. To show that this is not always the case, we examine Monte Carlo integration using techniques from the high-dimensional statistics literature by allowing the dimension of the integral to increase. In doing so, we derive non-asymptotic bounds for the relative and absolute error of the approximation for some general classes of functions through concentration inequalities. We provide concrete examples in which the magnitude of the number of points sampled needed to guarantee a consistent estimate vary between polynomial to exponential, and show that in theory arbitrarily fast or slow rates are possible. This demonstrates that the behaviour of Monte Carlo integration in high dimensions is not uniform. Through our methods we also obtain non-asymptotic confidence intervals for Monte Carlo estimate which are valid regardless of the number of points sampled.
This paper studies delayed stochastic algorithms for weakly convex optimization in a distributed network with workers connected to a master node. More specifically, we consider a structured stochastic weakly convex objective function which is the composition of a convex function and a smooth nonconvex function. Recently, Xu et al. 2022 showed that an inertial stochastic subgradient method converges at a rate of $\mathcal{O}(\tau/\sqrt{K})$, which suffers a significant penalty from the maximum information delay $\tau$. To alleviate this issue, we propose a new delayed stochastic prox-linear ($\texttt{DSPL}$) method in which the master performs the proximal update of the parameters and the workers only need to linearly approximate the inner smooth function. Somewhat surprisingly, we show that the delays only affect the high order term in the complexity rate and hence, are negligible after a certain number of $\texttt{DSPL}$ iterations. Moreover, to further improve the empirical performance, we propose a delayed extrapolated prox-linear ($\texttt{DSEPL}$) method which employs Polyak-type momentum to speed up the algorithm convergence. Building on the tools for analyzing $\texttt{DSPL}$, we also develop improved analysis of delayed stochastic subgradient method ($\texttt{DSGD}$). In particular, for general weakly convex problems, we show that convergence of $\texttt{DSGD}$ only depends on the expected delay.
We study the problem of approximate sampling from non-log-concave distributions, e.g., Gaussian mixtures, which is often challenging even in low dimensions due to their multimodality. We focus on performing this task via Markov chain Monte Carlo (MCMC) methods derived from discretizations of the overdamped Langevin diffusions, which are commonly known as Langevin Monte Carlo algorithms. Furthermore, we are also interested in two nonsmooth cases for which a large class of proximal MCMC methods have been developed: (i) a nonsmooth prior is considered with a Gaussian mixture likelihood; (ii) a Laplacian mixture distribution. Such nonsmooth and non-log-concave sampling tasks arise from a wide range of applications to Bayesian inference and imaging inverse problems such as image deconvolution. We perform numerical simulations to compare the performance of most commonly used Langevin Monte Carlo algorithms.
In the present paper, we examine a Crouzeix-Raviart approximation for non-linear partial differential equations having a $(p(\cdot),\delta)$-structure. We establish a medius error estimate, i.e., a best-approximation result, which holds for uniformly continuous exponents and implies a priori error estimates, which apply for H\"older continuous exponents and are optimal for Lipschitz continuous exponents. The theoretical findings are supported by numerical experiments.
We study implementations of basic fault-tolerant primitives, such as consensus and registers, in message-passing systems subject to process crashes and a broad range of communication failures. Our results characterize the necessary and sufficient conditions for implementing these primitives as a function of the connectivity constraints and synchrony assumptions. Our main contribution is a new algorithm for partially synchronous consensus that is resilient to process crashes and channel failures and is optimal in its connectivity requirements. In contrast to prior work, our algorithm assumes the most general model of message loss where faulty channels are flaky, i.e., can lose messages without any guarantee of fairness. This failure model is particularly challenging for consensus algorithms, as it rules out standard solutions based on leader oracles and failure detectors. To circumvent this limitation, we construct our solution using a new variant of the recently proposed view synchronizer abstraction, which we adapt to the crash-prone setting with flaky channels.
We present evidence that language models can learn meaning despite being trained only to perform next token prediction on text, specifically a corpus of programs. Each program is preceded by a specification in the form of (textual) input-output examples. Working with programs enables us to precisely define concepts relevant to meaning in language (e.g., correctness and semantics), making program synthesis well-suited as an intermediate testbed for characterizing the presence (or absence) of meaning in language models. We first train a Transformer model on the corpus of programs, then probe the trained model's hidden states as it completes a program given a specification. Despite providing no inductive bias toward learning the semantics of the language, we find that a linear probe is able to extract abstractions of both current and future program states from the model states. Moreover, there is a strong, statistically significant correlation between the accuracy of the probe and the model's ability to generate a program that implements the specification. To evaluate whether the semantics are represented in the model states rather than learned by the probe, we design a novel experimental procedure that intervenes on the semantics of the language while preserving the lexicon and syntax. We also demonstrate that the model learns to generate correct programs that are, on average, shorter than those in the training set, which is evidence that language model outputs may differ from the training distribution in semantically meaningful ways. In summary, this paper does not propose any new techniques for training language models, but develops an experimental framework for and provides insights into the acquisition and representation of (formal) meaning in language models.
The relationship between the number of training data points, the number of parameters in a statistical model, and the generalization capabilities of the model has been widely studied. Previous work has shown that double descent can occur in the over-parameterized regime, and believe that the standard bias-variance trade-off holds in the under-parameterized regime. In this paper, we present a simple example that provably exhibits double descent in the under-parameterized regime. For simplicity, we look at the ridge regularized least squares denoising problem with data on a line embedded in high-dimension space. By deriving an asymptotically accurate formula for the generalization error, we observe sample-wise and parameter-wise double descent with the peak in the under-parameterized regime rather than at the interpolation point or in the over-parameterized regime. Further, the peak of the sample-wise double descent curve corresponds to a peak in the curve for the norm of the estimator, and adjusting $\mu$, the strength of the ridge regularization, shifts the location of the peak. We observe that parameter-wise double descent occurs for this model for small $\mu$. For larger values of $\mu$, we observe that the curve for the norm of the estimator has a peak but that this no longer translates to a peak in the generalization error. Moreover, we study the training error for this problem. The considered problem setup allows for studying the interaction between two regularizers. We provide empirical evidence that the model implicitly favors using the ridge regularizer over the input data noise regularizer. Thus, we show that even though both regularizers regularize the same quantity, i.e., the norm of the estimator, they are not equivalent.
It is well known that Cauchy problem for Laplace equations is an ill-posed problem in Hadamard's sense. Small deviations in Cauchy data may lead to large errors in the solutions. It is observed that if a bound is imposed on the solution, there exists a conditional stability estimate. This gives a reasonable way to construct stable algorithms. However, it is impossible to have good results at all points in the domain. Although numerical methods for Cauchy problems for Laplace equations have been widely studied for quite a long time, there are still some unclear points, for example, how to evaluate the numerical solutions, which means whether we can approximate the Cauchy data well and keep the bound of the solution, and at which points the numerical results are reliable? In this paper, we will prove the conditional stability estimate which is quantitatively related to harmonic measures. The harmonic measure can be used as an indicate function to pointwisely evaluate the numerical result, which further enables us to find a reliable subdomain where the local convergence rate is higher than a certain order.
Intelligent reflecting surfaces (IRSs) are being widely investigated as a potential low-cost and energy-efficient alternative to active relays for improving coverage in next-generation cellular networks. However, technical constraints in the configuration of IRSs should be taken into account in the design of scheduling solutions and the assessment of their performance. To this end, we examine an IRS-assisted time division multiple access (TDMA) cellular network where the reconfiguration of the IRS incurs a communication cost; thus, we aim at limiting the number of reconfigurations over time. Along these lines, we propose a clustering-based heuristic scheduling scheme that maximizes the cell sum capacity, subject to a fixed number of reconfigurations within a TDMA frame. First, the best configuration of each user equipment (UE), in terms of joint beamforming and optimal IRS configuration, is determined using an iterative algorithm. Then, we propose different clustering techniques to divide the UEs into subsets sharing the same sub-optimal IRS configuration, derived through distance- and capacity-based algorithms. Finally, UEs within the same cluster are scheduled accordingly. We provide extensive numerical results for different propagation scenarios, IRS sizes, and phase shifters quantization constraints, showing the effectiveness of our approach in supporting multi-user IRS systems with practical constraints.