In this paper we study the variational method and integral equation methods for a conical diffraction problem for imperfectly conducting gratings modeled by the impedance boundary value problem of the Helmholtz equation in periodic structures. We justify the strong ellipticity of the sesquilinear form corresponding to the variational formulation and prove the uniqueness of solutions at any frequency. Convergence of the finite element method using the transparent boundary condition (Dirichlet-to-Neumann mapping) is verified. The boundary integral equation method is also discussed.
Bilevel optimization reveals the inner structure of otherwise oblique optimization problems, such as hyperparameter tuning and meta-learning. A common goal in bilevel optimization is to find stationary points of the hyper-objective function. Although this hyper-objective approach is widely used, its theoretical properties have not been thoroughly investigated in cases where the lower-level functions lack strong convexity. In this work, we take a step forward and study the hyper-objective approach without the typical lower-level strong convexity assumption. Our hardness results show that the hyper-objective of general convex lower-level functions can be intractable either to evaluate or to optimize. To tackle this challenge, we introduce the gradient dominant condition, which strictly relaxes the strong convexity assumption by allowing the lower-level solution set to be non-singleton. Under the gradient dominant condition, we propose the Inexact Gradient-Free Method (IGFM), which uses the Switching Gradient Method (SGM) as the zeroth order oracle, to find an approximate stationary point of the hyper-objective. We also extend our results to nonsmooth lower-level functions under the weak sharp minimum condition.
Aiming to improve the Automatic Speech Recognition (ASR) outputs with a post-processing step, ASR error correction (EC) techniques have been widely developed due to their efficiency in using parallel text data. Previous works mainly focus on using text or/ and speech data, which hinders the performance gain when not only text and speech information, but other modalities, such as visual information are critical for EC. The challenges are mainly two folds: one is that previous work fails to emphasize visual information, thus rare exploration has been studied. The other is that the community lacks a high-quality benchmark where visual information matters for the EC models. Therefore, this paper provides 1) simple yet effective methods, namely gated fusion and image captions as prompts to incorporate visual information to help EC; 2) large-scale benchmark datasets, namely Visual-ASR-EC, where each item in the training data consists of visual, speech, and text information, and the test data are carefully selected by human annotators to ensure that even humans could make mistakes when visual information is missing. Experimental results show that using captions as prompts could effectively use the visual information and surpass state-of-the-art methods by upto 1.2% in Word Error Rate(WER), which also indicates that visual information is critical in our proposed Visual-ASR-EC dataset
In this paper, a computational method is developed to find an approximate solution of the stochastic Volterra-Fredholm integral equation using the Walsh function approximation and its operational matrix. Moreover, convergence and error analysis of the method is carried out to strengthen the validity of the method. Furthermore, the method is numerically compared to the block pulse function method and the Haar wavelet method for some non-trivial examples.
This paper introduces the application of the weak Galerkin (WG) finite element method to solve the Steklov eigenvalue problem, focusing on obtaining lower bounds of the eigenvalues. The noncomforming finite element space of the weak Galerkin finite element method is the key to obtain lower bounds of the eigenvalues. The arbitary high order lower bound estimates are given and the guaranteed lower bounds of the eigenvalues are also discussed. Numerical results demonstrate the accuracy and lower bound property of the numerical scheme.
The logistic regression estimator is known to inflate the magnitude of its coefficients if the sample size $n$ is small, the dimension $p$ is (moderately) large or the signal-to-noise ratio $1/\sigma$ is large (probabilities of observing a label are close to 0 or 1). With this in mind, we study the logistic regression estimator with $p\ll n/\log n$, assuming Gaussian covariates and labels generated by the Gaussian link function, with a mild optimization constraint on the estimator's length to ensure existence. We provide finite sample guarantees for its direction, which serves as a classifier, and its Euclidean norm, which is an estimator for the signal-to-noise ratio. We distinguish between two regimes. In the low-noise/small-sample regime ($n\sigma\lesssim p\log n$), we show that the estimator's direction (and consequentially the classification error) achieve the rate $(p\log n)/n$ - as if the problem was noiseless. In this case, the norm of the estimator is at least of order $n/(p\log n)$. If instead $n\sigma\gtrsim p\log n$, the estimator's direction achieves the rate $\sqrt{\sigma p\log n/n}$, whereas its norm converges to the true norm at the rate $\sqrt{p\log n/(n\sigma^3)}$. As a corollary, the data are not linearly separable with high probability in this regime. The logistic regression estimator allows to conclude which regime occurs with high probability. Therefore, inference for logistic regression is possible in the regime $n\sigma\gtrsim p\log n$. In either case, logistic regression provides a competitive classifier.
Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions over compositional objects. A key limitation of GFlowNets until this time has been that they are restricted to discrete spaces. We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous or hybrid state spaces, and perform experiments with two goals in mind. First, we illustrate critical points of the theory and the importance of various assumptions. Second, we empirically demonstrate how observations about discrete GFlowNets transfer to the continuous case and show strong results compared to non-GFlowNet baselines on several previously studied tasks. This work greatly widens the perspectives for the application of GFlowNets in probabilistic inference and various modeling settings.
This study builds on a recent paper by Lai et al [Appl. Comput. Harmon. Anal., 2018] in which a novel boundary integral formulation is presented for scalar wave scattering analysis in two-dimensional layered and half-spaces. The seminal paper proposes a hybrid integral representation that combines the Sommerfeld integral and layer potential to efficiently deal with the boundaries of infinite length. In this work, we modify the integral formulation to eliminate the fictitious eigenvalues by employing Burton-Miller's approach. We also discuss reasonable parameter settings for the hybrid integral equation to ensure efficient and accurate numerical solutions. Furthermore, we extend the modified formulation for the scattering from a cavity in a half-space whose boundary is locally perturbed. To address the cavity scattering, we introduce a virtual boundary enclosing the cavity and couple the integral equation on it with the hybrid equation. The effectiveness of the proposed method is demonstrated through numerical examples.
Quantum error-correcting codes (QECCs) can eliminate the negative effects of quantum noise, the major obstacle to the execution of quantum algorithms. However, realizing practical quantum error correction (QEC) requires resolving many challenges to implement a high-performance real-time decoding system. Many decoding algorithms have been proposed and optimized in the past few decades, of which neural network (NNs) based solutions have drawn an increasing amount of attention due to their high efficiency. Unfortunately, previous works on neural decoders are still at an early stage and have only relatively simple architectures, which makes them unsuitable for practical QEC. In this work, we propose a scalable, fast, and programmable neural decoding system to meet the requirements of FTQEC for rotated surface codes (RSC). Firstly, we propose a hardware-efficient NN decoding algorithm with relatively low complexity and high accuracy. Secondly, we develop a customized hardware decoder with architectural optimizations to reduce latency. Thirdly, our proposed programmable architecture boosts the scalability and flexibility of the decoder by maximizing parallelism. Fourthly, we build an FPGA-based decoding system with integrated control hardware for evaluation. Our $L=5$ ($L$ is the code distance) decoder achieves an extremely low decoding latency of 197 ns, and the $L=7$ configuration also requires only 1.136 $\mu$s, both taking $2L$ rounds of syndrome measurements. The accuracy results of our system are close to minimum weight perfect matching (MWPM). Furthermore, our programmable architecture reduces hardware resource consumption by up to $3.0\times$ with only a small latency loss. We validated our approach in real-world scenarios by conducting a proof-of-concept benchmark with practical noise models, including one derived from experimental data gathered from physical hardware.
This paper proposes and analyzes a novel efficient high-order finite volume method for the ideal magnetohydrodynamics (MHD). As a distinctive feature, the method simultaneously preserves a discretely divergence-free (DDF) constraint on the magnetic field and the positivity-preserving (PP) property, which ensures the positivity of density, pressure, and internal energy. To enforce the DDF condition, we design a new discrete projection approach that projects the reconstructed point values at the cell interface into a DDF space, without using any approximation polynomials. This projection method is highly efficient, easy to implement, and particularly suitable for standard high-order finite volume WENO methods, which typically return only the point values in the reconstruction. Moreover, we also develop a new finite volume framework for constructing provably PP schemes for the ideal MHD system. The framework comprises the discrete projection technique, a suitable approximation to the Godunov--Powell source terms, and a simple PP limiter. We provide rigorous analysis of the PP property of the proposed finite volume method, demonstrating that the DDF condition and the proper approximation to the source terms eliminate the impact of magnetic divergence terms on the PP property. The analysis is challenging due to the internal energy function's nonlinearity and the intricate relationship between the DDF and PP properties. To address these challenges, the recently developed geometric quasilinearization approach is adopted, which transforms a nonlinear constraint into a family of linear constraints. Finally, we validate the effectiveness of the proposed method through several benchmark and demanding numerical examples. The results demonstrate that the proposed method is robust, accurate, and highly effective, confirming the significance of the proposed DDF projection and PP techniques.
It is well known that Cauchy problem for Laplace equations is an ill-posed problem in Hadamard's sense. Small deviations in Cauchy data may lead to large errors in the solutions. It is observed that if a bound is imposed on the solution, there exists a conditional stability estimate. This gives a reasonable way to construct stable algorithms. However, it is impossible to have good results at all points in the domain. Although numerical methods for Cauchy problems for Laplace equations have been widely studied for quite a long time, there are still some unclear points, for example, how to evaluate the numerical solutions, which means whether we can approximate the Cauchy data well and keep the bound of the solution, and at which points the numerical results are reliable? In this paper, we will prove the conditional stability estimate which is quantitatively related to harmonic measures. The harmonic measure can be used as an indicate function to pointwisely evaluate the numerical result, which further enables us to find a reliable subdomain where the local convergence rate is higher than a certain order.