This work proposes a fast iterative method for local steric Poisson--Boltzmann (PB) theories, in which the electrostatic potential is governed by the Poisson's equation and ionic concentrations satisfy equilibrium conditions. To present the method, we focus on a local steric PB theory derived from a lattice-gas model, as an example. The advantages of the proposed method in efficiency are achieved by treating ionic concentrations as scalar implicit functions of the electrostatic potential, though such functions are only numerically achievable. The existence, uniqueness, boundness, and smoothness of such functions are rigorously established. A Newton iteration method with truncation is proposed to solve a nonlinear system discretized from the generalized PB equations. The existence and uniqueness of the solution to the discretized nonlinear system are established by showing that it is a unique minimizer of a constructed convex energy. Thanks to the boundness of ionic concentrations, truncation bounds for the potential are obtained by using the extremum principle. The truncation step in iterations is shown to be energy and error decreasing. To further speed-up computations, we propose a novel precomputing-interpolation strategy, which is applicable to other local steric PB theories and makes the proposed methods for solving steric PB theories as efficient as for solving the classical PB theory. Analysis on the Newton iteration method with truncation shows local quadratic convergence for the proposed numerical methods. Applications to realistic biomolecular solvation systems reveal that counterions with steric hindrance stratify in an order prescribed by the parameter of ionic valence-to-volume ratio. Finally, we remark that the proposed iterative methods for local steric PB theories can be readily incorporated in well-known classical PB solvers.
We consider an optimal control problem constrained by a parabolic partial differential equation (PDE) with Robin boundary conditions. We use a well-posed space-time variational formulation in Lebesgue--Bochner spaces with minimal regularity. The abstract formulation of the optimal control problem yields the Lagrange function and Karush--Kuhn--Tucker (KKT) conditions in a natural manner. This results in space-time variational formulations of the adjoint and gradient equation in Lebesgue--Bochner spaces with minimal regularity. Necessary and sufficient optimality conditions are formulated and the optimality system is shown to be well-posed. Next, we introduce a conforming uniformly stable simultaneous space-time (tensorproduct) discretization of the optimality system in these Lebesgue--Boch\-ner spaces. Using finite elements of appropriate orders in space and time for trial and test spaces, this setting is known to be equivalent to a Crank--Nicolson time-stepping scheme for parabolic problems. Differences to existing methods are detailed. We show numerical comparisons with time-stepping methods. The space-time method shows good stability properties and requires fewer degrees of freedom in time to reach the same accuracy.
Test error estimation is a fundamental problem in statistics and machine learning. Correctly assessing the future performance of an algorithm is an essential task, especially with the development of complex predictive algorithms that require data-driven parameter tuning. We propose a new coupled bootstrap estimator for the test error of Poisson-response algorithms, a fundamental model for count data and with applications such as signal processing, density estimation, and queue theory. The idea behind our estimator is to generate two carefully designed new random vectors from the original data, where one acts as a training sample and the other as a test set. It is unbiased for an intuitive parameter: the out-of-sample error of a Poisson random vector whose mean has been shrunken by a small factor. Moreover, in a limiting regime, the coupled bootstrap estimator recovers an exactly unbiased estimator for test error. Our framework is applicable to loss functions of the Bregman divergence family, and our analysis and examples focus on two important cases: Poisson likelihood deviance and squared loss. Through a bias-variance decomposition, we analyze the effect of the number of bootstrap samples and the added noise due to the two auxiliary variables. We then apply our method to different scenarios with both simulated and real data.
Referred to as the third rung of the causal inference ladder, counterfactual queries typically ask the "What if ?" question retrospectively. The standard approach to estimate counterfactuals resides in using a structural equation model that accurately reflects the underlying data generating process. However, such models are seldom available in practice and one usually wishes to infer them from observational data alone. Unfortunately, the correct structural equation model is in general not identifiable from the observed factual distribution. Nevertheless, in this work, we show that under the assumption that the main latent contributors to the treatment responses are categorical, the counterfactuals can be still reliably predicted. Building upon this assumption, we introduce CounterFactual Query Prediction (CFQP), a novel method to infer counterfactuals from continuous observations when the background variables are categorical. We show that our method significantly outperforms previously available deep-learning-based counterfactual methods, both theoretically and empirically on time series and image data. Our code is available at //github.com/edebrouwer/cfqp.
The ever-growing size of modern space-time data sets, such as those collected by remote sensing, requires new techniques for their efficient and automated processing, including gap-filling of missing values. CUDA-based parallelization on GPU has become a popular way to dramatically increase computational efficiency of various approaches. Recently, we have proposed a computationally efficient and competitive, yet simple spatial prediction approach inspired from statistical physics models, called modified planar rotator (MPR) method. Its GPU implementation allowed additional impressive computational acceleration exceeding two orders of magnitude in comparison with CPU calculations. In the current study we propose a rather general approach to modelling spatial heterogeneity in GPU-implemented spatial prediction methods for two-dimensional gridded data by introducing spatial variability to model parameters. Predictions of unknown values are obtained from non-equilibrium conditional simulations, assuming ``local'' equilibrium conditions. We demonstrate that the proposed method leads to significant improvements in both prediction performance and computational efficiency.
A primary objective of news articles is to establish the factual record for an event, frequently achieved by conveying both the details of the specified event (i.e., the 5 Ws; Who, What, Where, When and Why regarding the event) and how people reacted to it (i.e., reported statements). However, existing work on news summarization almost exclusively focuses on the event details. In this work, we propose the novel task of summarizing the reactions of different speakers, as expressed by their reported statements, to a given event. To this end, we create a new multi-document summarization benchmark, SUMREN, comprising 745 summaries of reported statements from various public figures obtained from 633 news articles discussing 132 events. We propose an automatic silver training data generation approach for our task, which helps smaller models like BART achieve GPT-3 level performance on this task. Finally, we introduce a pipeline-based framework for summarizing reported speech, which we empirically show to generate summaries that are more abstractive and factual than baseline query-focused summarization approaches.
Learning precoding policies with neural networks enables low complexity online implementation, robustness to channel impairments, and joint optimization with channel acquisition. However, existing neural networks suffer from high training complexity and poor generalization ability when they are used to learn to optimize precoding for mitigating multi-user interference. This impedes their use in practical systems where the number of users is time-varying. In this paper, we propose a graph neural network (GNN) to learn precoding policies by harnessing both the mathematical model and the property of the policies. We first show that a vanilla GNN cannot well-learn pseudo-inverse of channel matrix when the numbers of antennas and users are large, and is not generalizable to unseen numbers of users. Then, we design a GNN by resorting to the Taylor's expansion of matrix pseudo-inverse, which allows for capturing the importance of the neighbored edges to be aggregated that is crucial for learning precoding policies efficiently. Simulation results show that the proposed GNN can well learn spectral efficient and energy efficient precoding policies in single- and multi-cell multi-user multi-antenna systems with low training complexity, and can be well generalized to the numbers of users.
Quantum machine learning has become an area of growing interest but has certain theoretical and hardware-specific limitations. Notably, the problem of vanishing gradients, or barren plateaus, renders the training impossible for circuits with high qubit counts, imposing a limit on the number of qubits that data scientists can use for solving problems. Independently, angle-embedded supervised quantum neural networks were shown to produce truncated Fourier series with a degree directly dependent on two factors: the depth of the encoding, and the number of parallel qubits the encoding is applied to. The degree of the Fourier series limits the model expressivity. This work introduces two new architectures whose Fourier degrees grow exponentially: the sequential and parallel exponential quantum machine learning architectures. This is done by efficiently using the available Hilbert space when encoding, increasing the expressivity of the quantum encoding. Therefore, the exponential growth allows staying at the low-qubit limit to create highly expressive circuits avoiding barren plateaus. Practically, parallel exponential architecture was shown to outperform the existing linear architectures by reducing their final mean square error value by up to 44.7% in a one-dimensional test problem. Furthermore, the feasibility of this technique was also shown on a trapped ion quantum processing unit.
A simultaneously transmitting and reflecting surface (STARS) aided terahertz (THz) communication system is proposed. A novel power consumption model depending on the type and the resolution of individual elements is proposed for the STARS. Then, the system energy efficiency (EE) and spectral efficiency (SE) are maximized in both narrowband and wideband THz systems. 1) For the narrowband system, an iterative algorithm based on penalty dual decomposition is proposed to jointly optimize the hybrid beamforming at the base station (BS) and the independent phase-shift coefficients at the STARS. The proposed algorithm is then extended to the coupled phase-shift STARS. 2) For the wideband system, to eliminate the beam split effect, a time-delay (TD) network implemented by the true-time-delayers is applied in the hybrid beamforming structure. An iterative algorithm based on the quasi-Newton method is proposed to design the coefficients of the TD network. Finally, our numerical results reveal that i) there is a slight performance loss of EE and SE caused by coupled phase shifts of the STARS in both narrowband and wideband systems, and ii) the conventional hybrid beamforming achieved close performance of EE and SE to the full-digital one in the narrowband system, but not in the wideband system where the TD-based hybrid beamforming is more efficient.
Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate. This paper continues the exploration of task-oriented parsing by introducing a new dataset for parsing pizza and drink orders, whose semantics cannot be captured by flat slots and intents. We perform an extensive evaluation of deep-learning techniques for task-oriented parsing on this dataset, including different flavors of seq2seq systems and RNNGs. The dataset comes in two main versions, one in a recently introduced utterance-level hierarchical notation that we call TOP, and one whose targets are executable representations (EXR). We demonstrate empirically that training the parser to directly generate EXR notation not only solves the problem of entity resolution in one fell swoop and overcomes a number of expressive limitations of TOP notation, but also results in significantly greater parsing accuracy.
Two combined numerical methods for solving time-varying semilinear differential-algebraic equations (DAEs) are obtained. These equations are also called degenerate DEs, descriptor systems, operator-differential equations and DEs on manifolds. The convergence and correctness of the methods are proved. When constructing methods we use, in particular, time-varying spectral projectors which can be numerically found. This enables to numerically solve and analyze the considered DAE in the original form without additional analytical transformations. To improve the accuracy of the second method, recalculation (a ``predictor-corrector'' scheme) is used. Note that the developed methods are applicable to the DAEs with the continuous nonlinear part which may not be continuously differentiable in $t$, and that the restrictions of the type of the global Lipschitz condition, including the global condition of contractivity, are not used in the theorems on the global solvability of the DAEs and on the convergence of the numerical methods. This enables to use the developed methods for the numerical solution of more general classes of mathematical models. For example, the functions of currents and voltages in electric circuits may not be differentiable or may be approximated by nondifferentiable functions. Presented conditions for the global solvability of the DAEs ensure the existence of an unique exact global solution for the corresponding initial value problem, which enables to compute approximate solutions on any given time interval (provided that the conditions of theorems or remarks on the convergence of the methods are fulfilled). In the paper, the numerical analysis of the mathematical model for a certain electrical circuit, which demonstrates the application of the presented theorems and numerical methods, is carried out.