In this paper, we present an approach to enhance interpolation and approximation error estimates. Based on a previously derived first-order Taylor-like formula, we demonstrate its applicability in improving the $P_1$-interpolation error estimate. Following the same principles, we also develop a novel numerical scheme for the heat equation that yields a better error estimate compared to the classical implicit finite differences scheme.
This is the first paper in a series of work we have accomplished over the past three years. In this paper, we have constructed a complete and compatible formal plane geometry system. This will serve as a crucial bridge between IMO-level plane geometry challenges and readable AI automated reasoning. Within this formal framework, we have been able to seamlessly integrate modern AI models with our formal system. AI is now capable of providing deductive reasoning solutions to IMO-level plane geometry problems, just like handling other natural languages, and these proofs are readable, traceable, and verifiable. We propose the geometry formalization theory (GFT) to guide the development of the geometry formal system. Based on the GFT, we have established the FormalGeo, which consists of 88 geometric predicates and 196 theorems. It can represent, validate, and solve IMO-level geometry problems. we also have crafted the FGPS (formal geometry problem solver) in Python. It serves as both an interactive assistant for verifying problem-solving processes and an automated problem solver. We've annotated the formalgeo7k and formalgeo-imo datasets. The former contains 6,981 (expand to 133,818 through data augmentation) geometry problems, while the latter includes 18 (expand to 2,627 and continuously increasing) IMO-level challenging geometry problems. All annotated problems include detailed formal language descriptions and solutions. Implementation of the formal system and experiments validate the correctness and utility of the GFT. The backward depth-first search method only yields a 2.42% problem-solving failure rate, and we can incorporate deep learning techniques to achieve lower one. The source code of FGPS and datasets are available at //github.com/BitSecret/FGPS.
Modeling and formally reasoning about distributed systems with faults is a challenging task. To address this problem, we propose the theory of Validating Labeled State transition and Message production systems (VLSMs). The theory of VLSMs provides a general approach to describing and verifying properties of distributed protocols whose executions are subject to faults, supporting a correct-by-construction system development methodology. The central focus of our investigation is equivocation, a mode of faulty behavior that we formally model, reason about, and then show how to detect from durable evidence that may be available locally to system components. Equivocating components exhibit behavior that is inconsistent with single-trace system executions, while also only interacting with other components by sending and receiving valid messages. Components of system are called validators for that system if their validity constraints validate that the messages they receive are producible by the system. Our main result shows that for systems of validators, the effect that Byzantine components can have on honest validators is precisely identical to the effect that equivocating components can have on non-equivocating validators. Therefore, for distributed systems of potentially faulty validators, replacing Byzantine components with equivocating components has no material analytical consequences, and forms the basis of a sound alternative foundation to Byzantine fault tolerance analysis. All of the results and examples in the paper have been formalised and checked in the Coq proof assistant.
In this paper, we study the problem of efficiently and effectively embedding the high-dimensional spatio-spectral information of hyperspectral (HS) images, guided by feature diversity. Specifically, based on the theoretical formulation that feature diversity is correlated with the rank of the unfolded kernel matrix, we rectify 3D convolution by modifying its topology to enhance the rank upper-bound. This modification yields a rank-enhanced spatial-spectral symmetrical convolution set (ReS$^3$-ConvSet), which not only learns diverse and powerful feature representations but also saves network parameters. Additionally, we also propose a novel diversity-aware regularization (DA-Reg) term that directly acts on the feature maps to maximize independence among elements. To demonstrate the superiority of the proposed ReS$^3$-ConvSet and DA-Reg, we apply them to various HS image processing and analysis tasks, including denoising, spatial super-resolution, and classification. Extensive experiments show that the proposed approaches outperform state-of-the-art methods both quantitatively and qualitatively to a significant extent. The code is publicly available at //github.com/jinnh/ReSSS-ConvSet.
In this paper, we derive a PAC-Bayes bound on the generalisation gap, in a supervised time-series setting for a special class of discrete-time non-linear dynamical systems. This class includes stable recurrent neural networks (RNN), and the motivation for this work was its application to RNNs. In order to achieve the results, we impose some stability constraints, on the allowed models. Here, stability is understood in the sense of dynamical systems. For RNNs, these stability conditions can be expressed in terms of conditions on the weights. We assume the processes involved are essentially bounded and the loss functions are Lipschitz. The proposed bound on the generalisation gap depends on the mixing coefficient of the data distribution, and the essential supremum of the data. Furthermore, the bound converges to zero as the dataset size increases. In this paper, we 1) formalize the learning problem, 2) derive a PAC-Bayesian error bound for such systems, 3) discuss various consequences of this error bound, and 4) show an illustrative example, with discussions on computing the proposed bound. Unlike other available bounds the derived bound holds for non i.i.d. data (time-series) and it does not grow with the number of steps of the RNN.
In this paper, we introduce a new approach for constructing robust well-balanced numerical methods for the one-dimensional Saint-Venant system with and without the Manning friction term. Following the idea presented in [R. Abgrall, Commun. Appl. Math. Comput. 5(2023), pp. 370-402], we first combine the conservative and non-conservative (primitive) formulations of the studied conservative hyperbolic system in a natural way. The solution is globally continuous and described by a combination of point values and average values. The point values and average values will then be evolved by two different forms of PDEs: a conservative version of the cell averages and a possibly non-conservative one for the points. We show how to deal with both the conservative and non-conservative forms of PDEs in a well-balanced manner. The developed schemes are capable of exactly preserving both the still-water and moving-water equilibria. Compared with existing well-balanced methods, this new class of scheme is nonlinear-equations-solver-free. This makes the developed schemes less computationally costly and easier to extend to other models. We demonstrate the behavior of the proposed new scheme on several challenging examples.
In this paper, we investigate a novel reconfigurable distributed antennas and reflecting surface (RDARS) aided multi-user massive MIMO system with imperfect CSI and propose a practical two-timescale (TTS) transceiver design to reduce the communication overhead and computational complexity of the system. In the RDARS-aided system, not only distribution gain but also reflection gain can be obtained by a flexible combination of the distributed antennas and reflecting surface, which differentiates the system from the others and also makes the TTS design challenging. To enable the optimal TTS transceiver design, the achievable rate of the system is first derived in closed-form. Then the TTS design aiming at the weighted sum rate maximization is considered. To solve the challenging non-convex optimization problem with high-order design variables, i.e., the transmit powers and the phase shifts at the RDARS, a block coordinate descent based method is proposed to find the optimal solutions in semi-closed forms iteratively. Specifically, two efficient algorithms are proposed with provable convergence for the optimal phase shift design, i.e., Riemannian Gradient Ascent based algorithm by exploiting the unit-modulus constraints, and Two-Tier Majorization-Minimization based algorithm with closed-form optimal solutions in each iteration. Simulation results validate the effectiveness of the proposed algorithm and demonstrate the superiority of deploying RDARS in massive MIMO systems to provide substantial rate improvement with a significantly reduced total number of active antennas/RF chains and lower transmit power when compared to the DAS and RIS-aided systems.
In this paper, we present a comprehensive review of the imbalance problems in object detection. To analyze the problems in a systematic manner, we introduce a problem-based taxonomy. Following this taxonomy, we discuss each problem in depth and present a unifying yet critical perspective on the solutions in the literature. In addition, we identify major open issues regarding the existing imbalance problems as well as imbalance problems that have not been discussed before. Moreover, in order to keep our review up to date, we provide an accompanying webpage which catalogs papers addressing imbalance problems, according to our problem-based taxonomy. Researchers can track newer studies on this webpage available at: //github.com/kemaloksuz/ObjectDetectionImbalance .
In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.
BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at //github.com/nlpyang/BertSum
In this paper, we propose a novel multi-task learning architecture, which incorporates recent advances in attention mechanisms. Our approach, the Multi-Task Attention Network (MTAN), consists of a single shared network containing a global feature pool, together with task-specific soft-attention modules, which are trainable in an end-to-end manner. These attention modules allow for learning of task-specific features from the global pool, whilst simultaneously allowing for features to be shared across different tasks. The architecture can be built upon any feed-forward neural network, is simple to implement, and is parameter efficient. Experiments on the CityScapes dataset show that our method outperforms several baselines in both single-task and multi-task learning, and is also more robust to the various weighting schemes in the multi-task loss function. We further explore the effectiveness of our method through experiments over a range of task complexities, and show how our method scales well with task complexity compared to baselines.