A seminal palette sparsification result of Assadi, Chen, and Khanna states that in every $n$-vertex graph of maximum degree $\Delta$, sampling $\Theta(\log n)$ colors per vertex from $\{1, \ldots, \Delta+1\}$ almost certainly allows for a proper coloring from the sampled colors. Alon and Assadi extended this work proving a similar result for $O\left(\Delta/\log \Delta\right)$-coloring triangle-free graphs. Apart from being interesting results from a combinatorial standpoint, their results have various applications to the design of graph coloring algorithms in different models of computation. In this work, we focus on locally sparse graphs, i.e., graphs with sparse neighborhoods. We say a graph $G = (V, E)$ is $k$-locally-sparse if for each vertex $v \in V$, the subgraph $G[N(v)]$ contains at most $k$ edges. A celebrated result of Alon, Krivelevich, and Sudakov shows that such graphs are $O(\Delta/\log (\Delta/\sqrt{k}))$-colorable. For any $\alpha \in (0, 1)$ and $k \ll \Delta^{2\alpha}$, let $G$ be a $k$-locally-sparse graph. For $q = \Theta\left(\Delta/\log \left(\Delta^\alpha/\sqrt{k}\right)\right)$, we show that sampling $O\left(\Delta^\alpha + \sqrt{\log n}\right)$ colors per vertex is sufficient to obtain a proper $q$-coloring of $G$ from the sampled colors. Setting $k = 1$ recovers the aforementioned result of Alon and Assadi for triangle-free graphs. A key element in our proof is a proposition regarding correspondence coloring in the so-called color-degree setting, which improves upon recent work of Anderson, Kuchukova, and the author and is of independent interest.
Emotional text-to-speech synthesis (TTS) aims to generate realistic emotional speech from input text. However, quantitatively controlling multi-level emotion rendering remains challenging. In this paper, we propose a diffusion-based emotional TTS framework with a novel approach for emotion intensity modeling to facilitate fine-grained control over emotion rendering at the phoneme, word, and utterance levels. We introduce a hierarchical emotion distribution (ED) extractor that captures a quantifiable ED embedding across different speech segment levels. Additionally, we explore various acoustic features and assess their impact on emotion intensity modeling. During TTS training, the hierarchical ED embedding effectively captures the variance in emotion intensity from the reference audio and correlates it with linguistic and speaker information. The TTS model not only generates emotional speech during inference, but also quantitatively controls the emotion rendering over the speech constituents. Both objective and subjective evaluations demonstrate the effectiveness of our framework in terms of speech quality, emotional expressiveness, and hierarchical emotion control.
Several recent works have focused on carrying out non-asymptotic convergence analyses for AC algorithms. Recently, a two-timescale critic-actor algorithm has been presented for the discounted cost setting in the look-up table case where the timescales of the actor and the critic are reversed and only asymptotic convergence shown. In our work, we present the first two-timescale critic-actor algorithm with function approximation in the long-run average reward setting and present the first finite-time non-asymptotic as well as asymptotic convergence analysis for such a scheme. We obtain optimal learning rates and prove that our algorithm achieves a sample complexity of {$\mathcal{\tilde{O}}(\epsilon^{-(2+\delta)})$ with $\delta >0$ arbitrarily close to zero,} for the mean squared error of the critic to be upper bounded by $\epsilon$ which is better than the one obtained for two-timescale AC in a similar setting. A notable feature of our analysis is that we present the asymptotic convergence analysis of our scheme in addition to the finite-time bounds that we obtain and show the almost sure asymptotic convergence of the (slower) critic recursion to the attractor of an associated differential inclusion with actor parameters corresponding to local maxima of a perturbed average reward objective. We also show the results of numerical experiments on three benchmark settings and observe that our critic-actor algorithm performs the best amongst all algorithms.
We consider the dunking problem: a solid body at uniform temperature $T_{\text i}$ is placed in a environment characterized by farfield temperature $T_\infty$ and spatially uniform time-independent heat transfer coefficient. We permit heterogeneous material composition: spatially dependent density, specific heat, and thermal conductivity. Mathematically, the problem is described by a heat equation with Robin boundary conditions. The crucial parameter is the Biot number -- a nondimensional heat transfer (Robin) coefficient; we consider the limit of small Biot number. We introduce first-order and second-order asymptotic approximations (in Biot number) for several quantities of interest, notably the spatial domain average temperature as a function of time; the first-order approximation is simply the standard engineering `lumped' model. We then provide asymptotic error estimates for the first-order and second-order approximations for small Biot number, and also, for the first-order approximation, alternative strict bounds valid for all Biot number. Companion numerical solutions of the heat equation confirm the effectiveness of the error estimates for small Biot number. The second-order approximation and the first-order and second-order error estimates depend on several functional outputs associated to an elliptic partial differential equation; the latter is derived from Biot-sensitivity analysis of the heat equation eigenproblem in the limit of small Biot number. Most important is $\phi$, the only functional output required for the first-order error estimates; $\phi$ admits a simple physical interpretation in terms of conduction length scale. We investigate the domain and property dependence of $\phi$: most notably, we characterize spatial domains for which the standard lumped-model error criterion -- Biot number (based on volume-to-area length scale) small -- is deficient.
Question Answering (QA) in NLP is the task of finding answers to a query within a relevant context retrieved by a retrieval system. Yet, the mix of relevant and irrelevant information in these contexts can hinder performance enhancements in QA tasks. To address this, we introduce a context filtering approach that removes non-essential details, summarizing crucial content through Reward Modeling. This method emphasizes keeping vital data while omitting the extraneous during summarization model training. We offer a framework for developing efficient QA models by discerning useful information from dataset pairs, bypassing the need for costly human evaluation. Furthermore, we show that our approach can significantly outperform the baseline, as evidenced by a 6.8-fold increase in the EM Per Token (EPT) metric, which we propose as a measure of token efficiency, indicating a notable token-efficiency boost for low-resource settings.
In this paper, we design a family of $[n,k,d]$ block circulant codes that consist of many $[n_0 \ll n,k_0 \ll k,d_0]$ local codes and that satisfy three properties: (1) the code supports distributed erasure decoding, (2) $d$ can be scaled above $d_0$ by a given parameter, and (3) it is amenable to low complexity verification of code symbols using a cryptographic commitment scheme. These properties make the code ideal for use in protocols that address the data availability problem in blockchain networks. Moreover, the code outperforms the currently used 2D Reed-Solomon (RS) code with a larger relative minimum distance $(d/n)$, as desired in the protocol, for a given rate $(k/n)$ in the high-rate regime. The code is designed in two steps. First, we develop the topology, i.e., the structure of linear dependence relations among code symbols, and define it as the block circulant topology $T_{[\mu,\lambda,\omega]}(\rho)$. In this topology, there are $\mu$ local codes, each constrained by $\rho$ parity checks. The set of symbols of a local code intersects with another in a uniform pattern, determined by two parameters, namely the overlap factor $\lambda$ and the overlap width $\omega$. Next, we instantiate the topology, i.e., to specify the coefficients of linear dependence relations, to construct the block circulant codes ${\cal C}_{\text{BC}}[\mu,\lambda,\omega,\rho]$. Every local code is a $[\lambda\omega+\rho,\lambda\omega,\rho+1]$ generalized RS code. The block circulant code has $n=\mu(\rho+\omega)$, $k=\mu\omega$ and we show that $d=\lambda\rho+1$ under certain conditions. For $\lambda=2$, we prove that $d=2\rho+1$ always, and provide an efficient, parallelizable erasure-correcting decoder that fully recovers the codeword when there are $\leq 2\rho$ erasures. The decoder uses a novel decoding mechanism that iteratively recovers erasures from pairs of local codes.
We study frequency domain electromagnetic scattering at a bounded, penetrable, and inhomogeneous obstacle $ \Omega \subset \mathbb{R}^3 $. From the Stratton-Chu integral representation, we derive a new representation formula when constant reference coefficients are given for the interior domain. The resulting integral representation contains the usual layer potentials, but also volume potentials on $\Omega$. Then it is possible to follow a single-trace approach to obtain boundary integral equations perturbed by traces of compact volume integral operators with weakly singular kernels. The coupled boundary and volume integral equations are discretized with a Galerkin approach with usual Curl-conforming and Div-conforming finite elements on the boundary and in the volume. Compression techniques and special quadrature rules for singular integrands are required for an efficient and accurate method. Numerical experiments provide evidence that our new formulation enjoys promising properties.
Is there a fixed dimension $n$ such that translational tiling of $\mathbb{Z}^n$ with a monotile is undecidable? Several recent results support a positive answer to this question. Greenfeld and Tao disprove the periodic tiling conjecture by showing that an aperiodic monotile exists in sufficiently high dimension $n$ [Ann. Math. 200(2024), 301-363]. In another paper [to appear in J. Eur. Math. Soc.], they also show that if the dimension $n$ is part of the input, then the translational tiling for subsets of $\mathbb{Z}^n$ with one tile is undecidable. These two results are very strong pieces of evidence for the conjecture that translational tiling of $\mathbb{Z}^n$ with a monotile is undecidable, for some fixed $n$. This paper gives another supportive result for this conjecture by showing that translational tiling of the $4$-dimensional space with a set of three connected tiles is undecidable.
Astronomers often deal with data where the covariates and the dependent variable are measured with heteroscedastic non-Gaussian error. For instance, while TESS and Kepler datasets provide a wealth of information, addressing the challenges of measurement errors and systematic biases is critical for extracting reliable scientific insights and improving machine learning models' performance. Although techniques have been developed for estimating regression parameters for these data, few techniques exist to construct prediction intervals with finite sample coverage guarantees. To address this issue, we tailor the conformal prediction approach to our application. We empirically demonstrate that this method gives finite sample control over Type I error probabilities under a variety of assumptions on the measurement errors in the observed data. Further, we demonstrate how the conformal prediction method could be used for constructing prediction intervals for unobserved exoplanet masses using established broken power-law relationships between masses and radii found in the literature.
Named entity recognition (NER) is the task to identify text spans that mention named entities, and to classify them into predefined categories such as person, location, organization etc. NER serves as the basis for a variety of natural language applications such as question answering, text summarization, and machine translation. Although early NER systems are successful in producing decent recognition accuracy, they often require much human effort in carefully designing rules or features. In recent years, deep learning, empowered by continuous real-valued vector representations and semantic composition through nonlinear processing, has been employed in NER systems, yielding stat-of-the-art performance. In this paper, we provide a comprehensive review on existing deep learning techniques for NER. We first introduce NER resources, including tagged NER corpora and off-the-shelf NER tools. Then, we systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder. Next, we survey the most representative methods for recent applied techniques of deep learning in new NER problem settings and applications. Finally, we present readers with the challenges faced by NER systems and outline future directions in this area.
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.