Exploring the semantic context in scene images is essential for indoor scene recognition. However, due to the diverse intra-class spatial layouts and the coexisting inter-class objects, modeling contextual relationships to adapt various image characteristics is a great challenge. Existing contextual modeling methods for indoor scene recognition exhibit two limitations: 1) During training, space-independent information, such as color, may hinder optimizing the network's capacity to represent the spatial context. 2) These methods often overlook the differences in coexisting objects across different scenes, suppressing scene recognition performance. To address these limitations, we propose SpaCoNet, which simultaneously models the Spatial relation and Co-occurrence of objects based on semantic segmentation. Firstly, the semantic spatial relation module (SSRM) is designed to explore the spatial relation among objects within a scene. With the help of semantic segmentation, this module decouples the spatial information from the image, effectively avoiding the influence of irrelevant features. Secondly, both spatial context features from the SSRM and deep features from the Image Feature Extraction Module are used to distinguish the coexisting object across different scenes. Finally, utilizing the discriminative features mentioned above, we employ the self-attention mechanism to explore the long-range co-occurrence among objects, and further generate a semantic-guided feature representation for indoor scene recognition. Experimental results on three widely used scene datasets demonstrate the effectiveness and generality of the proposed method. The code will be made publicly available after the blind review process is completed.
We propose, analyze, and test new iterative solvers for large-scale systems of linear algebraic equations arising from the finite element discretization of reduced optimality systems defining the finite element approximations to the solution of elliptic tracking-type distributed optimal control problems with both the standard $L_2$ and the more general energy regularizations. If we aim at an approximation of the given desired state $y_d$ by the computed finite element state $y_h$ that asymptotically differs from $y_d$ in the order of the best $L_2$ approximation under acceptable costs for the control, then the optimal choice of the regularization parameter $\varrho$ is linked to the mesh-size $h$ by the relations $\varrho=h^4$ and $\varrho=h^2$ for the $L_2$ and the energy regularization, respectively. For this setting, we can construct efficient parallel iterative solvers for the reduced finite element optimality systems. These results can be generalized to variable regularization parameters adapted to the local behavior of the mesh-size that can heavily change in case of adaptive mesh refinement. Similar results can be obtained for the space-time finite element discretization of the corresponding parabolic and hyperbolic optimal control problems.
We derive and analyze a symmetric interior penalty discontinuous Galerkin scheme for the approximation of the second-order form of the radiative transfer equation in slab geometry. Using appropriate trace lemmas, the analysis can be carried out as for more standard elliptic problems. Supporting examples show the accuracy and stability of the method also numerically, for different polynomial degrees. For discretization, we employ quad-tree grids, which allow for local refinement in phase-space, and we show exemplary that adaptive methods can efficiently approximate discontinuous solutions. We investigate the behavior of hierarchical error estimators and error estimators based on local averaging.
Introducing a coupling framework reminiscent of FETI methods, but here on abstract form, we establish conditions for stability and minimal requirements for well-posedness on the continuous level, as well as conditions on local solvers for the approximation of subproblems. We then discuss stability of the resulting Lagrange multiplier methods and show stability under a mesh conditions between the local discretizations and the mortar space. If this condition is not satisfied we show how a stabilization, acting only on the multiplier can be used to achieve stability. The design of preconditioners of the Schur complement system is discussed in the unstabilized case. Finally we discuss some applications that enter the framework.
The arrival of AI techniques in computations, with the potential for hallucinations and non-robustness, has made trustworthiness of algorithms a focal point. However, trustworthiness of the many classical approaches are not well understood. This is the case for feature selection, a classical problem in the sciences, statistics, machine learning etc. Here, the LASSO optimisation problem is standard. Despite its widespread use, it has not been established when the output of algorithms attempting to compute support sets of minimisers of LASSO in order to do feature selection can be trusted. In this paper we establish how no (randomised) algorithm that works on all inputs can determine the correct support sets (with probability $> 1/2$) of minimisers of LASSO when reading approximate input, regardless of precision and computing power. However, we define a LASSO condition number and design an efficient algorithm for computing these support sets provided the input data is well-posed (has finite condition number) in time polynomial in the dimensions and logarithm of the condition number. For ill-posed inputs the algorithm runs forever, hence, it will never produce a wrong answer. Furthermore, the algorithm computes an upper bound for the condition number when this is finite. Finally, for any algorithm defined on an open set containing a point with infinite condition number, there is an input for which the algorithm will either run forever or produce a wrong answer. Our impossibility results stem from generalised hardness of approximation -- within the Solvability Complexity Index (SCI) hierarchy framework -- that generalises the classical phenomenon of hardness of approximation.
In this article, we study some anisotropic singular perturbations for a class of linear elliptic problems. A uniform estimates for conforming $Q_1$ finite element method are derived, and some other results of convergence and regularity for the continuous problem are proved.
We consider logistic regression including two sets of discrete or categorical covariates that are missing at random (MAR) separately or simultaneously. We examine the asymptotic properties of two multiple imputation (MI) estimators, given in the study of Lee at al. (2023), for the parameters of the logistic regression model with both sets of discrete or categorical covariates that are MAR separately or simultaneously. The proposed estimated asymptotic variances of the two MI estimators address a limitation observed with Rubin's type estimated variances, which lead to underestimate the variances of the two MI estimators (Rubin, 1987). Simulation results demonstrate that our two proposed MI methods outperform the complete-case, semiparametric inverse probability weighting, random forest MI using chained equations, and stochastic approximation of expectation-maximization methods. To illustrate the methodology's practical application, we provide a real data example from a survey conducted in the Feng Chia night market in Taichung City, Taiwan.
We present ParrotTTS, a modularized text-to-speech synthesis model leveraging disentangled self-supervised speech representations. It can train a multi-speaker variant effectively using transcripts from a single speaker. ParrotTTS adapts to a new language in low resource setup and generalizes to languages not seen while training the self-supervised backbone. Moreover, without training on bilingual or parallel examples, ParrotTTS can transfer voices across languages while preserving the speaker specific characteristics, e.g., synthesizing fluent Hindi speech using a French speaker's voice and accent. We present extensive results in monolingual and multi-lingual scenarios. ParrotTTS outperforms state-of-the-art multi-lingual TTS models using only a fraction of paired data as latter.
We study hypothesis testing under communication constraints, where each sample is quantized before being revealed to a statistician. Without communication constraints, it is well known that the sample complexity of simple binary hypothesis testing is characterized by the Hellinger distance between the distributions. We show that the sample complexity of simple binary hypothesis testing under communication constraints is at most a logarithmic factor larger than in the unconstrained setting and this bound is tight. We develop a polynomial-time algorithm that achieves the aforementioned sample complexity. Our framework extends to robust hypothesis testing, where the distributions are corrupted in the total variation distance. Our proofs rely on a new reverse data processing inequality and a reverse Markov inequality, which may be of independent interest. For simple $M$-ary hypothesis testing, the sample complexity in the absence of communication constraints has a logarithmic dependence on $M$. We show that communication constraints can cause an exponential blow-up leading to $\Omega(M)$ sample complexity even for adaptive algorithms.
In this article, we study nonparametric inference for a covariate-adjusted regression function. This parameter captures the average association between a continuous exposure and an outcome after adjusting for other covariates. In particular, under certain causal conditions, this parameter corresponds to the average outcome had all units been assigned to a specific exposure level, known as the causal dose-response curve. We propose a debiased local linear estimator of the covariate-adjusted regression function, and demonstrate that our estimator converges pointwise to a mean-zero normal limit distribution. We use this result to construct asymptotically valid confidence intervals for function values and differences thereof. In addition, we use approximation results for the distribution of the supremum of an empirical process to construct asymptotically valid uniform confidence bands. Our methods do not require undersmoothing, permit the use of data-adaptive estimators of nuisance functions, and our estimator attains the optimal rate of convergence for a twice differentiable function. We illustrate the practical performance of our estimator using numerical studies and an analysis of the effect of air pollution exposure on cardiovascular mortality.
With the increasing demand of intelligent systems capable of operating in different contexts (e.g. users on the move) the correct interpretation of the user-need by such systems has become crucial to give consistent answers to the user questions. The most effective applications addressing such task are in the fields of natural language processing and semantic expansion of terms. These techniques are aimed at estimating the goal of an input query reformulating it as an intent, commonly relying on textual resources built exploiting different semantic relations like \emph{synonymy}, \emph{antonymy} and many others. The aim of this paper is to generate such resources using the labels of a given taxonomy as source of information. The obtained resources are integrated into a plain classifier for reformulating a set of input queries as intents and tracking the effect of each relation, in order to quantify the impact of each semantic relation on the classification. As an extension to this, the best tradeoff between improvement and noise introduction when combining such relations is evaluated. The assessment is made generating the resources and their combinations and using them for tuning the classifier which is used to reformulate the user questions as labels. The evaluation employs a wide and varied taxonomy as a use-case, exploiting its labels as basis for the semantic expansion and producing several corpora with the purpose of enhancing the pseudo-queries estimation.