In this paper we study and relate several invariants connected to the solving degree of a polynomial system. This provides a rigorous framework for estimating the complexity of solving a system of polynomial equations via Groebner bases methods. Our main results include a connection between the solving degree and the last fall degree and one between the degree of regularity and the Castelnuovo-Mumford regularity.
Let $(Z_n)_{n\geq0}$ be a supercritical Galton-Watson process. The Lotka-Nagaev estimator $Z_{n+1}/Z_n$ is a common estimator for the offspring mean.In this paper, we establish some Cram\'{e}r moderate deviation results for the Lotka-Nagaev estimator via a martingale method. Applications to construction of confidence intervals are also given.
This paper deals with the grouped variable selection problem. A widely used strategy is to augment the negative log-likelihood function with a sparsity-promoting penalty. Existing methods include the group Lasso, group SCAD, and group MCP. The group Lasso solves a convex optimization problem but is plagued by underestimation bias. The group SCAD and group MCP avoid this estimation bias but require solving a nonconvex optimization problem that may be plagued by suboptimal local optima. In this work, we propose an alternative method based on the generalized minimax concave (GMC) penalty, which is a folded concave penalty that maintains the convexity of the objective function. We develop a new method for grouped variable selection in linear regression, the group GMC, that generalizes the strategy of the original GMC estimator. We present an efficient algorithm for computing the group GMC estimator and also prove properties of the solution path to guide its numerical computation and tuning parameter selection in practice. We establish error bounds for both the group GMC and original GMC estimators. A rich set of simulation studies and a real data application indicate that the proposed group GMC approach outperforms existing methods in several different aspects under a wide array of scenarios.
We consider the extra degree of freedom offered by the rotation of the reconfigurable intelligent surface (RIS) plane and investigate its potential in improving the performance of RIS-assisted wireless communication systems. By considering radiation pattern modeling at all involved nodes, we first derive the composite channel gain and present a closed-form upper bound for the system ergodic capacity over cascade Rician fading channels. Then, we reconstruct the composite channel gain by taking the rotations at the RIS plane, transmit antenna, and receive antenna into account, and extract the optimal rotation angles after investigating their impacts on the capacity. Moreover, we present a location-dependent expression of the ergodic capacity and investigate the RIS deployment strategy, i.e. the joint rotation adjustment and location selection. Finally, simulation results verify the accuracy of the theoretical analyses and deployment strategy. Although the RIS location has a big impact on the performance, our results showcase that the RIS rotation plays a more important role. In other words, we can obtain a considerable improvement by properly rotating the RIS rather than moving it over a wide area. For instance, we can achieve more than 200\% performance improvement through rotating the RIS by 42.14$^{\circ}$, while an 150\% improvement is obtained by shifting the RIS over 400 meters.
In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%.
The Bochner integral is a generalization of the Lebesgue integral, for functions taking their values in a Banach space. Therefore, both its mathematical definition and its formalization in the Coq proof assistant are more challenging as we cannot rely on the properties of real numbers. Our contributions include an original formalization of simple functions, Bochner integrability defined by a dependent type, and the construction of the proof of the integrability of measurable functions under mild hypotheses (weak separability). Then, we define the Bochner integral and prove several theorems, including dominated convergence and the equivalence with an existing formalization of Lebesgue integral for nonnegative functions.
Heged\H{u}s's lemma is the following combinatorial statement regarding polynomials over finite fields. Over a field $\F$ of characteristic $p > 0$ and for $q$ a power of $p$, the lemma says that any multilinear polynomial $P\in \F[x_1,\ldots,x_n]$ of degree less than $q$ that vanishes at all points in $\{0,1\}^n$ of Hamming weight $k\in [q,n-q]$ must also vanish at all points in $\{0,1\}^n$ of weight $k + q$. This lemma was used by Heged\H{u}s (2009) to give a solution to \emph{Galvin's problem}, an extremal problem about set systems; by Alon, Kumar and Volk (2018) to improve the best-known multilinear circuit lower bounds; and by Hrube\v{s}, Ramamoorthy, Rao and Yehudayoff (2019) to prove optimal lower bounds against depth-$2$ threshold circuits for computing some symmetric functions. In this paper, we formulate a robust version of Heged\H{u}s's lemma. Informally, this version says that if a polynomial of degree $o(q)$ vanishes at most points of weight $k$, then it vanishes at many points of weight $k+q$. We prove this lemma and give three different applications.
Graph Convolutional Networks (GCNs) show promising results for semi-supervised learning tasks on graphs, thus become favorable comparing with other approaches. Despite the remarkable success of GCNs, it is difficult to train GCNs with insufficient supervision. When labeled data are limited, the performance of GCNs becomes unsatisfying for low-degree nodes. While some prior work analyze successes and failures of GCNs on the entire model level, profiling GCNs on individual node level is still underexplored. In this paper, we analyze GCNs in regard to the node degree distribution. From empirical observation to theoretical proof, we confirm that GCNs are biased towards nodes with larger degrees with higher accuracy on them, even if high-degree nodes are underrepresented in most graphs. We further develop a novel Self-Supervised-Learning Degree-Specific GCN (SL-DSGC) that mitigate the degree-related biases of GCNs from model and data aspects. Firstly, we propose a degree-specific GCN layer that captures both discrepancies and similarities of nodes with different degrees, which reduces the inner model-aspect biases of GCNs caused by sharing the same parameters with all nodes. Secondly, we design a self-supervised-learning algorithm that creates pseudo labels with uncertainty scores on unlabeled nodes with a Bayesian neural network. Pseudo labels increase the chance of connecting to labeled neighbors for low-degree nodes, thus reducing the biases of GCNs from the data perspective. Uncertainty scores are further exploited to weight pseudo labels dynamically in the stochastic gradient descent for SL-DSGC. Experiments on three benchmark datasets show SL-DSGC not only outperforms state-of-the-art self-training/self-supervised-learning GCN methods, but also improves GCN accuracy dramatically for low-degree nodes.
We study the link between generalization and interference in temporal-difference (TD) learning. Interference is defined as the inner product of two different gradients, representing their alignment. This quantity emerges as being of interest from a variety of observations about neural networks, parameter sharing and the dynamics of learning. We find that TD easily leads to low-interference, under-generalizing parameters, while the effect seems reversed in supervised learning. We hypothesize that the cause can be traced back to the interplay between the dynamics of interference and bootstrapping. This is supported empirically by several observations: the negative relationship between the generalization gap and interference in TD, the negative effect of bootstrapping on interference and the local coherence of targets, and the contrast between the propagation rate of information in TD(0) versus TD($\lambda$) and regression tasks such as Monte-Carlo policy evaluation. We hope that these new findings can guide the future discovery of better bootstrapping methods.
Robust cross-seasonal localization is one of the major challenges in long-term visual navigation of autonomous vehicles. In this paper, we exploit recent advances in semantic segmentation of images, i.e., where each pixel is assigned a label related to the type of object it represents, to solve the problem of long-term visual localization. We show that semantically labeled 3D point maps of the environment, together with semantically segmented images, can be efficiently used for vehicle localization without the need for detailed feature descriptors (SIFT, SURF, etc.). Thus, instead of depending on hand-crafted feature descriptors, we rely on the training of an image segmenter. The resulting map takes up much less storage space compared to a traditional descriptor based map. A particle filter based semantic localization solution is compared to one based on SIFT-features, and even with large seasonal variations over the year we perform on par with the larger and more descriptive SIFT-features, and are able to localize with an error below 1 m most of the time.
In this paper we introduce a covariance framework for the analysis of EEG and MEG data that takes into account observed temporal stationarity on small time scales and trial-to-trial variations. We formulate a model for the covariance matrix, which is a Kronecker product of three components that correspond to space, time and epochs/trials, and consider maximum likelihood estimation of the unknown parameter values. An iterative algorithm that finds approximations of the maximum likelihood estimates is proposed. We perform a simulation study to assess the performance of the estimator and investigate the influence of different assumptions about the covariance factors on the estimated covariance matrix and on its components. Apart from that, we illustrate our method on real EEG and MEG data sets. The proposed covariance model is applicable in a variety of cases where spontaneous EEG or MEG acts as source of noise and realistic noise covariance estimates are needed for accurate dipole localization, such as in evoked activity studies, or where the properties of spontaneous EEG or MEG are themselves the topic of interest, such as in combined EEG/fMRI experiments in which the correlation between EEG and fMRI signals is investigated.