Consider a connected graph $G$ and let $T$ be a spanning tree of $G$. Every edge $e \in G-T$ induces a cycle in $T \cup \{e\}$. The intersection of two distinct such cycles is the set of edges of $T$ that belong to both cycles. The MSTCI problem consists in finding a spanning tree that has the least number of such non-empty intersections and the instersection number is the number of non-empty intersections of a solution. In this article we consider three aspects of the problem in a general context (i.e. for arbitrary connected graphs). The first presents two lower bounds of the intersection number. The second compares the intersection number of graphs that differ in one edge. The last is an attempt to generalize a recent result for graphs with a universal vertex.
We treat the problem of the Frobenius distance evaluation from a given matrix $ A \in \mathbb R^{n\times n} $ with distinct eigenvalues to the manifold of matrices with multiple eigenvalues. On restricting considerations to the rank $ 1 $ real perturbation matrices, we prove that the distance in question equals $ \sqrt{z_{\ast}} $ where $ z_{\ast} $ is a positive (generically, the least positive) zero of the algebraic equation $$ \mathcal F(z) = 0, \ \mbox{where} \ \mathcal F(z):= \mathcal D_{\lambda} \left( \det \left[ (\lambda I - A)(\lambda I - A^{\top})-z I_n \right] \right)/z^n $$ and $ \mathcal D_{\lambda} $ stands for the discriminant of the polynomial treated with respect to $\lambda $. In the framework of this approach we also provide the procedure for finding the nearest to $ A $ matrix with multiple eigenvalue. Generalization of the problem to the case of complex perturbations is also discussed. Several examples are presented clarifying the computational aspects of the approach.
In this paper, we study upper bounds on the minimum length of frameproof codes introduced by Boneh and Shaw to protect copyrighted materials. A $q$-ary $(k,n)$-frameproof code of length $t$ is a $t \times n$ matrix having entries in $\{0,1,\ldots, q-1\}$ and with the property that for any column $\mathbf{c}$ and any other $k$ columns, there exists a row where the symbols of the $k$ columns are all different from the corresponding symbol (in the same row) of the column $\mathbf{c}$. In this paper, we show the existence of $q$-ary $(k,n)$-frameproof codes of length $t = O(\frac{k^2}{q} \log n)$ for $q \leq k$, using the Lov\'asz Local Lemma, and of length $t = O(\frac{k}{\log(q/k)}\log(n/k))$ for $q > k$ using the expurgation method. Remarkably, for the practical case of $q \leq k$ our findings give codes whose length almost matches the lower bound $\Omega(\frac{k^2}{q\log k} \log n)$ on the length of any $q$-ary $(k,n)$-frameproof code and, more importantly, allow us to derive an algorithm of complexity $O(t n^2)$ for the construction of such codes.
This work sheds some light on the relationship between a distribution's standard deviation and its range, a topic that has been discussed extensively in the literature. While many previous studies have proposed inequalities or relationships that depend on the shape of the population distribution, the approach here is built on a family of bounded probability distributions based on skewing functions. We offer closed-form expressions for its moments and the asymptotic behavior as the support's semi-range tends to zero and $\infty$. We also establish an inequality in which the well-known Popoviciu's one is a special case. Finally, we provide an example using US dollar prices in four different currencies traded on foreign exchange markets to illustrate the results developed here.
We consider identification and inference for the average treatment effect and heterogeneous treatment effect conditional on observable covariates in the presence of unmeasured confounding. Since point identification of average treatment effect and heterogeneous treatment effect is not achievable without strong assumptions, we obtain bounds on both average and heterogeneous treatment effects by leveraging differential effects, a tool that allows for using a second treatment to learn the effect of the first treatment. The differential effect is the effect of using one treatment in lieu of the other, and it could be identified in some observational studies in which treatments are not randomly assigned to units, where differences in outcomes may be due to biased assignments rather than treatment effects. With differential effects, we develop a flexible and easy-to-implement semi-parametric framework to estimate bounds and establish asymptotic properties over the support for conducting statistical inference. We provide conditions under which causal estimands are point identifiable as well in the proposed framework. The proposed method is examined by a simulation study and two case studies using datasets from National Health and Nutrition Examination Survey and Youth Risk Behavior Surveillance System.
We present an approach for testing student learning outcomes in a course on automated reasoning using the Isabelle proof assistant. The approach allows us to test both general understanding of formal proofs in various logical proof systems and understanding of proofs in the higher-order logic of Isabelle/HOL in particular. The use of Isabelle enables almost automatic grading of large parts of the exam. We explain our approach through a number of example problems, and explain why we believe that each of the kinds of problems we have selected are adequate measures of our intended learning outcomes. Finally, we discuss our experiences using the approach for the exam of a course on automated reasoning and suggest potential future work.
Electricity is traded on various markets with different time horizons and regulations. Short-term intraday trading becomes increasingly important due to the higher penetration of renewables. In Germany, the intraday electricity price typically fluctuates around the day-ahead price of the European Power EXchange (EPEX) spot markets in a distinct hourly pattern. This work proposes a probabilistic modeling approach that models the intraday price difference to the day-ahead contracts. The model captures the emerging hourly pattern by considering the four 15 min intervals in each day-ahead price interval as a four-dimensional joint probability distribution. The resulting nontrivial, multivariate price difference distribution is learned using a normalizing flow, i.e., a deep generative model that combines conditional multivariate density estimation and probabilistic regression. Furthermore, this work discusses the influence of different external impact factors based on literature insights and impact analysis using explainable artificial intelligence (XAI). The normalizing flow is compared to an informed selection of historical data and probabilistic forecasts using a Gaussian copula and a Gaussian regression model. Among the different models, the normalizing flow identifies the trends with the highest accuracy and has the narrowest prediction intervals. Both the XAI analysis and the empirical experiments highlight that the immediate history of the price difference realization and the increments of the day-ahead price have the most substantial impact on the price difference.
This study developed a new statistical model and method for analyzing the precision of binary measurement methods from collaborative studies. The model is based on beta-binomial distributions. In other words, it assumes that the sensitivity of each laboratory obeys a beta distribution, and the binary measured values under a given sensitivity follow a binomial distribution. We propose the key precision measures of repeatability and reproducibility for the model, and provide their unbiased estimates. Further, through consideration of a number of statistical test methods for homogeneity of proportions, we propose appropriate methods for determining laboratory effects in the new model. Finally, we apply the results to real-world examples in the fields of food safety and chemical risk assessment and management.
The heterogeneous, geographically distributed infrastructure of fog computing poses challenges in data replication, data distribution, and data mobility for fog applications. Fog computing is still missing the necessary abstractions to manage application data, and fog application developers need to re-implement data management for every new piece of software. Proposed solutions are limited to certain application domains, such as the IoT, are not flexible in regard to network topology, or do not provide the means for applications to control the movement of their data. In this paper, we present FReD, a data replication middleware for the fog. FReD serves as a building block for configurable fog data distribution and enables low-latency, high-bandwidth, and privacy-sensitive applications. FReD is a common data access interface across heterogeneous infrastructure and network topologies, provides transparent and controllable data distribution, and can be integrated with applications from different domains. To evaluate our approach, we present a prototype implementation of FReD and show the benefits of developing with FReD using three case studies of fog computing applications.
The Erd\H{o}s-R\'enyi random graph is the simplest model for node degree distribution, and it is one of the most widely studied. In this model, pairs of $n$ vertices are selected and connected uniformly at random with probability $p$, consequently, the degrees for a given vertex follow the binomial distribution. If the number of vertices is large, the binomial can be approximated by Normal using the Central Limit Theorem, which is often allowed when $\min (np, n(1-p)) > 5$. This is true for every node independently. However, due to the fact that the degrees of nodes in a graph are not independent, we aim in this paper to test whether the degrees of per node collectively in the Erd\H{o}s-R\'enyi graph have a multivariate normal distribution MVN. A chi square goodness of fit test for the hypothesis that binomial is a distribution for the whole set of nodes is rejected because of the dependence between degrees. Before testing MVN we show that the covariance and correlation between the degrees of any pair of nodes in the graph are $p(1-p)$ and $1/(n-1)$, respectively. We test MVN considering two assumptions: independent and dependent degrees, and we obtain our results based on the percentages of rejected statistics of chi square, the $p$-values of Anderson Darling test, and a CDF comparison. We always achieve a good fit of multivariate normal distribution with large values of $n$ and $p$, and very poor fit when $n$ or $p$ are very small. The approximation seems valid when $np \geq 10$. We also compare the maximum likelihood estimate of $p$ in MVN distribution where we assume independence and dependence. The estimators are assessed using bias, variance and mean square error.
Artificial Intelligence (AI) and its applications have sparked extraordinary interest in recent years. This achievement can be ascribed in part to advances in AI subfields including Machine Learning (ML), Computer Vision (CV), and Natural Language Processing (NLP). Deep learning, a sub-field of machine learning that employs artificial neural network concepts, has enabled the most rapid growth in these domains. The integration of vision and language has sparked a lot of attention as a result of this. The tasks have been created in such a way that they properly exemplify the concepts of deep learning. In this review paper, we provide a thorough and an extensive review of the state of the arts approaches, key models design principles and discuss existing datasets, methods, their problem formulation and evaluation measures for VQA and Visual reasoning tasks to understand vision and language representation learning. We also present some potential future paths in this field of research, with the hope that our study may generate new ideas and novel approaches to handle existing difficulties and develop new applications.