Scientific claims gain credibility by replicability, especially if replication under different circumstances and varying designs yields equivalent results. Aggregating results over multiple studies is, however, not straightforward, and when the heterogeneity between studies increases, conventional methods such as (Bayesian) meta-analysis and Bayesian sequential updating become infeasible. *Bayesian Evidence Synthesis*, built upon the foundations of the Bayes factor, allows to aggregate support for conceptually similar hypotheses over studies, regardless of methodological differences. We assess the performance of Bayesian Evidence Synthesis over multiple effect and sample sizes, with a broad set of (inequality-constrained) hypotheses using Monte Carlo simulations, focusing explicitly on the complexity of the hypotheses under consideration. The simulations show that this method can evaluate complex (informative) hypotheses regardless of methodological differences between studies, and performs adequately if the set of studies considered has sufficient statistical power. Additionally, we pinpoint challenging conditions that can lead to unsatisfactory results, and provide suggestions on handling these situations. Ultimately, we show that Bayesian Evidence Synthesis is a promising tool that can be used when traditional research synthesis methods are not applicable due to insurmountable between-study heterogeneity.
Resonance based numerical schemes are those in which cancellations in the oscillatory components of the equation are taken advantage of in order to reduce the regularity required of the initial data to achieve a particular order of error and convergence. We investigate the potential for the derivation of resonance based schemes in the context of nonlinear stochastic PDEs. By comparing the regularity conditions required for error analysis to traditional exponential schemes we demonstrate that at orders less than $ \mathcal{O}(t^2) $, the techniques are successful and provide a significant gain on the regularity of the initial data, while at orders greater than $ \mathcal{O}(t^2) $, that the resonance based techniques does not achieve any gain. This is due to limitations in the explicit path-wise analysis of stochastic integrals. As examples of applications of the method, we present schemes for the Schr\"odinger equation and Manakov system accompanied by local error and stability analysis as well as proof of global convergence in both the strong and path-wise sense.
Complex system design problems, such as those involved in aerospace engineering, require the use of numerically costly simulation codes in order to predict the performance of the system to be designed. In this context, these codes are often embedded into an optimization process to provide the best design while satisfying the design constraints. Recently, new approaches, called Quality-Diversity, have been proposed in order to enhance the exploration of the design space and to provide a set of optimal diversified solutions with respect to some feature functions. These functions are interesting to assess trade-offs. Furthermore, complex design problems often involve mixed continuous, discrete, and categorical design variables allowing to take into account technological choices in the optimization problem. Existing Bayesian Quality-Diversity approaches suited for intensive high-fidelity simulations are not adapted to mixed variables constrained optimization problems. In order to overcome these limitations, a new Quality-Diversity methodology based on mixed variables Bayesian optimization strategy is proposed in the context of limited simulation budget. Using adapted covariance models and dedicated enrichment strategy for the Gaussian processes in Bayesian optimization, this approach allows to reduce the computational cost up to two orders of magnitude, with respect to classical Quality-Diversity approaches while dealing with discrete choices and the presence of constraints. The performance of the proposed method is assessed on a benchmark of analytical problems as well as on two aerospace system design problems highlighting its efficiency in terms of speed of convergence. The proposed approach provides valuable trade-offs for decision-markers for complex system design.
We propose a new mapping tool for supervised and unsupervised analysis of multivariate binary data with multiple items, questions, or response variables. The mapping assumes an underlying proximity response function, where participants can have multiple reasons to disagree or say ``no'' to a question. The probability to endorse, or to agree with an item depends on an item specific parameter and the distance in a joint space between a point representing the item and a point representing the participant. The item specific parameter defines a circle in the joint space around the location of the item such that for participants positioned within the circle the probability is larger than 0.5. For map estimation, we develop and test an MM-algorithm in which the negative likelihood function is majorized with a weighted least squares function. The weighted least squares function can be minimized with standard algorithms for multidimensional unfolding, except that negative working dissimilarities may occur in the iterative process. To illustrate the new mapping, two empirical data sets are analyzed. The mappings are interpreted in detail and the unsupervised map is compared to a visualization based on correspondence analysis. In a Monte Carlo study, we test the performance of the algorithm in terms of recovery of population parameters and conclude that this recovery is adequate. A second Monte Carlo study investigates the predictive performance of the new mapping compared to a similar mapping with a monotone response function.
Analytical workflows in functional magnetic resonance imaging are highly flexible with limited best practices as to how to choose a pipeline. While it has been shown that the use of different pipelines might lead to different results, there is still a lack of understanding of the factors that drive these differences and of the stability of these differences across contexts. We use community detection algorithms to explore the pipeline space and assess the stability of pipeline relationships across different contexts. We show that there are subsets of pipelines that give similar results, especially those sharing specific parameters (e.g. number of motion regressors, software packages, etc.). Those pipeline-to-pipeline patterns are stable across groups of participants but not across different tasks. By visualizing the differences between communities, we show that the pipeline space is mainly driven by the size of the activation area in the brain and the scale of statistic values in statistic maps.
The evaluation of text-generative vision-language models is a challenging yet crucial endeavor. By addressing the limitations of existing Visual Question Answering (VQA) benchmarks and proposing innovative evaluation methodologies, our research seeks to advance our understanding of these models' capabilities. We propose a novel VQA benchmark based on well-known visual classification datasets which allows a granular evaluation of text-generative vision-language models and their comparison with discriminative vision-language models. To improve the assessment of coarse answers on fine-grained classification tasks, we suggest using the semantic hierarchy of the label space to ask automatically generated follow-up questions about the ground-truth category. Finally, we compare traditional NLP and LLM-based metrics for the problem of evaluating model predictions given ground-truth answers. We perform a human evaluation study upon which we base our decision on the final metric. We apply our benchmark to a suite of vision-language models and show a detailed comparison of their abilities on object, action, and attribute classification. Our contributions aim to lay the foundation for more precise and meaningful assessments, facilitating targeted progress in the exciting field of vision-language modeling.
We propose a hybrid iterative method based on MIONet for PDEs, which combines the traditional numerical iterative solver and the recent powerful machine learning method of neural operator, and further systematically analyze its theoretical properties, including the convergence condition, the spectral behavior, as well as the convergence rate, in terms of the errors of the discretization and the model inference. We show the theoretical results for the frequently-used smoothers, i.e. Richardson (damped Jacobi) and Gauss-Seidel. We give an upper bound of the convergence rate of the hybrid method w.r.t. the model correction period, which indicates a minimum point to make the hybrid iteration converge fastest. Several numerical examples including the hybrid Richardson (Gauss-Seidel) iteration for the 1-d (2-d) Poisson equation are presented to verify our theoretical results, and also reflect an excellent acceleration effect. As a meshless acceleration method, it is provided with enormous potentials for practice applications.
We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple ``little'' bootstraps run on partitions of the data and give asymptotic bounds on the coverage error of the resulting confidence intervals. For a fixed differential privacy parameter $\epsilon$, our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$. We empirically validate the performance of our methods for mean estimation, median estimation, and logistic regression with both real and synthetic data. Our methods achieve similar coverage accuracy to existing methods (and non-private baselines) while providing notably shorter ($\gtrsim 10$ times) confidence intervals than previous approaches.
Mendelian randomization (MR) is an instrumental variable (IV) approach to infer causal relationships between exposures and outcomes with genome-wide association studies (GWAS) summary data. However, the multivariable inverse-variance weighting (IVW) approach, which serves as the foundation for most MR approaches, cannot yield unbiased causal effect estimates in the presence of many weak IVs. To address this problem, we proposed the MR using Bias-corrected Estimating Equation (MRBEE) that can infer unbiased causal relationships with many weak IVs and account for horizontal pleiotropy simultaneously. While the practical significance of MRBEE was demonstrated in our parallel work (Lorincz-Comi (2023)), this paper established the statistical theories of multivariable IVW and MRBEE with many weak IVs. First, we showed that the bias of the multivariable IVW estimate is caused by the error-in-variable bias, whose scale and direction are inflated and influenced by weak instrument bias and sample overlaps of exposures and outcome GWAS cohorts, respectively. Second, we investigated the asymptotic properties of multivariable IVW and MRBEE, showing that MRBEE outperforms multivariable IVW regarding unbiasedness of causal effect estimation and asymptotic validity of causal inference. Finally, we applied MRBEE to examine myopia and revealed that education and outdoor activity are causal to myopia whereas indoor activity is not.
The characterization of complex networks with tools originating in geometry, for instance through the statistics of so-called Ricci curvatures, is a well established tool of network science. There exist various types of such Ricci curvatures, capturing different aspects of network geometry. In the present work, we investigate Bakry-\'Emery-Ricci curvature, a notion of discrete Ricci curvature that has been studied much in geometry, but so far has not been applied to networks. We explore on standard classes of artificial networks as well as on selected empirical ones to what the statistics of that curvature are similar to or different from that of other curvatures, how it is correlated to other important network measures, and what it tells us about the underlying network. We observe that most vertices typically have negative curvature. Random and small-world networks exhibit a narrow curvature distribution whereas other classes and most of the real-world networks possess a wide curvature distribution. When we compare Bakry-\'Emery-Ricci curvature with two other discrete notions of Ricci-curvature, Forman-Ricci and Ollivier-Ricci curvature for both model and real-world networks, we observe a high positive correlation between Bakry-\'Emery-Ricci and both Forman-Ricci and Ollivier-Ricci curvature, and in particular with the augmented version of Forman-Ricci curvature. Bakry-\'Emery-Ricci curvature also exhibits a high negative correlation with the vertex centrality measure and degree for most of the model and real-world networks. However, it does not correlate with the clustering coefficient. Also, we investigate the importance of vertices with highly negative curvature values to maintain communication in the network. The computational time for Bakry-\'Emery-Ricci curvature is shorter than that required for Ollivier-Ricci curvature but higher than for Augmented Forman-Ricci curvature.
Conventional entity typing approaches are based on independent classification paradigms, which make them difficult to recognize inter-dependent, long-tailed and fine-grained entity types. In this paper, we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges. To this end, we propose \emph{Label Reasoning Network(LRN)}, which sequentially reasons fine-grained entity labels by discovering and exploiting label dependencies knowledge entailed in the data. Specifically, LRN utilizes an auto-regressive network to conduct deductive reasoning and a bipartite attribute graph to conduct inductive reasoning between labels, which can effectively model, learn and reason complex label dependencies in a sequence-to-set, end-to-end manner. Experiments show that LRN achieves the state-of-the-art performance on standard ultra fine-grained entity typing benchmarks, and can also resolve the long tail label problem effectively.