We show that the anti-ferromagnetic Potts model on trees exhibits strong spatial mixing for a near-optimal range of parameters. Our work complements recent results of Chen, Liu, Mani, and Moitra [arXiv.2304.01954] who showed this to be true in the infinite temperature setting, corresponding to uniform proper colorings. We furthermore prove weak spatial mixing results complementing results in [arXiv.2304.01954].
Fine-tuning large pre-trained models has become the de facto strategy for developing both task-specific and general-purpose machine learning systems, including developing models that are safe to deploy. Despite its clear importance, there has been minimal work that explains how fine-tuning alters the underlying capabilities learned by a model during pretraining: does fine-tuning yield entirely novel capabilities or does it just modulate existing ones? We address this question empirically in synthetic, controlled settings where we can use mechanistic interpretability tools (e.g., network pruning and probing) to understand how the model's underlying capabilities are changing. We perform an extensive analysis of the effects of fine-tuning in these settings, and show that: (i) fine-tuning rarely alters the underlying model capabilities; (ii) a minimal transformation, which we call a 'wrapper', is typically learned on top of the underlying model capabilities, creating the illusion that they have been modified; and (iii) further fine-tuning on a task where such hidden capabilities are relevant leads to sample-efficient 'revival' of the capability, i.e., the model begins reusing these capability after only a few gradient steps. This indicates that practitioners can unintentionally remove a model's safety wrapper merely by fine-tuning it on a, e.g., superficially unrelated, downstream task. We additionally perform analysis on language models trained on the TinyStories dataset to support our claims in a more realistic setup.
This paper describes an exact solution to the drag-based adjoint Euler equations in two and three dimensions that is valid for irrotational flows.
The tasks of automatic lyrics transcription and lyrics alignment have witnessed significant performance improvements in the past few years. However, most of the previous works only focus on English in which large-scale datasets are available. In this paper, we address lyrics transcription and alignment of polyphonic Mandarin pop music in a low-resource setting. To deal with the data scarcity issue, we adapt pretrained Whisper model and fine-tune it on a monophonic Mandarin singing dataset. With the use of data augmentation and source separation model, results show that the proposed method achieves a character error rate of less than 18% on a Mandarin polyphonic dataset for lyrics transcription, and a mean absolute error of 0.071 seconds for lyrics alignment. Our results demonstrate the potential of adapting a pretrained speech model for lyrics transcription and alignment in low-resource scenarios.
This paper studies the impact of bootstrap procedure on the eigenvalue distributions of the sample covariance matrix under a high-dimensional factor structure. We provide asymptotic distributions for the top eigenvalues of bootstrapped sample covariance matrix under mild conditions. After bootstrap, the spiked eigenvalues which are driven by common factors will converge weakly to Gaussian limits after proper scaling and centralization. However, the largest non-spiked eigenvalue is mainly determined by the order statistics of the bootstrap resampling weights, and follows extreme value distribution. Based on the disparate behavior of the spiked and non-spiked eigenvalues, we propose innovative methods to test the number of common factors. Indicated by extensive numerical and empirical studies, the proposed methods perform reliably and convincingly under the existence of both weak factors and cross-sectionally correlated errors. Our technical details contribute to random matrix theory on spiked covariance model with convexly decaying density and unbounded support, or with general elliptical distributions.
Ecological Momentary Assessments (EMA) capture real-time thoughts and behaviors in natural settings, producing rich longitudinal data for statistical and physiological analyses. However, the robustness of these analyses can be compromised by the large amount of missing in EMA data sets. To address this, multiple imputation, a method that replaces missing values with several plausible alternatives, has become increasingly popular. In this paper, we introduce a two-step Bayesian multiple imputation framework which leverages the configuration of mixed models. We adopt the Random Intercept Linear Mixed model, the Mixed-effect Location Scale model which accounts for subject variance influenced by covariates and random effects, and the Shared Parameter Location Scale Mixed Effect model which links the missing data to the response variable through a random intercept logistic model, to complete the posterior distribution within the framework. In the simulation study and an application on data from a study on caregivers of dementia patients, we further adapt this two-step Bayesian multiple imputation strategy to handle simultaneous missing variables in EMA data sets and compare the effectiveness of multiple imputations across different mixed models. The analyses highlight the advantages of multiple imputations over single imputations. Furthermore, we propose two pivotal considerations in selecting the optimal mixed model for the two-step imputation: the influence of covariates as well as random effects on the within-variance, and the nature of missing data in relation to the response variable.
The accurate classification of lymphoma subtypes using hematoxylin and eosin (H&E)-stained tissue is complicated by the wide range of morphological features these cancers can exhibit. We present LymphoML - an interpretable machine learning method that identifies morphologic features that correlate with lymphoma subtypes. Our method applies steps to process H&E-stained tissue microarray cores, segment nuclei and cells, compute features encompassing morphology, texture, and architecture, and train gradient-boosted models to make diagnostic predictions. LymphoML's interpretable models, developed on a limited volume of H&E-stained tissue, achieve non-inferior diagnostic accuracy to pathologists using whole-slide images and outperform black box deep-learning on a dataset of 670 cases from Guatemala spanning 8 lymphoma subtypes. Using SHapley Additive exPlanation (SHAP) analysis, we assess the impact of each feature on model prediction and find that nuclear shape features are most discriminative for DLBCL (F1-score: 78.7%) and classical Hodgkin lymphoma (F1-score: 74.5%). Finally, we provide the first demonstration that a model combining features from H&E-stained tissue with features from a standardized panel of 6 immunostains results in a similar diagnostic accuracy (85.3%) to a 46-stain panel (86.1%).
A fully discrete semi-convex-splitting finite-element scheme with stabilization for a degenerate Cahn-Hilliard cross-diffusion system is analyzed. The system consists of parabolic fourth-order equations for the volume fraction of the fiber phase and the solute concentration, modeling pre-patterning of lymphatic vessel morphology. The existence of discrete solutions is proved, and it is shown that the numerical scheme is energy stable up to stabilization, conserves the solute mass, and preserves the lower and upper bounds of the fiber phase fraction. Numerical experiments in two space dimensions using FreeFEM illustrate the phase segregation and pattern formation.
We present an unstructured geometrical Volume-of-Fluid (VOF) method for handling two-phase flows with a viscoelastic liquid phase. The viscoelastic behavior is modeled via generic rate-type constitutive equations. A one-field formulation is employed, which results from conditional volume averaging of the local instantaneous bulk equations and interface jump conditions. The method builds on the 'plicRDF-isoAdvector' geometrical VOF solver that is extended and combined with the modular framework 'DeboRheo' for viscoelastic CFD. A piecewise-linear geometrical interface reconstruction technique on general unstructured meshes is employed for discretizing the viscoelastic stresses across the fluid interface in a numerically consistent and highly accurate way. Because of the numerical challenges posed by the high Weissenberg number problem, we employ an appropriate stabilization approach to the constitutive equation of the viscoelastic phase to increase the robustness of the method at higher fluid elasticity. DeboRheo facilitates a flexible combination of different rheological models with appropriate stabilization methods to address the high Weissenberg number problem. We discuss the theoretical formulation and implementation of the proposed method and demonstrate its effectiveness using numerical examples of viscoelastic flows. The results highlight the method's ability to accurately capture the behavior of viscoelastic fluids in various applications. The proposed method holds promise for furthering our understanding and predictive capabilities of viscoelastic flows in various industrial and natural processes.
We study the stability of the Lanczos algorithm run on problems whose eigenvector empirical spectral distribution is near to a reference measure with well-behaved orthogonal polynomials. We give a backwards stability result which can be upgraded to a forward stability result when the reference measure has a density supported on a single interval with square root behavior at the endpoints. Our analysis implies the Lanczos algorithm run on many large random matrix models is in fact forward stable, and hence nearly deterministic, even when computations are carried out in finite precision arithmetic. Since the Lanczos algorithm is not forward stable in general, this provides yet another example of the fact that random matrices are far from "any old matrix", and care must be taken when using them to test numerical algorithms.
We introduce a time discretization for Wasserstein gradient flows based on the classical Backward Differentiation Formula of order two. The main building block of the scheme is the notion of geodesic extrapolation in the Wasserstein space, which in general is not uniquely defined. We propose several possible definitions for such an operation, and we prove convergence of the resulting scheme to the limit PDE, in the case of the Fokker-Planck equation. For a specific choice of extrapolation we also prove a more general result, that is convergence towards EVI flows. Finally, we propose a variational finite volume discretization of the scheme which numerically achieves second order accuracy in both space and time.