We study the problem of aggregating distributions, such as budget proposals, into a collective distribution. An ideal aggregation mechanism would be Pareto efficient, strategyproof, and fair. Most previous work assumes that agents evaluate budgets according to the $\ell_1$ distance to their ideal budget. We investigate and compare different models from the larger class of star-shaped utility functions - a multi-dimensional generalization of single-peaked preferences. For the case of two alternatives, we extend existing results by proving that under very general assumptions, the uniform phantom mechanism is the only strategyproof mechanism that satisfies proportionality - a minimal notion of fairness introduced by Freeman et al. (2021). Moving to the case of more than two alternatives, we establish sweeping impossibilities for $\ell_1$ and $\ell_\infty$ disutilities: no mechanism satisfies efficiency, strategyproofness, and proportionality. We then propose a new kind of star-shaped utilities based on evaluating budgets by the ratios of shares between a given budget and an ideal budget. For these utilities, efficiency, strategyproofness, and fairness become compatible. In particular, we prove that the mechanism that maximizes the Nash product of individual utilities is characterized by group-strategyproofness and a core-based fairness condition.
Existing works based on molecular knowledge neglect the 3D geometric structure of molecules and fail to learn the high-dimensional information of medications, leading to structural confusion. Additionally, it does not extract key substructures from a single patient visit, resulting in the failure to identify medication molecules suitable for the current patient visit. To address the above limitations, we propose a bimodal molecular recommendation framework named BiMoRec, which introduces 3D molecular structures to obtain atomic 3D coordinates and edge indices, overcoming the inherent lack of high-dimensional molecular information in 2D molecular structures. To retain the fast training and prediction efficiency of the recommendation system, we use bimodal graph contrastive pretraining to maximize the mutual information between the two molecular modalities, achieving the fusion of 2D and 3D molecular graphs. Additionally, we designed a molecular multi-step enhancement mechanism to re-calibrate the molecular weights. Specifically, we employ a pre-training method that captures both 2D and 3D molecular structure representations, along with substructure representations, and leverages contrastive learning to extract mutual information. We then use the pre-trained encoder to generate molecular representations, enhancing them through a three-step process: intra-visit, molecular per-visit, and latest-visit. Finally, we apply temporal information aggregation to generate the final medication combinations. Our implementation on the MIMIC-III and MIMIC-IV datasets demonstrates that our method achieves state-of-the-art performance.
Existing regression models tend to fall short in both accuracy and uncertainty estimation when the label distribution is imbalanced. In this paper, we propose a probabilistic deep learning model, dubbed variational imbalanced regression (VIR), which not only performs well in imbalanced regression but naturally produces reasonable uncertainty estimation as a byproduct. Different from typical variational autoencoders assuming I.I.D. representations (a data point's representation is not directly affected by other data points), our VIR borrows data with similar regression labels to compute the latent representation's variational distribution; furthermore, different from deterministic regression models producing point estimates, VIR predicts the entire normal-inverse-gamma distributions and modulates the associated conjugate distributions to impose probabilistic reweighting on the imbalanced data, thereby providing better uncertainty estimation. Experiments in several real-world datasets show that our VIR can outperform state-of-the-art imbalanced regression models in terms of both accuracy and uncertainty estimation. Code will soon be available at //github.com/Wang-ML-Lab/variational-imbalanced-regression.
By leveraging the representation power of deep neural networks, neural upper confidence bound (UCB) algorithms have shown success in contextual bandits. To further balance the exploration and exploitation, we propose Neural-$\sigma^2$-LinearUCB, a variance-aware algorithm that utilizes $\sigma^2_t$, i.e., an upper bound of the reward noise variance at round $t$, to enhance the uncertainty quantification quality of the UCB, resulting in a regret performance improvement. We provide an oracle version for our algorithm characterized by an oracle variance upper bound $\sigma^2_t$ and a practical version with a novel estimation for this variance bound. Theoretically, we provide rigorous regret analysis for both versions and prove that our oracle algorithm achieves a better regret guarantee than other neural-UCB algorithms in the neural contextual bandits setting. Empirically, our practical method enjoys a similar computational efficiency, while outperforming state-of-the-art techniques by having a better calibration and lower regret across multiple standard settings, including on the synthetic, UCI, MNIST, and CIFAR-10 datasets.
In the circuit model of quantum computing, amplitude amplification techniques can be used to find solutions to NP-hard problems defined on $n$-bits in time $\text{poly}(n) 2^{n/2}$. In this work, we investigate whether such general statements can be made for adiabatic quantum optimization, as provable results regarding its performance are mostly unknown. Although a lower bound of $\Omega(2^{n/2})$ has existed in such a setting for over a decade, a purely adiabatic algorithm with this running time has been absent. We show that adiabatic quantum optimization using an unstructured search approach results in a running time that matches this lower bound (up to a polylogarithmic factor) for a broad class of classical local spin Hamiltonians. For this, it is necessary to bound the spectral gap throughout the adiabatic evolution and compute beforehand the position of the avoided crossing with sufficient precision so as to adapt the adiabatic schedule accordingly. However, we show that the position of the avoided crossing is approximately given by a quantity that depends on the degeneracies and inverse gaps of the problem Hamiltonian and is NP-hard to compute even within a low additive precision. Furthermore, computing it exactly (or nearly exactly) is \#P-hard. Our work indicates a possible limitation of adiabatic quantum optimization algorithms, leaving open the question of whether provable Grover-like speed-ups can be obtained for any optimization problem using this approach.
This research presents FDASynthesis, a novel algorithm designed to generate synthetic GPS trajectory data while preserving privacy. After pre-processing the input GPS data, human mobility traces are modeled as multidimensional curves using Functional Data Analysis (FDA). Then, the synthesis process identifies the K-nearest trajectories and averages their Square-Root Velocity Functions (SRVFs) to generate synthetic data. This results in synthetic trajectories that maintain the utility of the original data while ensuring privacy. Although applied for human mobility research, FDASynthesis is highly adaptable to different types of functional data, offering a scalable solution in various application domains.
In this paper we face the problem of representation of functional data with the tools of algebraic topology. We represent functions by means of merge trees, which, like the more commonly used persistence diagrams, are invariant under homeomorphic reparametrizations of the functions they represent, thus allowing for a statistical analysis which is indifferent to functional misalignment. We consider a recently defined metric for merge trees and we prove some theoretical results related to its specific implementation when merge trees represent functions, establishing also a class of consistent estimators with convergence rates. To showcase the good properties of our topological approach to functional data analysis, we test it on the Aneurisk65 dataset replicating, from our different perspective, the supervised classification analysis which contributed to make this dataset a benchmark for methods dealing with misaligned functional data. In the Appendix we provide an extensive comparison between merge trees and persistence diagrams, highlighting similarities and differences, which can guide the analyst in choosing between the two representations.
This work explores the intersection of continual learning (CL) and differential privacy (DP). Crucially, continual learning models must retain knowledge across tasks, but this conflicts with the differential privacy requirement of restricting individual samples to be memorised in the model. We propose using pre-trained models to address the trade-offs between privacy and performance in a continual learning setting. More specifically, we present necessary assumptions to enable privacy-preservation and propose combining pre-trained models with parameter-free classifiers and parameter-efficient adapters that are learned under differential privacy. Our experiments demonstrate their effectiveness and provide insights into balancing the competing demands of continual learning and privacy.
We introduce a simple, stochastic, a-posteriori, turbulence closure model based on a reduced subgrid scale term. This subgrid scale term is tailor-made to capture the statistics of a small set of spatially-integrate quantities of interest (QoIs), with only one unresolved scalar time series per QoI. In contrast to other data-driven surrogates the dimension of the "learning problem" is reduced from an evolving field to one scalar time series per QoI. We use an a-posteriori, nudging approach to find the distribution of the scalar series over time. This approach has the advantage of taking the interaction between the solver and the surrogate into account. A stochastic surrogate parametrization is obtained by random sampling from the found distribution for the scalar time series. Compared to an a-priori trained convolutional neural network, evaluating the new method is computationally much cheaper and gives similar long-term statistics.
Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. Therefore, eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models. Conventional methods assume either the known heterogeneity of training data (e.g. domain labels) or the approximately equal capacities of different domains. In this paper, we consider a more challenging case where neither of the above assumptions holds. We propose to address this problem by removing the dependencies between features via learning weights for training samples, which helps deep models get rid of spurious correlations and, in turn, concentrate more on the true connection between discriminative features and labels. Extensive experiments clearly demonstrate the effectiveness of our method on multiple distribution generalization benchmarks compared with state-of-the-art counterparts. Through extensive experiments on distribution generalization benchmarks including PACS, VLCS, MNIST-M, and NICO, we show the effectiveness of our method compared with state-of-the-art counterparts.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.