We model time-harmonic acoustic scattering by an object composed of piece-wise homogeneous parts and an arbitrarily heterogeneous part. We propose and analyze new formulations that couple, adopting a Costabel-type approach, boundary integral equations for the homogeneous subdomains with domain variational formulations for the heterogeneous subdomain. This is an extension of Costabel FEM-BEM coupling to a multi-domain configuration, with cross-points allowed, i.e. points where three or more subdomains abut. While generally just the exterior unbounded subdomain is treated with the BEM, here we wish to exploit the advantages of BEM whenever it is applicable, that is, for all the homogeneous parts of the scattering object. Our formulation is based on the multi-trace formalism, which initially was introduced for acoustic scattering by piece-wise homogeneous objects; here we allow the wavenumber to vary arbitrarily in a part of the domain. We prove that the bilinear form associated with the proposed formulation satisfies a G{\aa}rding coercivity inequality, which ensures stability of the variational problem if it is uniquely solvable. We identify conditions for injectivity and construct modified versions immune to spurious resonances.
Many data insight questions can be viewed as searching in a large space of tables and finding important ones, where the notion of importance is defined in some adhoc user defined manner. This paper presents Holistic Cube Analysis (HoCA), a framework that augments the capabilities of relational queries for such problems. HoCA first augments the relational data model and introduces a new data type AbstractCube, defined as a function which maps a region-features pair to a relational table (a region is a tuple which specifies values of a set of dimensions). AbstractCube provides a logical form of data, and HoCA operators are cube-to-cube transformations. We describe two basic but fundamental HoCA operators, cube crawling and cube join (with many possible extensions). Cube crawling explores a region space, and outputs a cube that maps regions to signal vectors. Cube join, in turn, is critical for composition, allowing one to join information from different cubes for deeper analysis. Cube crawling introduces two novel programming features, (programmable) Region Analysis Models (RAMs) and Multi-Model Crawling. Crucially, RAM has a notion of population features, which allows one to go beyond only analyzing local features at a region, and program region-population analysis that compares region and population features, capturing a large class of importance notions. HoCA has a rich algorithmic space, such as optimizing crawling and join performance, and physical design of cubes. We have implemented and deployed HoCA at Google. Our early HoCA offering has attracted more than 30 teams building applications with it, across a diverse spectrum of fields including system monitoring, experimentation analysis, and business intelligence. For many applications, HoCA empowers novel and powerful analyses, such as instances of recurrent crawling, which are challenging to achieve otherwise.
Analytical methods are fundamental in studying acoustics problems. One of the important tools is the Wiener-Hopf method, which can be used to solve many canonical problems with sharp transitions in boundary conditions on a plane/plate. However, there are some strict limitations to its use, usually the boundary conditions need to be imposed on parallel lines (after a suitable mapping). Such mappings exist for wedges with continuous boundaries, but for discrete boundaries, they have not yet been constructed. In our previous article, we have overcome this limitation and studied the diffraction of acoustic waves by a wedge consisting of point scatterers. Here, the problem is generalised to an arbitrary number of periodic semi-infinite arrays with arbitrary orientations. This is done by constructing several coupled systems of equations (one for every semi-infinite array) which are treated independently. The derived systems of equations are solved using the discrete Wiener-Hopf technique and the resulting matrix equation is inverted using elementary matrix arithmetic. Of course, numerically this matrix needs to be truncated, but we are able to do so such that thousands of scatterers on every array are included in the numerical results. Comparisons with other numerical methods are considered, and their strengths/weaknesses are highlighted.
Probabilistic counters are well-known tools often used for space-efficient set cardinality estimation. In this paper, we investigate probabilistic counters from the perspective of preserving privacy. We use the standard, rigid differential privacy notion. The intuition is that the probabilistic counters do not reveal too much information about individuals but provide only general information about the population. Therefore, they can be used safely without violating the privacy of individuals. However, it turned out, that providing a precise, formal analysis of the privacy parameters of probabilistic counters is surprisingly difficult and needs advanced techniques and a very careful approach. We demonstrate that probabilistic counters can be used as a privacy protection mechanism without extra randomization. Namely, the inherent randomization from the protocol is sufficient for protecting privacy, even if the probabilistic counter is used multiple times. In particular, we present a specific privacy-preserving data aggregation protocol based on Morris Counter and MaxGeo Counter. Some of the presented results are devoted to counters that have not been investigated so far from the perspective of privacy protection. Another part is an improvement of previous results. We show how our results can be used to perform distributed surveys and compare the properties of counter-based solutions and a standard Laplace method.
Recommender systems play a crucial role in helping users discover information that aligns with their interests based on their past behaviors. However, developing personalized recommendation systems becomes challenging when historical records of user-item interactions are unavailable, leading to what is known as the system cold-start recommendation problem. This issue is particularly prominent in start-up businesses or platforms with insufficient user engagement history. Previous studies focus on user or item cold-start scenarios, where systems could make recommendations for new users or items but are still trained with historical user-item interactions in the same domain, which cannot solve our problem. To bridge the gap, our research introduces an innovative and effective approach, capitalizing on the capabilities of pre-trained language models. We transform the recommendation process into sentiment analysis of natural languages containing information of user profiles and item attributes, where the sentiment polarity is predicted with prompt learning. By harnessing the extensive knowledge housed within language models, the prediction can be made without historical user-item interaction records. A benchmark is also introduced to evaluate the proposed method under the cold-start setting, and the results demonstrate the effectiveness of our method. To the best of our knowledge, this is the first study to tackle the system cold-start recommendation problem. The benchmark and implementation of the method are available at //github.com/JacksonWuxs/PromptRec.
Car detection, particularly through camera vision, has become a major focus in the field of computer vision and has gained widespread adoption. While current car detection systems are capable of good detection, reliable detection can still be challenging due to factors such as proximity between the car, light intensity, and environmental visibility. To address these issues, we propose cross-domain Car Detection Model with integrated convolutional block Attention mechanism(CDMA) that we apply to car recognition for autonomous driving and other areas. CDMA includes several novelties: 1)Building a complete cross-domain target detection framework. 2)Developing an unpaired target domain picture generation module with an integrated convolutional attention mechanism which specifically emphasizes the car headlights feature. 3)Adopting Generalized Intersection over Union (GIOU) as the loss function of the target detection framework. 4)Designing an object detection model integrated with two-headed Convolutional Block Attention Module(CBAM). 5)Utilizing an effective data enhancement method. To evaluate the model's effectiveness, we performed a reduced will resolution process on the data in the SSLAD dataset and used it as the benchmark dataset for our task. Experimental results show that the performance of the cross-domain car target detection model improves by 40% over the model without our framework, and our improvements have a significant impact on cross-domain car recognition.
Physicists routinely need probabilistic models for a number of tasks such as parameter inference or the generation of new realizations of a field. Establishing such models for highly non-Gaussian fields is a challenge, especially when the number of samples is limited. In this paper, we introduce scattering spectra models for stationary fields and we show that they provide accurate and robust statistical descriptions of a wide range of fields encountered in physics. These models are based on covariances of scattering coefficients, i.e. wavelet decomposition of a field coupled with a point-wise modulus. After introducing useful dimension reductions taking advantage of the regularity of a field under rotation and scaling, we validate these models on various multi-scale physical fields and demonstrate that they reproduce standard statistics, including spatial moments up to 4th order. These scattering spectra provide us with a low-dimensional structured representation that captures key properties encountered in a wide range of physical fields. These generic models can be used for data exploration, classification, parameter inference, symmetry detection, and component separation.
Small targets are often submerged in cluttered backgrounds of infrared images. Conventional detectors tend to generate false alarms, while CNN-based detectors lose small targets in deep layers. To this end, we propose iSmallNet, a multi-stream densely nested network with label decoupling for infrared small object detection. On the one hand, to fully exploit the shape information of small targets, we decouple the original labeled ground-truth (GT) map into an interior map and a boundary one. The GT map, in collaboration with the two additional maps, tackles the unbalanced distribution of small object boundaries. On the other hand, two key modules are delicately designed and incorporated into the proposed network to boost the overall performance. First, to maintain small targets in deep layers, we develop a multi-scale nested interaction module to explore a wide range of context information. Second, we develop an interior-boundary fusion module to integrate multi-granularity information. Experiments on NUAA-SIRST and NUDT-SIRST clearly show the superiority of iSmallNet over 11 state-of-the-art detectors.
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond just a ``chatbot'', and use it as an assistant or even replacement for domain experts and tools in specific domains such as healthcare, finance, and education. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). To fill such a gap, explosively-increase research, and practices have been conducted in very recent years on the domain specialization of LLMs, which, however, calls for a comprehensive and systematic review to better summarizes and guide this promising domain. In this survey paper, first, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. We also present a comprehensive taxonomy of critical application domains that can benefit from specialized LLMs, discussing their practical significance and open challenges. Furthermore, we offer insights into the current research status and future trends in this area.
In recent years, Graph Neural Networks have reported outstanding performance in tasks like community detection, molecule classification and link prediction. However, the black-box nature of these models prevents their application in domains like health and finance, where understanding the models' decisions is essential. Counterfactual Explanations (CE) provide these understandings through examples. Moreover, the literature on CE is flourishing with novel explanation methods which are tailored to graph learning. In this survey, we analyse the existing Graph Counterfactual Explanation methods, by providing the reader with an organisation of the literature according to a uniform formal notation for definitions, datasets, and metrics, thus, simplifying potential comparisons w.r.t to the method advantages and disadvantages. We discussed seven methods and sixteen synthetic and real datasets providing details on the possible generation strategies. We highlight the most common evaluation strategies and formalise nine of the metrics used in the literature. We first introduce the evaluation framework GRETEL and how it is possible to extend and use it while providing a further dimension of comparison encompassing reproducibility aspects. Finally, we provide a discussion on how counterfactual explanation interplays with privacy and fairness, before delving into open challenges and future works.
Invariant risk minimization (IRM) has recently emerged as a promising alternative for domain generalization. Nevertheless, the loss function is difficult to optimize for nonlinear classifiers and the original optimization objective could fail when pseudo-invariant features and geometric skews exist. Inspired by IRM, in this paper we propose a novel formulation for domain generalization, dubbed invariant information bottleneck (IIB). IIB aims at minimizing invariant risks for nonlinear classifiers and simultaneously mitigating the impact of pseudo-invariant features and geometric skews. Specifically, we first present a novel formulation for invariant causal prediction via mutual information. Then we adopt the variational formulation of the mutual information to develop a tractable loss function for nonlinear classifiers. To overcome the failure modes of IRM, we propose to minimize the mutual information between the inputs and the corresponding representations. IIB significantly outperforms IRM on synthetic datasets, where the pseudo-invariant features and geometric skews occur, showing the effectiveness of proposed formulation in overcoming failure modes of IRM. Furthermore, experiments on DomainBed show that IIB outperforms $13$ baselines by $0.9\%$ on average across $7$ real datasets.