This article highlights the importance of using a data-centric approach to improve the quality of data representations, particularly in cases where the available data is limited. To address this "small-data" issue, we discuss four methods for generating and aggregating training data: data augmentation, transfer learning, federated learning, and GANs (generative adversarial networks). We also propose the use of knowledge-guided GANs to incorporate domain knowledge in the training data generation process.
[2212.13591] Knowledge-Guided Data-Centric AI in Healthcare: Progress, Shortcomings, and Future Directions (arxiv.org)
![](https://static.wixstatic.com/media/04ef19_a68b9a1dc279457d87c1513a8beb55dc~mv2.jpg/v1/fill/w_147,h_49,al_c,q_80,usm_0.66_1.00_0.01,blur_2,enc_auto/04ef19_a68b9a1dc279457d87c1513a8beb55dc~mv2.jpg)