Journal Article DZNE-2024-01006

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Towards Biologically Plausible and Private Gene Expression Data Generation

 ;  ;  ;  ;  ;

2024
De Gruyter Open Warsaw, Poland

Proceedings on privacy enhancing technologies 2024(2), 531 - 554 () [10.56553/popets-2024-0062]

This record in other databases:

Please use a persistent id in citations: doi:

Abstract: Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how DP generative models perform in their natural application scenarios, specifically focusing on real-world gene expression data. We conduct a comprehensive analysis of five representative DP generation methods, examining them from various angles, such as downstream utility, statistical properties, and biological plausibility. Our extensive evaluation illuminates the unique characteristics of each DP generation method, offering critical insights into the strengths and weaknesses of each approach, and uncovering intriguing possibilities for future developments. Perhaps surprisingly, our analysis reveals that most methods are capable of achieving seemingly reasonable downstream utility, according to the standard evaluation metrics considered in existing literature. Nevertheless, we find that none of the DP methods are able to accurately capture the biological characteristics of the real dataset. This observation suggests a potential over-optimistic assessment of current methodologies in this field and underscores a pressing need for future enhancements in model design.

Classification:

Contributing Institute(s):
  1. Clinical Single Cell Omics (CSCO) / Systems Medicine (AG Schultze)
  2. Modular High Performance Computing and Artificial Intelligence (AG Becker)
Research Program(s):
  1. 354 - Disease Prevention and Healthy Aging (POF4-354) (POF4-354)

Appears in the scientific report 2024
Database coverage:
Medline ; Creative Commons Attribution CC BY 4.0 ; OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Document types > Articles > Journal Article
Institute Collections > BN DZNE > BN DZNE-AG Schultze
Institute Collections > BN DZNE > BN DZNE-AG Becker
Full Text Collection
Public records
Publications Database

 Record created 2024-08-07, last modified 2024-08-16


OpenAccess:
Download fulltext PDF Download fulltext PDF (PDFA)
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)