001     269770
005     20240809090116.0
024 7 _ |a 10.1038/s41598-024-62724-6
|2 doi
024 7 _ |a pmid:38789621
|2 pmid
024 7 _ |a pmc:PMC11126405
|2 pmc
024 7 _ |a altmetric:163793919
|2 altmetric
037 _ _ |a DZNE-2024-00612
041 _ _ |a English
082 _ _ |a 600
100 1 _ |a Young, Cameron C
|b 0
245 _ _ |a Development and validation of a reliable DNA copy-number-based machine learning algorithm (CopyClust) for breast cancer integrative cluster classification.
260 _ _ |a [London]
|c 2024
|b Macmillan Publishers Limited, part of Springer Nature
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1716795558_24509
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a The Integrative Cluster subtypes (IntClusts) provide a framework for the classification of breast cancer tumors into 10 distinct groups based on copy number and gene expression, each with unique biological drivers of disease and clinical prognoses. Gene expression data is often lacking, and accurate classification of samples into IntClusts with copy number data alone is essential. Current classification methods achieve low accuracy when gene expression data are absent, warranting the development of new approaches to IntClust classification. Copy number data from 1980 breast cancer samples from METABRIC was used to train multiclass XGBoost machine learning algorithms (CopyClust). A piecewise constant fit was applied to the average copy number profile of each IntClust and unique breakpoints across the 10 profiles were identified and converted into ~ 500 genomic regions used as features for CopyClust. These models consisted of two approaches: a 10-class model with the final IntClust label predicted by a single multiclass model and a 6-class model with binary reclassification in which four pairs of IntClusts were combined for initial multiclass classification. Performance was validated on the TCGA dataset, with copy number data generated from both SNP arrays and WES platforms. CopyClust achieved 81% and 79% overall accuracy with the TCGA SNP and WES datasets, respectively, a nine-percentage point or greater improvement in overall IntClust subtype classification accuracy. CopyClust achieves a significant improvement over current methods in classification accuracy of IntClust subtypes for samples without available gene expression data and is an easily implementable algorithm for IntClust classification of breast cancer samples with copy number data.
536 _ _ |a 354 - Disease Prevention and Healthy Aging (POF4-354)
|0 G:(DE-HGF)POF4-354
|c POF4-354
|f POF IV
|x 0
588 _ _ |a Dataset connected to CrossRef, PubMed, , Journals: pub.dzne.de
650 _ 2 |a Humans
|2 MeSH
650 _ 2 |a Breast Neoplasms: genetics
|2 MeSH
650 _ 2 |a Breast Neoplasms: classification
|2 MeSH
650 _ 2 |a Machine Learning
|2 MeSH
650 _ 2 |a Female
|2 MeSH
650 _ 2 |a DNA Copy Number Variations: genetics
|2 MeSH
650 _ 2 |a Algorithms
|2 MeSH
650 _ 2 |a Cluster Analysis
|2 MeSH
650 _ 2 |a Gene Expression Profiling: methods
|2 MeSH
700 1 _ |a Eason, Katherine
|b 1
700 1 _ |a Manzano Garcia, Raquel
|b 2
700 1 _ |a Moulange, Richard
|b 3
700 1 _ |a Mukherjee, Sach
|0 P:(DE-2719)2811372
|b 4
|u dzne
700 1 _ |a Chin, Suet-Feung
|b 5
700 1 _ |a Caldas, Carlos
|b 6
700 1 _ |a Rueda, Oscar M
|b 7
773 _ _ |a 10.1038/s41598-024-62724-6
|g Vol. 14, no. 1, p. 11861
|0 PERI:(DE-600)2615211-3
|n 1
|p 11861
|t Scientific reports
|v 14
|y 2024
|x 2045-2322
856 4 _ |y OpenAccess
|u https://pub.dzne.de/record/269770/files/DZNE-2024-00612.pdf
856 4 _ |y OpenAccess
|x pdfa
|u https://pub.dzne.de/record/269770/files/DZNE-2024-00612.pdf?subformat=pdfa
909 C O |o oai:pub.dzne.de:269770
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Deutsches Zentrum für Neurodegenerative Erkrankungen
|0 I:(DE-588)1065079516
|k DZNE
|b 4
|6 P:(DE-2719)2811372
913 1 _ |a DE-HGF
|b Gesundheit
|l Neurodegenerative Diseases
|1 G:(DE-HGF)POF4-350
|0 G:(DE-HGF)POF4-354
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-300
|4 G:(DE-HGF)POF
|v Disease Prevention and Healthy Aging
|x 0
914 1 _ |y 2024
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2023-08-24
915 _ _ |a Creative Commons Attribution CC BY (No Version)
|0 LIC:(DE-HGF)CCBYNV
|2 V:(DE-HGF)
|b DOAJ
|d 2023-04-12T15:11:06Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1050
|2 StatID
|b BIOSIS Previews
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1190
|2 StatID
|b Biological Abstracts
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0600
|2 StatID
|b Ebsco Academic Search
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1040
|2 StatID
|b Zoological Record
|d 2023-08-24
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b SCI REP-UK : 2022
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0501
|2 StatID
|b DOAJ Seal
|d 2023-04-12T15:11:06Z
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0500
|2 StatID
|b DOAJ
|d 2023-04-12T15:11:06Z
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2023-08-24
915 _ _ |a Fees
|0 StatID:(DE-HGF)0700
|2 StatID
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2023-08-24
915 _ _ |a IF < 5
|0 StatID:(DE-HGF)9900
|2 StatID
|d 2023-08-24
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b ASC
|d 2023-08-24
915 _ _ |a Article Processing Charges
|0 StatID:(DE-HGF)0561
|2 StatID
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1150
|2 StatID
|b Current Contents - Physical, Chemical and Earth Sciences
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0320
|2 StatID
|b PubMed Central
|d 2023-08-24
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2023-08-24
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b DOAJ : Anonymous peer review
|d 2023-04-12T15:11:06Z
920 1 _ |0 I:(DE-2719)1013030
|k AG Mukherjee
|l Statistics and Machine Learning
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-2719)1013030
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21