PEDIA: prioritization of exome data by image analysis.

December 01, 2019

Hsieh TC1,2,3, Mensah MA2,3, Pantel JT1,2,3, Aguilar D4, Bar O5, Bayat A6, Becerra-Solano L7, Bentzen HB8, Biskup S9, Borisov O1, Braaten O10, Ciaccio C11, Coutelier M2, Cremer K12, Danyel M2, Daschkey S13, Eden HD5, Devriendt K14, Wilson S15, Douzgou S16,17, Đukić D1, Ehmke N2, Fauth C18, Fischer-Zirnsak B2, Fleischer N5, Gabriel H19, Graul-Neumann L2, Gripp KW20, Gurovich Y5, Gusina A21, Haddad N2, Hajjir N2, Hanani Y5, Hertzberg J2, Hoertnagel K9, Howell J22, Ivanovski I23, Kaindl A24, Kamphans T25, Kamphausen S26, Karimov C27, Kathom H28, Keryan A27, Knaus A1, Köhler S29, Kornak U2, Lavrov A30, Leitheiser M2, Lyon GJ31, Mangold E32, Reina PM33, Carrascal AM34, Mitter D35, Herrador LM36, Nadav G5, Nöthen M12, Orrico A37, Ott CE2, Park K38, Peterlin B39, Pölsler L18, Raas-Rothschild A40, Randolph L27, Revencu N41, Fagerberg CR42, Robinson PN43, Rosnev S2, Rudnik S18, Rudolf G39, Schatz U18, Schossig A18, Schubach M3, Shanoon O5, Sheridan E44, Smirin-Yosef P45, Spielmann M2, Suk EK46, Sznajer Y47, Thiel CT48, Thiel G46, Verloes A49, Vrecar I39, Wahl D50, Weber I18, Winter K2, Wiśniewska M51, Wollnik B52, Yeung MW1, Zhao M2, Zhu N2, Zschocke J18, Mundlos S2, Horn D2, Krawitz PM53



Phenotype information is crucial for the interpretation of genomic variants. So far it has only been accessible for bioinformatics workflows after encoding into clinical terms by expert dysmorphologists.


Here, we introduce an approach driven by artificial intelligence that uses portrait photographs for the interpretation of clinical exome data. We measured the value added by computer-assisted image analysis to the diagnostic yield on a cohort consisting of 679 individuals with 105 different monogenic disorders. For each case in the cohort we compiled frontal photos, clinical features, and the disease-causing variants, and simulated multiple exomes of different ethnic backgrounds.


The additional use of similarity scores from computer-assisted analysis of frontal photos improved the top 1 accuracy rate by more than 20-89% and the top 10 accuracy rate by more than 5-99% for the disease-causing gene.


Image analysis by deep-learning algorithms can be used to quantify the phenotypic similarity (PP4 criterion of the American College of Medical Genetics and Genomics guidelines) and to advance the performance of bioinformatics pipelines for exome analysis.