In a near future, each person will incorporate his/her own sequenced genome in his/her electronic health record. In that precise moment, genomic medicine will be fundamental for clinical practice, as an essential key of personalized medicine. All the genomic data, as well as other 'omics' and clinical data necessary for personalized medicine, are stored in several distributed databases. Research and patient care require each time more biomedical data integration of several distributed heterogeneous datasources.
This work develops a comprehensive review of the most relevant works in biomedical data integration, specifically in genomic medical data, analyzing the evolution of architecture and integration techniques during the last 20 years, and its usage.
Most of these solutions, based on cross-linking, data warehouse or federated approaches, are suitable for specific domains. However, none of the models found in the literature is completely appropriate for a general biomedical data integration problem