Introduction
Last updated
Last updated
View this documentation at genomehubs.gitbooks.io
GenomeHubs[1] provide a straightforward way to create a collection of web services to make annotated genome assemblies accessible to a wide community of users. GenomeHubs use Docker containers to package each of the component tools and their dependencies, simplifying the process of setting up and importing data from FASTA and GFF files into:
a custom Ensembl genome browser
a SequenceServer BLAST server
an h5ai powered downloads server
[1] Challis RJ, Kumar S, Stevens L & Blaxter M (2017) GenomeHubs: simple containerized setup of a custom Ensembl database and web server for any species. Database, 2017:bax039 doi:10.1093/database/bax039.
GenomeHubs Docker containers are shown in rounded boxes with a double outline and the hosted sites are shown in plain boxes.
GenomeHubs containers are linked by use of common file formats or through a MySQL container via the Ensembl API (arrows show the flow of information) so it is relatively straightforward to expand the feature set by creating new containers to host additional services.
The full set of GenomeHubs Docker containers also includes tools to export data from Ensembl databases into standard file formats, and containers to run analyses on these files that can be imported back into Ensembl database format for display alongside the sequences and gene model data:
Blastp against Swissprot to add functional annotations
InterProScan to annotate protein domains
RepeatMasker to identify repetitive elements
Cegma and Busco genome completeness assessments