Creating datasets in ISA-Tab

ToxBank Guide
Step-by-step instructions for creating datasets in ISA-Tab (dose-response and ‘omics).

This guide has information on the role of ToxBank and the ToxBank Data Warehouse (sections 1-2), Important background information on the SEURAT-1/ToxBank ISAcreator and design principles that help understand the ISA-Tab file format concept as well as some technical information (3), Background on the Toxicity ontologies and keyword hierarchy that is used for annotating the protocols and the data at the ToxBank Data warehouse (4), instruction on getting started with creating datasets (dose-response study) (section 5) and finally a detailed guide on how to create a dose-response gene expression profiling dataset out of publicly available gene expression dataset (section 6). If you are reasonably familiar with ISA-Tab and ISacreator you can skip directly to section 5 or to the section 6.

ISA-Tab is designed to describe all meta-information necessary for reproducibility and downstream analysis (QC, normalisation, association with traits of interest, toxicological predictions). This includes investigation, sample and assay parameters and links to ontology terms.

ISA-Tab is a universal data exchange and annotation format for biology-related studies. It is available at: This guide is partly based on an EBI/diXa ISA-Tab guide: (Stathis Kanterakis, EBI, 18/01/2013). It uses publicly available data sets from a large Japanese Toxicogenomics Project TG-GATEs:

A SEURAT-1 customized version of ISAcreator software (able to interface with the ToxBank Data Warehouse) is available from the help page of the ToxBank Data Warehouse.


To get started: Download the software from this link ( Running the software requires the Java VM.

NOTE: Following this guide you will be including the raw data files of an expression study as well. As you practice loading data into the ToxBank Data Warehouse (TBDW) please remove the .cel files (about 150 MB in compressed format) before uploading the files to save time. Alternatively, these raw data files can be uploaded to the ToxBank FTP site and linked to the ISA-tab fields (contact ToxBank support for more details). Unpublished SEURAT-1 studies should include the raw data files as well. TBDW includes mechanisms that keep any data submitted to it secure and restrict availability only to allowed persons.

