Interface ArrayDesignAnnotationService

  • All Known Implementing Classes:
    ArrayDesignAnnotationServiceImpl

    public interface ArrayDesignAnnotationService
    Methods to generate annotations for array designs, based on information already in the database. This can be used to generate annotation files used for ermineJ, for example. The file format:
    • The file is tab-delimited text. Comma-delimited files or Excel spreadsheets (for example) are not supported.
    • There is a one-line header included in the file for readability.
    • The first column contains the probe identifier
    • The second column contains a gene symbol(s). Clusters are delimited by '|' and genes within clusters are delimited by ','
    • The third column contains the gene names (or description). Clusters are delimited by '|' and names within clusters are delimited by '$'
    • The fourth column contains a delimited list of GO identifiers. These include the "GO:" prefix. Thus they read "GO:00494494" and not "494494". Delimited by '|'.

    Note that for backwards compatibility, GO terms are not segregated by gene cluster.

    Author:
    paul
    • Method Detail

      • deleteExistingFiles

        void deleteExistingFiles​(ArrayDesign arrayDesign)
      • create

        void create​(ArrayDesign inputAd,
                    Boolean useGO,
                    boolean deleteOtherFiles)
             throws IOException
        Create (or update) all the annotation files for the given platform. Side effect: any expression experiment data files that use this platform will be deleted. Format details: There is a one-line header. The columns are:
        1. Probe name
        2. Gene symbol. Genes located at different genome locations are delimited by "|"; multiple genes at the same location are delimited by ",". Both can happen simultaneously.
        3. Gene name, delimited as for the symbol except '$' is used instead of ','.
        4. GO terms, delimited by '|'; multiple genes are not handled specially (for compatibility with ermineJ) -- unless useGO is false
        5. Gemma's gene ids, delimited by '|'
        6. NCBI gene ids, delimited by '|'
        7. Ensembl gene ids, delimited by '|'
        Parameters:
        inputAd - platform to process
        useGO - if true, GO terms will be included
        deleteOtherFiles - if true, other files conaining the annotations for this platform will be deleted, such as DEA results and data flat files.
        Throws:
        IOException
      • generateAnnotationFile

        int generateAnnotationFile​(Writer writer,
                                   Collection<Gene> genes,
                                   Boolean useGO)
        Generate an annotation for a list of genes, instead of probes. The second column will contain the NCBI id, if available. Will generate the 'short' version.
        Parameters:
        writer - the writer
        genes - genes
        useGO - if true, GO terms will be included
        Returns:
        code