Package ubic.gemma.core.analysis.service
Interface ArrayDesignAnnotationService
-
- All Known Implementing Classes:
ArrayDesignAnnotationServiceImpl
public interface ArrayDesignAnnotationService
Methods to generate annotations for array designs, based on information already in the database. This can be used to generate annotation files used for ermineJ, for example. The file format:- The file is tab-delimited text. Comma-delimited files or Excel spreadsheets (for example) are not supported.
- There is a one-line header included in the file for readability.
- The first column contains the probe identifier
- The second column contains a gene symbol(s). Clusters are delimited by '|' and genes within clusters are delimited by ','
- The third column contains the gene names (or description). Clusters are delimited by '|' and names within clusters are delimited by '$'
- The fourth column contains a delimited list of GO identifiers. These include the "GO:" prefix. Thus they read "GO:00494494" and not "494494". Delimited by '|'.
Note that for backwards compatibility, GO terms are not segregated by gene cluster.
- Author:
- paul
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static class
ArrayDesignAnnotationService.OutputType
-
Field Summary
Fields Modifier and Type Field Description static String
ANNOT_DATA_DIR
static String
ANNOTATION_FILE_DIRECTORY_NAME
static String
ANNOTATION_FILE_SUFFIX
static String
BIO_PROCESS_FILE_SUFFIX
static String
NO_PARENTS_FILE_SUFFIX
static String
STANDARD_FILE_SUFFIX
String included in file names for standard (default) annotation files.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description void
create(ArrayDesign inputAd, Boolean useGO, boolean deleteOtherFiles)
Create (or update) all the annotation files for the given platform.void
deleteExistingFiles(ArrayDesign arrayDesign)
int
generateAnnotationFile(Writer writer, Collection<Gene> genes, Boolean useGO)
Generate an annotation for a list of genes, instead of probes.
-
-
-
Field Detail
-
ANNOTATION_FILE_SUFFIX
static final String ANNOTATION_FILE_SUFFIX
- See Also:
- Constant Field Values
-
BIO_PROCESS_FILE_SUFFIX
static final String BIO_PROCESS_FILE_SUFFIX
- See Also:
- Constant Field Values
-
NO_PARENTS_FILE_SUFFIX
static final String NO_PARENTS_FILE_SUFFIX
- See Also:
- Constant Field Values
-
STANDARD_FILE_SUFFIX
static final String STANDARD_FILE_SUFFIX
String included in file names for standard (default) annotation files. These include GO terms and all parents.- See Also:
- Constant Field Values
-
ANNOTATION_FILE_DIRECTORY_NAME
static final String ANNOTATION_FILE_DIRECTORY_NAME
- See Also:
- Constant Field Values
-
ANNOT_DATA_DIR
static final String ANNOT_DATA_DIR
-
-
Method Detail
-
deleteExistingFiles
void deleteExistingFiles(ArrayDesign arrayDesign)
-
create
void create(ArrayDesign inputAd, Boolean useGO, boolean deleteOtherFiles) throws IOException
Create (or update) all the annotation files for the given platform. Side effect: any expression experiment data files that use this platform will be deleted. Format details: There is a one-line header. The columns are:- Probe name
- Gene symbol. Genes located at different genome locations are delimited by "|"; multiple genes at the same location are delimited by ",". Both can happen simultaneously.
- Gene name, delimited as for the symbol except '$' is used instead of ','.
- GO terms, delimited by '|'; multiple genes are not handled specially (for compatibility with ermineJ) -- unless useGO is false
- Gemma's gene ids, delimited by '|'
- NCBI gene ids, delimited by '|'
- Ensembl gene ids, delimited by '|'
- Parameters:
inputAd
- platform to processuseGO
- if true, GO terms will be includeddeleteOtherFiles
- if true, other files conaining the annotations for this platform will be deleted, such as DEA results and data flat files.- Throws:
IOException
-
generateAnnotationFile
int generateAnnotationFile(Writer writer, Collection<Gene> genes, Boolean useGO)
Generate an annotation for a list of genes, instead of probes. The second column will contain the NCBI id, if available. Will generate the 'short' version.- Parameters:
writer
- the writergenes
- genesuseGO
- if true, GO terms will be included- Returns:
- code
-
-