Interface ArrayDesignProbeMapperService

All Known Implementing Classes:
ArrayDesignProbeMapperServiceImpl

public interface ArrayDesignProbeMapperService
Author:
Paul
  • Method Details

    • printResult

      void printResult(CompositeSequence compositeSequence, Collection<BlatAssociation> col)
      Print results to STDOUT
      Parameters:
      compositeSequence - composite sequence
      col - blat associations
    • processArrayDesign

      void processArrayDesign(ArrayDesign arrayDesign)
      Do probe mapping, writing the results to the database and using default settings.
      Parameters:
      arrayDesign - AD
    • processArrayDesign

      void processArrayDesign(ArrayDesign arrayDesign, ProbeMapperConfig config, boolean useDB)
      Parameters:
      arrayDesign - AD
      config - config
      useDB - if false, the results will not be written to the database, but printed to stdout instead.
    • processArrayDesign

      void processArrayDesign(ArrayDesign arrayDesign, Taxon taxon, File source, ExternalDatabase sourceDB, boolean ncbiIds) throws IOException
      Annotate an array design using a direct source file. This should only be used if we can't run sequence analysis ourselves. The expected file format is tab-delimited with the following columns:
      • Probe name which must match the probe names Gemma uses for the array design.
      • Sequence name. If blank, it will be ignored but the probe will still be mapped if possible. The probe will be skipped if it isn't already associated with a sequence. If not blank, it will be checked against the sequence for the probe. If the probe has no sequence, it will be used to create one. If it does, it will be checked for a name match.
      • Gene symbol. More than one gene can be specified, delimited by '|'. Genes will only be found if Gemma has a unambiguous match to the name. The gene must already exist in the system.
      Comment lines begin with '#'; Note that all the RNA gene products of the gene will be associated with the sequence. This is necessary because 1) Gemma associates sequences with transcripts, not genes and 2) if all we get is a gene, we have to assume all gene products are relevant.
      Parameters:
      arrayDesign - AD
      taxon - We require this to ensure correct association of the sequences with the genes.
      source - source
      sourceDB - describes where the annotations came from. Can be null if you really don't know.
      ncbiIds - true if the values provided are ncbi ids, not gene symbols (ncbi ids are more reliable)
      Throws:
      IllegalStateException - if the input file doesn't match the array design.
      IOException - when IO problems occur.
    • processCompositeSequence

      @Transactional Map<String,Collection<BlatAssociation>> processCompositeSequence(ProbeMapperConfig config, Taxon taxon, GoldenPathSequenceAnalysis goldenPathDb, CompositeSequence compositeSequence)
    • deleteOldFiles

      void deleteOldFiles(ArrayDesign arrayDesign)
      Delete outdated annotation and associated experiment files.
      Parameters:
      arrayDesign - platform