Class GeoConverterImpl

  • All Implemented Interfaces:
    GeoConverter, Converter<GeoData,​Object>

    @Component
    @Scope("prototype")
    public class GeoConverterImpl
    extends Object
    implements GeoConverter
    Convert GEO domain objects into Gemma objects. Usually we trigger this by passing in GeoSeries objects. GEO has four basic kinds of objects: Platforms (ArrayDesigns), Samples (BioAssays), Series (Experiments) and DataSets (which are curated Experiments). Note that a sample can belong to more than one series. A series can include more than one dataset. GEO also supports the concept of a superseries. See http://www.ncbi.nlm.nih.gov/projects/geo/info/soft2.html. A curated expression data set is at first represented by a GEO "GDS" number (a curated dataset), which maps to a series (GSE). HOWEVER, multiple datasets may go together to form a series (GSE). This can happen when the "A" and "B" arrays were both run on the same samples. Thus we actually normally go by GSE. This service can be used in database-aware or unaware states. However, it has prototype scope as it has some 'global' data structures used during processing.
    Author:
    keshav, pavlidis
    • Constructor Detail

      • GeoConverterImpl

        public GeoConverterImpl()
    • Method Detail

      • clear

        public void clear()
        Description copied from interface: GeoConverter
        Remove old results. Call this prior to starting conversion of a full dataset.
        Specified by:
        clear in interface GeoConverter
      • convertSubsetToExperimentalFactor

        public void convertSubsetToExperimentalFactor​(ExpressionExperiment expExp,
                                                      GeoSubset geoSubSet)
        Description copied from interface: GeoConverter
        Converts Geo subsets to experimental factors. This adds a new factor value to the experimental factor of an experimental design, and adds the factor value to each BioMaterial of a specific BioAssay.
        Specified by:
        convertSubsetToExperimentalFactor in interface GeoConverter
        Parameters:
        expExp - experiment
        geoSubSet - geo subset
      • getPrimaryArrayTaxon

        public Taxon getPrimaryArrayTaxon​(Collection<Taxon> platformTaxa,
                                          Collection<String> probeTaxa)
                                   throws IllegalArgumentException
        This method determines the primary taxon on the array: There are 4 main branches of logic. 1.First it checks if there is only one platform taxon defined on the GEO submission: If there is that is the primary taxon. 2.If multiple taxa are given for the platform then the taxa are checked to see if they share a common parent if so that is the primary taxon e.g. salmonid where atlantic salmon and rainbow trout are given. 3.Finally the probeTaxa are looked at and the most common probe taxa is calculated as the primary taxon 4. No taxon found throws an error
        Specified by:
        getPrimaryArrayTaxon in interface GeoConverter
        Parameters:
        platformTaxa - Collection of taxa that were given on the GEO array submission as platform taxa
        probeTaxa - Collection of taxa strings defining the taxon of each probe on the array.
        Returns:
        Primary taxon of array as determined by this method
        Throws:
        IllegalArgumentException
      • setSplitByPlatform

        public void setSplitByPlatform​(boolean splitByPlatform)
        Specified by:
        setSplitByPlatform in interface GeoConverter
        Parameters:
        splitByPlatform - If true, and the series uses more than one platform, split it up. This often isn't necessary/desirable. This is overridden if the series uses more than one species, in which case it is always split up.
      • convertData

        public byte[] convertData​(List<Object> vector,
                                  QuantitationType qt)
        Convert a vector of strings into a byte[] for saving in the database. . Blanks(missing values) are treated as NAN (double), 0 (integer), false (booleans) or just empty strings (strings). Other invalid values are treated the same way as missing data (to keep the parser from failing when dealing with strange GEO files that have values like "Error" for an expression value).
        Specified by:
        convertData in interface GeoConverter
        Parameters:
        vector - of Strings to be converted to primitive values (double, int etc)
        qt - The quantitation type for the values to be converted.
      • setForceConvertElements

        public void setForceConvertElements​(boolean forceConvertElements)
        Specified by:
        setForceConvertElements in interface GeoConverter
        Parameters:
        forceConvertElements - Set the behaviour when a platform that normally would not be loaded in detail is encountered, such as an Exon array.
      • setElementLimitForStrictness

        public void setElementLimitForStrictness​(int tooManyElements)
        Specified by:
        setElementLimitForStrictness in interface GeoConverter
        Parameters:
        tooManyElements - this is here for tests only. The default value should be okay otherwise.
      • makeTitle

        protected String makeTitle​(String title,
                                   String appendix)
        Form title (will be experiment name) and ensure is valid length
        Parameters:
        appendix - can be null; e.g. species or platform name added when we are splitting up a record.
        Returns:
        title