Class GeoBrowser
- java.lang.Object
-
- ubic.gemma.core.loader.expression.geo.service.GeoBrowser
-
public class GeoBrowser extends Object
Gets records from GEO and compares them to Gemma. This is used to identify data sets that are new in GEO and not in Gemma.See Programmatic access to GEO for some information.
- Author:
- pavlidis
-
-
Constructor Summary
Constructors Constructor Description GeoBrowser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Collection<GeoRecord>
getAllGEOPlatforms()
A bit hacky, can be improved.Collection<GeoRecord>
getGeoRecords(Collection<String> accessions)
Retrieve records for experimentsList<GeoRecord>
getGeoRecordsBySearchTerm(String searchTerms, int start, int pageSize, boolean detailed, Collection<String> allowedTaxa, Collection<String> limitPlatforms)
Provides more details than getRecentGeoRecords.List<GeoRecord>
getRecentGeoRecords(int startPage, int pageSize)
Retrieves and parses tab delimited file from GEO.
-
-
-
Method Detail
-
getGeoRecords
public Collection<GeoRecord> getGeoRecords(Collection<String> accessions) throws IOException
Retrieve records for experiments- Parameters:
accessions
- of experiments- Returns:
- collection of records
- Throws:
IOException
-
getAllGEOPlatforms
public Collection<GeoRecord> getAllGEOPlatforms() throws IOException
A bit hacky, can be improved. Limited to human, mouse, rat, is not guaranteed to get everything, though as of 7/2021, this is sufficient (~8000 platforms)- Returns:
- all relevant platforms up to single-query limit of NCBI
- Throws:
IOException
-
getGeoRecordsBySearchTerm
public List<GeoRecord> getGeoRecordsBySearchTerm(String searchTerms, int start, int pageSize, boolean detailed, Collection<String> allowedTaxa, Collection<String> limitPlatforms) throws IOException
Provides more details than getRecentGeoRecords. Performs an E-utilities query of the GEO database with the given searchTerms (search terms can be omitted). Returns at most pageSize records. Does some screening of results for expression studies, and (optionally) taxa. This is used for identifying data sets for loading.- Parameters:
start
- start an offset to retrieve batchespageSize
- page size how many to retrivesearchTerms
- search terms in NCBI Entrez query formatdetailed
- if true, additional information is fetched (slower)allowedTaxa
- if not null, data sets not containing any of these taxa will be skippedlimitPlatforms
- not null or empty, platforms to limit the query to (combining with searchTerms not supported yet)- Returns:
- list of GeoRecords
- Throws:
IOException
- if there is a problem obtaining or manipulating the file (some exceptions are not thrown and just logged)
-
getRecentGeoRecords
public List<GeoRecord> getRecentGeoRecords(int startPage, int pageSize) throws IOException
Retrieves and parses tab delimited file from GEO. File contains pageSize GEO records starting from startPage. The retrieved information is pretty minimal.- Parameters:
startPage
- start pagepageSize
- page size- Returns:
- list of GeoRecords
- Throws:
IOException
- if there is a problem while manipulating the file
-
-