Class MexDetector
- All Implemented Interfaces:
ArchiveBasedSingleCellDetector,SeriesAwareSingleCellDetector,SingleCellDetector
This detector is not specific to 10x MEX data since MEX itself is a widely used format in single-cell studies. Logic
specific to 10x are stored in TenXCellRangerUtils.
Older 10x MEX datasets use the genes.tsv.gz instead of features.tsv.gz. Those are copied using the new
naming scheme into the download directory.
MEX data is only supported at the sample-level. However, we do support detecting its presence at the series-level, but not downloading.
- Author:
- poirigui
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final Stringstatic final Stringstatic final StringFields inherited from class ubic.gemma.core.loader.expression.geo.singleCell.AbstractSingleCellDetector
logFields inherited from interface ubic.gemma.core.loader.expression.geo.singleCell.ArchiveBasedSingleCellDetector
DEFAULT_MAX_ENTRY_SIZE_IN_ARCHIVE_TO_SKIP, DEFAULT_MAX_NUMBER_OF_ENTRIES_TO_SKIP -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptiondownloadSingleCellData(GeoSample sample) Retrieve single-cell data for the given GEO sample to disk.downloadSingleCellData(GeoSeries series) Download single-cell data for the given GEO series.downloadSingleCellData(GeoSeries series, GeoSample sample) Download a GEO sample within the context of its series.Obtain a list of all additional supplementary files.Obtain a list of all additional supplementary files.getAdditionalSupplementaryFiles(GeoSeries series, GeoSample sample) getSingleCellDataLoader(GeoSeries series, SingleCellDataLoaderConfig config) Obtain a single cell data loader for the given GEO series based on previously downloading data.booleanhasSingleCellData(GeoSample sample) Indicate if the given GEO sample has single-cell data.booleanhasSingleCellData(GeoSample sample, boolean allowArchiveLookup) Check if a GEO sample contains single-cell data.booleanhasSingleCellData(GeoSeries series) Indicate if the given GEO series has single-cell data.booleanhasSingleCellData(GeoSeries series, GeoSample sample) Check if a sample contains single-cell data in the context of its series.voidsetBarcodeMetadataFileSuffix(String barcodeMetadataFileSuffix) voidsetBarcodesFileSuffix(String barcodesFileSuffix) voidsetCellRangerPrefix(Path cellRangerPrefix) voidsetFeaturesFileSuffix(String featuresFileSuffix) voidsetGenesFileSuffix(String genesFileSuffix) voidsetMatrixFileSuffix(String matrixFileSuffix) voidsetMaxEntrySizeInArchiveToSkip(long maxEntrySizeInArchiveToSkip) Set the maximum size of an archive entry to skip the supplementary file altogether.voidsetMaxNumberOfEntriesToSkip(long maxNumberOfEntriesToSkip) Set the maximum number of archive entries to skip in order to ignore the supplementary file altogether.Methods inherited from class ubic.gemma.core.loader.expression.geo.singleCell.AbstractSingleCellDetector
existsAndHasExpectedSize, getDownloadDirectory, getSizeInBytes, openSupplementaryFileAsStream, retry, setDownloadDirectory, setFTPClientFactory, setRetryPolicyMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface ubic.gemma.core.loader.expression.geo.singleCell.SingleCellDetector
setDownloadDirectory, setRetryPolicy
-
Field Details
-
DEFAULT_BARCODES_FILE_SUFFIX
- See Also:
-
DEFAULT_BARCODE_METADATA_FILE_SUFFIX
- See Also:
-
DEFAULT_FEATURES_FILE_SUFFIX
- See Also:
-
DEFAULT_GENES_FILE_SUFFIX
- See Also:
-
DEFAULT_MATRIX_FILE_SUFFIX
- See Also:
-
-
Constructor Details
-
MexDetector
public MexDetector()
-
-
Method Details
-
setMaxEntrySizeInArchiveToSkip
public void setMaxEntrySizeInArchiveToSkip(long maxEntrySizeInArchiveToSkip) Description copied from interface:ArchiveBasedSingleCellDetectorSet the maximum size of an archive entry to skip the supplementary file altogether.Use -1 to indicate no limit.
Note that if a relevant file was previously found in the archive, it will not be skipped.
- Specified by:
setMaxEntrySizeInArchiveToSkipin interfaceArchiveBasedSingleCellDetector
-
setMaxNumberOfEntriesToSkip
public void setMaxNumberOfEntriesToSkip(long maxNumberOfEntriesToSkip) Description copied from interface:ArchiveBasedSingleCellDetectorSet the maximum number of archive entries to skip in order to ignore the supplementary file altogether.Use -1 to indicate no limit.
Note that if a relevant file was previously found in the archive, it will not be ignored.
- Specified by:
setMaxNumberOfEntriesToSkipin interfaceArchiveBasedSingleCellDetector
-
hasSingleCellData
Indicate if the given GEO series has single-cell data.MEX data detection is not supported at the series level, so while this method can return true if barcodes/genes/matrices are present in the series supplementary files,
downloadSingleCellData(GeoSeries)will subsequently fail.- Specified by:
hasSingleCellDatain interfaceSingleCellDetector
-
hasSingleCellData
Check if a sample contains single-cell data in the context of its series.- Specified by:
hasSingleCellDatain interfaceSeriesAwareSingleCellDetector
-
hasSingleCellData
Description copied from interface:SingleCellDetectorIndicate if the given GEO sample has single-cell data.- Specified by:
hasSingleCellDatain interfaceSingleCellDetector
-
hasSingleCellData
Check if a GEO sample contains single-cell data.- Parameters:
allowArchiveLookup- allow looking into archives for MEX files
-
downloadSingleCellData
Description copied from interface:SingleCellDetectorDownload single-cell data for the given GEO series.- Specified by:
downloadSingleCellDatain interfaceSingleCellDetector- Returns:
- a directory or file containing the downloaded series data
- Throws:
NoSingleCellDataFoundException- if there is no single-cell data for the given series
-
downloadSingleCellData
public Path downloadSingleCellData(GeoSeries series, GeoSample sample) throws NoSingleCellDataFoundException, IOException Download a GEO sample within the context of its series.This will first download the sample with
downloadSingleCellData(String, Collection)with the merged supplementary files from the series and the sample, then create aseries/samplefolder structure and finally hard-link all the sample files in there. This ensures that if two series mention the same sample, they can reuse the same files.- Specified by:
downloadSingleCellDatain interfaceSeriesAwareSingleCellDetector- Throws:
NoSingleCellDataFoundExceptionIOException
-
downloadSingleCellData
public Path downloadSingleCellData(GeoSample sample) throws NoSingleCellDataFoundException, IOException Retrieve single-cell data for the given GEO sample to disk.- Specified by:
downloadSingleCellDatain interfaceSingleCellDetector- Returns:
- a directory or file containing the downloaded sample data
- Throws:
NoSingleCellDataFoundException- if no single-cell data is found in the given GEO sampleIOException
-
getAdditionalSupplementaryFiles
Description copied from interface:SingleCellDetectorObtain a list of all additional supplementary files.- Specified by:
getAdditionalSupplementaryFilesin interfaceSingleCellDetector
-
getAdditionalSupplementaryFiles
Description copied from interface:SingleCellDetectorObtain a list of all additional supplementary files.- Specified by:
getAdditionalSupplementaryFilesin interfaceSingleCellDetector
-
getAdditionalSupplementaryFiles
- Specified by:
getAdditionalSupplementaryFilesin interfaceSeriesAwareSingleCellDetector
-
getSingleCellDataLoader
public SingleCellDataLoader getSingleCellDataLoader(GeoSeries series, SingleCellDataLoaderConfig config) throws NoSingleCellDataFoundException Description copied from interface:SingleCellDetectorObtain a single cell data loader for the given GEO series based on previously downloading data.- Specified by:
getSingleCellDataLoaderin interfaceSingleCellDetector- Throws:
NoSingleCellDataFoundException- if there is no single-cell data for the given series
-
setBarcodesFileSuffix
-
setBarcodeMetadataFileSuffix
-
setFeaturesFileSuffix
-
setGenesFileSuffix
-
setMatrixFileSuffix
-
setCellRangerPrefix
-