Class AbstractMultiAssayExpressionDataMatrix<T>
- All Implemented Interfaces:
BulkExpressionDataMatrix<T>,ExpressionDataMatrix<T>,MultiAssayBulkExpressionDataMatrix<T>
- Direct Known Subclasses:
EmptyExpressionMatrix,ExpressionDataBooleanMatrix,ExpressionDataDoubleMatrix,ExpressionDataIntegerMatrix,ExpressionDataStringMatrix
Implementation note: The underlying DoubleMatrixNamed is indexed by Integers, which are in turn mapped to BioAssays etc. held here. Thus the 'names' of the underlying matrix are just numbers.
- Author:
- pavlidis
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Map<CompositeSequence, BioAssayDimension> protected Map<Integer, Collection<BioAssay>> protected Map<BioMaterial, Integer> protected Map<Integer, BioMaterial> protected ExpressionExperimentprotected Collection<QuantitationType> protected Map<Integer, CompositeSequence> protected Map<CompositeSequence, Integer> -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected voidaddToRowMaps(int row, CompositeSequence designElement) Each row is a unique DesignElement.intNumber of columns that use the given design element.protected abstract Stringformat(int row, int column) Format the value at the provided indices of the matrix.protected StringProduce a string representation of the type of values held in the matrix.get(CompositeSequence designElement, BioAssay bioAssay) Access a single value of the matrix.Obtain the largestBioAssayDimensionthat covers all the biomaterials in this matrix.Obtain the dimension for the columns of this matrix.getBioAssayDimension(CompositeSequence designElement) Produce a BioAssayDimension representing the matrix columns for a specific row.Obtain all theBioAssayDimensions that are used in this matrix.getBioAssayForColumn(int index) Obtain an assay corresponding to a given column.getBioAssaysForColumn(int index) getBioMaterialForColumn(int index) Obtain a biomaterial corresponding to a column.T[]Access a single column of the matrix.intgetColumnIndex(BioAssay bioAssay) Obtain the column index of a given assay.intgetColumnIndex(BioMaterial bioMaterial) protected StringgetColumnLabel(int j) Obtain a label suitable for describing a column of the matrix.getDesignElementForRow(int index) Return a design element for a given index.Obtain all the design elements in this data matrix.Return the expression experiment this matrix is holding data for, if known.Obtain the quantitation type for this matrix.Return the quantitation types for this matrix.T[]getRow(CompositeSequence designElement) Return a row that 'came from' the given design element.getRowElement(int index) intgetRowIndex(CompositeSequence designElement) int[]getRowIndices(CompositeSequence designElement) Obtain all the rows that correspond to the given design element, ornullif the design element is not found.protected StringgetRowLabel(int i) Obtain a label suitable for describing a row of the matrix.protected voidselectVectors(Collection<? extends BulkExpressionDataVector> vectors) Selects all the vectors passed in (uses them to initialize the data)protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes) protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes) protected <R,C, V> void setMatBioAssayValues(AbstractMatrix<R, C, V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it) protected intNote: In the current versions of Gemma, we require that there can be only a single BioAssayDimension.protected abstract voidvectorsToMatrix(Collection<? extends BulkExpressionDataVector> vectors) Methods inherited from class ubic.gemma.core.datastructure.matrix.AbstractExpressionDataMatrix
format, format, toStringMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface ubic.gemma.core.datastructure.matrix.BulkExpressionDataMatrix
getRawMatrix, hasMissingValues
-
Field Details
-
expressionExperiment
-
quantitationTypes
-
bioAssayDimensions
-
columnAssayMap
-
columnBioMaterialMap
-
columnBioAssayMapByInteger
-
columnBioMaterialMapByInteger
-
rowElementMap
-
rowDesignElementMapByInteger
-
-
Constructor Details
-
AbstractMultiAssayExpressionDataMatrix
protected AbstractMultiAssayExpressionDataMatrix()
-
-
Method Details
-
columns
Description copied from interface:MultiAssayBulkExpressionDataMatrixNumber of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Specified by:
columnsin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
el- el- Returns:
- int
-
getBioAssayDimensions
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain all theBioAssayDimensions that are used in this matrix.- Specified by:
getBioAssayDimensionsin interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain the dimension for the columns of this matrix.- Specified by:
getBioAssayDimensionin interfaceBulkExpressionDataMatrix<T>- Specified by:
getBioAssayDimensionin interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBestBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain the largestBioAssayDimensionthat covers all the biomaterials in this matrix.- Specified by:
getBestBioAssayDimensionin interfaceMultiAssayBulkExpressionDataMatrix<T>- Returns:
- the best
BioAssayDimensionfor this matrix, orOptional.empty()if no such dimension exists
-
getBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrixProduce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Specified by:
getBioAssayDimensionin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
designElement- de- Returns:
- bad
-
getBioAssaysForColumn
- Specified by:
getBioAssaysForColumnin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
index- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioAssayForColumn
Description copied from interface:BulkExpressionDataMatrixObtain an assay corresponding to a given column.- Specified by:
getBioAssayForColumnin interfaceBulkExpressionDataMatrix<T>
-
getBioMaterialForColumn
Description copied from interface:BulkExpressionDataMatrixObtain a biomaterial corresponding to a column.- Specified by:
getBioMaterialForColumnin interfaceBulkExpressionDataMatrix<T>- Specified by:
getBioMaterialForColumnin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
index- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumn
Description copied from interface:BulkExpressionDataMatrixAccess a single column of the matrix.- Specified by:
getColumnin interfaceBulkExpressionDataMatrix<T>- Returns:
- a vector for the given column, or null if the column is not present
-
getColumnIndex
- Specified by:
getColumnIndexin interfaceBulkExpressionDataMatrix<T>- Specified by:
getColumnIndexin interfaceMultiAssayBulkExpressionDataMatrix<T>- Parameters:
bioMaterial- bm- Returns:
- the index of the column for the data for the bioMaterial, or -1 if missing
-
getRow
Description copied from interface:ExpressionDataMatrixReturn a row that 'came from' the given design element.- Specified by:
getRowin interfaceExpressionDataMatrix<T>- Parameters:
designElement- de- Returns:
- the corresponding row or null if the design element is not found in the matrix
-
getDesignElements
Description copied from interface:ExpressionDataMatrixObtain all the design elements in this data matrix.- Specified by:
getDesignElementsin interfaceExpressionDataMatrix<T>
-
getDesignElementForRow
Description copied from interface:ExpressionDataMatrixReturn a design element for a given index.- Specified by:
getDesignElementForRowin interfaceExpressionDataMatrix<T>
-
get
Description copied from interface:BulkExpressionDataMatrixAccess a single value of the matrix. Note that because there can be multiple bioassays per column and multiple design elements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Specified by:
getin interfaceBulkExpressionDataMatrix<T>- Parameters:
designElement- debioAssay- ba- Returns:
- the value at the given design element and bioassay, or
nullif the value is missing
-
getExpressionExperiment
Description copied from interface:ExpressionDataMatrixReturn the expression experiment this matrix is holding data for, if known.- Specified by:
getExpressionExperimentin interfaceExpressionDataMatrix<T>
-
getQuantitationTypes
Description copied from interface:MultiAssayBulkExpressionDataMatrixReturn the quantitation types for this matrix. Often (usually) there will be just one.- Specified by:
getQuantitationTypesin interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getQuantitationType
Description copied from interface:MultiAssayBulkExpressionDataMatrixObtain the quantitation type for this matrix.In the case of multi-assay matrices, more than one quantitation type may be present. When possible, those are merged with
QuantitationTypeUtils.mergeQuantitationTypes(Collection).- Specified by:
getQuantitationTypein interfaceExpressionDataMatrix<T>- Specified by:
getQuantitationTypein interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getRowElements
- Specified by:
getRowElementsin interfaceExpressionDataMatrix<T>- Returns:
- list of elements representing the row 'labels'.
-
getRowIndex
- Specified by:
getRowIndexin interfaceExpressionDataMatrix<T>- Returns:
- the index for the given design element, or -1 if not found
-
getRowIndices
Description copied from interface:ExpressionDataMatrixObtain all the rows that correspond to the given design element, ornullif the design element is not found.- Specified by:
getRowIndicesin interfaceExpressionDataMatrix<T>
-
getRowElement
- Specified by:
getRowElementin interfaceExpressionDataMatrix<T>
-
getColumnIndex
Obtain the column index of a given assay.- Specified by:
getColumnIndexin interfaceBulkExpressionDataMatrix<T>- Returns:
- the index, or -1 if not found
-
format
Format the value at the provided indices of the matrix.- Specified by:
formatin classAbstractExpressionDataMatrix<T>
-
vectorsToMatrix
-
setMatBioAssayValues
protected <R,C, void setMatBioAssayValuesV> (AbstractMatrix<R, C, V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it) -
addToRowMaps
Each row is a unique DesignElement.- Parameters:
row- The row number to be used by this design element.
-
setUpColumnElements
protected int setUpColumnElements()Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension. Thus this code is overly complex. If an experiment has multiple BioAssayDimensions (due to multiple arrays), we merge the vectors (e.g., needed in the last case shown below). However, the issue of having multiple "BioMaterials" per "BioAssay" still exists.
Deals with the fact that the bioassay dimensions can vary in size, and don't even need to overlap in the biomaterials used. In the case where there is a single BioAssayDimension this reduces to simply associating each column with a bioassay (though we are forced to use an integer under the hood).
For example, in the following diagram "-" indicates a biomaterial, while "*" indicates a bioassay. Each row of "*" indicates samples run on a different microarray design (a different bio assay material). In the examples we assume there is just a single biomaterial dimension.
--------------- ***** -- only a few samples run on this platform ********** -- ditto **** -- these samples were not run on any of the other platforms .A simpler case:
--------------- *************** *********** *******
A more typical and easy case (one microarray design used):
---------------- ****************
If every sample was run on two different array designs:
---------------- **************** ****************
Every sample was run on a different array design:
----------------------- ****** ********* ********Because there can be limited or no overlap between the bioassay dimensions, we cannot assume the dimensions of the matrix will be defined by the longest BioAssayDimension. Note that later in processing, this possible lack of overlap is fixed by sample matching or vector merging; this class has to deal with the general case.
-
selectVectors
Selects all the vectors passed in (uses them to initialize the data) -
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes) -
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes) -
formatRepresentation
Description copied from class:AbstractExpressionDataMatrixProduce a string representation of the type of values held in the matrix.- Overrides:
formatRepresentationin classAbstractExpressionDataMatrix<T>
-
getRowLabel
Description copied from class:AbstractExpressionDataMatrixObtain a label suitable for describing a row of the matrix.- Specified by:
getRowLabelin classAbstractExpressionDataMatrix<T>
-
getColumnLabel
Description copied from class:AbstractExpressionDataMatrixObtain a label suitable for describing a column of the matrix.- Specified by:
getColumnLabelin classAbstractExpressionDataMatrix<T>
-