Class AbstractMultiAssayExpressionDataMatrix<T>
- All Implemented Interfaces:
BulkExpressionDataMatrix<T>
,ExpressionDataMatrix<T>
,MultiAssayBulkExpressionDataMatrix<T>
- Direct Known Subclasses:
EmptyExpressionMatrix
,ExpressionDataBooleanMatrix
,ExpressionDataDoubleMatrix
,ExpressionDataIntegerMatrix
,ExpressionDataStringMatrix
Implementation note: The underlying DoubleMatrixNamed is indexed by Integers, which are in turn mapped to BioAssays etc. held here. Thus the 'names' of the underlying matrix are just numbers.
- Author:
- pavlidis
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected Map
<CompositeSequence, BioAssayDimension> protected Map
<Integer, Collection<BioAssay>> protected Map
<BioMaterial, Integer> protected Map
<Integer, BioMaterial> protected ExpressionExperiment
protected Collection
<QuantitationType> protected Map
<Integer, CompositeSequence> protected Map
<CompositeSequence, Integer> -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
addToRowMaps
(int row, CompositeSequence designElement) Each row is a unique DesignElement.int
Number of columns that use the given design element.protected abstract String
format
(int row, int column) Format the value at the provided indices of the matrix.protected String
Produce a string representation of the type of values held in the matrix.get
(CompositeSequence designElement, BioAssay bioAssay) Access a single value of the matrix.Obtain the largestBioAssayDimension
that covers all the biomaterials in this matrix.Obtain the dimension for the columns of this matrix.getBioAssayDimension
(CompositeSequence designElement) Produce a BioAssayDimension representing the matrix columns for a specific row.Obtain all theBioAssayDimension
s that are used in this matrix.getBioAssayForColumn
(int index) Obtain an assay corresponding to a given column.getBioAssaysForColumn
(int index) getBioMaterialForColumn
(int index) Obtain a biomaterial corresponding to a column.T[]
Access a single column of the matrix.int
getColumnIndex
(BioAssay bioAssay) Obtain the column index of a given assay.int
getColumnIndex
(BioMaterial bioMaterial) protected String
getColumnLabel
(int j) Obtain a label suitable for describing a column of the matrix.getDesignElementForRow
(int index) Return a design element for a given index.Obtain all the design elements in this data matrix.Return the expression experiment this matrix is holding data for, if known.Obtain the quantitation type for this matrix.Return the quantitation types for this matrix.T[]
getRow
(CompositeSequence designElement) Return a row that 'came from' the given design element.getRowElement
(int index) int
getRowIndex
(CompositeSequence designElement) int[]
getRowIndices
(CompositeSequence designElement) Obtain all the rows that correspond to the given design element, ornull
if the design element is not found.protected String
getRowLabel
(int i) Obtain a label suitable for describing a row of the matrix.protected void
selectVectors
(Collection<? extends BulkExpressionDataVector> vectors) Selects all the vectors passed in (uses them to initialize the data)protected Collection
<BulkExpressionDataVector> selectVectors
(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes) protected Collection
<BulkExpressionDataVector> selectVectors
(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes) protected <R,
C, V> void setMatBioAssayValues
(AbstractMatrix<R, C, V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it) protected int
Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension.protected abstract void
vectorsToMatrix
(Collection<? extends BulkExpressionDataVector> vectors) Methods inherited from class ubic.gemma.core.datastructure.matrix.AbstractExpressionDataMatrix
format, format, toString
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface ubic.gemma.core.datastructure.matrix.BulkExpressionDataMatrix
getRawMatrix, hasMissingValues
-
Field Details
-
expressionExperiment
-
quantitationTypes
-
bioAssayDimensions
-
columnAssayMap
-
columnBioMaterialMap
-
columnBioAssayMapByInteger
-
columnBioMaterialMapByInteger
-
rowElementMap
-
rowDesignElementMapByInteger
-
-
Constructor Details
-
AbstractMultiAssayExpressionDataMatrix
protected AbstractMultiAssayExpressionDataMatrix()
-
-
Method Details
-
columns
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Number of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Specified by:
columns
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
el
- el- Returns:
- int
-
getBioAssayDimensions
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain all theBioAssayDimension
s that are used in this matrix.- Specified by:
getBioAssayDimensions
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain the dimension for the columns of this matrix.- Specified by:
getBioAssayDimension
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getBioAssayDimension
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBestBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain the largestBioAssayDimension
that covers all the biomaterials in this matrix.- Specified by:
getBestBioAssayDimension
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Returns:
- the best
BioAssayDimension
for this matrix, orOptional.empty()
if no such dimension exists
-
getBioAssayDimension
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Produce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Specified by:
getBioAssayDimension
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
designElement
- de- Returns:
- bad
-
getBioAssaysForColumn
- Specified by:
getBioAssaysForColumn
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
index
- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioAssayForColumn
Description copied from interface:BulkExpressionDataMatrix
Obtain an assay corresponding to a given column.- Specified by:
getBioAssayForColumn
in interfaceBulkExpressionDataMatrix<T>
-
getBioMaterialForColumn
Description copied from interface:BulkExpressionDataMatrix
Obtain a biomaterial corresponding to a column.- Specified by:
getBioMaterialForColumn
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getBioMaterialForColumn
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
index
- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumn
Description copied from interface:BulkExpressionDataMatrix
Access a single column of the matrix.- Specified by:
getColumn
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- a vector for the given column, or null if the column is not present
-
getColumnIndex
- Specified by:
getColumnIndex
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getColumnIndex
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
bioMaterial
- bm- Returns:
- the index of the column for the data for the bioMaterial, or -1 if missing
-
getRow
Description copied from interface:ExpressionDataMatrix
Return a row that 'came from' the given design element.- Specified by:
getRow
in interfaceExpressionDataMatrix<T>
- Parameters:
designElement
- de- Returns:
- the corresponding row or null if the design element is not found in the matrix
-
getDesignElements
Description copied from interface:ExpressionDataMatrix
Obtain all the design elements in this data matrix.- Specified by:
getDesignElements
in interfaceExpressionDataMatrix<T>
-
getDesignElementForRow
Description copied from interface:ExpressionDataMatrix
Return a design element for a given index.- Specified by:
getDesignElementForRow
in interfaceExpressionDataMatrix<T>
-
get
Description copied from interface:BulkExpressionDataMatrix
Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple design elements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Specified by:
get
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
designElement
- debioAssay
- ba- Returns:
- the value at the given design element and bioassay, or
null
if the value is missing
-
getExpressionExperiment
Description copied from interface:ExpressionDataMatrix
Return the expression experiment this matrix is holding data for, if known.- Specified by:
getExpressionExperiment
in interfaceExpressionDataMatrix<T>
-
getQuantitationTypes
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Return the quantitation types for this matrix. Often (usually) there will be just one.- Specified by:
getQuantitationTypes
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getQuantitationType
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain the quantitation type for this matrix.In the case of multi-assay matrices, more than one quantitation type may be present. When possible, those are merged with
QuantitationTypeUtils.mergeQuantitationTypes(Collection)
.- Specified by:
getQuantitationType
in interfaceExpressionDataMatrix<T>
- Specified by:
getQuantitationType
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getRowElements
- Specified by:
getRowElements
in interfaceExpressionDataMatrix<T>
- Returns:
- list of elements representing the row 'labels'.
-
getRowIndex
- Specified by:
getRowIndex
in interfaceExpressionDataMatrix<T>
- Returns:
- the index for the given design element, or -1 if not found
-
getRowIndices
Description copied from interface:ExpressionDataMatrix
Obtain all the rows that correspond to the given design element, ornull
if the design element is not found.- Specified by:
getRowIndices
in interfaceExpressionDataMatrix<T>
-
getRowElement
- Specified by:
getRowElement
in interfaceExpressionDataMatrix<T>
-
getColumnIndex
Obtain the column index of a given assay.- Specified by:
getColumnIndex
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- the index, or -1 if not found
-
format
Format the value at the provided indices of the matrix.- Specified by:
format
in classAbstractExpressionDataMatrix<T>
-
vectorsToMatrix
-
setMatBioAssayValues
protected <R,C, void setMatBioAssayValuesV> (AbstractMatrix<R, C, V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it) -
addToRowMaps
Each row is a unique DesignElement.- Parameters:
row
- The row number to be used by this design element.
-
setUpColumnElements
protected int setUpColumnElements()Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension. Thus this code is overly complex. If an experiment has multiple BioAssayDimensions (due to multiple arrays), we merge the vectors (e.g., needed in the last case shown below). However, the issue of having multiple "BioMaterials" per "BioAssay" still exists.
Deals with the fact that the bioassay dimensions can vary in size, and don't even need to overlap in the biomaterials used. In the case where there is a single BioAssayDimension this reduces to simply associating each column with a bioassay (though we are forced to use an integer under the hood).
For example, in the following diagram "-" indicates a biomaterial, while "*" indicates a bioassay. Each row of "*" indicates samples run on a different microarray design (a different bio assay material). In the examples we assume there is just a single biomaterial dimension.
--------------- ***** -- only a few samples run on this platform ********** -- ditto **** -- these samples were not run on any of the other platforms .
A simpler case:
--------------- *************** *********** *******
A more typical and easy case (one microarray design used):
---------------- ****************
If every sample was run on two different array designs:
---------------- **************** ****************
Every sample was run on a different array design:
----------------------- ****** ********* ********
Because there can be limited or no overlap between the bioassay dimensions, we cannot assume the dimensions of the matrix will be defined by the longest BioAssayDimension. Note that later in processing, this possible lack of overlap is fixed by sample matching or vector merging; this class has to deal with the general case.
-
selectVectors
Selects all the vectors passed in (uses them to initialize the data) -
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes) -
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes) -
formatRepresentation
Description copied from class:AbstractExpressionDataMatrix
Produce a string representation of the type of values held in the matrix.- Overrides:
formatRepresentation
in classAbstractExpressionDataMatrix<T>
-
getRowLabel
Description copied from class:AbstractExpressionDataMatrix
Obtain a label suitable for describing a row of the matrix.- Specified by:
getRowLabel
in classAbstractExpressionDataMatrix<T>
-
getColumnLabel
Description copied from class:AbstractExpressionDataMatrix
Obtain a label suitable for describing a column of the matrix.- Specified by:
getColumnLabel
in classAbstractExpressionDataMatrix<T>
-