Class AbstractMultiAssayExpressionDataMatrix<T>
- java.lang.Object
-
- ubic.gemma.core.datastructure.matrix.AbstractExpressionDataMatrix<T>
-
- ubic.gemma.core.datastructure.matrix.AbstractMultiAssayExpressionDataMatrix<T>
-
- All Implemented Interfaces:
BulkExpressionDataMatrix<T>
,ExpressionDataMatrix<T>
,MultiAssayBulkExpressionDataMatrix<T>
- Direct Known Subclasses:
EmptyExpressionMatrix
,ExpressionDataBooleanMatrix
,ExpressionDataDoubleMatrix
,ExpressionDataIntegerMatrix
,ExpressionDataStringMatrix
public abstract class AbstractMultiAssayExpressionDataMatrix<T> extends AbstractExpressionDataMatrix<T> implements MultiAssayBulkExpressionDataMatrix<T>, ExpressionDataMatrix<T>
Base class for ExpressionDataMatrix implementations that can deal with multiple BioAssays per BioMaterial.Implementation note: The underlying DoubleMatrixNamed is indexed by Integers, which are in turn mapped to BioAssays etc. held here. Thus the 'names' of the underlying matrix are just numbers.
- Author:
- pavlidis
-
-
Field Summary
Fields Modifier and Type Field Description protected Map<CompositeSequence,BioAssayDimension>
bioAssayDimensions
protected Map<BioAssay,Integer>
columnAssayMap
protected Map<Integer,Collection<BioAssay>>
columnBioAssayMapByInteger
protected Map<BioMaterial,Integer>
columnBioMaterialMap
protected Map<Integer,BioMaterial>
columnBioMaterialMapByInteger
protected ExpressionExperiment
expressionExperiment
protected Collection<QuantitationType>
quantitationTypes
protected Map<Integer,CompositeSequence>
rowDesignElementMapByInteger
protected Map<CompositeSequence,Integer>
rowElementMap
-
Constructor Summary
Constructors Modifier Constructor Description protected
AbstractMultiAssayExpressionDataMatrix()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
addToRowMaps(int row, CompositeSequence designElement)
Each row is a unique DesignElement.int
columns(CompositeSequence el)
Number of columns that use the given design element.protected abstract String
format(int row, int column)
Format the value at the provided indices of the matrix.protected String
formatRepresentation()
Produce a string representation of the type of values held in the matrix.T
get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix.Optional<BioAssayDimension>
getBestBioAssayDimension()
Obtain the largestBioAssayDimension
that covers all the biomaterials in this matrix.BioAssayDimension
getBioAssayDimension()
Obtain the dimension for the columns of this matrix.BioAssayDimension
getBioAssayDimension(CompositeSequence designElement)
Produce a BioAssayDimension representing the matrix columns for a specific row.Collection<BioAssayDimension>
getBioAssayDimensions()
Obtain all theBioAssayDimension
s that are used in this matrix.BioAssay
getBioAssayForColumn(int index)
Obtain an assay corresponding to a given column.Collection<BioAssay>
getBioAssaysForColumn(int index)
BioMaterial
getBioMaterialForColumn(int index)
Obtain a biomaterial corresponding to a column.T[]
getColumn(BioAssay bioAssay)
Access a single column of the matrix.int
getColumnIndex(BioAssay bioAssay)
Obtain the column index of a given assay.int
getColumnIndex(BioMaterial bioMaterial)
protected String
getColumnLabel(int j)
Obtain a label suitable for describing a column of the matrix.CompositeSequence
getDesignElementForRow(int index)
Return a design element for a given index.List<CompositeSequence>
getDesignElements()
Obtain all the design elements in this data matrix.ExpressionExperiment
getExpressionExperiment()
Return the expression experiment this matrix is holding data for, if known.QuantitationType
getQuantitationType()
Obtain the quantitation type for this matrix.Collection<QuantitationType>
getQuantitationTypes()
Return the quantitation types for this matrix.T[]
getRow(CompositeSequence designElement)
Return a row that 'came from' the given design element.ExpressionDataMatrixRowElement
getRowElement(int index)
List<ExpressionDataMatrixRowElement>
getRowElements()
int
getRowIndex(CompositeSequence designElement)
int[]
getRowIndices(CompositeSequence designElement)
Obtain all the rows that correspond to the given design element, ornull
if the design element is not found.protected String
getRowLabel(int i)
Obtain a label suitable for describing a row of the matrix.protected void
selectVectors(Collection<? extends BulkExpressionDataVector> vectors)
Selects all the vectors passed in (uses them to initialize the data)protected Collection<BulkExpressionDataVector>
selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes)
protected Collection<BulkExpressionDataVector>
selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes)
protected <R,C,V>
voidsetMatBioAssayValues(AbstractMatrix<R,C,V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it)
protected int
setUpColumnElements()
Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension.protected abstract void
vectorsToMatrix(Collection<? extends BulkExpressionDataVector> vectors)
-
Methods inherited from class ubic.gemma.core.datastructure.matrix.AbstractExpressionDataMatrix
format, format, toString
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface ubic.gemma.core.datastructure.matrix.BulkExpressionDataMatrix
getRawMatrix, hasMissingValues
-
-
-
-
Field Detail
-
expressionExperiment
@Nullable protected ExpressionExperiment expressionExperiment
-
quantitationTypes
protected Collection<QuantitationType> quantitationTypes
-
bioAssayDimensions
protected Map<CompositeSequence,BioAssayDimension> bioAssayDimensions
-
columnBioMaterialMap
protected Map<BioMaterial,Integer> columnBioMaterialMap
-
columnBioAssayMapByInteger
protected Map<Integer,Collection<BioAssay>> columnBioAssayMapByInteger
-
columnBioMaterialMapByInteger
protected Map<Integer,BioMaterial> columnBioMaterialMapByInteger
-
rowElementMap
protected Map<CompositeSequence,Integer> rowElementMap
-
rowDesignElementMapByInteger
protected Map<Integer,CompositeSequence> rowDesignElementMapByInteger
-
-
Method Detail
-
columns
public int columns(CompositeSequence el)
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Number of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Specified by:
columns
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
el
- el- Returns:
- int
-
getBioAssayDimensions
public Collection<BioAssayDimension> getBioAssayDimensions()
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain all theBioAssayDimension
s that are used in this matrix.- Specified by:
getBioAssayDimensions
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBioAssayDimension
public BioAssayDimension getBioAssayDimension()
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain the dimension for the columns of this matrix.- Specified by:
getBioAssayDimension
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getBioAssayDimension
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getBestBioAssayDimension
public Optional<BioAssayDimension> getBestBioAssayDimension()
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain the largestBioAssayDimension
that covers all the biomaterials in this matrix.- Specified by:
getBestBioAssayDimension
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Returns:
- the best
BioAssayDimension
for this matrix, orOptional.empty()
if no such dimension exists
-
getBioAssayDimension
public BioAssayDimension getBioAssayDimension(CompositeSequence designElement)
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Produce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Specified by:
getBioAssayDimension
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
designElement
- de- Returns:
- bad
-
getBioAssaysForColumn
public Collection<BioAssay> getBioAssaysForColumn(int index)
- Specified by:
getBioAssaysForColumn
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
index
- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioAssayForColumn
public BioAssay getBioAssayForColumn(int index)
Description copied from interface:BulkExpressionDataMatrix
Obtain an assay corresponding to a given column.- Specified by:
getBioAssayForColumn
in interfaceBulkExpressionDataMatrix<T>
-
getBioMaterialForColumn
public BioMaterial getBioMaterialForColumn(int index)
Description copied from interface:BulkExpressionDataMatrix
Obtain a biomaterial corresponding to a column.- Specified by:
getBioMaterialForColumn
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getBioMaterialForColumn
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
index
- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumn
public T[] getColumn(BioAssay bioAssay)
Description copied from interface:BulkExpressionDataMatrix
Access a single column of the matrix.- Specified by:
getColumn
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- a vector for the given column, or null if the column is not present
-
getColumnIndex
public int getColumnIndex(BioMaterial bioMaterial)
- Specified by:
getColumnIndex
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getColumnIndex
in interfaceMultiAssayBulkExpressionDataMatrix<T>
- Parameters:
bioMaterial
- bm- Returns:
- the index of the column for the data for the bioMaterial, or -1 if missing
-
getRow
public T[] getRow(CompositeSequence designElement)
Description copied from interface:ExpressionDataMatrix
Return a row that 'came from' the given design element.- Specified by:
getRow
in interfaceExpressionDataMatrix<T>
- Parameters:
designElement
- de- Returns:
- the corresponding row or null if the design element is not found in the matrix
-
getDesignElements
public List<CompositeSequence> getDesignElements()
Description copied from interface:ExpressionDataMatrix
Obtain all the design elements in this data matrix.- Specified by:
getDesignElements
in interfaceExpressionDataMatrix<T>
-
getDesignElementForRow
public CompositeSequence getDesignElementForRow(int index)
Description copied from interface:ExpressionDataMatrix
Return a design element for a given index.- Specified by:
getDesignElementForRow
in interfaceExpressionDataMatrix<T>
-
get
public T get(CompositeSequence designElement, BioAssay bioAssay)
Description copied from interface:BulkExpressionDataMatrix
Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple design elements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Specified by:
get
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
designElement
- debioAssay
- ba- Returns:
- the value at the given design element and bioassay, or
null
if the value is missing
-
getExpressionExperiment
public ExpressionExperiment getExpressionExperiment()
Description copied from interface:ExpressionDataMatrix
Return the expression experiment this matrix is holding data for, if known.- Specified by:
getExpressionExperiment
in interfaceExpressionDataMatrix<T>
-
getQuantitationTypes
public Collection<QuantitationType> getQuantitationTypes()
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Return the quantitation types for this matrix. Often (usually) there will be just one.- Specified by:
getQuantitationTypes
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getQuantitationType
public QuantitationType getQuantitationType()
Description copied from interface:MultiAssayBulkExpressionDataMatrix
Obtain the quantitation type for this matrix.In the case of multi-assay matrices, more than one quantitation type may be present. When possible, those are merged with
QuantitationTypeConversionUtils.mergeQuantitationTypes(Collection)
.- Specified by:
getQuantitationType
in interfaceExpressionDataMatrix<T>
- Specified by:
getQuantitationType
in interfaceMultiAssayBulkExpressionDataMatrix<T>
-
getRowElements
public List<ExpressionDataMatrixRowElement> getRowElements()
- Specified by:
getRowElements
in interfaceExpressionDataMatrix<T>
- Returns:
- list of elements representing the row 'labels'.
-
getRowIndex
public int getRowIndex(CompositeSequence designElement)
- Specified by:
getRowIndex
in interfaceExpressionDataMatrix<T>
- Returns:
- the index for the given design element, or -1 if not found
-
getRowIndices
public int[] getRowIndices(CompositeSequence designElement)
Description copied from interface:ExpressionDataMatrix
Obtain all the rows that correspond to the given design element, ornull
if the design element is not found.- Specified by:
getRowIndices
in interfaceExpressionDataMatrix<T>
-
getRowElement
public ExpressionDataMatrixRowElement getRowElement(int index)
- Specified by:
getRowElement
in interfaceExpressionDataMatrix<T>
-
getColumnIndex
public int getColumnIndex(BioAssay bioAssay)
Obtain the column index of a given assay.- Specified by:
getColumnIndex
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- the index, or -1 if not found
-
format
protected abstract String format(int row, int column)
Format the value at the provided indices of the matrix.- Specified by:
format
in classAbstractExpressionDataMatrix<T>
-
vectorsToMatrix
protected abstract void vectorsToMatrix(Collection<? extends BulkExpressionDataVector> vectors)
-
setMatBioAssayValues
protected <R,C,V> void setMatBioAssayValues(AbstractMatrix<R,C,V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it)
-
addToRowMaps
protected void addToRowMaps(int row, CompositeSequence designElement)
Each row is a unique DesignElement.- Parameters:
row
- The row number to be used by this design element.
-
setUpColumnElements
protected int setUpColumnElements()
Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension. Thus this code is overly complex. If an experiment has multiple BioAssayDimensions (due to multiple arrays), we merge the vectors (e.g., needed in the last case shown below). However, the issue of having multiple "BioMaterials" per "BioAssay" still exists.
Deals with the fact that the bioassay dimensions can vary in size, and don't even need to overlap in the biomaterials used. In the case where there is a single BioAssayDimension this reduces to simply associating each column with a bioassay (though we are forced to use an integer under the hood).
For example, in the following diagram "-" indicates a biomaterial, while "*" indicates a bioassay. Each row of "*" indicates samples run on a different microarray design (a different bio assay material). In the examples we assume there is just a single biomaterial dimension.
--------------- ***** -- only a few samples run on this platform ********** -- ditto **** -- these samples were not run on any of the other platforms .
A simpler case:
--------------- *************** *********** *******
A more typical and easy case (one microarray design used):
---------------- ****************
If every sample was run on two different array designs:
---------------- **************** ****************
Every sample was run on a different array design:
----------------------- ****** ********* ********
Because there can be limited or no overlap between the bioassay dimensions, we cannot assume the dimensions of the matrix will be defined by the longest BioAssayDimension. Note that later in processing, this possible lack of overlap is fixed by sample matching or vector merging; this class has to deal with the general case.
-
selectVectors
protected void selectVectors(Collection<? extends BulkExpressionDataVector> vectors)
Selects all the vectors passed in (uses them to initialize the data)
-
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes)
-
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes)
-
formatRepresentation
protected String formatRepresentation()
Description copied from class:AbstractExpressionDataMatrix
Produce a string representation of the type of values held in the matrix.- Overrides:
formatRepresentation
in classAbstractExpressionDataMatrix<T>
-
getRowLabel
protected String getRowLabel(int i)
Description copied from class:AbstractExpressionDataMatrix
Obtain a label suitable for describing a row of the matrix.- Specified by:
getRowLabel
in classAbstractExpressionDataMatrix<T>
-
getColumnLabel
protected String getColumnLabel(int j)
Description copied from class:AbstractExpressionDataMatrix
Obtain a label suitable for describing a column of the matrix.- Specified by:
getColumnLabel
in classAbstractExpressionDataMatrix<T>
-
-