Class BaseExpressionDataMatrix<T>
- java.lang.Object
-
- ubic.gemma.core.datastructure.matrix.BaseExpressionDataMatrix<T>
-
- All Implemented Interfaces:
Serializable
,BulkExpressionDataMatrix<T>
,ExpressionDataMatrix<T>
- Direct Known Subclasses:
EmptyExpressionMatrix
,ExpressionDataBooleanMatrix
,ExpressionDataDoubleMatrix
,ExpressionDataIntegerMatrix
,ExpressionDataStringMatrix
public abstract class BaseExpressionDataMatrix<T> extends Object implements BulkExpressionDataMatrix<T>, Serializable
Base class for ExpressionDataMatrix implementations. Implementation note: The underlying DoubleMatrixNamed is indexed by Integers, which are in turn mapped to BioAssays etc. held here. Thus the 'names' of the underlying matrix are just numbers.- Author:
- pavlidis
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected Map<CompositeSequence,BioAssayDimension>
bioAssayDimensions
protected Map<BioAssay,Integer>
columnAssayMap
protected Map<Integer,Collection<BioAssay>>
columnBioAssayMapByInteger
protected Map<BioMaterial,Integer>
columnBioMaterialMap
protected Map<Integer,BioMaterial>
columnBioMaterialMapByInteger
protected ExpressionExperiment
expressionExperiment
protected Collection<QuantitationType>
quantitationTypes
protected Map<Integer,CompositeSequence>
rowDesignElementMapByInteger
protected Map<CompositeSequence,Integer>
rowElementMap
-
Constructor Summary
Constructors Constructor Description BaseExpressionDataMatrix()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
addToRowMaps(int row, CompositeSequence designElement)
Each row is a unique DesignElement.int
columns(CompositeSequence el)
Number of columns that use the given design element.BioAssayDimension
getBestBioAssayDimension()
BioAssayDimension
getBioAssayDimension(CompositeSequence designElement)
Produce a BioAssayDimension representing the matrix columns for a specific row.BioAssay
getBioAssayForColumn(int index)
Obtain a single assay for a column.Collection<BioAssay>
getBioAssaysForColumn(int index)
BioMaterial
getBioMaterialForColumn(int index)
int
getColumnIndex(BioAssay bioAssay)
Obtain the column index of a given assay.int
getColumnIndex(BioMaterial bioMaterial)
CompositeSequence
getDesignElementForRow(int index)
Return a design element for a given index.List<CompositeSequence>
getDesignElements()
Obtain all the design elements in this data matrix.ExpressionExperiment
getExpressionExperiment()
The experiment this matrix is associated with, if known.QuantitationType
getQuantitationType()
Obtain the single quantitation type for this matrix.Collection<QuantitationType>
getQuantitationTypes()
Return the quantitation types for this matrix.ExpressionDataMatrixRowElement
getRowElement(int index)
ExpressionDataMatrixRowElement
getRowElement(CompositeSequence designElement)
List<ExpressionDataMatrixRowElement>
getRowElements()
int
getRowIndex(CompositeSequence designElement)
protected void
init()
protected void
selectVectors(Collection<? extends BulkExpressionDataVector> vectors)
Selects all the vectors passed in (uses them to initialize the data)protected Collection<BulkExpressionDataVector>
selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes)
protected Collection<BulkExpressionDataVector>
selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes)
protected Collection<BulkExpressionDataVector>
selectVectors(ExpressionExperiment ee, QuantitationType quantitationType)
protected <R,C,V>
voidsetMatBioAssayValues(AbstractMatrix<R,C,V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it)
protected int
setUpColumnElements()
Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension.protected abstract void
vectorsToMatrix(Collection<? extends BulkExpressionDataVector> vectors)
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface ubic.gemma.core.datastructure.matrix.BulkExpressionDataMatrix
get, get, getColumn, getColumns, getRawMatrix, hasMissingValues, set
-
-
-
-
Field Detail
-
expressionExperiment
@Nullable protected ExpressionExperiment expressionExperiment
-
quantitationTypes
protected Collection<QuantitationType> quantitationTypes
-
bioAssayDimensions
protected Map<CompositeSequence,BioAssayDimension> bioAssayDimensions
-
columnBioMaterialMap
protected Map<BioMaterial,Integer> columnBioMaterialMap
-
columnBioAssayMapByInteger
protected Map<Integer,Collection<BioAssay>> columnBioAssayMapByInteger
-
columnBioMaterialMapByInteger
protected Map<Integer,BioMaterial> columnBioMaterialMapByInteger
-
rowElementMap
protected Map<CompositeSequence,Integer> rowElementMap
-
rowDesignElementMapByInteger
protected Map<Integer,CompositeSequence> rowDesignElementMapByInteger
-
-
Method Detail
-
columns
public int columns(CompositeSequence el)
Description copied from interface:BulkExpressionDataMatrix
Number of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Specified by:
columns
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
el
- el- Returns:
- int
-
getBestBioAssayDimension
public BioAssayDimension getBestBioAssayDimension()
- Specified by:
getBestBioAssayDimension
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- a
BioAssayDimension
that covers all the biomaterials in this matrix.
-
getBioAssayDimension
public BioAssayDimension getBioAssayDimension(CompositeSequence designElement)
Description copied from interface:BulkExpressionDataMatrix
Produce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Specified by:
getBioAssayDimension
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
designElement
- de- Returns:
- bad
-
getBioAssaysForColumn
public Collection<BioAssay> getBioAssaysForColumn(int index)
- Specified by:
getBioAssaysForColumn
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
index
- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioAssayForColumn
public BioAssay getBioAssayForColumn(int index)
Description copied from interface:BulkExpressionDataMatrix
Obtain a single assay for a column.- Specified by:
getBioAssayForColumn
in interfaceBulkExpressionDataMatrix<T>
- Returns:
-
getBioMaterialForColumn
public BioMaterial getBioMaterialForColumn(int index)
- Specified by:
getBioMaterialForColumn
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
index
- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumnIndex
public int getColumnIndex(BioMaterial bioMaterial)
- Specified by:
getColumnIndex
in interfaceBulkExpressionDataMatrix<T>
- Parameters:
bioMaterial
- bm- Returns:
- the index of the column for the data for the bioMaterial, or -1 if missing
-
getDesignElements
public List<CompositeSequence> getDesignElements()
Description copied from interface:ExpressionDataMatrix
Obtain all the design elements in this data matrix.- Specified by:
getDesignElements
in interfaceExpressionDataMatrix<T>
-
getDesignElementForRow
public CompositeSequence getDesignElementForRow(int index)
Description copied from interface:ExpressionDataMatrix
Return a design element for a given index.- Specified by:
getDesignElementForRow
in interfaceExpressionDataMatrix<T>
-
getExpressionExperiment
@Nullable public ExpressionExperiment getExpressionExperiment()
Description copied from interface:BulkExpressionDataMatrix
The experiment this matrix is associated with, if known.- Specified by:
getExpressionExperiment
in interfaceBulkExpressionDataMatrix<T>
- Specified by:
getExpressionExperiment
in interfaceExpressionDataMatrix<T>
-
getQuantitationTypes
public Collection<QuantitationType> getQuantitationTypes()
Description copied from interface:BulkExpressionDataMatrix
Return the quantitation types for this matrix. Often (usually) there will be just one.- Specified by:
getQuantitationTypes
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- qts
-
getQuantitationType
public QuantitationType getQuantitationType()
Description copied from interface:BulkExpressionDataMatrix
Obtain the single quantitation type for this matrix.- Specified by:
getQuantitationType
in interfaceBulkExpressionDataMatrix<T>
-
getRowElements
public List<ExpressionDataMatrixRowElement> getRowElements()
- Specified by:
getRowElements
in interfaceExpressionDataMatrix<T>
- Returns:
- list of elements representing the row 'labels'.
-
getRowIndex
public int getRowIndex(CompositeSequence designElement)
- Specified by:
getRowIndex
in interfaceExpressionDataMatrix<T>
- Returns:
- the index for the given design element, or -1 if not found
-
getRowElement
public ExpressionDataMatrixRowElement getRowElement(int index)
- Specified by:
getRowElement
in interfaceExpressionDataMatrix<T>
-
getRowElement
@Nullable public ExpressionDataMatrixRowElement getRowElement(CompositeSequence designElement)
- Specified by:
getRowElement
in interfaceExpressionDataMatrix<T>
-
vectorsToMatrix
protected abstract void vectorsToMatrix(Collection<? extends BulkExpressionDataVector> vectors)
-
getColumnIndex
public int getColumnIndex(BioAssay bioAssay)
Obtain the column index of a given assay.- Specified by:
getColumnIndex
in interfaceBulkExpressionDataMatrix<T>
- Returns:
- the index, or -1 if not found
-
init
protected void init()
-
setMatBioAssayValues
protected <R,C,V> void setMatBioAssayValues(AbstractMatrix<R,C,V> mat, Integer rowIndex, V[] vals, Collection<BioAssay> bioAssays, Iterator<BioAssay> it)
-
addToRowMaps
protected void addToRowMaps(int row, CompositeSequence designElement)
Each row is a unique DesignElement.- Parameters:
row
- The row number to be used by this design element.
-
setUpColumnElements
protected int setUpColumnElements()
Note: In the current versions of Gemma, we require that there can be only a single BioAssayDimension. Thus this code is overly complex. If an experiment has multiple BioAssayDimensions (due to multiple arrays), we merge the vectors (e.g., needed in the last case shown below). However, the issue of having multiple "BioMaterials" per "BioAssay" still exists.
Deals with the fact that the bioassay dimensions can vary in size, and don't even need to overlap in the biomaterials used. In the case where there is a single BioAssayDimension this reduces to simply associating each column with a bioassay (though we are forced to use an integer under the hood).
For example, in the following diagram "-" indicates a biomaterial, while "*" indicates a bioassay. Each row of "*" indicates samples run on a different microarray design (a different bio assay material). In the examples we assume there is just a single biomaterial dimension.
--------------- ***** -- only a few samples run on this platform ********** -- ditto **** -- these samples were not run on any of the other platforms .
A simpler case:
--------------- *************** *********** *******
A more typical and easy case (one microarray design used):
---------------- ****************
If every sample was run on two different array designs:
---------------- **************** ****************
Every sample was run on a different array design:
----------------------- ****** ********* ********
Because there can be limited or no overlap between the bioassay dimensions, we cannot assume the dimensions of the matrix will be defined by the longest BioAssayDimension. Note that later in processing, this possible lack of overlap is fixed by sample matching or vector merging; this class has to deal with the general case.
-
selectVectors
protected void selectVectors(Collection<? extends BulkExpressionDataVector> vectors)
Selects all the vectors passed in (uses them to initialize the data)
-
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, Collection<QuantitationType> qTypes)
-
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(Collection<? extends BulkExpressionDataVector> vectors, List<QuantitationType> qTypes)
-
selectVectors
protected Collection<BulkExpressionDataVector> selectVectors(ExpressionExperiment ee, QuantitationType quantitationType)
-
-