Interface BulkExpressionDataMatrix<T>
-
- All Superinterfaces:
ExpressionDataMatrix<T>
- All Known Implementing Classes:
BaseExpressionDataMatrix
,EmptyExpressionMatrix
,ExpressionDataBooleanMatrix
,ExpressionDataDoubleMatrix
,ExpressionDataIntegerMatrix
,ExpressionDataStringMatrix
public interface BulkExpressionDataMatrix<T> extends ExpressionDataMatrix<T>
Interface for bulk expression data matrices.In a bulk expression data matrix, each column represents a sample.
Expression data is rather complex, so we have to handle some messy cases.
The key problem is how to unambiguously identify rows and columns in the matrix. This is greatly complicated by the fact that experiments can combine data from multiple array designs in various ways.
Put it together, and the result is that there can be more than one
BioAssay
per column; the sameBioMaterial
can be used in multiple columns (supported implicitly). There can also be more than on BioMaterial in one column (we don't support this yet either). The sameBioSequence
can be found in multiple rows. A row can contain data from more than oneCompositeSequence
.There are a few constraints: a particular
CompositeSequence
can only be used once, in a single row. At the moment we do not directly support technical replicates, though this should be possible. ABioAssay
can only appear in a single column.For some operations a
ExpressionDataMatrixRowElement
object is offered, which encapsulates a combination ofCompositeSequence
, aBioSequence
, and an index. The list of these can be useful for iterating over the rows of the matrix.- Author:
- pavlidis, keshav
- See Also:
BioAssayDimension
,BulkExpressionDataVector
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Modifier and Type Method Description int
columns(CompositeSequence el)
Number of columns that use the given design element.T[][]
get(List<CompositeSequence> designElements, List<BioAssay> bioAssays)
Access a submatrixT
get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix.BioAssayDimension
getBestBioAssayDimension()
BioAssayDimension
getBioAssayDimension(CompositeSequence designElement)
Produce a BioAssayDimension representing the matrix columns for a specific row.BioAssay
getBioAssayForColumn(int index)
Obtain a single assay for a column.Collection<BioAssay>
getBioAssaysForColumn(int index)
BioMaterial
getBioMaterialForColumn(int index)
T[]
getColumn(BioAssay bioAssay)
Access a single column of the matrix.int
getColumnIndex(BioAssay bioAssay)
int
getColumnIndex(BioMaterial bioMaterial)
T[][]
getColumns(List<BioAssay> bioAssays)
Access a submatrix slice by columnsExpressionExperiment
getExpressionExperiment()
The experiment this matrix is associated with, if known.static BulkExpressionDataMatrix<?>
getMatrix(Collection<? extends BulkExpressionDataVector> vectors)
Create a matrix using all the vectors, which are assumed to all be of the same quantitation type.QuantitationType
getQuantitationType()
Obtain the single quantitation type for this matrix.Collection<QuantitationType>
getQuantitationTypes()
Return the quantitation types for this matrix.T[][]
getRawMatrix()
Access the entire matrix.boolean
hasMissingValues()
void
set(int row, int column, T value)
Set a value in the matrix, by index-
Methods inherited from interface ubic.gemma.core.datastructure.matrix.ExpressionDataMatrix
columns, get, getColumn, getDesignElementForRow, getDesignElements, getRow, getRow, getRowElement, getRowElement, getRowElements, getRowIndex, rows
-
-
-
-
Method Detail
-
getMatrix
static BulkExpressionDataMatrix<?> getMatrix(Collection<? extends BulkExpressionDataVector> vectors)
Create a matrix using all the vectors, which are assumed to all be of the same quantitation type.- Parameters:
vectors
- raw vectors- Returns:
- matrix of appropriate type.
-
getExpressionExperiment
@Nullable ExpressionExperiment getExpressionExperiment()
The experiment this matrix is associated with, if known.- Specified by:
getExpressionExperiment
in interfaceExpressionDataMatrix<T>
-
getQuantitationTypes
Collection<QuantitationType> getQuantitationTypes()
Return the quantitation types for this matrix. Often (usually) there will be just one.- Returns:
- qts
-
getQuantitationType
QuantitationType getQuantitationType()
Obtain the single quantitation type for this matrix.- Throws:
IllegalStateException
- if there is more than one quantitation type.
-
getBestBioAssayDimension
BioAssayDimension getBestBioAssayDimension()
- Returns:
- a
BioAssayDimension
that covers all the biomaterials in this matrix. - Throws:
IllegalStateException
- if there isn't a single bioassaydimension that encapsulates all the biomaterials used in the experiment.
-
hasMissingValues
boolean hasMissingValues()
- Returns:
- true if any values are null or NaN (for Doubles); all other values are considered non-missing.
-
get
T get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple designelements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Parameters:
designElement
- debioAssay
- ba- Returns:
- T t
-
get
T[][] get(List<CompositeSequence> designElements, List<BioAssay> bioAssays)
Access a submatrix- Parameters:
designElements
- debioAssays
- bas- Returns:
- T[][]
-
getRawMatrix
T[][] getRawMatrix()
Access the entire matrix.- Returns:
- T[][]
-
getColumn
T[] getColumn(BioAssay bioAssay)
Access a single column of the matrix.- Parameters:
bioAssay
- i- Returns:
- T[]
-
getColumns
T[][] getColumns(List<BioAssay> bioAssays)
Access a submatrix slice by columns- Parameters:
bioAssays
- ba- Returns:
- t[][]
-
columns
int columns(CompositeSequence el)
Number of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Parameters:
el
- el- Returns:
- int
-
getBioMaterialForColumn
BioMaterial getBioMaterialForColumn(int index)
- Parameters:
index
- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumnIndex
int getColumnIndex(BioMaterial bioMaterial)
- Parameters:
bioMaterial
- bm- Returns:
- the index of the column for the data for the bioMaterial, or -1 if missing
-
getColumnIndex
int getColumnIndex(BioAssay bioAssay)
- Returns:
- the index of the column for the data for the bioAssay, or -1 if missing
-
getBioAssayDimension
BioAssayDimension getBioAssayDimension(CompositeSequence designElement)
Produce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Parameters:
designElement
- de- Returns:
- bad
-
getBioAssaysForColumn
Collection<BioAssay> getBioAssaysForColumn(int index)
- Parameters:
index
- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioAssayForColumn
BioAssay getBioAssayForColumn(int index)
Obtain a single assay for a column.- Parameters:
index
-- Returns:
- Throws:
IllegalStateException
- if there is more than one assay for the column.
-
set
void set(int row, int column, T value)
Set a value in the matrix, by index- Parameters:
row
- rowcolumn
- colvalue
- val
-
-