Interface BulkExpressionDataMatrix<T>
- All Superinterfaces:
ExpressionDataMatrix<T>
- All Known Subinterfaces:
BulkExpressionDataPrimitiveDoubleMatrix,BulkExpressionDataPrimitiveIntMatrix,MultiAssayBulkExpressionDataMatrix<T>
- All Known Implementing Classes:
AbstractBulkExpressionDataMatrix,AbstractMultiAssayExpressionDataMatrix,BulkExpressionDataDoubleMatrix,BulkExpressionDataIntMatrix,EmptyBulkExpressionDataMatrix,EmptyExpressionMatrix,ExpressionDataBooleanMatrix,ExpressionDataDoubleMatrix,ExpressionDataIntegerMatrix,ExpressionDataStringMatrix
In a bulk expression data matrix, each column represents a sample.
Expression data is rather complex, so we have to handle some messy cases.
The key problem is how to unambiguously identify rows and columns in the matrix. This is greatly complicated by the fact that experiments can combine data from multiple array designs in various ways.
Put it together, and the result is that there can be more than one BioAssay per column; the same BioMaterial
can be used in multiple columns (supported implicitly). There can also be more than on BioMaterial in one column
(we don't support this yet either). The same BioSequence can be found in multiple rows. A row can contain
data from more than one CompositeSequence. These cases are handled by the MultiAssayBulkExpressionDataMatrix
interface and their corresponding implementations. This interface assumes the simplest case where each column is
represented by a BioAssay and each row is represented by a CompositeSequence.
There are a few constraints: a particular CompositeSequence can only be used once, in a single row. At the
moment we do not directly support technical replicates, though this should be possible. A BioAssay can only
appear in a single column.
For some operations a ExpressionDataMatrixRowElement object is offered, which encapsulates a combination of
CompositeSequence, a BioSequence, and an index. The list of these can be useful for iterating over
the rows of the matrix.
- Author:
- pavlidis, keshav
- See Also:
-
Method Summary
Modifier and TypeMethodDescriptionget(CompositeSequence designElement, BioAssay bioAssay) Access a single value of the matrix.Obtain the dimension for the columns of this matrix.getBioAssayForColumn(int index) Obtain an assay corresponding to a given column.getBioMaterialForColumn(int index) Obtain a biomaterial corresponding to a column.T[]Access a single column of the matrix.intgetColumnIndex(BioAssay bioAssay) intgetColumnIndex(BioMaterial bioMaterial) static BulkExpressionDataMatrix<?> getMatrix(Collection<? extends BulkExpressionDataVector> vectors) Create a bulk expression data matrix from a collection of vectors.T[][]Access the entire matrix.booleanMethods inherited from interface ubic.gemma.core.datastructure.matrix.ExpressionDataMatrix
columns, get, getColumn, getDesignElementForRow, getDesignElements, getExpressionExperiment, getQuantitationType, getRow, getRow, getRowElement, getRowElements, getRowIndex, getRowIndices, rows, sliceRows
-
Method Details
-
getMatrix
static BulkExpressionDataMatrix<?> getMatrix(Collection<? extends BulkExpressionDataVector> vectors) Create a bulk expression data matrix from a collection of vectors. All vectors must share the sameQuantitationType. -
getBioAssayDimension
BioAssayDimension getBioAssayDimension()Obtain the dimension for the columns of this matrix. -
hasMissingValues
boolean hasMissingValues()- Returns:
- true if any values are null or NaN (for doubles and floats); any other value that is considered missing.
-
get
Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple design elements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Parameters:
designElement- debioAssay- ba- Returns:
- the value at the given design element and bioassay, or
nullif the value is missing
-
getRawMatrix
T[][] getRawMatrix()Access the entire matrix.- Returns:
- T[][]
-
getColumn
Access a single column of the matrix.- Returns:
- a vector for the given column, or null if the column is not present
-
getColumnIndex
- Returns:
- the index of the column for the data for the bioAssay, or -1 if missing
-
getColumnIndex
-
getBioAssayForColumn
Obtain an assay corresponding to a given column. -
getBioMaterialForColumn
Obtain a biomaterial corresponding to a column.
-