Interface BulkExpressionDataMatrix<T>
-
- All Superinterfaces:
ExpressionDataMatrix<T>
- All Known Subinterfaces:
BulkExpressionDataPrimitiveDoubleMatrix
,BulkExpressionDataPrimitiveIntMatrix
,MultiAssayBulkExpressionDataMatrix<T>
- All Known Implementing Classes:
AbstractBulkExpressionDataMatrix
,AbstractMultiAssayExpressionDataMatrix
,BulkExpressionDataDoubleMatrix
,BulkExpressionDataIntMatrix
,EmptyBulkExpressionDataMatrix
,EmptyExpressionMatrix
,ExpressionDataBooleanMatrix
,ExpressionDataDoubleMatrix
,ExpressionDataIntegerMatrix
,ExpressionDataStringMatrix
public interface BulkExpressionDataMatrix<T> extends ExpressionDataMatrix<T>
Interface for bulk expression data matrices.In a bulk expression data matrix, each column represents a sample.
Expression data is rather complex, so we have to handle some messy cases.
The key problem is how to unambiguously identify rows and columns in the matrix. This is greatly complicated by the fact that experiments can combine data from multiple array designs in various ways.
Put it together, and the result is that there can be more than one
BioAssay
per column; the sameBioMaterial
can be used in multiple columns (supported implicitly). There can also be more than on BioMaterial in one column (we don't support this yet either). The sameBioSequence
can be found in multiple rows. A row can contain data from more than oneCompositeSequence
. These cases are handled by theMultiAssayBulkExpressionDataMatrix
interface and their corresponding implementations. This interface assumes the simplest case where each column is represented by aBioAssay
and each row is represented by aCompositeSequence
.There are a few constraints: a particular
CompositeSequence
can only be used once, in a single row. At the moment we do not directly support technical replicates, though this should be possible. ABioAssay
can only appear in a single column.For some operations a
ExpressionDataMatrixRowElement
object is offered, which encapsulates a combination ofCompositeSequence
, aBioSequence
, and an index. The list of these can be useful for iterating over the rows of the matrix.- Author:
- pavlidis, keshav
- See Also:
BioAssayDimension
,BulkExpressionDataVector
,MultiAssayBulkExpressionDataMatrix
-
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Modifier and Type Method Description T
get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix.BioAssayDimension
getBioAssayDimension()
Obtain the dimension for the columns of this matrix.BioAssay
getBioAssayForColumn(int index)
Obtain an assay corresponding to a given column.BioMaterial
getBioMaterialForColumn(int index)
Obtain a biomaterial corresponding to a column.T[]
getColumn(BioAssay bioAssay)
Access a single column of the matrix.int
getColumnIndex(BioAssay bioAssay)
int
getColumnIndex(BioMaterial bioMaterial)
static BulkExpressionDataMatrix<?>
getMatrix(Collection<? extends BulkExpressionDataVector> vectors)
Create a bulk expression data matrix from a collection of vectors.T[][]
getRawMatrix()
Access the entire matrix.boolean
hasMissingValues()
-
Methods inherited from interface ubic.gemma.core.datastructure.matrix.ExpressionDataMatrix
columns, get, getColumn, getDesignElementForRow, getDesignElements, getExpressionExperiment, getQuantitationType, getRow, getRow, getRowElement, getRowElements, getRowIndex, getRowIndices, rows
-
-
-
-
Method Detail
-
getMatrix
static BulkExpressionDataMatrix<?> getMatrix(Collection<? extends BulkExpressionDataVector> vectors)
Create a bulk expression data matrix from a collection of vectors. All vectors must share the sameQuantitationType
.
-
getBioAssayDimension
BioAssayDimension getBioAssayDimension()
Obtain the dimension for the columns of this matrix.
-
hasMissingValues
boolean hasMissingValues()
- Returns:
- true if any values are null or NaN (for doubles and floats); any other value that is considered missing.
-
get
@Nullable T get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple design elements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Parameters:
designElement
- debioAssay
- ba- Returns:
- the value at the given design element and bioassay, or
null
if the value is missing
-
getRawMatrix
T[][] getRawMatrix()
Access the entire matrix.- Returns:
- T[][]
-
getColumn
@Nullable T[] getColumn(BioAssay bioAssay)
Access a single column of the matrix.- Returns:
- a vector for the given column, or null if the column is not present
-
getColumnIndex
int getColumnIndex(BioAssay bioAssay)
- Returns:
- the index of the column for the data for the bioAssay, or -1 if missing
-
getColumnIndex
int getColumnIndex(BioMaterial bioMaterial)
-
getBioAssayForColumn
BioAssay getBioAssayForColumn(int index)
Obtain an assay corresponding to a given column.
-
getBioMaterialForColumn
BioMaterial getBioMaterialForColumn(int index)
Obtain a biomaterial corresponding to a column.
-
-