Interface ExpressionDataMatrix<T>
-
- All Known Implementing Classes:
BaseExpressionDataMatrix
,EmptyExpressionMatrix
,ExpressionDataBooleanMatrix
,ExpressionDataDoubleMatrix
,ExpressionDataIntegerMatrix
,ExpressionDataStringMatrix
public interface ExpressionDataMatrix<T>
Represents a matrix of data from an expression experiment. Expression data is rather complex, so we have to handle some messy cases. The key problem is how to unambiguously identify rows and columns in the matrix. This is greatly complicated by the fact that experiments can combine data from multiple array designs in various ways. Put it together, and the result is that there can be more than one BioAssay per column; the same BioMaterial can be used in multiple columns (supported implictly). There can also be more than on BioMaterial in one column (we don't support this yet either). The same BioSequence can be found in multiple rows. A row can contain data from more than one DesignElement. There are a few constraints: a particular DesignElement can only be used once, in a single row. At the moment we do not directly support technical replicates, though this should be possible. A BioAssay can only appear in a single column. For some operations a ExpressionDataMatrixRowElement object is offered, which encapsulates a combination of DesignElements, a BioSequence, and an index. The list of these can be useful for iterating over the rows of the matrix.- Author:
- pavlidis, keshav
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description int
columns()
Total number of columns.int
columns(CompositeSequence el)
Number of columns that use the given design element.T
get(int row, int column)
Access a single value of the matrix.T[][]
get(List<CompositeSequence> designElements, List<BioAssay> bioAssays)
Access a submatrixT
get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix.BioAssayDimension
getBestBioAssayDimension()
BioAssayDimension
getBioAssayDimension(CompositeSequence designElement)
Produce a BioAssayDimension representing the matrix columns for a specific row.Collection<BioAssay>
getBioAssaysForColumn(int index)
BioMaterial
getBioMaterialForColumn(int index)
T[]
getColumn(Integer column)
Access a single column of the matrix.T[]
getColumn(BioAssay bioAssay)
Access a single column of the matrix.int
getColumnIndex(BioMaterial bioMaterial)
T[][]
getColumns(List<BioAssay> bioAssays)
Access a submatrix slice by columnsCompositeSequence
getDesignElementForRow(int index)
List<CompositeSequence>
getDesignElements()
Obtain all the design elements in this data matrix.ExpressionExperiment
getExpressionExperiment()
Return the expression experiment this matrix is holding data for.Collection<QuantitationType>
getQuantitationTypes()
Return the quantitation types for this matrix.T[][]
getRawMatrix()
Access the entire matrix.T[]
getRow(Integer index)
Access a single row of the matrix, by index.T[]
getRow(CompositeSequence designElement)
Return a row that 'came from' the given design element.List<ExpressionDataMatrixRowElement>
getRowElements()
int
getRowIndex(CompositeSequence designElement)
T[][]
getRows(List<CompositeSequence> designElements)
Access a submatrixboolean
hasMissingValues()
int
rows()
void
set(int row, int column, T value)
Set a value in the matrix, by index
-
-
-
Method Detail
-
columns
int columns()
Total number of columns.- Returns:
- int
-
columns
int columns(CompositeSequence el)
Number of columns that use the given design element. Useful if the matrix includes data from more than one array design.- Parameters:
el
- el- Returns:
- int
-
get
T get(CompositeSequence designElement, BioAssay bioAssay)
Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple designelements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.- Parameters:
designElement
- debioAssay
- ba- Returns:
- T t
-
get
T get(int row, int column)
Access a single value of the matrix. This is generally the easiest way to do it.- Parameters:
row
- rowcolumn
- col- Returns:
- t
-
get
T[][] get(List<CompositeSequence> designElements, List<BioAssay> bioAssays)
Access a submatrix- Parameters:
designElements
- debioAssays
- bas- Returns:
- T[][]
-
getBestBioAssayDimension
BioAssayDimension getBestBioAssayDimension()
- Returns:
- The bioassaydimension that covers all the biomaterials in this matrix.
- Throws:
IllegalStateException
- if there isn't a single bioassaydimension that encapsulates all the biomaterials used in the experiment.
-
getBioAssayDimension
BioAssayDimension getBioAssayDimension(CompositeSequence designElement)
Produce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.- Parameters:
designElement
- de- Returns:
- bad
-
getBioAssaysForColumn
Collection<BioAssay> getBioAssaysForColumn(int index)
- Parameters:
index
- i- Returns:
- bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
-
getBioMaterialForColumn
BioMaterial getBioMaterialForColumn(int index)
- Parameters:
index
- i- Returns:
- BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
-
getColumn
T[] getColumn(BioAssay bioAssay)
Access a single column of the matrix.- Parameters:
bioAssay
- i- Returns:
- T[]
-
getColumn
T[] getColumn(Integer column)
Access a single column of the matrix.- Parameters:
column
- index- Returns:
- T[]
-
getColumnIndex
int getColumnIndex(BioMaterial bioMaterial)
- Parameters:
bioMaterial
- bm- Returns:
- the index of the column for the data for the bioMaterial.
-
getColumns
T[][] getColumns(List<BioAssay> bioAssays)
Access a submatrix slice by columns- Parameters:
bioAssays
- ba- Returns:
- t[][]
-
getDesignElements
List<CompositeSequence> getDesignElements()
Obtain all the design elements in this data matrix.
-
getDesignElementForRow
CompositeSequence getDesignElementForRow(int index)
- Parameters:
index
- i- Returns:
- cs
-
getExpressionExperiment
ExpressionExperiment getExpressionExperiment()
Return the expression experiment this matrix is holding data for.- Returns:
- ee
-
getQuantitationTypes
Collection<QuantitationType> getQuantitationTypes()
Return the quantitation types for this matrix. Often (usually) there will be just one.- Returns:
- qts
-
getRawMatrix
T[][] getRawMatrix()
Access the entire matrix.- Returns:
- T[][]
-
getRow
T[] getRow(CompositeSequence designElement)
Return a row that 'came from' the given design element.- Parameters:
designElement
- de- Returns:
- t
-
getRow
T[] getRow(Integer index)
Access a single row of the matrix, by index. A complete row is returned.- Parameters:
index
- i- Returns:
- t[]
-
getRowElements
List<ExpressionDataMatrixRowElement> getRowElements()
- Returns:
- list of elements representing the row 'labels'.
-
getRowIndex
int getRowIndex(CompositeSequence designElement)
-
getRows
T[][] getRows(List<CompositeSequence> designElements)
Access a submatrix- Parameters:
designElements
- de- Returns:
- T[][]
-
hasMissingValues
boolean hasMissingValues()
- Returns:
- true if any values are null or NaN (for Doubles); all other values are considered non-missing.
-
rows
int rows()
- Returns:
- int
-
set
void set(int row, int column, T value)
Set a value in the matrix, by index- Parameters:
row
- rowcolumn
- colvalue
- val
-
-