Interface ExpressionDataMatrix<T>

  • All Known Implementing Classes:
    BaseExpressionDataMatrix, EmptyExpressionMatrix, ExpressionDataBooleanMatrix, ExpressionDataDoubleMatrix, ExpressionDataIntegerMatrix, ExpressionDataStringMatrix

    public interface ExpressionDataMatrix<T>
    Represents a matrix of data from an expression experiment. Expression data is rather complex, so we have to handle some messy cases. The key problem is how to unambiguously identify rows and columns in the matrix. This is greatly complicated by the fact that experiments can combine data from multiple array designs in various ways. Put it together, and the result is that there can be more than one BioAssay per column; the same BioMaterial can be used in multiple columns (supported implictly). There can also be more than on BioMaterial in one column (we don't support this yet either). The same BioSequence can be found in multiple rows. A row can contain data from more than one DesignElement. There are a few constraints: a particular DesignElement can only be used once, in a single row. At the moment we do not directly support technical replicates, though this should be possible. A BioAssay can only appear in a single column. For some operations a ExpressionDataMatrixRowElement object is offered, which encapsulates a combination of DesignElements, a BioSequence, and an index. The list of these can be useful for iterating over the rows of the matrix.
    Author:
    pavlidis, keshav
    • Method Detail

      • columns

        int columns()
        Total number of columns.
        Returns:
        int
      • columns

        int columns​(CompositeSequence el)
        Number of columns that use the given design element. Useful if the matrix includes data from more than one array design.
        Parameters:
        el - el
        Returns:
        int
      • get

        T get​(CompositeSequence designElement,
              BioAssay bioAssay)
        Access a single value of the matrix. Note that because there can be multiple bioassays per column and multiple designelements per row, it is possible for this method to retrieve a data that does not come from the bioassay and/or designelement arguments.
        Parameters:
        designElement - de
        bioAssay - ba
        Returns:
        T t
      • get

        T get​(int row,
              int column)
        Access a single value of the matrix. This is generally the easiest way to do it.
        Parameters:
        row - row
        column - col
        Returns:
        t
      • getBestBioAssayDimension

        BioAssayDimension getBestBioAssayDimension()
        Returns:
        The bioassaydimension that covers all the biomaterials in this matrix.
        Throws:
        IllegalStateException - if there isn't a single bioassaydimension that encapsulates all the biomaterials used in the experiment.
      • getBioAssayDimension

        BioAssayDimension getBioAssayDimension​(CompositeSequence designElement)
        Produce a BioAssayDimension representing the matrix columns for a specific row. The designelement argument is needed because a matrix can combine data from multiple array designs, each of which will generate its own bioassaydimension. Note that if this represents a subsetted data set, the return value may be a lightweight 'fake'.
        Parameters:
        designElement - de
        Returns:
        bad
      • getBioAssaysForColumn

        Collection<BioAssay> getBioAssaysForColumn​(int index)
        Parameters:
        index - i
        Returns:
        bioassays that contribute data to the column. There can be multiple bioassays if more than one array was used in the study.
      • getBioMaterialForColumn

        BioMaterial getBioMaterialForColumn​(int index)
        Parameters:
        index - i
        Returns:
        BioMaterial. Note that if this represents a subsetted data set, the BioMaterial may be a lightweight 'fake'.
      • getColumn

        T[] getColumn​(BioAssay bioAssay)
        Access a single column of the matrix.
        Parameters:
        bioAssay - i
        Returns:
        T[]
      • getColumn

        T[] getColumn​(Integer column)
        Access a single column of the matrix.
        Parameters:
        column - index
        Returns:
        T[]
      • getColumnIndex

        int getColumnIndex​(BioMaterial bioMaterial)
        Parameters:
        bioMaterial - bm
        Returns:
        the index of the column for the data for the bioMaterial.
      • getColumns

        T[][] getColumns​(List<BioAssay> bioAssays)
        Access a submatrix slice by columns
        Parameters:
        bioAssays - ba
        Returns:
        t[][]
      • getDesignElements

        List<CompositeSequence> getDesignElements()
        Obtain all the design elements in this data matrix.
      • getDesignElementForRow

        CompositeSequence getDesignElementForRow​(int index)
        Parameters:
        index - i
        Returns:
        cs
      • getExpressionExperiment

        ExpressionExperiment getExpressionExperiment()
        Return the expression experiment this matrix is holding data for.
        Returns:
        ee
      • getQuantitationTypes

        Collection<QuantitationType> getQuantitationTypes()
        Return the quantitation types for this matrix. Often (usually) there will be just one.
        Returns:
        qts
      • getRawMatrix

        T[][] getRawMatrix()
        Access the entire matrix.
        Returns:
        T[][]
      • getRow

        T[] getRow​(CompositeSequence designElement)
        Return a row that 'came from' the given design element.
        Parameters:
        designElement - de
        Returns:
        t
      • getRow

        T[] getRow​(Integer index)
        Access a single row of the matrix, by index. A complete row is returned.
        Parameters:
        index - i
        Returns:
        t[]
      • getRows

        T[][] getRows​(List<CompositeSequence> designElements)
        Access a submatrix
        Parameters:
        designElements - de
        Returns:
        T[][]
      • hasMissingValues

        boolean hasMissingValues()
        Returns:
        true if any values are null or NaN (for Doubles); all other values are considered non-missing.
      • rows

        int rows()
        Returns:
        int
      • set

        void set​(int row,
                 int column,
                 T value)
        Set a value in the matrix, by index
        Parameters:
        row - row
        column - col
        value - val