Class TwoChannelMissingValuesImpl

java.lang.Object
ubic.gemma.core.analysis.preprocess.TwoChannelMissingValuesImpl
All Implemented Interfaces:
TwoChannelMissingValues

@Component public class TwoChannelMissingValuesImpl extends Object implements TwoChannelMissingValues
Computes a missing value matrix for ratiometric data sets.

Supported formats and special cases:

  • Genepix: CH1B_MEDIAN etc; (various versions)
  • Incyte GEMTools: RAW_DATA etc (no background values)
  • Quantarray: CH1_BKD etc
  • F635.Median / F532.Median (genepix as rendered in some data sets)
  • CH1_SMTM (found in GPL230)
  • Caltech (GPL260)
  • Agilent (Ch2BkgMedian etc or CH2_SIG_MEAN etc)
  • GSE3251 (ch1.Background etc)
  • GPL560 (*_CY3 vs *CY5)
  • GSE1501 (NormCH2)

The missing values are computed with the following considerations with respect to available data

  1. If the preferred quantitation type data is a missing value, then the data are considered missing (for consistency).
  2. We then do additional checks if there is 'signal' data available.
  3. If there are background values, they are used to compute signal-to-noise ratios
  4. If the signal values already contain missing data, these are still considered missing.
  5. If there are no background values, we try to compute a threshold based on a quantile of the signal
  6. Otherwise, values will be considered 'present' unless the signal values are zero or missing.
Author:
pavlidis
  • Constructor Details

    • TwoChannelMissingValuesImpl

      public TwoChannelMissingValuesImpl()
  • Method Details

    • computeMissingValues

      @Transactional public Collection<RawExpressionDataVector> computeMissingValues(ExpressionExperiment ee)
      Specified by:
      computeMissingValues in interface TwoChannelMissingValues
      Parameters:
      ee - The expression experiment to analyze. The quantitation types to use are selected automatically. If you want more control use other computeMissingValues methods.
      Returns:
      collection of raw expression data vectors
    • computeMissingValues

      @Transactional public Collection<RawExpressionDataVector> computeMissingValues(ExpressionExperiment ee, double signalToNoiseThreshold, @Nullable Collection<Double> extraMissingValueIndicators)
      Specified by:
      computeMissingValues in interface TwoChannelMissingValues
      Parameters:
      ee - The expression experiment to analyze. The quantitation types to use are selected automatically.
      signalToNoiseThreshold - A value such as 1.5 or 2.0; only spots for which at least ONE of the channel signal is more than signalToNoiseThreshold*background (and the preferred data are not missing) will be considered present.
      extraMissingValueIndicators - Values that should be considered missing. For example, some data sets use '0'. This can be null or empty and it will be ignored.
      Returns:
      DesignElementDataVectors corresponding to a new PRESENTCALL quantitation type for the experiment.