Class ExperimentalDesignUtils
- java.lang.Object
-
- ubic.gemma.model.expression.experiment.ExperimentalDesignUtils
-
@ParametersAreNonnullByDefault public class ExperimentalDesignUtils extends Object
- Author:
- paul
-
-
Field Summary
Fields Modifier and Type Field Description static List<Category>
BATCH_FACTOR_CATEGORIES
A list of all categories considered to be batch.static String
BATCH_FACTOR_NAME
Name used by a batch factor.static String
BIO_MATERIAL_RNAME_PREFIX
static String
FACTOR_RNAME_PREFIX
static String
FACTOR_VALUE_BASELINE_SUFFIX
static String
FACTOR_VALUE_RNAME_PREFIX
-
Constructor Summary
Constructors Constructor Description ExperimentalDesignUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static ObjectMatrix<BioMaterial,ExperimentalFactor,Object>
buildDesignMatrix(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues)
Build a design matrix for the given factors and samples.static ObjectMatrix<String,String,Object>
buildRDesignMatrix(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues)
Build an R-friendly design matrix.static ObjectMatrix<String,String,Object>
buildRDesignMatrix(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, Map<ExperimentalFactor,FactorValue> baselines, boolean allowMissingValues)
A variant ofbuildRDesignMatrix(List, List, boolean)
that allows for reusing baselines for repeated calls.static Map<ExperimentalFactor,FactorValue>
getBaselineConditions(List<BioMaterial> samplesUsed, List<ExperimentalFactor> factors)
static Map<ExperimentalFactor,FactorValue>
getBaselineLevels(Collection<ExperimentalFactor> factors)
Identify the FactorValue that should be treated as 'Baseline' for each of the given factors.static Map<ExperimentalFactor,FactorValue>
getBaselineLevels(Collection<ExperimentalFactor> factors, List<BioMaterial> samplesUsed)
Identify the FactorValue that should be treated as 'Baseline' for each of the given factors.static List<ExperimentalFactor>
getOrderedFactors(Collection<ExperimentalFactor> factors)
Sort factors in a consistent way.static Map<BioMaterial,Set<FactorValue>>
getSampleToFactorValuesMap(ExperimentalFactor factor, Collection<BioMaterial> samplesUsed)
Create a sample to factor value mapping.static boolean
isBatchFactor(ExperimentalFactor ef)
Check if a factor is a batch factor.static boolean
isBatchFactor(ExperimentalFactorValueObject ef)
Check if a given factor VO is a batch factor.static boolean
isComplete(ExperimentalFactor factor, List<BioMaterial> samplesUsed)
Check if a factor has missing values (samples that lack an assigned value)static double
measurement2double(Measurement measurement)
Convert a measurement to a double.static String
nameForR(BioMaterial sample)
Create a name for a sample suitable for R.static String
nameForR(ExperimentalFactor experimentalFactor)
Create a name for the factor that is suitable for R.static String
nameForR(FactorValue fv, boolean isBaseline)
Create a name for the factor value that is suitable for R.
-
-
-
Field Detail
-
BATCH_FACTOR_CATEGORIES
public static final List<Category> BATCH_FACTOR_CATEGORIES
A list of all categories considered to be batch.
-
BATCH_FACTOR_NAME
public static final String BATCH_FACTOR_NAME
Name used by a batch factor.This is used only if the factor lacks a category.
- See Also:
- Constant Field Values
-
BIO_MATERIAL_RNAME_PREFIX
public static final String BIO_MATERIAL_RNAME_PREFIX
- See Also:
- Constant Field Values
-
FACTOR_RNAME_PREFIX
public static final String FACTOR_RNAME_PREFIX
- See Also:
- Constant Field Values
-
FACTOR_VALUE_RNAME_PREFIX
public static final String FACTOR_VALUE_RNAME_PREFIX
- See Also:
- Constant Field Values
-
FACTOR_VALUE_BASELINE_SUFFIX
public static final String FACTOR_VALUE_BASELINE_SUFFIX
- See Also:
- Constant Field Values
-
-
Method Detail
-
isComplete
public static boolean isComplete(ExperimentalFactor factor, List<BioMaterial> samplesUsed)
Check if a factor has missing values (samples that lack an assigned value)- Parameters:
samplesUsed
- the samples usedfactor
- the factor- Returns:
- false if there are any missing values.
-
getSampleToFactorValuesMap
public static Map<BioMaterial,Set<FactorValue>> getSampleToFactorValuesMap(ExperimentalFactor factor, Collection<BioMaterial> samplesUsed)
Create a sample to factor value mapping.Under normal circumstances, there should be only one factor value per sample.
-
buildDesignMatrix
public static ObjectMatrix<BioMaterial,ExperimentalFactor,Object> buildDesignMatrix(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues)
Build a design matrix for the given factors and samples.- Parameters:
factors
- factorssamplesUsed
- the samples usedallowMissingValues
- whether to allow missing values, if set to true, the returned matrix may contain nulls- Returns:
- the experimental design matrix
- Throws:
IllegalStateException
- if missing values are found and allowMissingValues is false
-
buildRDesignMatrix
public static ObjectMatrix<String,String,Object> buildRDesignMatrix(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues)
Build an R-friendly design matrix.Rows and columns use names derived from
nameForR(BioMaterial)
,nameForR(ExperimentalFactor)
andnameForR(FactorValue, boolean)
such that the resulting matrix can be passed to R for analysis. It is otherwise identical tobuildDesignMatrix(List, List, boolean)
.
-
buildRDesignMatrix
public static ObjectMatrix<String,String,Object> buildRDesignMatrix(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, Map<ExperimentalFactor,FactorValue> baselines, boolean allowMissingValues)
A variant ofbuildRDesignMatrix(List, List, boolean)
that allows for reusing baselines for repeated calls. This is used for subset analysis.
-
getBaselineConditions
public static Map<ExperimentalFactor,FactorValue> getBaselineConditions(List<BioMaterial> samplesUsed, List<ExperimentalFactor> factors)
-
getBaselineLevels
public static Map<ExperimentalFactor,FactorValue> getBaselineLevels(Collection<ExperimentalFactor> factors)
Identify the FactorValue that should be treated as 'Baseline' for each of the given factors. This is done heuristically, and if all else fails we choose arbitrarily.- Parameters:
factors
- factors- Returns:
- map
-
getBaselineLevels
public static Map<ExperimentalFactor,FactorValue> getBaselineLevels(Collection<ExperimentalFactor> factors, @Nullable List<BioMaterial> samplesUsed)
Identify the FactorValue that should be treated as 'Baseline' for each of the given factors. This is done heuristically, and if all else fails we choose arbitrarily. For continuous factors, the minimum value is treated as baseline.- Parameters:
factors
- factorssamplesUsed
- These are used to make sure we don't bother using factor values as baselines if they are not used by any of the samples. This is important for subsets. If null, this is ignored.- Returns:
- map of factors to the baseline factorvalue for that factor.
-
measurement2double
public static double measurement2double(Measurement measurement)
Convert a measurement to a double. Missing values are treated as NaNs.- Throws:
UnsupportedOperationException
- if the measurement representation is not supported
-
isBatchFactor
public static boolean isBatchFactor(ExperimentalFactor ef)
Check if a factor is a batch factor.
-
isBatchFactor
public static boolean isBatchFactor(ExperimentalFactorValueObject ef)
Check if a given factor VO is a batch factor.
-
nameForR
public static String nameForR(BioMaterial sample)
Create a name for a sample suitable for R.
-
nameForR
public static String nameForR(ExperimentalFactor experimentalFactor)
Create a name for the factor that is suitable for R.
-
nameForR
public static String nameForR(FactorValue fv, boolean isBaseline)
Create a name for the factor value that is suitable for R.
-
getOrderedFactors
public static List<ExperimentalFactor> getOrderedFactors(Collection<ExperimentalFactor> factors)
Sort factors in a consistent way.For this to work, the factors must be persistent as the order will be based on the numerical ID.
-
-