Class DiffExAnalyzerUtils
java.lang.Object
ubic.gemma.core.analysis.expression.diff.DiffExAnalyzerUtils
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbuildDesignMatrix
(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues) Build a design matrix for the given factors and samples.static ObjectMatrix
<String, String, Object> buildRDesignMatrix
(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues) Build an R-friendly design matrix.static ObjectMatrix
<String, String, Object> buildRDesignMatrix
(List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, Map<ExperimentalFactor, FactorValue> baselines, boolean allowMissingValues) A variant ofbuildRDesignMatrix(List, List, boolean)
that allows for reusing baselines for repeated calls.static BioAssayDimension
createBADMap
(List<BioMaterial> columnsToUse) This bioAssayDimension shouldn't get persisted; it is only for dealing with subset diff ex. analyses.static AnalysisType
determineAnalysisType
(BioAssaySet bioAssaySet, Collection<ExperimentalFactor> experimentalFactors, ExperimentalFactor subsetFactor, boolean includeInteractionsIfPossible) Determines the analysis to execute based on the experimental factors, factor values, and block design.static AnalysisType
determineAnalysisType
(BioAssaySet bioAssaySet, DifferentialExpressionAnalysisConfig config) FIXME this should probably deal with the case of outliers and also theLinearModelAnalyzer
's EXCLUDE_CHARACTERISTICS_VALUESstatic String
Format an interaction of factors.static DoubleMatrix
<String, String> makeDataMatrix
(ObjectMatrix<String, String, Object> designMatrix, DoubleMatrix<CompositeSequence, BioMaterial> namedMatrix) Convert the data into a string-keyed matrix.static String
nameForR
(BioMaterial sample) Create a name for a sample suitable for R.static String
static String
nameForR
(ExperimentalFactor experimentalFactor) Create a name for the factor that is suitable for R.static String
nameForR
(FactorValue fv, boolean isBaseline) Create a name for the factor value that is suitable for R.static void
static void
writeConfig
(DifferentialExpressionAnalysisConfig config, Writer writer)
-
Field Details
-
BIO_MATERIAL_RNAME_PREFIX
- See Also:
-
FACTOR_RNAME_PREFIX
- See Also:
-
FACTOR_VALUE_RNAME_PREFIX
- See Also:
-
FACTOR_VALUE_BASELINE_SUFFIX
- See Also:
-
-
Constructor Details
-
DiffExAnalyzerUtils
public DiffExAnalyzerUtils()
-
-
Method Details
-
createBADMap
This bioAssayDimension shouldn't get persisted; it is only for dealing with subset diff ex. analyses.- Parameters:
columnsToUse
- columns to use- Returns:
- bio assay dimension
-
populateFactorValuesFromBASet
public static void populateFactorValuesFromBASet(BioAssaySet ee, ExperimentalFactor f, Collection<FactorValue> fvs) -
makeDataMatrix
public static DoubleMatrix<String,String> makeDataMatrix(ObjectMatrix<String, String, Object> designMatrix, DoubleMatrix<CompositeSequence, BioMaterial> namedMatrix) Convert the data into a string-keyed matrix. Assumes that the row names of the designMatrix are concordant with the column names of the namedMatrix -
nameForR
-
determineAnalysisType
public static AnalysisType determineAnalysisType(BioAssaySet bioAssaySet, DifferentialExpressionAnalysisConfig config) FIXME this should probably deal with the case of outliers and also theLinearModelAnalyzer
's EXCLUDE_CHARACTERISTICS_VALUES- Returns:
- selected type of analysis such as t-test, two-way ANOVA, etc.
-
determineAnalysisType
public static AnalysisType determineAnalysisType(BioAssaySet bioAssaySet, Collection<ExperimentalFactor> experimentalFactors, @Nullable ExperimentalFactor subsetFactor, boolean includeInteractionsIfPossible) Determines the analysis to execute based on the experimental factors, factor values, and block design.FIXME: this should probably deal with the case of outliers and also the
LinearModelAnalyzer
's EXCLUDE_CHARACTERISTICS_VALUES- Parameters:
bioAssaySet
- experiment or subset to determine the analysis type forexperimentalFactors
- which factors to use, or null if to use all from the experimentsubsetFactor
- can be nullincludeInteractionsIfPossible
- include interactions among the provided experimental factors if possible- Returns:
- an appropriate analysis type
-
writeConfig
public static void writeConfig(DifferentialExpressionAnalysisConfig config, Writer writer) throws IOException - Throws:
IOException
-
formatInteraction
Format an interaction of factors. -
buildDesignMatrix
public static ObjectMatrix<BioMaterial,ExperimentalFactor, buildDesignMatrixObject> (List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues) Build a design matrix for the given factors and samples.- Parameters:
factors
- factorssamplesUsed
- the samples usedallowMissingValues
- whether to allow missing values, if set to true, the returned matrix may contain nulls- Returns:
- the experimental design matrix
- Throws:
IllegalStateException
- if missing values are found and allowMissingValues is false
-
buildRDesignMatrix
public static ObjectMatrix<String,String, buildRDesignMatrixObject> (List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, boolean allowMissingValues) Build an R-friendly design matrix.Rows and columns use names derived from
nameForR(BioMaterial)
,nameForR(ExperimentalFactor)
andnameForR(FactorValue, boolean)
such that the resulting matrix can be passed to R for analysis. It is otherwise identical tobuildDesignMatrix(List, List, boolean)
. -
buildRDesignMatrix
public static ObjectMatrix<String,String, buildRDesignMatrixObject> (List<ExperimentalFactor> factors, List<BioMaterial> samplesUsed, Map<ExperimentalFactor, FactorValue> baselines, boolean allowMissingValues) A variant ofbuildRDesignMatrix(List, List, boolean)
that allows for reusing baselines for repeated calls. This is used for subset analysis. -
nameForR
Create a name for a sample suitable for R. -
nameForR
Create a name for the factor that is suitable for R. -
nameForR
Create a name for the factor value that is suitable for R.
-