Class ProbeMapperConfig
- java.lang.Object
-
- ubic.gemma.core.analysis.sequence.ProbeMapperConfig
-
public class ProbeMapperConfig extends Object
Holds parameters for how mapping should be done.- Author:
- paul
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
DEFAULT_ALLOW_PARS
static boolean
DEFAULT_ALLOW_PREDICTED
static double
DEFAULT_IDENTITY_THRESHOLD
Sequence identity below which we throw hits away (expressed as a fraction)static double
DEFAULT_MINIMUM_EXON_OVERLAP_FRACTION
Fraction of bases which must overlap with an annotated exon.static double
DEFAULT_SCORE_THRESHOLD
BLAT score threshold below which we do not consider hits.static boolean
DEFAULT_TRIM_NONCANONICAL_CHROMOSOMES
static int
MAX_WARNINGS
static int
NON_REPEAT_NON_SPECIFIC_SITE_THRESHOLD
Sequences which hybridize to this many or more sites in the genome are candidates to be considered non-specific.static int
NON_SPECIFIC_SITE_THRESHOLD
Sequences which hybridize to this many or more sites in the genome are candidates to be considered non-specific.static double
REPEAT_FRACTION_MAXIMUM
Sequences which have more than this fraction accounted for by repeats (via repeatmasker) will not be examined if they produce multiple alignments to the genome, regardless of the alignment quality.
-
Constructor Summary
Constructors Constructor Description ProbeMapperConfig()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
getBlatScoreThreshold()
double
getIdentityThreshold()
double
getMaximumRepeatFraction()
double
getMinimumExonOverlapFraction()
double
getNonRepeatNonSpecificSiteCountThreshold()
double
getNonSpecificSiteCountThreshold()
protected int
getWarnings()
protected void
incrementWarnings()
boolean
isAllowPredictedGenes()
boolean
isTrimNonCanonicalChromosomehits()
boolean
isUseEnsembl()
boolean
isUseEsts()
boolean
isUseKnownGene()
boolean
isUseMiRNA()
boolean
isUseMrnas()
boolean
isUseRefGene()
void
setAllowPredictedGenes(boolean allowPredictedGenes)
void
setAllTracksOff()
Set to use no tracks.void
setAllTracksOn()
Set to use all tracks, including ESTsvoid
setBlatScoreThreshold(double blatScoreThreshold)
void
setIdentityThreshold(double identityThreshold)
void
setMaximumRepeatFraction(double maximumRepeatFraction)
void
setMinimumExonOverlapFraction(double minimumExonOverlapFraction)
void
setNonRepeatNonSpecificSiteCountThreshold(double nonRepeatNonSpecificSiteCountThreshold)
void
setNonSpecificSiteCountThreshold(double nonSpecificSiteCountThreshold)
void
setTrimNonCanonicalChromosomeHits(boolean trimNonCanonicalChromosomeHits)
void
setUseEnsembl(boolean useEnsembl)
void
setUseEsts(boolean useEsts)
void
setUseKnownGene(boolean useKnownGene)
void
setUseMiRNA(boolean useMiRNA)
void
setUseMrnas(boolean useMrnas)
void
setUseRefGene(boolean useRefGene)
String
toString()
-
-
-
Field Detail
-
DEFAULT_TRIM_NONCANONICAL_CHROMOSOMES
public static final boolean DEFAULT_TRIM_NONCANONICAL_CHROMOSOMES
- See Also:
- Constant Field Values
-
DEFAULT_ALLOW_PARS
public static final boolean DEFAULT_ALLOW_PARS
- See Also:
- Constant Field Values
-
DEFAULT_ALLOW_PREDICTED
public static final boolean DEFAULT_ALLOW_PREDICTED
- See Also:
- Constant Field Values
-
MAX_WARNINGS
public static final int MAX_WARNINGS
- See Also:
- Constant Field Values
-
DEFAULT_IDENTITY_THRESHOLD
public static final double DEFAULT_IDENTITY_THRESHOLD
Sequence identity below which we throw hits away (expressed as a fraction)- See Also:
- Constant Field Values
-
DEFAULT_MINIMUM_EXON_OVERLAP_FRACTION
public static final double DEFAULT_MINIMUM_EXON_OVERLAP_FRACTION
Fraction of bases which must overlap with an annotated exon. This should probably be higher than zero, to avoid "pure intron" hits, but setting it too high can cause loss of sensitivity.- See Also:
- Constant Field Values
-
DEFAULT_SCORE_THRESHOLD
public static final double DEFAULT_SCORE_THRESHOLD
BLAT score threshold below which we do not consider hits. This reflects the fraction of aligned bases.
-
NON_REPEAT_NON_SPECIFIC_SITE_THRESHOLD
public static final int NON_REPEAT_NON_SPECIFIC_SITE_THRESHOLD
Sequences which hybridize to this many or more sites in the genome are candidates to be considered non-specific. This is used even if the sequence does not contain a repeat.- See Also:
- Constant Field Values
-
NON_SPECIFIC_SITE_THRESHOLD
public static final int NON_SPECIFIC_SITE_THRESHOLD
Sequences which hybridize to this many or more sites in the genome are candidates to be considered non-specific. This is used in combination with the REPEAT_FRACTION_MAXIMUM. Note that many sequences which contain repeats nonetheless only align to very few sites in the genome. Similarly, there are sequences that map to multiple sites which are _not_ repeats. This value is also not designed to care about whether the alignments are in known genes or not. Thus setting this too low could result in over-stringent filtering.- See Also:
- Constant Field Values
-
REPEAT_FRACTION_MAXIMUM
public static final double REPEAT_FRACTION_MAXIMUM
Sequences which have more than this fraction accounted for by repeats (via repeatmasker) will not be examined if they produce multiple alignments to the genome, regardless of the alignment quality.- See Also:
- Constant Field Values
-
-
Method Detail
-
getBlatScoreThreshold
public double getBlatScoreThreshold()
- Returns:
- the blatScoreThreshold
-
setBlatScoreThreshold
public void setBlatScoreThreshold(double blatScoreThreshold)
- Parameters:
blatScoreThreshold
- the blatScoreThreshold to set
-
getIdentityThreshold
public double getIdentityThreshold()
- Returns:
- the identityThreshold
-
setIdentityThreshold
public void setIdentityThreshold(double identityThreshold)
- Parameters:
identityThreshold
- the identityThreshold to set
-
getMaximumRepeatFraction
public double getMaximumRepeatFraction()
- Returns:
- the maximumRepeatFraction
-
setMaximumRepeatFraction
public void setMaximumRepeatFraction(double maximumRepeatFraction)
- Parameters:
maximumRepeatFraction
- the maximumRepeatFraction to set
-
getMinimumExonOverlapFraction
public double getMinimumExonOverlapFraction()
-
setMinimumExonOverlapFraction
public void setMinimumExonOverlapFraction(double minimumExonOverlapFraction)
-
getNonRepeatNonSpecificSiteCountThreshold
public double getNonRepeatNonSpecificSiteCountThreshold()
-
setNonRepeatNonSpecificSiteCountThreshold
public void setNonRepeatNonSpecificSiteCountThreshold(double nonRepeatNonSpecificSiteCountThreshold)
-
getNonSpecificSiteCountThreshold
public double getNonSpecificSiteCountThreshold()
- Returns:
- the nonSpecificSiteCountThreshold
-
setNonSpecificSiteCountThreshold
public void setNonSpecificSiteCountThreshold(double nonSpecificSiteCountThreshold)
- Parameters:
nonSpecificSiteCountThreshold
- the nonSpecificSiteCountThreshold to set
-
isAllowPredictedGenes
public boolean isAllowPredictedGenes()
-
setAllowPredictedGenes
public void setAllowPredictedGenes(boolean allowPredictedGenes)
-
isTrimNonCanonicalChromosomehits
public boolean isTrimNonCanonicalChromosomehits()
-
isUseEnsembl
public boolean isUseEnsembl()
- Returns:
- the useEnsembl
-
setUseEnsembl
public void setUseEnsembl(boolean useEnsembl)
- Parameters:
useEnsembl
- the useEnsembl to set
-
isUseEsts
public boolean isUseEsts()
- Returns:
- the useEsts
-
setUseEsts
public void setUseEsts(boolean useEsts)
- Parameters:
useEsts
- the useEsts to set
-
isUseKnownGene
public boolean isUseKnownGene()
- Returns:
- the useKnownGene
-
setUseKnownGene
public void setUseKnownGene(boolean useKnownGene)
- Parameters:
useKnownGene
- the useKnownGene to set
-
isUseMiRNA
public boolean isUseMiRNA()
- Returns:
- the useMiRNA
-
setUseMiRNA
public void setUseMiRNA(boolean useMiRNA)
- Parameters:
useMiRNA
- the useMiRNA to set
-
isUseMrnas
public boolean isUseMrnas()
- Returns:
- the useMrnas
-
setUseMrnas
public void setUseMrnas(boolean useMrnas)
- Parameters:
useMrnas
- the useMrnas to set
-
isUseRefGene
public boolean isUseRefGene()
- Returns:
- the useRefGene
-
setUseRefGene
public void setUseRefGene(boolean useRefGene)
- Parameters:
useRefGene
- the useRefGene to set
-
setAllTracksOff
public void setAllTracksOff()
Set to use no tracks. Obviously then nothing will be found, so it is wise to then switch some tracks on.
-
setAllTracksOn
public void setAllTracksOn()
Set to use all tracks, including ESTs
-
setTrimNonCanonicalChromosomeHits
public void setTrimNonCanonicalChromosomeHits(boolean trimNonCanonicalChromosomeHits)
-
getWarnings
protected int getWarnings()
-
incrementWarnings
protected void incrementWarnings()
-
-