Class NcbiGeneHistoryParser

  • All Implemented Interfaces:
    LineParser<NcbiGeneHistory>, Parser<NcbiGeneHistory>

    public class NcbiGeneHistoryParser
    extends BasicLineMapParser<String,​NcbiGeneHistory>
    Parse the NCBI "gene_history" file. File format : tax_id, GeneID,Discontinued_GeneID, Discontinued_Symbol, Discontinue_Date; (tab is used as a separator, pound sign - start of a comment) File is obtained from ftp.ncbi.nih.gov.gene/DATA See ncbi readme There are two kinds of lines. Lines with a "-" for the GeneID (the majority) seems to be used when the record was withdrawn (Field is defined as "the current unique identified for a gene"). Lines with a symbol means it was replaced, so far as I can tell.
    Author:
    paul