Class SAMSequenceDictionary

    • Field Detail

      • DEFAULT_DICTIONARY_EQUAL_TAG

        public static final List<String> DEFAULT_DICTIONARY_EQUAL_TAG
    • Constructor Detail

      • SAMSequenceDictionary

        public SAMSequenceDictionary()
    • Method Detail

      • setSequences

        public void setSequences​(List<SAMSequenceRecord> list)
        Replaces the existing list of SAMSequenceRecords with the given list. Reset the aliases
        Parameters:
        list - This value is copied and validated.
      • getSequence

        public SAMSequenceRecord getSequence​(int sequenceIndex)
        Returns:
        The SAMSequenceRecord with the given index, or null if index is out of range.
      • getSequenceIndex

        public int getSequenceIndex​(String sequenceName)
        Returns:
        The index for the given sequence name, or -1 if the name is not found.
      • size

        public int size()
        Returns:
        number of SAMSequenceRecord(s) in this dictionary
      • getReferenceLength

        public long getReferenceLength()
        Returns:
        The sum of the lengths of the sequences in this dictionary
      • isEmpty

        public boolean isEmpty()
        Returns:
        true is the dictionary is empty
      • equals

        public boolean equals​(Object o)
        Returns true if the two dictionaries are the same.

        NOTE: Aliases are NOT considered, but alternative sequence names (AN tag) names ARE.

        Overrides:
        equals in class Object
      • addSequenceAlias

        public SAMSequenceRecord addSequenceAlias​(String originalName,
                                                  String altName)
        Add an alias to a SAMSequenceRecord. This can be use to provide some alternate names fo a given contig. e.g: 1,chr1,chr01,01,CM000663,NC_000001.10 e.g: MT,chrM

        NOTE: this method does not add the alias to the alternative sequence name tag (AN) in the SAMSequenceRecord. If you would like to add it to the AN tag, use addAlternativeSequenceName(String, String) instead.

        Parameters:
        originalName - existing contig name
        altName - new contig name
        Returns:
        the contig associated to the 'originalName/altName'
      • addAlternativeSequenceName

        public SAMSequenceRecord addAlternativeSequenceName​(String originalName,
                                                            String altName)
        Add an alternative sequence name (AN tag) to a SAMSequenceRecord, including it into the aliases to retrieve the contigs (as with addSequenceAlias(String, String).

        This can be use to provide some alternate names fo a given contig. e.g: 1,chr1,chr01,01,CM000663 or MT,chrM.

        Parameters:
        originalName - existing contig name
        altName - new contig name
        Returns:
        the contig associated to the 'originalName/altName', with the AN tag including the altName
      • md5

        public String md5()
        return a MD5 sum for ths dictionary, the checksum is re-computed each time this method is called.
         md5( (seq1.md5_if_available) + ' '+(seq2.name+seq2.length) + ' '+...)
         
        Returns:
        a MD5 checksum for this dictionary or the empty string if it is empty
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • mergeDictionaries

        public static SAMSequenceDictionary mergeDictionaries​(SAMSequenceDictionary dict1,
                                                              SAMSequenceDictionary dict2,
                                                              List<String> tagsToMatch)
        Will merge dictionaryTags from two dictionaries into one focusing on merging the tags rather than the sequences. Requires that dictionaries have the same SAMSequence records in the same order. For each sequenceIndex, the union of the tags from both sequences will be added to the new sequence, mismatching values (for tags that are in both) will generate a warning, and the value from dict1 will be used. For tags that are in tagsToEquate an unequal value will generate an error (an IllegalArgumentException will be thrown.) tagsToEquate must include LN and MD.
        Parameters:
        dict1 - first dictionary
        dict2 - first dictionary
        tagsToMatch - list of tags that must be equal if present in both sequence. Must contain MD, and LN
        Returns:
        dictionary consisting of the same sequences as the two inputs with the merged values of tags.