S
- type of the base sequence.final class HistogramDiffIndex<S extends Sequence>
extends java.lang.Object
HistogramDiff
by computing occurrence counts of elements.
Each element in the range being considered is put into a hash table, tracking the number of times that distinct element appears in the sequence. Once all elements have been inserted from sequence A, each element of sequence B is probed in the hash table and the longest common subsequence with the lowest occurrence count in A is used as the result.
Modifier and Type | Field and Description |
---|---|
private HashedSequence<S> |
a |
private HashedSequence<S> |
b |
private HashedSequenceComparator<S> |
cmp |
private int |
cnt |
private boolean |
hasCommon |
private int |
keyShift
Number of low bits to discard from a key to index
table . |
private Edit |
lcs |
private static int |
MAX_CNT |
private static int |
MAX_PTR |
private int |
maxChainLength |
private int[] |
next
For
ptr , next[ptr - ptrShift] has subsequent index. |
private int |
ptrShift
Value to subtract from element indexes to key
next array. |
private static int |
REC_CNT_MASK |
private static int |
REC_NEXT_SHIFT |
private static int |
REC_PTR_MASK |
private static int |
REC_PTR_SHIFT |
private int |
recCnt
Number of elements in
recs ; also is the unique element count. |
private int[] |
recIdx
For element
ptr in A, index of the record in recs array. |
private long[] |
recs
Describes a unique element in sequence A.
|
private Edit |
region |
private int[] |
table
Keyed by
hash(HashedSequence, int) for recs index. |
Constructor and Description |
---|
HistogramDiffIndex(int maxChainLength,
HashedSequenceComparator<S> cmp,
HashedSequence<S> a,
HashedSequence<S> b,
Edit r) |
Modifier and Type | Method and Description |
---|---|
(package private) Edit |
findLongestCommonSequence() |
private int |
hash(HashedSequence<S> s,
int idx) |
private static int |
recCnt(long rec) |
private static long |
recCreate(int next,
int ptr,
int cnt) |
private static int |
recNext(long rec) |
private static int |
recPtr(long rec) |
private boolean |
scanA() |
private static int |
tableBits(int sz) |
private int |
tryLongestCommonSequence(int bPtr) |
private static final int REC_NEXT_SHIFT
private static final int REC_PTR_SHIFT
private static final int REC_PTR_MASK
private static final int REC_CNT_MASK
private static final int MAX_PTR
private static final int MAX_CNT
private final int maxChainLength
private final HashedSequenceComparator<S extends Sequence> cmp
private final HashedSequence<S extends Sequence> a
private final HashedSequence<S extends Sequence> b
private final Edit region
private final int[] table
hash(HashedSequence, int)
for recs
index.private final int keyShift
table
.private long[] recs
MAX_CNT
, as the field is only
a few bits wide. Elements that occur more frequently will have their
count capped.private int recCnt
recs
; also is the unique element count.private int[] next
ptr
, next[ptr - ptrShift]
has subsequent index.
For the sequence element ptr
, the value stored at location
next[ptr - ptrShift]
is the next occurrence of the exact same
element in the sequence.
Chains always run from the lowest index to the largest index. Therefore
the array will store next[1] = 2
, but never next[2] = 1
.
This allows a chain to terminate with 0
, as 0
would never
be a valid next element.
The array is sized to be region.getLengthA()
and element indexes
are converted to array indexes by subtracting ptrShift
, which is
just a cached version of region.beginA
.private int[] recIdx
ptr
in A, index of the record in recs
array.
The record at recs[recIdx[ptr - ptrShift]]
is the record
describing all occurrences of the element appearing in sequence A at
position ptr
. The record is needed to get the occurrence count of
the element, or to locate all other occurrences of that element within
sequence A. This index provides constant-time access to the record, and
avoids needing to scan the hash chain.private int ptrShift
next
array.private Edit lcs
private int cnt
private boolean hasCommon
HistogramDiffIndex(int maxChainLength, HashedSequenceComparator<S> cmp, HashedSequence<S> a, HashedSequence<S> b, Edit r)
Edit findLongestCommonSequence()
private boolean scanA()
private int tryLongestCommonSequence(int bPtr)
private int hash(HashedSequence<S> s, int idx)
private static long recCreate(int next, int ptr, int cnt)
private static int recNext(long rec)
private static int recPtr(long rec)
private static int recCnt(long rec)
private static int tableBits(int sz)