public class BlockReader extends BaseTermsEnum implements Accountable
Reads fully the block in blockReadBuffer
. Then scans the block
terms in memory. The details region is lazily decoded with termStatesReadBuffer
which shares the same byte array with blockReadBuffer
.
See BlockWriter
and BlockLine
for the block format.
TermsEnum.SeekStatus
Modifier and Type | Field and Description |
---|---|
protected BlockDecoder |
blockDecoder |
protected int |
blockFirstLineStart
Offset of the start of the first line of the current block (just after the header), relative to the block start.
|
protected BlockHeader |
blockHeader
Current block header.
|
protected IndexInput |
blockInput
IndexInput on the block file . |
protected BlockLine |
blockLine
Current block line.
|
protected BlockLine.Serializer |
blockLineReader |
protected ByteArrayDataInput |
blockReadBuffer
In-memory read buffer for the current block.
|
protected long |
blockStartFP
Current block start file pointer, absolute in the
block file . |
protected IndexDictionary.Browser |
dictionaryBrowser
Holds the
IndexDictionary.Browser once loaded. |
protected Supplier<IndexDictionary.Browser> |
dictionaryBrowserSupplier
IndexDictionary.Browser supplier for lazy loading. |
protected FieldMetadata |
fieldMetadata |
protected BytesRefBuilder |
forcedTerm
Set when
seekExact(BytesRef, TermState) is called. |
protected int |
lineIndexInBlock
Current line index in the block.
|
protected PostingsReaderBase |
postingsReader |
protected BytesRef |
scratchBlockBytes |
protected BlockTermState |
scratchTermState |
protected BlockTermState |
termState
Current block line details.
|
protected boolean |
termStateForced
Whether the current
TermState has been forced with a call to
seekExact(BytesRef, TermState) . |
protected DeltaBaseTermStateSerializer |
termStateSerializer |
protected ByteArrayDataInput |
termStatesReadBuffer
In-memory read buffer for the details region of the current block.
|
Modifier | Constructor and Description |
---|---|
protected |
BlockReader(Supplier<IndexDictionary.Browser> dictionaryBrowserSupplier,
IndexInput blockInput,
PostingsReaderBase postingsReader,
FieldMetadata fieldMetadata,
BlockDecoder blockDecoder) |
Modifier and Type | Method and Description |
---|---|
protected void |
clearTermState()
|
protected int |
compareToMiddleAndJump(BytesRef searchedTerm)
Compares the searched term to the middle term of the block.
|
protected BytesRef |
decodeBlockBytesIfNeeded(int numBlockBytes) |
int |
docFreq()
Returns the number of documents containing the current
term.
|
protected IndexDictionary.Browser |
getOrCreateDictionaryBrowser() |
ImpactsEnum |
impacts(int flags)
Return a
ImpactsEnum . |
protected void |
initializeBlockReadLazily() |
protected void |
initializeHeader(BytesRef searchedTerm,
long targetBlockStartFP)
Reads and sets
blockHeader . |
protected boolean |
isBeyondLastTerm(BytesRef searchedTerm,
long blockStartFP)
Indicates whether the searched term is beyond the last term of the field.
|
protected boolean |
isCurrentTerm(BytesRef searchedTerm) |
BytesRef |
next()
Increments the iteration to the next
BytesRef in the iterator. |
protected BytesRef |
nextTerm()
Moves to the next term line and reads it, it may be in the next block.
|
long |
ord()
Returns ordinal position for current term.
|
PostingsEnum |
postings(PostingsEnum reuse,
int flags)
Get
PostingsEnum for the current term, with
control over whether freqs, positions, offsets or payloads
are required. |
long |
ramBytesUsed()
Return the memory usage of this object in bytes.
|
protected BlockHeader |
readHeader()
Reads the block header.
|
protected BlockLine |
readLineInBlock()
Reads the current block line.
|
protected BlockTermState |
readTermState()
Reads the
BlockTermState on the current line. |
protected BlockTermState |
readTermStateIfNotRead()
Reads the
BlockTermState if it is not already set. |
TermsEnum.SeekStatus |
seekCeil(BytesRef searchedTerm)
Seeks to the specified term, if it exists, or to the
next (ceiling) term.
|
boolean |
seekExact(BytesRef searchedTerm)
Attempts to seek to the exact term, returning true if the term is found.
|
void |
seekExact(BytesRef term,
TermState state)
Positions this
BlockReader without re-seeking the term dictionary. |
void |
seekExact(long ord)
Not supported.
|
protected TermsEnum.SeekStatus |
seekInBlock(BytesRef searchedTerm)
Seeks to the provided term in this block.
|
protected TermsEnum.SeekStatus |
seekInBlock(BytesRef searchedTerm,
long blockStartFP)
Seeks to the provided term in the block starting at the provided file pointer.
|
BytesRef |
term()
Returns current term.
|
TermState |
termState()
Expert: Returns the TermsEnums internal state to position the TermsEnum
without re-seeking the term dictionary.
|
long |
totalTermFreq()
Returns the total number of occurrences of this term
across all documents (the sum of the freq() for each
doc that has this term).
|
attributes
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
getChildResources
protected IndexInput blockInput
IndexInput
on the block file
.protected final PostingsReaderBase postingsReader
protected final FieldMetadata fieldMetadata
protected final BlockDecoder blockDecoder
protected BlockLine.Serializer blockLineReader
protected ByteArrayDataInput blockReadBuffer
protected ByteArrayDataInput termStatesReadBuffer
blockReadBuffer
, with a
different position.protected DeltaBaseTermStateSerializer termStateSerializer
protected final Supplier<IndexDictionary.Browser> dictionaryBrowserSupplier
IndexDictionary.Browser
supplier for lazy loading.protected IndexDictionary.Browser dictionaryBrowser
IndexDictionary.Browser
once loaded.protected long blockStartFP
block file
.protected BlockHeader blockHeader
protected BlockLine blockLine
protected BlockTermState termState
protected int blockFirstLineStart
protected int lineIndexInBlock
protected boolean termStateForced
TermState
has been forced with a call to
seekExact(BytesRef, TermState)
.forcedTerm
protected BytesRefBuilder forcedTerm
seekExact(BytesRef, TermState)
is called.
This optimizes the use-case when the caller calls first seekExact(BytesRef, TermState)
and then postings(PostingsEnum, int)
. In this case we don't access
the terms block file (we don't seek) but directly the postings file because
we already have the TermState
with the file pointers to the postings
file.
protected BytesRef scratchBlockBytes
protected final BlockTermState scratchTermState
protected BlockReader(Supplier<IndexDictionary.Browser> dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) throws IOException
dictionaryBrowserSupplier
- to load the IndexDictionary.Browser
lazily in seekCeil(BytesRef)
.blockDecoder
- Optional block decoder, may be null if none.
It can be used for decompression or decryption.IOException
public TermsEnum.SeekStatus seekCeil(BytesRef searchedTerm) throws IOException
TermsEnum
seekCeil
in class TermsEnum
IOException
public boolean seekExact(BytesRef searchedTerm) throws IOException
TermsEnum
TermsEnum.seekCeil(org.apache.lucene.util.BytesRef)
.
seekExact
in class BaseTermsEnum
IOException
protected boolean isCurrentTerm(BytesRef searchedTerm)
protected boolean isBeyondLastTerm(BytesRef searchedTerm, long blockStartFP)
blockStartFP
- The current block start file pointer.protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm, long blockStartFP) throws IOException
IOException
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm) throws IOException
Does not exceed this block; TermsEnum.SeekStatus.END
is returned if it follows the block.
Compares the line terms with the searchedTerm
, taking
advantage of the incremental encoding properties.
Scans linearly the terms. Updates the current block line with the current term.
IOException
protected int compareToMiddleAndJump(BytesRef searchedTerm) throws IOException
IOException
protected BlockLine readLineInBlock() throws IOException
blockLine
and increments lineIndexInBlock
.BlockLine
; or null if there no more line in the block.IOException
public void seekExact(BytesRef term, TermState state)
BlockReader
without re-seeking the term dictionary.
The block containing the term is not read by this method. It will be read
lazily only if needed, for example if next()
is called.
Calling postings(org.apache.lucene.index.PostingsEnum, int)
after this method does require the block to be read.
seekExact
in class BaseTermsEnum
term
- the term the TermState corresponds tostate
- the TermState
public BytesRef next() throws IOException
BytesRefIterator
BytesRef
in the iterator.
Returns the resulting BytesRef
or null
if the end of
the iterator is reached. The returned BytesRef may be re-used across calls
to next. After this method returns null, do not call it again: the results
are undefined.next
in interface BytesRefIterator
BytesRef
in the iterator or null
if
the end of the iterator is reached.IOException
- If there is a low-level I/O error.protected BytesRef nextTerm() throws IOException
readTermStateIfNotRead()
.IOException
protected void initializeHeader(BytesRef searchedTerm, long targetBlockStartFP) throws IOException
blockHeader
. Sets null if there is no block for the field anymore.searchedTerm
- The searched term; or null if none.targetBlockStartFP
- The file pointer of the block to read.IOException
protected void initializeBlockReadLazily()
protected BlockHeader readHeader() throws IOException
blockHeader
.IOException
protected BytesRef decodeBlockBytesIfNeeded(int numBlockBytes) throws IOException
IOException
protected BlockTermState readTermStateIfNotRead() throws IOException
BlockTermState
if it is not already set.
Sets termState
.IOException
protected BlockTermState readTermState() throws IOException
BlockTermState
on the current line.
Sets termState
.
Overriding method may return null if there is no BlockTermState
(in this case the extending class must support a null termState
).
BlockTermState
; or null if none.IOException
public BytesRef term()
TermsEnum
public long ord()
TermsEnum
UnsupportedOperationException
). Do not call this
when the enum is unpositioned.public int docFreq() throws IOException
TermsEnum
TermsEnum.SeekStatus.END
.docFreq
in class TermsEnum
IOException
public long totalTermFreq() throws IOException
TermsEnum
totalTermFreq
in class TermsEnum
IOException
public TermState termState() throws IOException
TermsEnum
NOTE: A seek by TermState
might not capture the
AttributeSource
's state. Callers must maintain the
AttributeSource
states separately
termState
in class BaseTermsEnum
IOException
TermState
,
TermsEnum.seekExact(BytesRef, TermState)
public PostingsEnum postings(PostingsEnum reuse, int flags) throws IOException
TermsEnum
PostingsEnum
for the current term, with
control over whether freqs, positions, offsets or payloads
are required. Do not call this when the enum is
unpositioned. This method will not return null.
NOTE: the returned iterator may return deleted documents, so
deleted documents have to be checked on top of the PostingsEnum
.
postings
in class TermsEnum
reuse
- pass a prior PostingsEnum for possible reuseflags
- specifies which optional per-document values
you require; see PostingsEnum.FREQS
IOException
public ImpactsEnum impacts(int flags) throws IOException
TermsEnum
ImpactsEnum
.impacts
in class TermsEnum
IOException
TermsEnum.postings(PostingsEnum, int)
public long ramBytesUsed()
Accountable
ramBytesUsed
in interface Accountable
protected IndexDictionary.Browser getOrCreateDictionaryBrowser()
protected void clearTermState()
Copyright © 2000–2020 The Apache Software Foundation. All rights reserved.