org.apache.lucene.index.TermsEnum.seekExact java code examples

@Override
public boolean seekExact(BytesRef text) throws IOException {
 return actualEnum.seekExact(text);
}

@Override
public void seekExact(long ord) throws IOException {
 actualEnum.seekExact(ord);
}

@Override
public void seekExact(long ord) throws IOException {
 in.seekExact(ord);
}

@Override
public void seekExact(BytesRef term, TermState state) throws IOException {
 actualEnum.seekExact(term, state);
}

/**
 * Expert: Seeks a specific position by {@link TermState} previously obtained
 * from {@link #termState()}. Callers should maintain the {@link TermState} to
 * use this method. Low-level implementations may position the TermsEnum
 * without re-seeking the term dictionary.
 * <p>
 * Seeking by {@link TermState} should only be used iff the state was obtained 
 * from the same {@link TermsEnum} instance. 
 * <p>
 * NOTE: Using this method with an incompatible {@link TermState} might leave
 * this {@link TermsEnum} in undefined state. On a segment level
 * {@link TermState} instances are compatible only iff the source and the
 * target {@link TermsEnum} operate on the same field. If operating on segment
 * level, TermState instances must not be used across segments.
 * <p>
 * NOTE: A seek by {@link TermState} might not restore the
 * {@link AttributeSource}'s state. {@link AttributeSource} states must be
 * maintained separately if this method is used.
 * @param term the term the TermState corresponds to
 * @param state the {@link TermState}
 * */
public void seekExact(BytesRef term, TermState state) throws IOException {
 if (!seekExact(term)) {
  throw new IllegalArgumentException("term=" + term + " does not exist");
 }
}

@Override
public BytesRef lookupOrd(int ord) throws IOException {
 termsEnum.seekExact(ord);
 return termsEnum.term();
}

@Override
public BytesRef lookupOrd(long ord) throws IOException {
 termsEnum.seekExact(ord);
 return termsEnum.term();
}

/** Returns {@link PostingsEnum} for the specified
 *  field and term, with control over whether offsets and payloads are
 *  required.  Some codecs may be able to optimize
 *  their implementation when offsets and/or payloads are not
 *  required. This will return null if the field or term does not
 *  exist. See {@link TermsEnum#postings(PostingsEnum,int)}. */
public static PostingsEnum getTermPositionsEnum(IndexReader r, String field, BytesRef term, int flags) throws IOException {
 assert field != null;
 assert term != null;
 final Terms terms = getTerms(r, field);
 if (terms != null) {
  final TermsEnum termsEnum = terms.iterator();
  if (termsEnum.seekExact(term)) {
   return termsEnum.postings(null, flags);
  }
 }
 return null;
}

/** Returns {@link PostingsEnum} for the specified field and
 *  term, with control over whether freqs are required.
 *  Some codecs may be able to optimize their
 *  implementation when freqs are not required.  This will
 *  return null if the field or term does not exist.  See {@link
 *  TermsEnum#postings(PostingsEnum,int)}.*/
public static PostingsEnum getTermDocsEnum(IndexReader r, String field, BytesRef term, int flags) throws IOException {
 assert field != null;
 assert term != null;
 final Terms terms = getTerms(r, field);
 if (terms != null) {
  final TermsEnum termsEnum = terms.iterator();
  if (termsEnum.seekExact(term)) {
   return termsEnum.postings(null, flags);
  }
 }
 return null;
}

@Override
public Scorer scorer(LeafReaderContext context) throws IOException {
 Similarity.SimScorer simScorer = similarity.simScorer(simWeight, context);
 // we use termscorers + disjunction as an impl detail
 List<Scorer> subScorers = new ArrayList<>();
 for (int i = 0; i < terms.length; i++) {
  TermState state = termContexts[i].get(context.ord);
  if (state != null) {
   TermsEnum termsEnum = context.reader().terms(terms[i].field()).iterator();
   termsEnum.seekExact(terms[i].bytes(), state);
   PostingsEnum postings = termsEnum.postings(null, PostingsEnum.FREQS);
   subScorers.add(new TermScorer(this, postings, simScorer));
  }
 }
 if (subScorers.isEmpty()) {
  return null;
 } else if (subScorers.size() == 1) {
  // we must optimize this case (term not in segment), disjunctionscorer requires >= 2 subs
  return subScorers.get(0);
 } else {
  return new SynonymScorer(simScorer, this, subScorers);
 }
}

@Override
public final int docFreq(Term term) throws IOException {
 final Terms terms = terms(term.field());
 if (terms == null) {
  return 0;
 }
 final TermsEnum termsEnum = terms.iterator();
 if (termsEnum.seekExact(term.bytes())) {
  return termsEnum.docFreq();
 } else {
  return 0;
 }
}

/** Returns the number of documents containing the term
 * <code>t</code>.  This method returns 0 if the term or
 * field does not exists.  This method does not take into
 * account deleted documents that have not yet been merged
 * away. */
@Override
public final long totalTermFreq(Term term) throws IOException {
 final Terms terms = terms(term.field());
 if (terms == null) {
  return 0;
 }
 final TermsEnum termsEnum = terms.iterator();
 if (termsEnum.seekExact(term.bytes())) {
  return termsEnum.totalTermFreq();
 } else {
  return 0;
 }
}

/**
 * Create a {@link DisjunctionMatchesIterator} over a list of terms extracted from a {@link BytesRefIterator}
 *
 * Only terms that have at least one match in the given document will be included
 */
static MatchesIterator fromTermsEnum(LeafReaderContext context, int doc, Query query, String field, BytesRefIterator terms) throws IOException {
 Objects.requireNonNull(field);
 List<MatchesIterator> mis = new ArrayList<>();
 Terms t = context.reader().terms(field);
 if (t == null)
  return null;
 TermsEnum te = t.iterator();
 PostingsEnum reuse = null;
 for (BytesRef term = terms.next(); term != null; term = terms.next()) {
  if (te.seekExact(term)) {
   PostingsEnum pe = te.postings(reuse, PostingsEnum.OFFSETS);
   if (pe.advance(doc) == doc) {
    mis.add(new TermMatchesIterator(query, pe));
    reuse = null;
   }
   else {
    reuse = pe;
   }
  }
 }
 return fromSubIterators(mis);
}

/** Returns {@link PostingsEnum} for the specified term.
 *  This will return null if either the field or
 *  term does not exist.
 *  <p><b>NOTE:</b> The returned {@link PostingsEnum} may contain deleted docs.
 *  @see TermsEnum#postings(PostingsEnum) */
public final PostingsEnum postings(Term term, int flags) throws IOException {
 assert term.field() != null;
 assert term.bytes() != null;
 final Terms terms = terms(term.field());
 if (terms != null) {
  final TermsEnum termsEnum = terms.iterator();
  if (termsEnum.seekExact(term.bytes())) {
   return termsEnum.postings(null, flags);
  }
 }
 return null;
}

@Override
public Scorer scorer(LeafReaderContext context) throws IOException {
 Terms terms = context.reader().terms(fieldName);
 if (terms == null) {
  return null;
 }
 TermsEnum termsEnum = terms.iterator();
 if (termsEnum.seekExact(new BytesRef(featureName)) == false) {
  return null;
 }
 SimScorer scorer = function.scorer(fieldName, boost);
 PostingsEnum postings = termsEnum.postings(null, PostingsEnum.FREQS);
 return new Scorer(this) {
  @Override
  public int docID() {
   return postings.docID();
  }
  @Override
  public float score() throws IOException {
   return scorer.score(postings.docID(), postings.freq());
  }
  @Override
  public DocIdSetIterator iterator() {
   return postings;
  }
 };
}

/**
 * Returns a {@link TermsEnum} positioned at this weights Term or null if
 * the term does not exist in the given context
 */
private TermsEnum getTermsEnum(LeafReaderContext context) throws IOException {
 if (termStates != null) {
  // TermQuery either used as a Query or the term states have been provided at construction time
  assert termStates.wasBuiltFor(ReaderUtil.getTopLevelContext(context)) : "The top-reader used to create Weight is not the same as the current reader's top-reader (" + ReaderUtil.getTopLevelContext(context);
  final TermState state = termStates.get(context.ord);
  if (state == null) { // term is not present in that reader
   assert termNotInReader(context.reader(), term) : "no termstate found but term exists in reader term=" + term;
   return null;
  }
  final TermsEnum termsEnum = context.reader().terms(term.field()).iterator();
  termsEnum.seekExact(term.bytes(), state);
  return termsEnum;
 } else {
  // TermQuery used as a filter, so the term states have not been built up front
  Terms terms = context.reader().terms(term.field());
  if (terms == null) {
   return null;
  }
  final TermsEnum termsEnum = terms.iterator();
  if (termsEnum.seekExact(term.bytes())) {
   return termsEnum;
  } else {
   return null;
  }
 }
}

te.seekExact(t.bytes(), state);
PostingsEnum postingsEnum = te.postings(null, 24);
postingsFreqs[i] = new CustomPhraseQuery.PostingsAndFreq(postingsEnum, query.positions[i], t);

@Override
public Explanation explain(LeafReaderContext context, int doc) throws IOException {
 String desc = "weight(" + getQuery() + " in " + doc + ") [" + function + "]";
 Terms terms = context.reader().terms(fieldName);
 if (terms == null) {
  return Explanation.noMatch(desc + ". Field " + fieldName + " doesn't exist.");
 }
 TermsEnum termsEnum = terms.iterator();
 if (termsEnum.seekExact(new BytesRef(featureName)) == false) {
  return Explanation.noMatch(desc + ". Feature " + featureName + " doesn't exist.");
 }
 PostingsEnum postings = termsEnum.postings(null, PostingsEnum.FREQS);
 if (postings.advance(doc) != doc) {
  return Explanation.noMatch(desc + ". Feature " + featureName + " isn't set.");
 }
 return function.explain(fieldName, featureName, boost, doc, postings.freq());
}

/**
 * Creates a {@link TermContext} from a top-level {@link IndexReaderContext} and the
 * given {@link Term}. This method will lookup the given term in all context's leaf readers 
 * and register each of the readers containing the term in the returned {@link TermContext}
 * using the leaf reader's ordinal.
 * <p>
 * Note: the given context must be a top-level context.
 */
public static TermContext build(IndexReaderContext context, Term term)
  throws IOException {
 assert context != null && context.isTopLevel;
 final String field = term.field();
 final BytesRef bytes = term.bytes();
 final TermContext perReaderTermState = new TermContext(context);
 //if (DEBUG) System.out.println("prts.build term=" + term);
 for (final LeafReaderContext ctx : context.leaves()) {
  //if (DEBUG) System.out.println("  r=" + leaves[i].reader);
  final Terms terms = ctx.reader().terms(field);
  if (terms != null) {
   final TermsEnum termsEnum = terms.iterator();
   if (termsEnum.seekExact(bytes)) { 
    final TermState termState = termsEnum.termState();
    //if (DEBUG) System.out.println("    found");
    perReaderTermState.register(termState, ctx.ord, termsEnum.docFreq(), termsEnum.totalTermFreq());
   }
  }
 }
 return perReaderTermState;
}

 @Override
 public Spans getSpans(final LeafReaderContext context, Postings requiredPostings) throws IOException {
  assert termContext.wasBuiltFor(ReaderUtil.getTopLevelContext(context)) : "The top-reader used to create Weight is not the same as the current reader's top-reader (" + ReaderUtil.getTopLevelContext(context);
  final TermState state = termContext.get(context.ord);
  if (state == null) { // term is not present in that reader
   assert context.reader().docFreq(term) == 0 : "no termstate found but term exists in reader term=" + term;
   return null;
  }
  final Terms terms = context.reader().terms(term.field());
  if (terms == null)
   return null;
  if (terms.hasPositions() == false)
   throw new IllegalStateException("field \"" + term.field() + "\" was indexed without position data; cannot run SpanTermQuery (term=" + term.text() + ")");
  final TermsEnum termsEnum = terms.iterator();
  termsEnum.seekExact(term.bytes(), state);
  final PostingsEnum postings = termsEnum.postings(null, requiredPostings.getRequiredPostings());
  float positionsCost = termPositionsCost(termsEnum) * PHRASE_TO_SPAN_TERM_POSITIONS_COST;
  return new TermSpans(getSimScorer(context), postings, term, positionsCost);
 }
}

Javadoc

Attempts to seek to the exact term, returning true if the term is found. If this returns false, the enum is unpositioned. For some codecs, seekExact may be substantially faster than #seekCeil.

Popular methods of TermsEnum

next
docFreq
Returns the number of documents containing the current term. Do not call this when the enum is unpos
totalTermFreq
Returns the total number of occurrences of this term across all documents (the sum of the freq() for
term
Returns current term. Do not call this when the enum is unpositioned.
postings
seekCeil
Seeks to the specified term, if it exists, or to the next (ceiling) term. Returns SeekStatus to indi
ord
Returns ordinal position for current term. This is an optional method (the codec may throw Unsupport
attributes
Returns the related attributes.
termState
Expert: Returns the TermsEnums internal state to position the TermsEnum without re-seeking the term
docs
Get DocsEnum for the current term, with control over whether freqs are required. Do not call this wh
docsAndPositions
Get DocsAndPositionsEnum for the current term, with control over whether offsets and payloads are re
getComparator

Popular in Java

Making http post requests using okhttp
compareTo (BigDecimal)
runOnUiThread (Activity)
notifyDataSetChanged (ArrayAdapter)
ConnectException (java.net)
A ConnectException is thrown if a connection cannot be established to a remote host on a specific po
URI (java.net)
A Uniform Resource Identifier that identifies an abstract or physical resource, as specified by RFC
MessageFormat (java.text)
Produces concatenated messages in language-neutral way. New code should probably use java.util.Forma
List (java.util)
An ordered collection (also known as a sequence). The user of this interface has precise control ove
FileUtils (org.apache.commons.io)
General file manipulation utilities. Facilities are provided in the following areas: * writing to a
Reflections (org.reflections)
Reflections one-stop-shop objectReflections scans your classpath, indexes the metadata, allows you t
CodeWhisperer alternatives

How to use seekExactmethodin org.apache.lucene.index.TermsEnum

Best Java code snippets using org.apache.lucene.index.TermsEnum.seekExact (Showing top 20 results out of 315)

How to use
seekExact
method
in
org.apache.lucene.index.TermsEnum