Tabnine Logo
TextDocumentStatistics.avgNumWords
Code IndexAdd Tabnine to your IDE (free)

How to use
avgNumWords
method
in
de.l3s.boilerpipe.document.TextDocumentStatistics

Best Java code snippets using de.l3s.boilerpipe.document.TextDocumentStatistics.avgNumWords (Showing top 4 results out of 315)

origin: de.l3s.boilerpipe/boilerpipe

/**
 * Given the statistics of the document before and after applying the {@link BoilerpipeExtractor},
 * can we regard the extraction quality (too) low?
 * 
 * Works well with {@link DefaultExtractor}, {@link ArticleExtractor} and others.
 * 
 * @param dsBefore
 * @param dsAfter
 * @return true if low quality is to be expected. 
 */
public boolean isLowQuality(final TextDocumentStatistics dsBefore, final TextDocumentStatistics dsAfter) {
  if (dsBefore.getNumWords() < 90 || dsAfter.getNumWords() < 70) {
    return true;
  }
  if (dsAfter.avgNumWords() < 25) {
    return true;
  }
  return false;
}
origin: com.syncthemall/boilerpipe

/**
 * Given the statistics of the document before and after applying the {@link BoilerpipeExtractor},
 * can we regard the extraction quality (too) low?
 * 
 * Works well with {@link DefaultExtractor}, {@link ArticleExtractor} and others.
 * 
 * @param dsBefore
 * @param dsAfter
 * @return true if low quality is to be expected. 
 */
public boolean isLowQuality(final TextDocumentStatistics dsBefore, final TextDocumentStatistics dsAfter) {
  if (dsBefore.getNumWords() < 90 || dsAfter.getNumWords() < 70) {
    return true;
  }
  if (dsAfter.avgNumWords() < 25) {
    return true;
  }
  return false;
}
origin: pvdlg/boilerpipe

/**
 * Given the statistics of the document before and after applying the {@link BoilerpipeExtractor},
 * can we regard the extraction quality (too) low?
 * 
 * Works well with {@link DefaultExtractor}, {@link ArticleExtractor} and others.
 * 
 * @param dsBefore
 * @param dsAfter
 * @return true if low quality is to be expected. 
 */
public boolean isLowQuality(final TextDocumentStatistics dsBefore, final TextDocumentStatistics dsAfter) {
  if (dsBefore.getNumWords() < 90 || dsAfter.getNumWords() < 70) {
    return true;
  }
  if (dsAfter.avgNumWords() < 25) {
    return true;
  }
  return false;
}
origin: Netbreeze-GmbH/boilerpipe

/**
 * Given the statistics of the document before and after applying the {@link BoilerpipeExtractor},
 * can we regard the extraction quality (too) low?
 * 
 * Works well with {@link DefaultExtractor}, {@link ArticleExtractor} and others.
 * 
 * @param dsBefore
 * @param dsAfter
 * @return true if low quality is to be expected. 
 */
public boolean isLowQuality(final TextDocumentStatistics dsBefore, final TextDocumentStatistics dsAfter) {
  if (dsBefore.getNumWords() < 90 || dsAfter.getNumWords() < 70) {
    return true;
  }
  if (dsAfter.avgNumWords() < 25) {
    return true;
  }
  return false;
}
de.l3s.boilerpipe.documentTextDocumentStatisticsavgNumWords

Javadoc

Returns the average number of words at block-level (= overall number of words divided by the number of blocks).

Popular methods of TextDocumentStatistics

  • getNumWords
    Returns the overall number of words in all blocks.

Popular in Java

  • Making http post requests using okhttp
  • getOriginalFilename (MultipartFile)
    Return the original filename in the client's filesystem.This may contain path information depending
  • getContentResolver (Context)
  • notifyDataSetChanged (ArrayAdapter)
  • Container (java.awt)
    A generic Abstract Window Toolkit(AWT) container object is a component that can contain other AWT co
  • Window (java.awt)
    A Window object is a top-level window with no borders and no menubar. The default layout for a windo
  • BigDecimal (java.math)
    An immutable arbitrary-precision signed decimal.A value is represented by an arbitrary-precision "un
  • MessageDigest (java.security)
    Uses a one-way hash function to turn an arbitrary number of bytes into a fixed-length byte sequence.
  • DecimalFormat (java.text)
    A concrete subclass of NumberFormat that formats decimal numbers. It has a variety of features desig
  • TreeSet (java.util)
    TreeSet is an implementation of SortedSet. All optional operations (adding and removing) are support
  • Github Copilot alternatives
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now