Tabnine Logo
HTMLHighlighter.setOutputHighlightOnly
Code IndexAdd Tabnine to your IDE (free)

How to use
setOutputHighlightOnly
method
in
de.l3s.boilerpipe.sax.HTMLHighlighter

Best Java code snippets using de.l3s.boilerpipe.sax.HTMLHighlighter.setOutputHighlightOnly (Showing top 5 results out of 315)

origin: de.l3s.boilerpipe/boilerpipe

private HTMLHighlighter(final boolean extractHTML) {
  if (extractHTML) {
    setOutputHighlightOnly(true);
    setExtraStyleSheet("");
    setPreHighlight("");
    setPostHighlight("");
  }
}
origin: com.syncthemall/boilerpipe

private HTMLHighlighter(final boolean extractHTML) {
  if (extractHTML) {
    setOutputHighlightOnly(true);
    setExtraStyleSheet("\n<style type=\"text/css\">\n"
        + "A:before { content:' '; } \n" //
        + "A:after { content:' '; } \n" //
        + "SPAN:before { content:' '; } \n" //
        + "SPAN:after { content:' '; } \n" //
        + "</style>\n");
    setPreHighlight("");
    setPostHighlight("");
  }
}
origin: pvdlg/boilerpipe

private HTMLHighlighter(final boolean extractHTML) {
  if (extractHTML) {
    setOutputHighlightOnly(true);
    setExtraStyleSheet("\n<style type=\"text/css\">\n"
        + "A:before { content:' '; } \n" //
        + "A:after { content:' '; } \n" //
        + "SPAN:before { content:' '; } \n" //
        + "SPAN:after { content:' '; } \n" //
        + "</style>\n");
    setPreHighlight("");
    setPostHighlight("");
  }
}
origin: Netbreeze-GmbH/boilerpipe

private HTMLHighlighter(final boolean extractHTML) {
  if (extractHTML) {
    setOutputHighlightOnly(true);
    setExtraStyleSheet("\n<style type=\"text/css\">\n"
        + "A:before { content:' '; } \n" //
        + "A:after { content:' '; } \n" //
        + "SPAN:before { content:' '; } \n" //
        + "SPAN:after { content:' '; } \n" //
        + "</style>\n");
    setPreHighlight("");
    setPostHighlight("");
  }
}
origin: Netbreeze-GmbH/boilerpipe

/**
 * returns the article from an document with its basic html structure. 
 * 
 * @param HTMLDocument
 * @param URI the uri from the document for resolving the relative anchors in the document to absolute anchors
 * @return String
 */
public String process(HTMLDocument htmlDoc, URI docUri, final BoilerpipeExtractor extractor) {
  final HTMLHighlighter hh = HTMLHighlighter.newExtractingInstance();
  hh.setOutputHighlightOnly(true);
  TextDocument doc;
  String text = "";
  try {
    doc = new BoilerpipeSAXInput(htmlDoc.toInputSource()).getTextDocument();
    extractor.process(doc);
    final InputSource is = htmlDoc.toInputSource();
    text = hh.process(doc, is);
  } catch (Exception ex) {
    return null;
  }
  return removeNotAllowedTags(text, docUri);
}
de.l3s.boilerpipe.saxHTMLHighlightersetOutputHighlightOnly

Javadoc

Sets whether only HTML enclosed within highlighted content will be returned, or the whole HTML document.

Popular methods of HTMLHighlighter

  • <init>
  • process
    Fetches the given URL using HTMLFetcher and processes the retrieved HTML using the specified Boilerp
  • setExtraStyleSheet
    Sets the extra stylesheet definition that will be inserted in the HEAD element. To disable, set it t
  • setPostHighlight
    Sets the string that will be inserted after any highlighted HTML block. To disable, set it to the em
  • setPreHighlight
    Sets the string that will be inserted prior to any highlighted HTML block. To disable, set it to the
  • newExtractingInstance
    Creates a new HTMLHighlighter, which is set-up to return only the extracted HTML text, including enc

Popular in Java

  • Making http post requests using okhttp
  • requestLocationUpdates (LocationManager)
  • startActivity (Activity)
  • getSupportFragmentManager (FragmentActivity)
  • BufferedReader (java.io)
    Wraps an existing Reader and buffers the input. Expensive interaction with the underlying reader is
  • URI (java.net)
    A Uniform Resource Identifier that identifies an abstract or physical resource, as specified by RFC
  • Iterator (java.util)
    An iterator over a sequence of objects, such as a collection.If a collection has been changed since
  • ThreadPoolExecutor (java.util.concurrent)
    An ExecutorService that executes each submitted task using one of possibly several pooled threads, n
  • AtomicInteger (java.util.concurrent.atomic)
    An int value that may be updated atomically. See the java.util.concurrent.atomic package specificati
  • Pattern (java.util.regex)
    Patterns are compiled regular expressions. In many cases, convenience methods such as String#matches
  • Top PhpStorm plugins
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now