Tabnine Logo
BoilerpipeHTMLParser
Code IndexAdd Tabnine to your IDE (free)

How to use
BoilerpipeHTMLParser
in
de.l3s.boilerpipe.sax

Best Java code snippets using de.l3s.boilerpipe.sax.BoilerpipeHTMLParser (Showing top 12 results out of 315)

origin: pvdlg/boilerpipe

/**
 * Retrieves the {@link TextDocument} using a default HTML parser.
 */
public TextDocument getTextDocument() throws BoilerpipeProcessingException {
  return getTextDocument(new BoilerpipeHTMLParser());
}

origin: Netbreeze-GmbH/boilerpipe

/**
 * Constructs a {@link BoilerpipeHTMLParser} using the given {@link BoilerpipeHTMLContentHandler}.
 *
 * @param contentHandler
 */
public BoilerpipeHTMLParser(BoilerpipeHTMLContentHandler contentHandler) {
  super(new HTMLConfiguration());
  setContentHandler(contentHandler);
}

origin: de.l3s.boilerpipe/boilerpipe

/**
 * Retrieves the {@link TextDocument} using the given HTML parser.
 * 
 * @param parser The parser used to transform the input into boilerpipe's internal representation.
 * @return The retrieved {@link TextDocument}
 * @throws BoilerpipeProcessingException
 */
public TextDocument getTextDocument(final BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException {
  try {
    parser.parse(is);
  } catch (IOException e) {
    throw new BoilerpipeProcessingException(e);
  } catch (SAXException e) {
    throw new BoilerpipeProcessingException(e);
  }
  
  return parser.toTextDocument();
}
origin: com.syncthemall/boilerpipe

/**
 * Retrieves the {@link TextDocument} using the given HTML parser.
 * 
 * @param parser The parser used to transform the input into boilerpipe's internal representation.
 * @return The retrieved {@link TextDocument}
 * @throws BoilerpipeProcessingException
 */
public TextDocument getTextDocument(final BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException {
  try {
    parser.parse(is);
  } catch (IOException e) {
    throw new BoilerpipeProcessingException(e);
  } catch (SAXException e) {
    throw new BoilerpipeProcessingException(e);
  }
  
  return parser.toTextDocument();
}
origin: pvdlg/boilerpipe

/**
 * Retrieves the {@link TextDocument} using the given HTML parser.
 * 
 * @param parser The parser used to transform the input into boilerpipe's internal representation.
 * @return The retrieved {@link TextDocument}
 * @throws BoilerpipeProcessingException
 */
public TextDocument getTextDocument(final BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException {
  try {
    parser.parse(is);
  } catch (IOException e) {
    throw new BoilerpipeProcessingException(e);
  } catch (SAXException e) {
    throw new BoilerpipeProcessingException(e);
  }
  
  return parser.toTextDocument();
}
origin: de.l3s.boilerpipe/boilerpipe

/**
 * Retrieves the {@link TextDocument} using a default HTML parser.
 */
public TextDocument getTextDocument() throws BoilerpipeProcessingException {
  return getTextDocument(new BoilerpipeHTMLParser());
}

origin: com.syncthemall/boilerpipe

/**
 * Constructs a {@link BoilerpipeHTMLParser} using the given {@link BoilerpipeHTMLContentHandler}.
 *
 * @param contentHandler
 */
public BoilerpipeHTMLParser(BoilerpipeHTMLContentHandler contentHandler) {
  super(new HTMLConfiguration());
  setContentHandler(contentHandler);
}

origin: Netbreeze-GmbH/boilerpipe

/**
 * Retrieves the {@link TextDocument} using the given HTML parser.
 * 
 * @param parser The parser used to transform the input into boilerpipe's internal representation.
 * @return The retrieved {@link TextDocument}
 * @throws BoilerpipeProcessingException
 */
public TextDocument getTextDocument(final BoilerpipeHTMLParser parser) throws BoilerpipeProcessingException {
  try {
    parser.parse(is);
  } catch (IOException e) {
    throw new BoilerpipeProcessingException(e);
  } catch (SAXException e) {
    throw new BoilerpipeProcessingException(e);
  }
  
  return parser.toTextDocument();
}
origin: com.syncthemall/boilerpipe

/**
 * Retrieves the {@link TextDocument} using a default HTML parser.
 */
public TextDocument getTextDocument() throws BoilerpipeProcessingException {
  return getTextDocument(new BoilerpipeHTMLParser());
}

origin: de.l3s.boilerpipe/boilerpipe

/**
 * Constructs a {@link BoilerpipeHTMLParser} using the given {@link BoilerpipeHTMLContentHandler}.
 *
 * @param contentHandler
 */
public BoilerpipeHTMLParser(BoilerpipeHTMLContentHandler contentHandler) {
  super(new HTMLConfiguration());
  this.contentHandler = contentHandler;
  setContentHandler(contentHandler);
}
origin: Netbreeze-GmbH/boilerpipe

/**
 * Retrieves the {@link TextDocument} using a default HTML parser.
 */
public TextDocument getTextDocument() throws BoilerpipeProcessingException {
  return getTextDocument(new BoilerpipeHTMLParser());
}

origin: pvdlg/boilerpipe

/**
 * Constructs a {@link BoilerpipeHTMLParser} using the given {@link BoilerpipeHTMLContentHandler}.
 *
 * @param contentHandler
 */
public BoilerpipeHTMLParser(BoilerpipeHTMLContentHandler contentHandler) {
  super(new HTMLConfiguration());
  setContentHandler(contentHandler);
}

de.l3s.boilerpipe.saxBoilerpipeHTMLParser

Javadoc

A simple SAX Parser, used by BoilerpipeSAXInput. The parser uses CyberNeko to parse HTML content.

Most used methods

  • <init>
  • parse
  • setContentHandler
  • toTextDocument
    Returns a TextDocument containing the extracted TextBlocks. NOTE: Only call this after #parse(org.xm

Popular in Java

  • Creating JSON documents from java classes using gson
  • compareTo (BigDecimal)
  • addToBackStack (FragmentTransaction)
  • setRequestProperty (URLConnection)
  • HttpServer (com.sun.net.httpserver)
    This class implements a simple HTTP server. A HttpServer is bound to an IP address and port number a
  • Selector (java.nio.channels)
    A controller for the selection of SelectableChannel objects. Selectable channels can be registered w
  • ReentrantLock (java.util.concurrent.locks)
    A reentrant mutual exclusion Lock with the same basic behavior and semantics as the implicit monitor
  • Manifest (java.util.jar)
    The Manifest class is used to obtain attribute information for a JarFile and its entries.
  • JComboBox (javax.swing)
  • Loader (org.hibernate.loader)
    Abstract superclass of object loading (and querying) strategies. This class implements useful common
  • Top Sublime Text plugins
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now