How to use
tokenize
method
in
edu.stanford.nlp.process.PTBTokenizer

Best Java code snippets using edu.stanford.nlp.process.PTBTokenizer.tokenize (Showing top 10 results out of 315)

@Override
public Tree next() {
 if (line == null) {
  throw new NoSuchElementException();
 }
 Reader lineReader = new StringReader(line);
 line = null;
 List<Word> words;
 if (tokenized) {
  words = WhitespaceTokenizer.newWordWhitespaceTokenizer(lineReader).tokenize();
 } else {
  words = PTBTokenizer.newPTBTokenizer(lineReader).tokenize();
 }
 if (!words.isEmpty()) {
  // the parser throws an exception if told to parse an empty sentence.
  Tree parseTree = lp.apply(words);
  return parseTree;
 } else {
  return new SimpleTree();
 }
}

System.out.println("Processing sentence: " + line);
PTBTokenizer<Word> ptb = PTBTokenizer.newPTBTokenizer(new StringReader(line));
List<Word> words = ptb.tokenize();
Tree parseTree = lp.parseTree(words);
tb.add(parseTree);

List<CoreLabel> words = ptb.tokenize();

 PTBTokenizer ptbt = new PTBTokenizer(
          new StringReader(text), new CoreLabelTokenFactory(), "ptb3Escaping=false");
List<List<CoreLabel>> sents = (new WordToSentenceProcessor()).process(ptbt.tokenize());
Vector<String> sentences = new Vector<String>();
for (List<CoreLabel> sent : sents) {
  StringBuilder sb = new StringBuilder("");
  for (CoreLabel w : sent) sb.append(w + " ");
    sentences.add(sb.toString());
  }
}

@Override
public Tree next() {
 if (line == null) {
  throw new NoSuchElementException();
 }
 Reader lineReader = new StringReader(line);
 line = null;
 List<Word> words;
 if (tokenized) {
  words = WhitespaceTokenizer.newWordWhitespaceTokenizer(lineReader).tokenize();
 } else {
  words = PTBTokenizer.newPTBTokenizer(lineReader).tokenize();
 }
 if (!words.isEmpty()) {
  // the parser throws an exception if told to parse an empty sentence.
  Tree parseTree = lp.apply(words);
  return parseTree;
 } else {
  return new SimpleTree();
 }
}

public List<Word> tokenize(String string)
{ 
  this.tokenizer = 
    new PTBTokenizer<Word>(
        new StringReader(string), 
        new WordTokenFactory(), 
        "untokenizable=noneDelete,ptb3Escaping=true");
  try
  {
    return tokenizer.tokenize();
  }
  catch (Exception e)
  {
    System.err.println(e.getMessage());
    
    final List<Word> tokens = new ArrayList<Word>();
    for (String token : pennTokenizer.tokenize(string).split("\\s+"))
    { 
      tokens.add(new Word(token));
    }
    return tokens;
  }
}

@Override
public Tree next() {
 if (line == null) {
  throw new NoSuchElementException();
 }
 Reader lineReader = new StringReader(line);
 line = null;
 List<Word> words;
 if (tokenized) {
  words = WhitespaceTokenizer.newWordWhitespaceTokenizer(lineReader).tokenize();
 } else {
  words = PTBTokenizer.newPTBTokenizer(lineReader).tokenize();
 }
 if (!words.isEmpty()) {
  // the parser throws an exception if told to parse an empty sentence.
  Tree parseTree = lp.apply(words);
  return parseTree;
 } else {
  return new SimpleTree();
 }
}

List<Word> words = ptb.tokenize();
if (!words.isEmpty()) {

@Override
public Tree next() {
 if (line == null) {
  throw new NoSuchElementException();
 }
 Reader lineReader = new StringReader(line);
 line = null;
 List<Word> words;
 if (tokenized) {
  words = WhitespaceTokenizer.newWordWhitespaceTokenizer(lineReader).tokenize();
 } else {
  words = PTBTokenizer.newPTBTokenizer(lineReader).tokenize();
 }
 if (!words.isEmpty()) {
  // the parser throws an exception if told to parse an empty sentence.
  Tree parseTree = lp.apply(words);
  return parseTree;
 } else {
  return new SimpleTree();
 }
}

System.out.println("Processing sentence: " + line);
PTBTokenizer<Word> ptb = PTBTokenizer.newPTBTokenizer(new StringReader(line));
List<Word> words = ptb.tokenize();
lp.parse(words);
Tree parseTree = lp.getBestParse();

Popular methods of PTBTokenizer

<init>
Constructs a new PTBTokenizer that optionally returns carriage returns as their own token, and has a
hasNext
next
factory
newPTBTokenizer
Constructs a new PTBTokenizer that makes CoreLabel tokens. It optionally returns carriage returns as
ptb2Text
Returns a presentable version of the given PTB-tokenized words. Pass in a List of Strings and this m
coreLabelFactory
ptbToken2Text
Returns a presentable version of a given PTB token. For instance, it transforms -LRB- into (.
tok
tokReader
untok
getNewlineToken
Returns the string literal inserted for newlines when the -tokenizeNLs options is set.

Popular in Java

Reactive rest calls using spring rest template
putExtra (Intent)
getResourceAsStream (ClassLoader)
addToBackStack (FragmentTransaction)
FileInputStream (java.io)
An input stream that reads bytes from a file. File file = ...finally if (in != null) in.clos
Locale (java.util)
Locale represents a language/country/variant combination. Locales are used to alter the presentatio
Scanner (java.util)
A parser that parses a text string of primitive types and strings with the help of regular expressio
Base64 (org.apache.commons.codec.binary)
Provides Base64 encoding and decoding as defined by RFC 2045.This class implements section 6.8. Base
Logger (org.slf4j)
The org.slf4j.Logger interface is the main user entry point of SLF4J API. It is expected that loggin
Runner (org.openjdk.jmh.runner)
Top plugins for Android Studio

How to use tokenizemethodin edu.stanford.nlp.process.PTBTokenizer

Best Java code snippets using edu.stanford.nlp.process.PTBTokenizer.tokenize (Showing top 10 results out of 315)

How to use
tokenize
method
in
edu.stanford.nlp.process.PTBTokenizer