congrats Icon
New! Tabnine Pro 14-day free trial
Start a free trial
Tabnine Logo
WordTokenizer.getTokenizingCharacters
Code IndexAdd Tabnine to your IDE (free)

How to use
getTokenizingCharacters
method
in
org.languagetool.tokenizers.WordTokenizer

Best Java code snippets using org.languagetool.tokenizers.WordTokenizer.getTokenizingCharacters (Showing top 5 results out of 315)

origin: languagetool-org/languagetool

@Override
public List<String> tokenize(String text) {
 List<String> l = new ArrayList<>();
 StringTokenizer st = new StringTokenizer(text, getTokenizingCharacters(), true);
 while (st.hasMoreElements()) {
  l.add(st.nextToken());
 }
 return joinEMailsAndUrls(l);
}
origin: org.languagetool/language-tl

 @Override
 public String getTokenizingCharacters() {
  return super.getTokenizingCharacters() + "-";
 } 
}
origin: org.languagetool/language-pl

public PolishWordTokenizer() {
 plTokenizing = super.getTokenizingCharacters() + "–";   // n-dash
}
origin: org.languagetool/language-nl

public DutchWordTokenizer() {
 //remove the apostrophe etc. from the standard tokenizing characters:
 String chars = super.getTokenizingCharacters();
 for (String quote : QUOTES) {
  chars = chars.replace(quote, "");
 }
 nlTokenizingChars = chars;
}
origin: org.languagetool/languagetool-core

@Override
public List<String> tokenize(String text) {
 List<String> l = new ArrayList<>();
 StringTokenizer st = new StringTokenizer(text, getTokenizingCharacters(), true);
 while (st.hasMoreElements()) {
  l.add(st.nextToken());
 }
 return joinEMailsAndUrls(l);
}
org.languagetool.tokenizersWordTokenizergetTokenizingCharacters

Popular methods of WordTokenizer

  • tokenize
  • isEMail
  • isUrl
  • <init>
  • getProtocols
    Get the protocols that the tokenizer knows about.
  • isProtocol
  • joinEMails
  • joinEMailsAndUrls
  • joinUrls
  • urlEndsAt
  • urlStartsAt
  • urlStartsAt

Popular in Java

  • Running tasks concurrently on multiple threads
  • getSharedPreferences (Context)
  • notifyDataSetChanged (ArrayAdapter)
  • scheduleAtFixedRate (Timer)
  • File (java.io)
    An "abstract" representation of a file system entity identified by a pathname. The pathname may be a
  • DecimalFormat (java.text)
    A concrete subclass of NumberFormat that formats decimal numbers. It has a variety of features desig
  • Scanner (java.util)
    A parser that parses a text string of primitive types and strings with the help of regular expressio
  • TimeUnit (java.util.concurrent)
    A TimeUnit represents time durations at a given unit of granularity and provides utility methods to
  • HttpServlet (javax.servlet.http)
    Provides an abstract class to be subclassed to create an HTTP servlet suitable for a Web site. A sub
  • FileUtils (org.apache.commons.io)
    General file manipulation utilities. Facilities are provided in the following areas: * writing to a
  • PhpStorm for WordPress
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimAtomGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyStudentsTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now