edu.stanford.nlp.process.WordShapeClassifier.wordShapeDan2 java code examples

/**
 * Returns a fine-grained word shape classifier, that equivalence classes
 * lower and upper case and digits, and collapses sequences of the
 * same type, but keeps all punctuation.  This adds an extra recognizer
 * for a greek letter embedded in the String, which is useful for bio.
 */
private static String wordShapeDan2Bio(String s, Collection<String> knownLCWords) {
 if (containsGreekLetter(s)) {
  return wordShapeDan2(s, knownLCWords) + "-GREEK";
 } else {
  return wordShapeDan2(s, knownLCWords);
 }
}

 return wordShapeChris1(inStr);
case WORDSHAPEDAN2:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2USELC:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2BIO:
 return wordShapeDan2Bio(inStr, knownLCWords);

 return wordShapeChris1(inStr);
case WORDSHAPEDAN2:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2USELC:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2BIO:
 return wordShapeDan2Bio(inStr, knownLCWords);

/**
 * Returns a fine-grained word shape classifier, that equivalence classes
 * lower and upper case and digits, and collapses sequences of the
 * same type, but keeps all punctuation.  This adds an extra recognizer
 * for a greek letter embedded in the String, which is useful for bio.
 */
private static String wordShapeDan2Bio(String s, Collection<String> knownLCWords) {
 if (containsGreekLetter(s)) {
  return wordShapeDan2(s, knownLCWords) + "-GREEK";
 } else {
  return wordShapeDan2(s, knownLCWords);
 }
}

 return wordShapeChris1(inStr);
case WORDSHAPEDAN2:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2USELC:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2BIO:
 return wordShapeDan2Bio(inStr, knownLCWords);

/**
 * Returns a fine-grained word shape classifier, that equivalence classes
 * lower and upper case and digits, and collapses sequences of the
 * same type, but keeps all punctuation.  This adds an extra recognizer
 * for a greek letter embedded in the String, which is useful for bio.
 */
private static String wordShapeDan2Bio(String s, Collection<String> knownLCWords) {
 if (containsGreekLetter(s)) {
  return wordShapeDan2(s, knownLCWords) + "-GREEK";
 } else {
  return wordShapeDan2(s, knownLCWords);
 }
}

 return wordShapeChris1(inStr);
case WORDSHAPEDAN2:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2USELC:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2BIO:
 return wordShapeDan2Bio(inStr, knownLCWords);

/**
 * Returns a fine-grained word shape classifier, that equivalence classes
 * lower and upper case and digits, and collapses sequences of the
 * same type, but keeps all punctuation.  This adds an extra recognizer
 * for a greek letter embedded in the String, which is useful for bio.
 */
private static String wordShapeDan2Bio(String s, Collection<String> knownLCWords) {
 if (containsGreekLetter(s)) {
  return wordShapeDan2(s, knownLCWords) + "-GREEK";
 } else {
  return wordShapeDan2(s, knownLCWords);
 }
}

 return wordShapeChris1(inStr);
case WORDSHAPEDAN2:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2USELC:
 return wordShapeDan2(inStr, knownLCWords);
case WORDSHAPEDAN2BIO:
 return wordShapeDan2Bio(inStr, knownLCWords);

/**
 * Returns a fine-grained word shape classifier, that equivalence classes
 * lower and upper case and digits, and collapses sequences of the
 * same type, but keeps all punctuation.  This adds an extra recognizer
 * for a greek letter embedded in the String, which is useful for bio.
 */
private static String wordShapeDan2Bio(String s, Collection<String> knownLCWords) {
 if (containsGreekLetter(s)) {
  return wordShapeDan2(s, knownLCWords) + "-GREEK";
 } else {
  return wordShapeDan2(s, knownLCWords);
 }
}

Javadoc

A fine-grained word shape classifier, that equivalence classes lower and upper case and digits, and collapses sequences of the same type, but keeps all punctuation, etc.

Note: We treat '_' as a lowercase letter, sort of like many programming languages. We do this because we use '_' joining of tokens in some applications like RTE.

Popular methods of WordShapeClassifier

chris4equivalenceClass
containsGreekLetter
Somewhat ad-hoc list of only greek letters that bio people use, partly to avoid false positives on s
dontUseLC
Returns true if the specified word shaper doesn't use known lower case words, even if a list of them
lookupShaper
Look up a shaper by a short String name.
wordShape
Specify the string and the int identifying which word shaper to use and this returns the result of u
wordShapeChris1
This one equivalence classes all strings into one of 24 semantically informed classes, somewhat simi
wordShapeChris2
This one picks up on Dan2 ideas, but seeks to make less distinctions mid sequence by sorting for lon
wordShapeChris2Long
wordShapeChris2Short
wordShapeChris4
This one picks up on Dan2 ideas, but seeks to make less distinctions mid sequence by sorting for lon
wordShapeChris4Long
wordShapeChris4Short

Popular in Java

Making http requests using okhttp
scheduleAtFixedRate (Timer)
setRequestProperty (URLConnection)
addToBackStack (FragmentTransaction)
ServerSocket (java.net)
This class represents a server-side socket that waits for incoming client connections. A ServerSocke
BlockingQueue (java.util.concurrent)
A java.util.Queue that additionally supports operations that wait for the queue to become non-empty
ThreadPoolExecutor (java.util.concurrent)
An ExecutorService that executes each submitted task using one of possibly several pooled threads, n
Servlet (javax.servlet)
Defines methods that all servlets must implement. A servlet is a small Java program that runs within
JCheckBox (javax.swing)
JOptionPane (javax.swing)
Top PhpStorm plugins

How to use wordShapeDan2methodin edu.stanford.nlp.process.WordShapeClassifier

Best Java code snippets using edu.stanford.nlp.process.WordShapeClassifier.wordShapeDan2 (Showing top 10 results out of 315)

How to use
wordShapeDan2
method
in
edu.stanford.nlp.process.WordShapeClassifier