How to use
StringSearch
in
com.ibm.icu.text

Best Java code snippets using com.ibm.icu.text.StringSearch (Showing top 9 results out of 315)

  /**
   * Returns the index of the first occurrence of <code>subtext</code> within
   * <code>text</code>; or <code>-1</code> if <code>subtext</code> does not
   * occur within <code>text</code>.
   *
   * @param text String in which to locate <code>subtext</code>
   * @return the index of the first occurrence of <code>subtext</code> within
   *      <code>text</code>; or <code>-1</code>
   * @throws IllegalStateException if no subtext has been set
   */
  public int indexOf(String text) {
    if (pattern == null)
      throw new IllegalStateException("setSubtext must be called with a valid value before this method can operate");

    if (text.length() == 0)
      return -1;

    final int index = new StringSearch(pattern, new StringCharacterIterator(text), COLLATOR).first();
    return mode == TextMatcherEditor.STARTS_WITH && index != 0 ? -1 : index;
  }
}

  public static int indexOfIgnoreCase(String haystack, String needle) {
    StringSearch stringSearch = new StringSearch(needle, haystack);
    stringSearch.getCollator().setStrength(Collator.PRIMARY);
    return stringSearch.first();
  }
}

 public static void main(String[] args) {  
  StringSearch stringSearch = new StringSearch();
  stringSearch.search();
  System.out.println("Welcome!  The strings you started with are:\n" + stringSearch.s1 + "\n" + stringSearch.s2 + "\n" + stringSearch.s3);
}

private boolean handleNextCommonImpl() {
  int textOffset = textIter_.getOffset();
  Match match = new Match();
  if (search(textOffset, match)) {
    search_.matchedIndex_ = match.start_;
    search_.setMatchedLength(match.limit_ - match.start_);
    return true;
  } else {
    setMatchNotFound();
    return false;
  }
}

int newce = getCE(ce);
if (newce != CollationElementIterator.IGNORABLE /* 0 */) {
  int[] temp = addToIntArray(cetable, offset, cetablesize, newce,
      patternlength - coleiter.getOffset() + 1);
  offset++;

long[] temp = addToLongArray(pcetable, offset, pcetablesize, pce, patternlength - coleiter.getOffset() + 1);
offset++;
pcetable = temp;

  /**
   * Returns the index of the first occurrence of <code>subtext</code> within
   * <code>text</code>; or <code>-1</code> if <code>subtext</code> does not
   * occur within <code>text</code>.
   *
   * @param text String in which to locate <code>subtext</code>
   * @return the index of the first occurrence of <code>subtext</code> within
   *      <code>text</code>; or <code>-1</code>
   * @throws IllegalStateException if no subtext has been set
   */
  public int indexOf(String text) {
    if (pattern == null)
      throw new IllegalStateException("setSubtext must be called with a valid value before this method can operate");

    if (text.length() == 0)
      return -1;

    final int index = new StringSearch(pattern, new StringCharacterIterator(text), COLLATOR).first();
    return mode == TextMatcherEditor.STARTS_WITH && index != 0 ? -1 : index;
  }
}

 public static boolean containsIgnoreCase(String haystack, String needle) {
  return indexOfIgnoreCase(haystack, needle) >= 0;
}

public static int indexOfIgnoreCase(String haystack, String needle) {
  StringSearch stringSearch = new StringSearch(needle, haystack);
  stringSearch.getCollator().setStrength(Collator.PRIMARY);
  return stringSearch.first();
}

  /**
   * Returns the index of the first occurrence of <code>subtext</code> within
   * <code>text</code>; or <code>-1</code> if <code>subtext</code> does not
   * occur within <code>text</code>.
   *
   * @param text String in which to locate <code>subtext</code>
   * @return the index of the first occurrence of <code>subtext</code> within
   *      <code>text</code>; or <code>-1</code>
   * @throws IllegalStateException if no subtext has been set
   */
  @Override
  public int indexOf(String text) {
    if (pattern == null) {
      throw new IllegalStateException("setSubtext must be called with a valid value before this method can operate");
    }

    if (text.length() == 0) {
      return -1;
    }

    final int index = new StringSearch(pattern, new StringCharacterIterator(text), COLLATOR).first();
    return mode == TextMatcherEditor.STARTS_WITH && index != 0 ? -1 : index;
  }
}

Javadoc

StringSearch is a SearchIterator that provides language-sensitive text searching based on the comparison rules defined in a RuleBasedCollator object. StringSearch ensures that language eccentricity can be handled, e.g. for the German collator, characters ß and SS will be matched if case is chosen to be ignored. See the "ICU Collation Design Document" for more information.

There are 2 match options for selection:
Let S' be the sub-string of a text string S between the offsets start and end [start, end].
A pattern string P matches a text string S at the offsets [start, end] if

  
option 1. Some canonical equivalent of P matches some canonical equivalent 
of S' 
option 2. P matches S' and if P starts or ends with a combining mark, 
there exists no non-ignorable combining mark before or after S? 
in S respectively.

Option 2. is the default.

This search has APIs similar to that of other text iteration mechanisms such as the break iterators in BreakIterator. Using these APIs, it is easy to scan through text looking for all occurrences of a given pattern. This search iterator allows changing of direction by calling a #reset followed by a #next or #previous. Though a direction change can occur without calling #reset first, this operation comes with some speed penalty. Match results in the forward direction will match the result matches in the backwards direction in the reverse order

SearchIterator provides APIs to specify the starting position within the text string to be searched, e.g. SearchIterator#setIndex, SearchIterator#preceding and SearchIterator#following. Since the starting position will be set as it is specified, please take note that there are some danger points at which the search may render incorrect results:

In the midst of a substring that requires normalization.
If the following match is to be found, the position should not be the second character which requires swapping with the preceding character. Vice versa, if the preceding match is to be found, the position to search from should not be the first character which requires swapping with the next character. E.g certain Thai and Lao characters require swapping.
If a following pattern match is to be found, any position within a contracting sequence except the first will fail. Vice versa if a preceding pattern match is to be found, an invalid starting point would be any character within a contracting sequence except the last.

A BreakIterator can be used if only matches at logical breaks are desired. Using a BreakIterator will only give you results that exactly matches the boundaries given by the BreakIterator. For instance the pattern "e" will not be found in the string "\u00e9" if a character break iterator is used.

Options are provided to handle overlapping matches. E.g. In English, overlapping matches produces the result 0 and 2 for the pattern "abab" in the text "ababab", where mutually exclusive matches only produces the result of 0.

Options are also provided to implement "asymmetric search" as described in UTS #10 Unicode Collation Algorithm, specifically the ElementComparisonType values.

Though collator attributes will be taken into consideration while performing matches, there are no APIs here for setting and getting the attributes. These attributes can be set by getting the collator from #getCollator and using the APIs in RuleBasedCollator. Lastly to update StringSearch to the new collator attributes, #reset has to be called.

Restriction:
Currently there are no composite characters that consists of a character with combining class > 0 before a character with combining class == 0. However, if such a character exists in the future, StringSearch does not guarantee the results for option 1.

Consult the SearchIterator documentation for information on and examples of how to use instances of this class to implement text searching.

Note, StringSearch is not to be subclassed.

Most used methods

<init>
Initializes the iterator to use the language-specific rules and break iterator rules defined in the
first
getCollator
Gets the RuleBasedCollator used for the language rules. Since StringSearch depends on the returned R
search
addToIntArray
Direct port of ICU4C static int32_t * addTouint32_tArray(...) in usearch.cpp. This is used for appen
addToLongArray
Direct port of ICU4C static int64_t * addTouint64_tArray(...) in usearch.cpp. This is used for appen
checkIdentical
Checks for identical match
codePointAt
codePointBefore
compareCE64s
getCE
Getting the modified collation elements taking into account the collation attributes.
getIndex

Popular in Java

Updating database using SQL prepared statement
getContentResolver (Context)
compareTo (BigDecimal)
getOriginalFilename (MultipartFile)
Return the original filename in the client's filesystem.This may contain path information depending
System (java.lang)
Provides access to system-related information and resources including standard input and output. Ena
Collections (java.util)
This class consists exclusively of static methods that operate on or return collections. It contains
HttpServletRequest (javax.servlet.http)
Extends the javax.servlet.ServletRequest interface to provide request information for HTTP servlets.
IOUtils (org.apache.commons.io)
General IO stream manipulation utilities. This class provides static utility methods for input/outpu
Point (java.awt)
A point representing a location in (x,y) coordinate space, specified in integer precision.
JFileChooser (javax.swing)
Top Sublime Text plugins

How to useStringSearch in com.ibm.icu.text

Best Java code snippets using com.ibm.icu.text.StringSearch (Showing top 9 results out of 315)

How to use
StringSearch
in
com.ibm.icu.text