How to use
EncodingDetector
in
com.mucommander.commons.io

Best Java code snippets using com.mucommander.commons.io.EncodingDetector (Showing top 7 results out of 315)

/**
 * This method is a shorthand for {@link #detectEncoding(byte[], int, int) detectEncoding(b, 0, b.length)}.
 *
 * @param bytes the bytes for which to detect the encoding
 * @return the best guess at the character encoding, null if there is none (not enough data or confidence)
 */
public static String detectEncoding(byte bytes[]) {
  return detectEncoding(bytes, 0, bytes.length);
}

  /**
   * Lists all detectable encodings as returned by {@link #getDetectableEncodings()} to the standard output.
   * @param args command line arguments.
   */
  public static void main(String args[]) {
    String encodings[] = getDetectableEncodings();

    for (String encoding : encodings)
      System.out.println(encoding);
  }
}

public synchronized void processDied(int returnValue) {
  String encoding;
  String oldEncoding;
  // Abort if there is no need to identify the encoding anymore.
  if(out == null)
    return;
  // Attempts to guess at the encoding. If no guess can be made, ignore.
  if((encoding = EncodingDetector.detectEncoding(out.toByteArray())) == null)
    return;
  // Checks whether the detected charset is supported.
  if(Charset.isSupported(encoding)) {
    oldEncoding = MuConfigurations.getPreferences().getVariable(MuPreference.SHELL_ENCODING);
    // If no encoding was previously set, or we have found a new encoding, change the current shell encoding.
    if((oldEncoding == null) || !encoding.equals(oldEncoding))
      MuConfigurations.getPreferences().setVariable(MuPreference.SHELL_ENCODING, encoding);
    // Stop listening for new byte input if we have gathered a large enough sample set.
    if(out.size() >= EncodingDetector.MAX_RECOMMENDED_BYTE_SIZE)
      out = null;
  }
}

/**
 * Try and detect the character encoding in which the bytes contained by the given <code>InputStream</code> are
 * encoded, and returns the best guess or <code>null</code> if there is none (not enough data or confidence).
 * Note that the returned character encoding may or may not be available on the Java runtime -- use
 * <code>java.nio.Charset#isSupported(String)</code> to determine if it is available.
 *
 * <p>A maximum of {@link #MAX_RECOMMENDED_BYTE_SIZE} will be read from the <code>InputStream</code>. The
 * stream will not be closed and will not be repositionned after the bytes have been read. It is up to the calling
 * method to use the <code>InputStream#mark()</code> and <code>InputStream#reset()</code> methods (if supported) 
 * or reopen the stream if needed.
 * </p>
 *
 * @param in the InputStream that supplies the bytes
 * @return the best guess at the character encoding, null if there is none (not enough data or confidence)
 * @throws IOException if an error occurred while reading the stream
 */
public static String detectEncoding(InputStream in) throws IOException {
  byte buf[] = BufferPool.getByteArray(MAX_RECOMMENDED_BYTE_SIZE);
  try {
    return detectEncoding(buf, 0, StreamUtils.readUpTo(in, buf));
  }
  finally {
    BufferPool.releaseByteArray(buf);
  }
}

  in = file.getInputStream();
String encoding = EncodingDetector.detectEncoding(in);

comment = getString(commentBytes, defaultEncoding!=null?defaultEncoding:EncodingDetector.detectEncoding(commentBytes));

String guessedEncoding = EncodingDetector.detectEncoding(encodingAccumulator.toByteArray());

Javadoc

This class allows to guess at an encoding in which an array of bytes is encoded. Detecting an encoding is by no means an accurate operation, as it relies on heuristics that are imprecise by nature. However, accuracy improves with the quantity of bytes that is supplied: a small amount of data (say 10 bytes) has little chance of being guessed correctly, whereas a larger amount of data (say 1000 bytes) is likely to provide a good result. On the other hand, providing a very large amount of data will only marginally improve the accuracy, and is not worth the extra effort considering that encoding detection is a costly operation which involves many comparisons per byte. The #MAX_RECOMMENDED_BYTE_SIZE field controls that threshold: if a supplied byte array is larger than this value, the additional bytes will not be processed by the detectEncoding methods. Therefore, this value should be taken into account if bytes are to be fetched specifically for the purpose of detecting the encoding.

EncodingDetector uses ICU4J under the hood. Here's a list of encodings that can currently be detected:

 
UTF-8 
UTF-16BE 
UTF-16LE 
UTF-32BE 
UTF-32LE 
Shift_JIS 
ISO-2022-JP 
ISO-2022-CN 
ISO-2022-KR 
GB18030 
EUC-JP 
EUC-KR 
Big5 
ISO-8859-1 
ISO-8859-2 
ISO-8859-5 
ISO-8859-6 
ISO-8859-7 
ISO-8859-8 
windows-1251 
windows-1256 
KOI8-R 
ISO-8859-9

Most used methods

detectEncoding
Try and detect the character encoding in which the given bytes are encoded, and returns the best gue
getDetectableEncodings
Returns an array of encodings that can be detected by the detectEncoding methods. Note that some of

Popular in Java

Making http post requests using okhttp
getResourceAsStream (ClassLoader)
onRequestPermissionsResult (Fragment)
getApplicationContext (Context)
File (java.io)
An "abstract" representation of a file system entity identified by a pathname. The pathname may be a
HashMap (java.util)
HashMap is an implementation of Map. All optional operations are supported.All elements are permitte
CountDownLatch (java.util.concurrent)
A synchronization aid that allows one or more threads to wait until a set of operations being perfor
FileUtils (org.apache.commons.io)
General file manipulation utilities. Facilities are provided in the following areas: * writing to a
LoggerFactory (org.slf4j)
The LoggerFactory is a utility class producing Loggers for various logging APIs, most notably for lo
JOptionPane (javax.swing)
Top PhpStorm plugins

How to useEncodingDetector in com.mucommander.commons.io

Best Java code snippets using com.mucommander.commons.io.EncodingDetector (Showing top 7 results out of 315)

How to use
EncodingDetector
in
com.mucommander.commons.io