Tabnine Logo
HuffmanCodec
Code IndexAdd Tabnine to your IDE (free)

How to use
HuffmanCodec
in
it.unimi.dsi.compression

Best Java code snippets using it.unimi.dsi.compression.HuffmanCodec (Showing top 17 results out of 315)

origin: blazegraph/database

/**
 * This verifies that a code book constructed from a given set of
 * frequencies may be reconstructed from the cord word bit lengths, given in
 * a non-decreasing order, together with the symbols in a correlated array.
 * 
 * @param frequency
 */
public void doRoundTripTest(final int[] frequency) {
  
  final DecoderInputs decoderInputs = new DecoderInputs();
  
  final HuffmanCodec codec = new HuffmanCodec(frequency, decoderInputs);
  if (log.isDebugEnabled()) {
    log.debug(printCodeBook(codec.codeWords()) + "\nlength[]="
        + Arrays.toString(decoderInputs.getLengths()) + "\nsymbol[]="
        + Arrays.toString(decoderInputs.getSymbols()));
  }
  
  final CanonicalFast64CodeWordDecoder actualDecoder = new CanonicalFast64CodeWordDecoder(
      decoderInputs.getLengths(), decoderInputs.getSymbols());
  for (int i = 0; i < frequency.length; i++) {
    final BooleanIterator coded = codec.coder().encode(i/*symbol*/);
    
    assertEquals(i, actualDecoder.decode(coded));
    
  }
}
origin: blazegraph/database

/**
 * @param shortestCodeWord
 * @param lengths
 * @param
 */
private void doCoderRoundTripTest(final BitVector[] expected,
    final BitVector shortestCodeWord, final int[] length,
    final int[] symbol) {
  final PrefixCoder newCoder = HuffmanCodec.newCoder(shortestCodeWord,
      length, symbol);
  final BitVector[] actual = newCoder.codeWords();
  assertEquals("codeWord[]", expected, actual);
  if (log.isDebugEnabled()) {
       log.debug("\nexpected: " + Arrays.toString(expected)
        + "\nactual  : " + Arrays.toString(actual));
    
  }
  
}
 
origin: it.unimi.dsi/dsiutils

/** Creates a new Huffman codec using the given vector of frequencies.
 *
 * @param frequency a vector of nonnnegative frequencies.
 */
public HuffmanCodec(final int[] frequency) {
  this(intArray2LongArray(frequency));
}
origin: blazegraph/database

  : getSumCodedValueBitLengths(setup.codec().codeWords(),
      raba, (Byte2Symbol) setup);
      : new long[size + 1]);
  final long sumCodedValueBitLengths2 = writeCodedValues(setup
      .codec().coder(), raba, (Byte2Symbol) setup,
      codedValueOffset, obs);
  assert sumCodedValueBitLengths == sumCodedValueBitLengths2 : "sumCodedValueBitLengths="
return new CodedRabaImpl(slice, setup.codec().decoder(),
    decoderInputsBitLength);
origin: blazegraph/database

codec = new HuffmanCodec(packedFrequency, decoderInputs);
          + printCodeBook(codec.codeWords(), this/* Symbol2Byte */));
origin: blazegraph/database

/**
 * Unit test for processing an {@link IRaba} representing B+Tree values
 * suitable to setup the data for compression.
 * 
 * @throws IOException 
 * 
 * @todo test w/ nulls.
 */
public void test_valueRabaSetup() throws IOException {
  final int n = 3;
  final byte[][] a = new byte[n][];
  a[0] = new byte[]{2,3};
  a[1] = new byte[]{3,5};
  a[2] = new byte[]{'m','i','k','e'};
  
  final IRaba raba = new ReadOnlyValuesRaba(a);
  final RabaCodingSetup setup = new RabaCodingSetup(raba);
  
  // verify that we can re-create the decoder.
  doDecoderInputRoundTripTest(setup.getSymbolCount(), setup
      .decoderInputs());
  // verify that we can re-create the coder.
  doCoderRoundTripTest(setup.codec().codeWords(), setup.decoderInputs()
      .getShortestCodeWord(), setup.decoderInputs().getLengths(),
      setup.decoderInputs().getSymbols());
}
origin: com.blazegraph/bigdata-core

  : getSumCodedValueBitLengths(setup.codec().codeWords(),
      raba, (Byte2Symbol) setup);
      : new long[size + 1]);
  final long sumCodedValueBitLengths2 = writeCodedValues(setup
      .codec().coder(), raba, (Byte2Symbol) setup,
      codedValueOffset, obs);
  assert sumCodedValueBitLengths == sumCodedValueBitLengths2 : "sumCodedValueBitLengths="
return new CodedRabaImpl(slice, setup.codec().decoder(),
    decoderInputsBitLength);
origin: com.blazegraph/bigdata-core

codec = new HuffmanCodec(packedFrequency, decoderInputs);
          + printCodeBook(codec.codeWords(), this/* Symbol2Byte */));
origin: com.blazegraph/bigdata-core-test

/**
 * Unit test for processing an {@link IRaba} representing B+Tree values
 * suitable to setup the data for compression.
 * 
 * @throws IOException 
 * 
 * @todo test w/ nulls.
 */
public void test_valueRabaSetup() throws IOException {
  final int n = 3;
  final byte[][] a = new byte[n][];
  a[0] = new byte[]{2,3};
  a[1] = new byte[]{3,5};
  a[2] = new byte[]{'m','i','k','e'};
  
  final IRaba raba = new ReadOnlyValuesRaba(a);
  final RabaCodingSetup setup = new RabaCodingSetup(raba);
  
  // verify that we can re-create the decoder.
  doDecoderInputRoundTripTest(setup.getSymbolCount(), setup
      .decoderInputs());
  // verify that we can re-create the coder.
  doCoderRoundTripTest(setup.codec().codeWords(), setup.decoderInputs()
      .getShortestCodeWord(), setup.decoderInputs().getLengths(),
      setup.decoderInputs().getSymbols());
}
origin: com.blazegraph/bigdata-core-test

/**
 * This verifies that a code book constructed from a given set of
 * frequencies may be reconstructed from the cord word bit lengths, given in
 * a non-decreasing order, together with the symbols in a correlated array.
 * 
 * @param frequency
 */
public void doRoundTripTest(final int[] frequency) {
  
  final DecoderInputs decoderInputs = new DecoderInputs();
  
  final HuffmanCodec codec = new HuffmanCodec(frequency, decoderInputs);
  if (log.isDebugEnabled()) {
    log.debug(printCodeBook(codec.codeWords()) + "\nlength[]="
        + Arrays.toString(decoderInputs.getLengths()) + "\nsymbol[]="
        + Arrays.toString(decoderInputs.getSymbols()));
  }
  
  final CanonicalFast64CodeWordDecoder actualDecoder = new CanonicalFast64CodeWordDecoder(
      decoderInputs.getLengths(), decoderInputs.getSymbols());
  for (int i = 0; i < frequency.length; i++) {
    final BooleanIterator coded = codec.coder().encode(i/*symbol*/);
    
    assertEquals(i, actualDecoder.decode(coded));
    
  }
}
origin: com.blazegraph/bigdata-core-test

/**
 * @param shortestCodeWord
 * @param lengths
 * @param
 */
private void doCoderRoundTripTest(final BitVector[] expected,
    final BitVector shortestCodeWord, final int[] length,
    final int[] symbol) {
  final PrefixCoder newCoder = HuffmanCodec.newCoder(shortestCodeWord,
      length, symbol);
  final BitVector[] actual = newCoder.codeWords();
  assertEquals("codeWord[]", expected, actual);
  if (log.isDebugEnabled()) {
       log.debug("\nexpected: " + Arrays.toString(expected)
        + "\nactual  : " + Arrays.toString(actual));
    
  }
  
}
 
origin: blazegraph/database

/**
 * Unit test for processing an {@link IRaba} representing B+Tree keys
 * suitable to setup the data for compression.
 * 
 * @throws IOException 
 */
public void test_keyRabaSetup() throws IOException {
  final int n = 8;
  final byte[][] a = new byte[n][];
  a[0] = new byte[]{1,2};
  a[1] = new byte[]{1,2,3};
  a[2] = new byte[]{1,3};
  a[3] = new byte[]{1,3,1};
  a[4] = new byte[]{1,3,3};
  a[5] = new byte[]{1,3,7};
  a[6] = new byte[]{1,5};
  a[7] = new byte[]{1,6,0};
  
  final IRaba raba = new ReadOnlyKeysRaba(a);
  final AbstractCodingSetup setup = new RabaCodingSetup(raba);
  doDecoderInputRoundTripTest(setup.getSymbolCount(), setup
      .decoderInputs());
  // verify that we can re-create the coder.
  doCoderRoundTripTest(setup.codec().codeWords(), setup.decoderInputs()
      .getShortestCodeWord(), setup.decoderInputs().getLengths(),
      setup.decoderInputs().getSymbols());
}
origin: blazegraph/database

final HuffmanCodec codec = new HuffmanCodec(frequency, decoderInputs);
final PrefixCoder expected = codec.coder();
final PrefixCoder actual = new Fast64CodeWordCoder(codec.codeWords());
  log.debug(printCodeBook(codec.codeWords()));
origin: blazegraph/database

/**
 * (Re-)constructs the canonical huffman code from the shortest code word,
 * the non-decreasing bit lengths of each code word, and the permutation of
 * the symbols corresponding to those bit lengths. This information is
 * necessary and sufficient to reconstruct a canonical huffman code.
 * 
 * @param decoderInputs
 *            This contains the necessary and sufficient information to
 *            recreate the {@link PrefixCoder}.
 * 
 * @return A new {@link PrefixCoder} instance for the corresponding
 *         canonical huffman code.
 */
static public PrefixCoder newCoder(final DecoderInputs decoderInputs) {
  return newCoder(decoderInputs.getShortestCodeWord(), decoderInputs
      .getLengths(), decoderInputs.getSymbols());
}
origin: com.blazegraph/bigdata-core-test

/**
 * Unit test for processing an {@link IRaba} representing B+Tree keys
 * suitable to setup the data for compression.
 * 
 * @throws IOException 
 */
public void test_keyRabaSetup() throws IOException {
  final int n = 8;
  final byte[][] a = new byte[n][];
  a[0] = new byte[]{1,2};
  a[1] = new byte[]{1,2,3};
  a[2] = new byte[]{1,3};
  a[3] = new byte[]{1,3,1};
  a[4] = new byte[]{1,3,3};
  a[5] = new byte[]{1,3,7};
  a[6] = new byte[]{1,5};
  a[7] = new byte[]{1,6,0};
  
  final IRaba raba = new ReadOnlyKeysRaba(a);
  final AbstractCodingSetup setup = new RabaCodingSetup(raba);
  doDecoderInputRoundTripTest(setup.getSymbolCount(), setup
      .decoderInputs());
  // verify that we can re-create the coder.
  doCoderRoundTripTest(setup.codec().codeWords(), setup.decoderInputs()
      .getShortestCodeWord(), setup.decoderInputs().getLengths(),
      setup.decoderInputs().getSymbols());
}
origin: com.blazegraph/bigdata-core-test

final HuffmanCodec codec = new HuffmanCodec(frequency, decoderInputs);
final PrefixCoder expected = codec.coder();
final PrefixCoder actual = new Fast64CodeWordCoder(codec.codeWords());
  log.debug(printCodeBook(codec.codeWords()));
origin: com.blazegraph/dsi-utils

/**
 * (Re-)constructs the canonical huffman code from the shortest code word,
 * the non-decreasing bit lengths of each code word, and the permutation of
 * the symbols corresponding to those bit lengths. This information is
 * necessary and sufficient to reconstruct a canonical huffman code.
 * 
 * @param decoderInputs
 *            This contains the necessary and sufficient information to
 *            recreate the {@link PrefixCoder}.
 * 
 * @return A new {@link PrefixCoder} instance for the corresponding
 *         canonical huffman code.
 */
static public PrefixCoder newCoder(final DecoderInputs decoderInputs) {
  return newCoder(decoderInputs.getShortestCodeWord(), decoderInputs
      .getLengths(), decoderInputs.getSymbols());
}
it.unimi.dsi.compressionHuffmanCodec

Javadoc

An implementation of Huffman optimal prefix-free coding.

A Huffman coder is built starting from an array of frequencies corresponding to each symbol. Frequency 0 symbols are allowed, but they will degrade the resulting code.

Instances of this class compute a canonical Huffman code (Eugene S. Schwartz and Bruce Kallick, “Generating a Canonical Prefix Encoding”, Commun. ACM 7(3), pages 166−169, 1964), which can by CanonicalFast64CodeWordDecoder. The construction uses the most efficient one-pass in-place codelength computation procedure described by Alistair Moffat and Jyrki Katajainen in “In-Place Calculation of Minimum-Redundancy Codes”, Algorithms and Data Structures, 4th International Workshop, number 955 in Lecture Notes in Computer Science, pages 393−402, Springer-Verlag, 1995.

We note by passing that this coded uses a CanonicalFast64CodeWordDecoder, which does not support codelengths above 64. However, since the worst case for codelengths is given by Fibonacci numbers, and frequencies are to be provided as integers, no codeword longer than the base-[(51/2 + 1)/2] logarithm of 51/2 · 231 (less than 47) will ever be generated.

Modifications

  1. This class has been modified to define an alternative ctor which exposes the symbol[] in correlated order with the codeWord bitLength[] and the shortest code word in the generated canonical.
  2. A method has been added to recreate the PrefixCoder from the shortest code word, the code word length[], and the symbol[].

Most used methods

  • <init>
    Creates a new Huffman codec using the given vector of frequencies.
  • codeWords
  • coder
  • newCoder
    (Re-)constructs the canonical huffman code from the shortest code word, the non-decreasing bit lengt
  • decoder
  • intArray2LongArray

Popular in Java

  • Start an intent from android
  • scheduleAtFixedRate (Timer)
  • onRequestPermissionsResult (Fragment)
  • runOnUiThread (Activity)
  • Component (java.awt)
    A component is an object having a graphical representation that can be displayed on the screen and t
  • Menu (java.awt)
  • Point (java.awt)
    A point representing a location in (x,y) coordinate space, specified in integer precision.
  • Properties (java.util)
    A Properties object is a Hashtable where the keys and values must be Strings. Each property can have
  • ZipFile (java.util.zip)
    This class provides random read access to a zip file. You pay more to read the zip file's central di
  • Project (org.apache.tools.ant)
    Central representation of an Ant project. This class defines an Ant project with all of its targets,
  • Top plugins for WebStorm
Tabnine Logo
  • Products

    Search for Java codeSearch for JavaScript code
  • IDE Plugins

    IntelliJ IDEAWebStormVisual StudioAndroid StudioEclipseVisual Studio CodePyCharmSublime TextPhpStormVimGoLandRubyMineEmacsJupyter NotebookJupyter LabRiderDataGripAppCode
  • Company

    About UsContact UsCareers
  • Resources

    FAQBlogTabnine AcademyTerms of usePrivacy policyJava Code IndexJavascript Code Index
Get Tabnine for your IDE now