inSet = new CuckooSetBytes(inListValues.length); inSet.load(inListValues); if (!(inSet.lookup(vector[0], start[0], len[0]))) { for(int j = 0; j != n; j++) { int i = sel[j]; if (inSet.lookup(vector[i], start[i], len[i])) { sel[newSize++] = i; int newSize = 0; for(int i = 0; i != n; i++) { if (inSet.lookup(vector[i], start[i], len[i])) { sel[newSize++] = i; if (!inSet.lookup(vector[0], start[0], len[0])) { int i = sel[j]; if (!nullPos[i]) { if (inSet.lookup(vector[i], start[i], len[i] )) { sel[newSize++] = i; for(int i = 0; i != n; i++) { if (!nullPos[i]) { if (inSet.lookup(vector[i], start[i], len[i])) { sel[newSize++] = i;
public void insert(byte[] x) { byte[] temp; if (lookup(x, 0, x.length)) { return; } // Try to insert up to n times. Rehash if that fails. for(int i = 0; i != n; i++) { int hash1 = h1(x, 0, x.length); if (t1[hash1] == null) { t1[hash1] = x; return; } // swap x and t1[h1(x)] temp = t1[hash1]; t1[hash1] = x; x = temp; int hash2 = h2(x, 0, x.length); if (t2[hash2] == null) { t2[hash2] = x; return; } // swap x and t2[h2(x)] temp = t2[hash2]; t2[hash2] = x; x = temp; } rehash(); insert(x); }
/** * Return true if and only if the value in byte array b beginning at start * and ending at start+len is present in the set. */ public boolean lookup(byte[] b, int start, int len) { return entryEqual(t1, h1(b, start, len), b, start, len) || entryEqual(t2, h2(b, start, len), b, start, len); }
@Test public void testSetBytes() { String[] strings = {"foo", "bar", "baz", "a", "", "x1341", "Z"}; String[] negativeStrings = {"not", "in", "the", "set", "foobar"}; byte[][] values = getByteArrays(strings); byte[][] negatives = getByteArrays(negativeStrings); // load set CuckooSetBytes s = new CuckooSetBytes(strings.length); for(byte[] v : values) { s.insert(v); } // test that the values we added are there for(byte[] v : values) { assertTrue(s.lookup(v, 0, v.length)); } // test that values that we know are missing are shown to be absent for (byte[] v : negatives) { assertFalse(s.lookup(v, 0, v.length)); } // Test that we can search correctly using a buffer and pulling // a sequence of bytes out of the middle of it. In this case it // is the 3 letter sequence "foo". byte[] buf = getUTF8Bytes("thewordfooisinhere"); assertTrue(s.lookup(buf, 7, 3)); }
throw new RuntimeException("Too many rehashes"); updateHashSalt(); for (byte[] v : prev1) { if (v != null) { byte[] x = tryInsert(v); if (x != null) { rehash(); return; byte[] x = tryInsert(v); if (x != null) { rehash(); return;
@Test public void testSetBytesLargeRandom() { byte[][] values; Random gen = new Random(98763537); for(int i = 0; i < 200;) { // Make a random array of byte arrays int size = gen.nextInt() % MAX_SIZE; if (size <= 0) { // ensure size is >= 1, otherwise try again continue; } i++; values = new byte[size][]; loadRandomBytes(values, gen); // load them into a set CuckooSetBytes s = new CuckooSetBytes(size); loadSet(s, values); // look them up to make sure they are all there for (int j = 0; j != size; j++) { assertTrue(s.lookup(values[j], 0, values[j].length)); } } }
/** * Insert all values in the input array into the set. */ public void load(byte[][] a) { for (byte[] x : a) { insert(x); } }
/** * second hash function */ private int h2(byte[] b, int start, int len) { // AND hash with mask to 0 out sign bit to make sure it's positive. // Then we know taking the result mod n is in the range (0..n-1). // Include salt as argument so this hash function can be varied // if we need to rehash. return (hash(b, start, len, salt) & 0x7FFFFFFF) % n; }
a = (a - c) & INT_MASK; a ^= rot(c, 4); c = (c + b) & INT_MASK; b = (b - a) & INT_MASK; b ^= rot(a, 6); a = (a + c) & INT_MASK; c = (c - b) & INT_MASK; c ^= rot(b, 8); b = (b + a) & INT_MASK; a = (a - c) & INT_MASK; a ^= rot(c,16); c = (c + b) & INT_MASK; b = (b - a) & INT_MASK; b ^= rot(a,19); a = (a + c) & INT_MASK; c = (c - b) & INT_MASK; c ^= rot(b, 4); b = (b + a) & INT_MASK; c ^= b; c = (c - rot(b,14)) & INT_MASK; a ^= c; a = (a - rot(c,11)) & INT_MASK; b ^= a; b = (b - rot(a,25)) & INT_MASK; c ^= b; c = (c - rot(b,16)) & INT_MASK; a ^= c; a = (a - rot(c,4)) & INT_MASK; b ^= a; b = (b - rot(a,14)) & INT_MASK; c ^= b; c = (c - rot(b,24)) & INT_MASK;
throw new RuntimeException("Too many rehashes"); updateHashSalt(); for (byte[] v : prev1) { if (v != null) { byte[] x = tryInsert(v); if (x != null) { rehash(); return; byte[] x = tryInsert(v); if (x != null) { rehash(); return;
/** * Insert all values in the input array into the set. */ public void load(byte[][] a) { for (byte[] x : a) { insert(x); } }
/** * first hash function */ private int h1(byte[] b, int start, int len) { // AND hash with mask to 0 out sign bit to make sure it's positive. // Then we know taking the result mod n is in the range (0..n-1). return (hash(b, start, len, 0) & 0x7FFFFFFF) % n; }
a = (a - c) & INT_MASK; a ^= rot(c, 4); c = (c + b) & INT_MASK; b = (b - a) & INT_MASK; b ^= rot(a, 6); a = (a + c) & INT_MASK; c = (c - b) & INT_MASK; c ^= rot(b, 8); b = (b + a) & INT_MASK; a = (a - c) & INT_MASK; a ^= rot(c,16); c = (c + b) & INT_MASK; b = (b - a) & INT_MASK; b ^= rot(a,19); a = (a + c) & INT_MASK; c = (c - b) & INT_MASK; c ^= rot(b, 4); b = (b + a) & INT_MASK; c ^= b; c = (c - rot(b,14)) & INT_MASK; a ^= c; a = (a - rot(c,11)) & INT_MASK; b ^= a; b = (b - rot(a,25)) & INT_MASK; c ^= b; c = (c - rot(b,16)) & INT_MASK; a ^= c; a = (a - rot(c,4)) & INT_MASK; b ^= a; b = (b - rot(a,14)) & INT_MASK; c ^= b; c = (c - rot(b,24)) & INT_MASK;
public void insert(byte[] x) { byte[] temp; if (lookup(x, 0, x.length)) { return; } // Try to insert up to n times. Rehash if that fails. for(int i = 0; i != n; i++) { int hash1 = h1(x, 0, x.length); if (t1[hash1] == null) { t1[hash1] = x; return; } // swap x and t1[h1(x)] temp = t1[hash1]; t1[hash1] = x; x = temp; int hash2 = h2(x, 0, x.length); if (t2[hash2] == null) { t2[hash2] = x; return; } // swap x and t2[h2(x)] temp = t2[hash2]; t2[hash2] = x; x = temp; } rehash(); insert(x); }
inSet = new CuckooSetBytes(inListValues.length); inSet.load(inListValues); if (!(inSet.lookup(vector[0], start[0], len[0]))) { for(int j = 0; j != n; j++) { int i = sel[j]; if (inSet.lookup(vector[i], start[i], len[i])) { sel[newSize++] = i; int newSize = 0; for(int i = 0; i != n; i++) { if (inSet.lookup(vector[i], start[i], len[i])) { sel[newSize++] = i; if (!inSet.lookup(vector[0], start[0], len[0])) { int i = sel[j]; if (!nullPos[i]) { if (inSet.lookup(vector[i], start[i], len[i] )) { sel[newSize++] = i; for(int i = 0; i != n; i++) { if (!nullPos[i]) { if (inSet.lookup(vector[i], start[i], len[i])) { sel[newSize++] = i;
/** * Return true if and only if the value in byte array b beginning at start * and ending at start+len is present in the set. */ public boolean lookup(byte[] b, int start, int len) { return entryEqual(t1, h1(b, start, len), b, start, len) || entryEqual(t2, h2(b, start, len), b, start, len); }
throw new RuntimeException("Too many rehashes"); updateHashSalt(); for (byte[] v : prev1) { if (v != null) { byte[] x = tryInsert(v); if (x != null) { rehash(); return; byte[] x = tryInsert(v); if (x != null) { rehash(); return;