private void snapshotActiveBuckets( final long checkpointId, final ListState<byte[]> bucketStatesContainer) throws Exception { for (Bucket<IN, BucketID> bucket : activeBuckets.values()) { final BucketState<BucketID> bucketState = bucket.onReceptionOfCheckpoint(checkpointId); final byte[] serializedBucketState = SimpleVersionedSerialization .writeVersionAndSerialize(bucketStateSerializer, bucketState); bucketStatesContainer.add(serializedBucketState); if (LOG.isDebugEnabled()) { LOG.debug("Subtask {} checkpointing: {}", subtaskIndex, bucketState); } } }
@VisibleForTesting void serializeV1(BucketState<BucketID> state, DataOutputView out) throws IOException { SimpleVersionedSerialization.writeVersionAndSerialize(bucketIdSerializer, state.getBucketId(), out); out.writeUTF(state.getBucketPath().toString()); out.writeLong(state.getInProgressFileCreationTime()); // put the current open part file if (state.hasInProgressResumableFile()) { final RecoverableWriter.ResumeRecoverable resumable = state.getInProgressResumableFile(); out.writeBoolean(true); SimpleVersionedSerialization.writeVersionAndSerialize(resumableSerializer, resumable, out); } else { out.writeBoolean(false); } // put the map of pending files per checkpoint final Map<Long, List<RecoverableWriter.CommitRecoverable>> pendingCommitters = state.getCommittableFilesPerCheckpoint(); // manually keep the version here to safe some bytes out.writeInt(commitableSerializer.getVersion()); out.writeInt(pendingCommitters.size()); for (Entry<Long, List<RecoverableWriter.CommitRecoverable>> resumablesForCheckpoint : pendingCommitters.entrySet()) { List<RecoverableWriter.CommitRecoverable> resumables = resumablesForCheckpoint.getValue(); out.writeLong(resumablesForCheckpoint.getKey()); out.writeInt(resumables.size()); for (RecoverableWriter.CommitRecoverable resumable : resumables) { byte[] serialized = commitableSerializer.serialize(resumable); out.writeInt(serialized.length); out.write(serialized); } } }
@Test public void testSerializationRoundTrip() throws IOException { final SimpleVersionedSerializer<String> utfEncoder = new SimpleVersionedSerializer<String>() { private static final int VERSION = Integer.MAX_VALUE / 2; // version should occupy many bytes @Override public int getVersion() { return VERSION; } @Override public byte[] serialize(String str) throws IOException { return str.getBytes(StandardCharsets.UTF_8); } @Override public String deserialize(int version, byte[] serialized) throws IOException { assertEquals(VERSION, version); return new String(serialized, StandardCharsets.UTF_8); } }; final String testString = "dugfakgs"; final DataOutputSerializer out = new DataOutputSerializer(32); SimpleVersionedSerialization.writeVersionAndSerialize(utfEncoder, testString, out); final byte[] outBytes = out.getCopyOfBuffer(); final byte[] bytes = SimpleVersionedSerialization.writeVersionAndSerialize(utfEncoder, testString); assertArrayEquals(bytes, outBytes); final DataInputDeserializer in = new DataInputDeserializer(bytes); final String deserialized = SimpleVersionedSerialization.readVersionAndDeSerialize(utfEncoder, in); final String deserializedFromBytes = SimpleVersionedSerialization.readVersionAndDeSerialize(utfEncoder, outBytes); assertEquals(testString, deserialized); assertEquals(testString, deserializedFromBytes); }
@Test public void testSerializationEmpty() throws IOException { final File testFolder = tempFolder.newFolder(); final FileSystem fs = FileSystem.get(testFolder.toURI()); final RecoverableWriter writer = fs.createRecoverableWriter(); final Path testBucket = new Path(testFolder.getPath(), "test"); final BucketState<String> bucketState = new BucketState<>( "test", testBucket, Long.MAX_VALUE, null, new HashMap<>()); final SimpleVersionedSerializer<BucketState<String>> serializer = new BucketStateSerializer<>( writer.getResumeRecoverableSerializer(), writer.getCommitRecoverableSerializer(), SimpleVersionedStringSerializer.INSTANCE ); byte[] bytes = SimpleVersionedSerialization.writeVersionAndSerialize(serializer, bucketState); final BucketState<String> recoveredState = SimpleVersionedSerialization.readVersionAndDeSerialize(serializer, bytes); Assert.assertEquals(testBucket, recoveredState.getBucketPath()); Assert.assertNull(recoveredState.getInProgressResumableFile()); Assert.assertTrue(recoveredState.getCommittableFilesPerCheckpoint().isEmpty()); }
SimpleVersionedSerialization.writeVersionAndSerialize(emptySerializer, "abc", out); final byte[] outBytes = out.getCopyOfBuffer(); final byte[] bytes = SimpleVersionedSerialization.writeVersionAndSerialize(emptySerializer, "abc"); assertArrayEquals(bytes, outBytes);
); byte[] bytes = SimpleVersionedSerialization.writeVersionAndSerialize(serializer, bucketState);
stream.close(); byte[] bytes = SimpleVersionedSerialization.writeVersionAndSerialize(serializer, bucketState);
@Test public void testSerializationOnlyInProgress() throws IOException { final File testFolder = tempFolder.newFolder(); final FileSystem fs = FileSystem.get(testFolder.toURI()); final Path testBucket = new Path(testFolder.getPath(), "test"); final RecoverableWriter writer = fs.createRecoverableWriter(); final RecoverableFsDataOutputStream stream = writer.open(testBucket); stream.write(IN_PROGRESS_CONTENT.getBytes(Charset.forName("UTF-8"))); final RecoverableWriter.ResumeRecoverable current = stream.persist(); final BucketState<String> bucketState = new BucketState<>( "test", testBucket, Long.MAX_VALUE, current, new HashMap<>()); final SimpleVersionedSerializer<BucketState<String>> serializer = new BucketStateSerializer<>( writer.getResumeRecoverableSerializer(), writer.getCommitRecoverableSerializer(), SimpleVersionedStringSerializer.INSTANCE ); final byte[] bytes = SimpleVersionedSerialization.writeVersionAndSerialize(serializer, bucketState); // to simulate that everything is over for file. stream.close(); final BucketState<String> recoveredState = SimpleVersionedSerialization.readVersionAndDeSerialize(serializer, bytes); Assert.assertEquals(testBucket, recoveredState.getBucketPath()); FileStatus[] statuses = fs.listStatus(testBucket.getParent()); Assert.assertEquals(1L, statuses.length); Assert.assertTrue( statuses[0].getPath().getPath().startsWith( (new Path(testBucket.getParent(), ".test.inprogress")).toString()) ); }
private void snapshotActiveBuckets( final long checkpointId, final ListState<byte[]> bucketStatesContainer) throws Exception { for (Bucket<IN, BucketID> bucket : activeBuckets.values()) { final BucketState<BucketID> bucketState = bucket.onReceptionOfCheckpoint(checkpointId); final byte[] serializedBucketState = SimpleVersionedSerialization .writeVersionAndSerialize(bucketStateSerializer, bucketState); bucketStatesContainer.add(serializedBucketState); if (LOG.isDebugEnabled()) { LOG.debug("Subtask {} checkpointing: {}", subtaskIndex, bucketState); } } }
private void snapshotActiveBuckets( final long checkpointId, final ListState<byte[]> bucketStatesContainer) throws Exception { for (Bucket<IN, BucketID> bucket : activeBuckets.values()) { final BucketState<BucketID> bucketState = bucket.onReceptionOfCheckpoint(checkpointId); final byte[] serializedBucketState = SimpleVersionedSerialization .writeVersionAndSerialize(bucketStateSerializer, bucketState); bucketStatesContainer.add(serializedBucketState); if (LOG.isDebugEnabled()) { LOG.debug("Subtask {} checkpointing: {}", subtaskIndex, bucketState); } } }
@VisibleForTesting void serializeV1(BucketState<BucketID> state, DataOutputView out) throws IOException { SimpleVersionedSerialization.writeVersionAndSerialize(bucketIdSerializer, state.getBucketId(), out); out.writeUTF(state.getBucketPath().toString()); out.writeLong(state.getInProgressFileCreationTime()); // put the current open part file if (state.hasInProgressResumableFile()) { final RecoverableWriter.ResumeRecoverable resumable = state.getInProgressResumableFile(); out.writeBoolean(true); SimpleVersionedSerialization.writeVersionAndSerialize(resumableSerializer, resumable, out); } else { out.writeBoolean(false); } // put the map of pending files per checkpoint final Map<Long, List<RecoverableWriter.CommitRecoverable>> pendingCommitters = state.getCommittableFilesPerCheckpoint(); // manually keep the version here to safe some bytes out.writeInt(commitableSerializer.getVersion()); out.writeInt(pendingCommitters.size()); for (Entry<Long, List<RecoverableWriter.CommitRecoverable>> resumablesForCheckpoint : pendingCommitters.entrySet()) { List<RecoverableWriter.CommitRecoverable> resumables = resumablesForCheckpoint.getValue(); out.writeLong(resumablesForCheckpoint.getKey()); out.writeInt(resumables.size()); for (RecoverableWriter.CommitRecoverable resumable : resumables) { byte[] serialized = commitableSerializer.serialize(resumable); out.writeInt(serialized.length); out.write(serialized); } } }
@VisibleForTesting void serializeV1(BucketState<BucketID> state, DataOutputView out) throws IOException { SimpleVersionedSerialization.writeVersionAndSerialize(bucketIdSerializer, state.getBucketId(), out); out.writeUTF(state.getBucketPath().toString()); out.writeLong(state.getInProgressFileCreationTime()); // put the current open part file if (state.hasInProgressResumableFile()) { final RecoverableWriter.ResumeRecoverable resumable = state.getInProgressResumableFile(); out.writeBoolean(true); SimpleVersionedSerialization.writeVersionAndSerialize(resumableSerializer, resumable, out); } else { out.writeBoolean(false); } // put the map of pending files per checkpoint final Map<Long, List<RecoverableWriter.CommitRecoverable>> pendingCommitters = state.getCommittableFilesPerCheckpoint(); // manually keep the version here to safe some bytes out.writeInt(commitableSerializer.getVersion()); out.writeInt(pendingCommitters.size()); for (Entry<Long, List<RecoverableWriter.CommitRecoverable>> resumablesForCheckpoint : pendingCommitters.entrySet()) { List<RecoverableWriter.CommitRecoverable> resumables = resumablesForCheckpoint.getValue(); out.writeLong(resumablesForCheckpoint.getKey()); out.writeInt(resumables.size()); for (RecoverableWriter.CommitRecoverable resumable : resumables) { byte[] serialized = commitableSerializer.serialize(resumable); out.writeInt(serialized.length); out.write(serialized); } } }