TimePartitionedFileSetArguments.setOutputPartitionTime(sinkArgs, outputPartitionTime); if (!Strings.isNullOrEmpty(tpfsSinkConfig.filePathFormat)) { TimePartitionedFileSetArguments.setOutputPathFormat(sinkArgs, tpfsSinkConfig.filePathFormat,
@Override public void run(JavaSparkExecutionContext sec) throws Exception { JavaSparkContext jsc = new JavaSparkContext(); String input = sec.getRuntimeArguments().get("input"); String output = sec.getRuntimeArguments().get("output"); // read the dataset JavaPairRDD<Long, String> inputData = sec.fromDataset(input); JavaPairRDD<String, Integer> stringLengths = transformRDD(inputData); // write the character count to dataset sec.saveAsDataset(stringLengths, output); String inputPartitionTime = sec.getRuntimeArguments().get("inputKey"); String outputPartitionTime = sec.getRuntimeArguments().get("outputKey"); // read and write datasets with dataset arguments if (inputPartitionTime != null && outputPartitionTime != null) { Map<String, String> inputArgs = new HashMap<>(); TimePartitionedFileSetArguments.setInputStartTime(inputArgs, Long.parseLong(inputPartitionTime) - 100); TimePartitionedFileSetArguments.setInputEndTime(inputArgs, Long.parseLong(inputPartitionTime) + 100); // read the dataset with user custom dataset args JavaPairRDD<Long, String> customPartitionData = sec.fromDataset(input, inputArgs); // create a new RDD with the same key but with a new value which is the length of the string JavaPairRDD<String, Integer> customPartitionStringLengths = transformRDD(customPartitionData); // write the character count to dataset with user custom dataset args Map<String, String> outputArgs = new HashMap<>(); TimePartitionedFileSetArguments.setOutputPartitionTime(outputArgs, Long.parseLong(outputPartitionTime)); sec.saveAsDataset(customPartitionStringLengths, output, outputArgs); } }
TimePartitionedFileSetArguments.setOutputPartitionTime(args, date.getTime()); TimeZone timeZone = Calendar.getInstance().getTimeZone(); TimePartitionedFileSetArguments.setOutputPathFormat(args, "yyyy-MM-dd/HH_mm", timeZone.getID());
TimePartitionedFileSetArguments.setInputEndTime(inputArgs, inputTime + 100); Map<String, String> outputArgs = new HashMap<>(); TimePartitionedFileSetArguments.setOutputPartitionTime(outputArgs, outputTime); Map<String, String> args = new HashMap<>(); args.putAll(RuntimeArguments.addScope(Scope.DATASET, "tpfs", inputArgs));
TimePartitionedFileSetArguments.setOutputPartitionTime(outputArgs, time); final ImmutableMap<String, String> assignedMetadata = ImmutableMap.of("region", "13", "data.source.name", "input", TimePartitionedFileSetArguments.setOutputPartitionTime(outputArgs, time5); runtimeArguments.putAll(RuntimeArguments.addScope(Scope.DATASET, TIME_PARTITIONED, outputArgs));