Get maximum default range size of a file being split during GGFS task execution. When GGFS task is about to
be executed, it requests file block locations first. Each location is defined as
GridGgfsFileRange which
has length. In case this parameter is set to positive value, then GGFS will split single file range into smaller
ranges with length not greater that this parameter. The only exception to this case is when maximum task range
length is smaller than file block size. In this case maximum task range size will be overridden and set to file
block size.
Note that this parameter is applied when task is split into jobs before
GridGgfsRecordResolver is
applied. Therefore, final file ranges being assigned to particular jobs could be greater than value of this
parameter depending on file data layout and selected resolver type.
Setting this parameter might be useful when file is highly colocated and have very long consequent data chunks
so that task execution suffers from insufficient parallelism. E.g., in case you have one GGFS node in topology
and want to process 1Gb file, then only single range of length 1Gb will be returned. This will result in
a single job which will be processed in one thread. But in case you provide this configuration parameter and set
maximum range length to 16Mb, then 64 ranges will be returned resulting in 64 jobs which could be executed in
parallel.
Note that some
GridGgfs.execute() methods can override value of this parameter.
In case value of this parameter is set to
0 or negative value, it is simply ignored. Default value is
0.