org.apache.flink.streaming.api.transformations.StreamTransformation java code examples

if (transform.getMaxParallelism() <= 0) {
    transform.setMaxParallelism(globalMaxParallelismFromConfig);
transform.getOutputType();
if (transform.getBufferTimeout() >= 0) {
  streamGraph.setBufferTimeout(transform.getId(), transform.getBufferTimeout());
if (transform.getUid() != null) {
  streamGraph.setTransformationUID(transform.getId(), transform.getUid());
if (transform.getUserProvidedNodeHash() != null) {
  streamGraph.setTransformationUserHash(transform.getId(), transform.getUserProvidedNodeHash());
if (transform.getMinResources() != null && transform.getPreferredResources() != null) {
  streamGraph.setResources(transform.getId(), transform.getMinResources(), transform.getPreferredResources());

/**
 * Creates a new {@code SplitTransformation} from the given input and {@code OutputSelector}.
 *
 * @param input The input {@code StreamTransformation}
 * @param outputSelector The output selector
 */
public SplitTransformation(StreamTransformation<T> input,
    OutputSelector<T> outputSelector) {
  super("Split", input.getOutputType(), input.getParallelism());
  this.input = input;
  this.outputSelector = outputSelector;
}

/**
 * Adds a type information hint about the return type of this operator. This method
 * can be used in cases where Flink cannot determine automatically what the produced
 * type of a function is. That can be the case if the function uses generic type variables
 * in the return type that cannot be inferred from the input type.
 *
 * <p>In most cases, the methods {@link #returns(Class)} and {@link #returns(TypeHint)}
 * are preferable.
 *
 * @param typeInfo type information as a return type hint
 * @return This operator with a given return type hint.
 */
public SingleOutputStreamOperator<T> returns(TypeInformation<T> typeInfo) {
  requireNonNull(typeInfo, "TypeInformation must not be null");
  transformation.setOutputType(typeInfo);
  return this;
}

/**
 * Sets the parallelism and maximum parallelism of this operator to one.
 * And mark this operator cannot set a non-1 degree of parallelism.
 *
 * @return The operator with only one parallelism.
 */
@PublicEvolving
public SingleOutputStreamOperator<T> forceNonParallel() {
  transformation.setParallelism(1);
  transformation.setMaxParallelism(1);
  nonParallel = true;
  return this;
}

/**
 * Gets the parallelism for this operator.
 *
 * @return The parallelism set for this operator.
 */
public int getParallelism() {
  return transformation.getParallelism();
}

/**
 * Returns the ID of the {@link DataStream} in the current {@link StreamExecutionEnvironment}.
 *
 * @return ID of the DataStream
 */
@Internal
public int getId() {
  return transformation.getId();
}

/**
 * Gets the minimum resources for this operator.
 *
 * @return The minimum resources set for this operator.
 */
@PublicEvolving
public ResourceSpec getMinResources() {
  return transformation.getMinResources();
}

/**
 * Sets the name of the current data stream. This name is
 * used by the visualization and logging during runtime.
 *
 * @return The named operator.
 */
public SingleOutputStreamOperator<T> name(String name){
  transformation.setName(name);
  return this;
}

/**
 * Sets the parallelism for this operator.
 *
 * @param parallelism
 *            The parallelism for this operator.
 * @return The operator with set parallelism.
 */
public SingleOutputStreamOperator<T> setParallelism(int parallelism) {
  Preconditions.checkArgument(canBeParallel() || parallelism == 1,
      "The parallelism of non parallel operator must be 1.");
  transformation.setParallelism(parallelism);
  return this;
}

/**
 * Sets the {@link ChainingStrategy} for the given operator affecting the
 * way operators will possibly be co-located on the same thread for
 * increased performance.
 *
 * @param strategy
 *            The selected {@link ChainingStrategy}
 * @return The operator with the modified chaining strategy
 */
@PublicEvolving
private SingleOutputStreamOperator<T> setChainingStrategy(ChainingStrategy strategy) {
  this.transformation.setChainingStrategy(strategy);
  return this;
}

/**
 * Sets an ID for this operator.
 *
 * <p>The specified ID is used to assign the same operator ID across job
 * submissions (for example when starting a job from a savepoint).
 *
 * <p><strong>Important</strong>: this ID needs to be unique per
 * transformation and job. Otherwise, job submission will fail.
 *
 * @param uid The unique user-specified ID of this transformation.
 * @return The operator with the specified ID.
 */
@PublicEvolving
public SingleOutputStreamOperator<T> uid(String uid) {
  transformation.setUid(uid);
  return this;
}

/**
 * Sets the slot sharing group of this operation. Parallel instances of
 * operations that are in the same slot sharing group will be co-located in the same
 * TaskManager slot, if possible.
 *
 * <p>Operations inherit the slot sharing group of input operations if all input operations
 * are in the same slot sharing group and no slot sharing group was explicitly specified.
 *
 * <p>Initially an operation is in the default slot sharing group. An operation can be put into
 * the default group explicitly by setting the slot sharing group to {@code "default"}.
 *
 * @param slotSharingGroup The slot sharing group name.
 */
@PublicEvolving
public SingleOutputStreamOperator<T> slotSharingGroup(String slotSharingGroup) {
  transformation.setSlotSharingGroup(slotSharingGroup);
  return this;
}

Assert.assertEquals(-1, operator.getTransformation().getMaxParallelism());
Assert.assertEquals(-1, operator.getTransformation().getMaxParallelism());
Assert.assertEquals(42, operator.getTransformation().getMaxParallelism());
Assert.assertEquals(1, operator.getTransformation().getMaxParallelism());
Assert.assertEquals(1 << 15, operator.getTransformation().getMaxParallelism());
Assert.assertEquals(1 << 15 , operator.getTransformation().getMaxParallelism());

public SideOutputTransformation(StreamTransformation<?> input, final OutputTag<T> tag) {
  super("SideOutput", tag.getTypeInfo(), requireNonNull(input).getParallelism());
  this.input = input;
  this.tag = requireNonNull(tag);
}

/**
 * Sets the parallelism and maximum parallelism of this operator to one.
 * And mark this operator cannot set a non-1 degree of parallelism.
 *
 * @return The operator with only one parallelism.
 */
@PublicEvolving
public SingleOutputStreamOperator<T> forceNonParallel() {
  transformation.setParallelism(1);
  transformation.setMaxParallelism(1);
  nonParallel = true;
  return this;
}

  @Override
  public DataStreamSource<T> setParallelism(int parallelism) {
    if (parallelism != 1 && !isParallel) {
      throw new IllegalArgumentException("Source: " + transformation.getId() + " is not a parallel source");
    } else {
      super.setParallelism(parallelism);
      return this;
    }
  }
}

/**
 * Gets the minimum resources for this operator.
 *
 * @return The minimum resources set for this operator.
 */
@PublicEvolving
public ResourceSpec getMinResources() {
  return transformation.getMinResources();
}

/**
 * Sets the name of this sink. This name is
 * used by the visualization and logging during runtime.
 *
 * @return The named sink.
 */
public CassandraSink<IN> name(String name) {
  if (useDataStreamSink) {
    getSinkTransformation().setName(name);
  } else {
    getStreamTransformation().setName(name);
  }
  return this;
}

/**
 * Sets the parallelism for this sink. The degree must be higher than zero.
 *
 * @param parallelism The parallelism for this sink.
 * @return The sink with set parallelism.
 */
public CassandraSink<IN> setParallelism(int parallelism) {
  if (useDataStreamSink) {
    getSinkTransformation().setParallelism(parallelism);
  } else {
    getStreamTransformation().setParallelism(parallelism);
  }
  return this;
}

/**
 * Turns off chaining for this operator so thread co-location will not be
 * used as an optimization.
 * <p/>
 * <p/>
 * Chaining can be turned off for the whole
 * job by {@link org.apache.flink.streaming.api.environment.StreamExecutionEnvironment#disableOperatorChaining()}
 * however it is not advised for performance considerations.
 *
 * @return The sink with chaining disabled
 */
public CassandraSink<IN> disableChaining() {
  if (useDataStreamSink) {
    getSinkTransformation().setChainingStrategy(ChainingStrategy.NEVER);
  } else {
    getStreamTransformation().setChainingStrategy(ChainingStrategy.NEVER);
  }
  return this;
}

Javadoc

A StreamTransformation represents the operation that creates a org.apache.flink.streaming.api.datastream.DataStream. Every org.apache.flink.streaming.api.datastream.DataStream has an underlying StreamTransformation that is the origin of said DataStream.

API operations such as org.apache.flink.streaming.api.datastream.DataStream#map create a tree of StreamTransformations underneath. When the stream program is to be executed this graph is translated to a StreamGraph using org.apache.flink.streaming.api.graph.StreamGraphGenerator.

A StreamTransformation does not necessarily correspond to a physical operation at runtime. Some operations are only logical concepts. Examples of this are union, split/select data stream, partitioning.

The following graph of StreamTransformations:

 
  Source              Source

Would result in this graph of operations at runtime:

 
Source              Source

The information about partitioning, union, split/select end up being encoded in the edges that connect the sources to the map operation.

Most used methods

getMaxParallelism
Gets the maximum parallelism for this stream transformation.
getParallelism
Returns the parallelism of this StreamTransformation.
setOutputType
Tries to fill in the type information. Type information can be filled in later when the program uses
getId
Returns the unique ID of this StreamTransformation.
getMinResources
Gets the minimum resource of this stream transformation.
setChainingStrategy
Sets the chaining strategy of this StreamTransformation.
setName
Changes the name of this StreamTransformation.
setParallelism
Sets the parallelism of this StreamTransformation.
setSlotSharingGroup
Sets the slot sharing group of this transformation. Parallel instances of operations that are in the
setUid
Sets an ID for this StreamTransformation. This is will later be hashed to a uidHash which is then us
setUidHash
Sets an user provided hash for this operator. This will be used AS IS the create the JobVertexID.The
getBufferTimeout
Returns the buffer timeout of this StreamTransformation.

Popular in Java

Updating database using SQL prepared statement
addToBackStack (FragmentTransaction)
getContentResolver (Context)
putExtra (Intent)
ObjectMapper (com.fasterxml.jackson.databind)
ObjectMapper provides functionality for reading and writing JSON, either to and from basic POJOs (Pl
FileOutputStream (java.io)
An output stream that writes bytes to a file. If the output file exists, it can be replaced or appen
InputStream (java.io)
A readable source of bytes.Most clients will use input streams that read data from the file system (
ConnectException (java.net)
A ConnectException is thrown if a connection cannot be established to a remote host on a specific po
SQLException (java.sql)
An exception that indicates a failed JDBC operation. It provides the following information about pro
Manifest (java.util.jar)
The Manifest class is used to obtain attribute information for a JarFile and its entries.
Top 12 Jupyter Notebook extensions

How to useStreamTransformation in org.apache.flink.streaming.api.transformations

Best Java code snippets using org.apache.flink.streaming.api.transformations.StreamTransformation (Showing top 20 results out of 315)

How to use
StreamTransformation
in
org.apache.flink.streaming.api.transformations