About Data Manipulation Tool Stages

Data Manipulation Tool is a tool based on Streamsets Data Collector, an open source low-latency ingest infrastructure application used to import, transfer, load and process data for later usage or storage in a database. Therefore, it allows you to create workflows that fit the complexity of your data.

Data Manipulation Tool provides you with a user interface where you define a pipeline to describe the flow of data from the origin system to destination systems. Along the pipeline, you add stages, which can be of the following types:

  • Origin stages - also called origins, they are the entry points in the pipeline.
  • Processor stages - also called processors, they allow you to transform the data.
  • Executor stages, also called executors, they allow you to perform some actions, like sending an email, executing a query or running a custom shell command.
  • Destination stages, also called destinations, they are the exit points from the pipeline.

To find out more about the pipeline concepts and design, see the Streamsets Data Collector documentation

You are provided with stages that are tailored to be used with the core data structures: cubes and reports. In the following sections you find out more about them.