IBM Information Server 8.X (DataStage): Parallel transformer stage

DataStage: What is Transformer Stage?

DataStage provides several stages for extracting, transforming, and loading data into data warehouses or data marts. The stages are classified as General, Database, Development and Debugging, File, Processing, Real Time, etc. These stages will be classified into categories of active or passive stages.

The transformation stage is a processing stage.

This stage allows us to create transformations to apply to your data based on the given business rules.

It can have a single input and any number of outputs. You can also have a reject link that takes rows that have not been written to any of the output links due to an expression write or evaluation error or null handling rejects.

The transformer stage is divided into

1. Link area

  • Define column definition
  • Define stage variables

2. Metadata Area

  • Define column metadata for input and output

Exit Links:

  1. Pass some data directly through the altered Transformer stage
  2. Modify the derivation by entering the transformation expression.
  3. Specify constraints that operate on entire output links
  4. You can also specify a constraint of another link, which is an output link that carries all data that is not output in other links, that is, columns that have not met the criteria.

A constraint is an expression that specifies the criteria that the data must meet before it can be passed to the output link.

Reject link:

You can also specify another link that takes rows that have not been written to any other link due to a write error or expression evaluation error. This is specified outside of the stage by adding a link and making it a reject link. Any records that are discarded due to null handling will also be written to reject the link.

If runtime column propagation is enabled, no metadata is required for the outputs.

Find and Replace capabilities allow you to find the particular string within an expression or search for column names or find an empty expression in expression types.

Defining output column derivations:

  • Use drag and drop or copy and paste to copy an input column to the outputs
  • Automatic column matching facility to automatically configure derived columns from your matching input columns.

Automatic column matching

  1. Choose the output link you want to match the columns with the input link from the dropdown list.
  2. Match type area.
    • Location Match – This will set column branches to the input link columns at the equivalent positions.
    • Name Match: The output taps established based on the name match.

RESTRICTIONS and OTHERWISE/Registration

A constraint is an expression that specifies the criteria that the data must meet before it can be passed to the output link.

  • Click the Otherwise/Record field to make a check mark appear and leave the Restriction fields blank. This will detect rows that have not met the constraints in all previous output links.
  • Clicking the Otherwise/Log field will log the number of rows written to that link (ie rows that satisfy the constraint) to the job log as a warning message.

Along with these, we can define local stage variables, use system variables, and we can also set partition methods and sort operations.

Leave a Reply

Your email address will not be published. Required fields are marked *