Variable Port Initialization - default values are not NULL

Variable Initialization

The Integration Service does not set the initial value for variables to NULL.
The Integration Service uses the following guidelines to set initial values for variables:

• Zero for numeric ports
• Empty strings for string ports
• 01/01/0001 for Date/Time ports

Therefore, use variables as counters, which need an initial value. For example, you can create a numeric variable with the following expression:
VAR1 + 1

This expression counts the number of rows in the VAR1 port. If the initial value of the variable were set to NULL, the expression would always evaluate to NULL. This is why the initial value is set to zero.

Execution Order of Ports in an Expression Transformation?

Port Order

All Ports in Expression Transformation are executed from Top to Bottom and the order of the ports execution is as follows.

Input ports. Informatica calculates all input ports first as they are not dependent on any other ports. So, we can create input ports in any order.

Variable ports. Variable ports can reference input ports and variable ports, but not output ports. As variable ports can reference input ports, the Informatica calculates variable ports after input ports. Likewise, since variable port can reference other variables, the display order for variable ports is the same as the order in which the PowerCenter Server evaluates each variable.

Output ports. As output ports can reference both input ports and variable ports, the Informatica evaluates output ports last. The display order for output ports does not matter, as output ports cannot reference other output ports. Output ports should appear at the bottom of the list of ports

Informatica - How to store a previous record value - Passing Previous Row Value

Please subscribe on my YouTube Channel.

How to Capture Previous value using variable

You can store previous record values by creating variable ports in the expression transformation.
By doing this you can compare previous record values with current record values

The important thing to understand is that values in an expression transformation are retained across rows and that the expressions are evaluated in order from top to bottom

----------------------Documentation------------------------------

The Integration Service evaluates ports in the following order:

Input ports. The Integration Service evaluates all input ports first since they do not depend on any other ports. Therefore, you can create input ports in any order. Since they do not reference other ports, the Integration Service does not order input ports.

Variable ports. Variable ports can reference input ports and variable ports, but not output ports. Because variable ports can reference input ports, the Integration Service evaluates variable ports after input ports. Likewise, since variables can reference other variables, the display order for variable ports is the same as the order in which the Integration Service evaluates each variable.

For example, if you calculate the original value of a building and then adjust for depreciation, you might create the original value calculation as a variable port. This variable port needs to appear before the port that adjusts for depreciation.

Output ports. Because output ports can reference input ports and variable ports, the Integration Service evaluates output ports last. The display order for output ports does not matter since output ports cannot reference other output ports. Be sure output ports display at the bottom of the list of ports.

Current record to previous record comparison

Mapping: Comparing Records using Variable Ports

How to use variable port?

Informatica - Pipeline

Understanding Pipeline

A pipeline consists of a source qualifier and all the transformations and targets that receive data from that source qualifier.

What is the use of Associate Expression / NewLookupRow in dynamic lookup

Please subscribe on my YouTube Channel.

Requirement: Match all the incoming value columns from the source to the corresponding column value in the reference table,

if the match is found get the value if no match then insert.

You need the Newlookuprow port to check what has happened in the cache.

You will use an Update strategy after the lookup with the dynamic cache.

If the value in the NewLookupRow port is 1, you need to set the record to DD_INSERT,

if it's 2 set it to DD_UPDATE.

if it's 0, you either set it to DD_REJECT or you use a router to route it to somewhere else.

The Associated Expression needs to have ports that you want to compare the lookup port with, otherwise, we wouldn't know what to insert/update in the cache.

Join condition ports are ignored in comparison. Also, you can ignore columns which you know will be changing like timestamp or date column.

Informatica - Handling duplicate rows while reading from Source

Problem Summary

Duplicate rows in the source, encountering primary key violation errors on the target table is a common issue encountered when running PowerCenter sessions.
You can use a Lookup transformation to find duplicate data in a target based on a key field (or a composite key). This works when comparing source rows to rows already existing in the target table.
But this technique does not entirely work when the source contains duplicate rows which do not exist in the target already.

You may attempt to use an uncached Lookup against the target and set the target commit interval to one so that the target table is queried for each row. Because of the way in which the PowerCenter Server processes rows (as described in LookUp Strategies Section ), this approach does not work. Even if this method worked, it would be very inefficient and performance would be severely affected.

Lookup Strategies

If you have duplicate rows in your source data, using a Lookup transformation will not reliably detect them.
A cached Lookup transformation caches the target rows at the beginning of the session and can detect duplicate rows that are already in the target, but not new duplicate rows being loaded in the same session. Turning off caching will not reliably catch duplicates. It ensures the target is re-queried for each row, but uncommitted rows temporarily stored buffers will not be accessible by the Lookup. Setting the commit point to one also will not work reliably. A database commit operation takes some milliseconds to complete, and the transformation engine does not pause - it keeps processing and transforming rows and buffering them in memory. So by the time the commit is complete, 100 rows might have been processed. If the 101st row is a duplicate the Lookup will find it, but not if there was a duplicate in the first 100 rows. A dynamic lookup can be used to overcome the above limitation.

The following examples demonstrate some of the most efficient ways to deal with duplicate source data.

Rank() Function is Source Qualifier SQL
Eliminate duplicate at the time when you are reading it from source.
or let's say you want to load the duplicate entries as an older version then you can pull all the data with respective ranks and use router transformation to route rank=1 and apply your logic accordingly and rank!=1 just load it as an older version of the records.

Aggregator Mapping - essentially same as above but within Informatica.

When to use Any time an Aggregator transformation has no major impact on performance or the source is relational. Or source is file and output row order does not matter.

Pros: Ease of implementation. Works with all types of sources.

Cons: Limits in the choice of output row. Aggregator cache can slow performance of large loads. Duplicates are silently removed and therefore no auditing.

Informatica - Cannot start a connector in an input only port

Edit the input Transformation ->Port Tab -> Enable all the Output Ports which you want to move to next transfromation.

Informatica Power Center-ETL