The data is stored in a file hence, it is possible that the source system did not send the data in correct format.Įven though the downstream ETL processes are not responsible for incoming upstream data from source system, still it is important to validate the source data because Date of Birth format is in YYYYMMDD format.If customer name is missing the record might not be useful. Make sure “Name” attribute is NOT NULL.This rule tests if the data in the source is valid. This data validation of customer file is needed to ensure that ETL1 gets correct data. Also, this example proves that the concepts for ETL testing are totally different from that of GUI based software testing. There are various permutations and combination of these types of rules including advancing complexity. Hence, it will not only prove the validation of counts but will also prove that the customer is exactly the same on both sides. This kind of reconciliation can be done at the row or attribute level. Physical Test 2: Compare each corporate customer from the source to the customer on target. This is pretty good test however the count might misguide if the same record is loaded more than once as it cannot distinguish between each customer. If the ETL transformation is correct the count should be an exact match. Count the customers from the target table. Physical Test 1: Count the corporate customers from the source. The transformation rule also specifies that output should only have corporate customers. The test cases required to validate the ETL process by reconciling the source (input) and target (Output) data. The requirement is that an ETL process should take the corporate customers only and populate the data in a target table. We will use a simple example below to explain the ETL testing mechanism.Ī source table has an individual and corporate customer. The above equation can provide us the information for creating physical tests to validate the ETL processes. I will use a simple example below to explain the ETL testing mechanism. Input Data + Transformation = Output Data This can be represented by the following simplistic equation. An ETL process at its core reads data, applies a transformation on it and then loads the data. The answer lies in the understanding of an ETL process.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |