1)Explain the architecture of DWH.
Ans)Sourceà(ODS)stagingàIntermediate stagingàWarehouseàdata martàcube or reporting
2)What is ODS ?
Ans)It is an area b/w source and Staging where all records are consolidated and clients use it for reporting purpouse.
Full form is Operational Data Store.
3)How would you compare b/w source and target? Ways?
Ans)
*Reconciliation –source and target count comparison
*Null/Not null validation
*Data completeness by using minus query
*Mapping b/w source and target using mapping document.
*Domain correctness and business code validation.
*Business logic or transformation validation(Sampling techniques to validate data transformed)
*Truncation of records by checking the data type size
*Rejected data analysis
*Referential check when data is inserted in foreign key table.
*check sum or record counts to be compared.
*Range validation using boundary value analysis
*Regression testing(keeping source and target results and comparing with the new runs).
*Negative testing
*Integration testing to check whether the ETL performs well with up and downstream systems.
4)Find out duplicate rows in a table?
Ans) select (all_col_names) from tab_name group by all_col_names having count(*)>1
5)What is Full load, initial load and incremental load?
Ans)Initial load (or)full load is first time load and
incremental is incrementing to the existing ones.
Full load is nothing but SCD 1 and
Incremental load is SCD2 .
Refresh load is erasing the contents of one or more tables and reloading it.
6)What is CDC (CHANGE DATA CAPTURE)?
Ans)It is SCD2 where delta records are captured using SCD2 or SCD3.
7)Look up transformation? How to validate?
Ans)Lookup on the target and compare with the source and either update or insert.
Combine different tables in source side using inner joins and using minus query compare with the target.
Mainly validate the data type between source and target using the lookup or the document.
8)DEFECT LIFECYCLE
Ans)New open assign test verify deferred reopened duplicates rejected closed
9)
INPUT 3 tables as below
| |||||
Product_Table
| |||||
Prod_Id
|
Prod_Name
| ||||
1
|
A
| ||||
2
|
B
| ||||
3
|
C
|
Target_Table
| |||
Prod_Id
|
Prod_Dept
|
Prod_Name
| |||
1
|
10
|
A
| |||
DEPT_Table
|
2
|
20
|
B
| ||
Prod_Id
|
Prod_Dept
|
3
|
30
|
C
| |
1
|
10
|
4
|
40
|
D
| |
2
|
20
| ||||
3
|
30
| ||||
OUTPUT
| |||||
Target_Table
| |||||
Prod_Id
|
Prod_Dept
|
Prod_Name
| |||
4
|
?
|
?
|
1.
Query to display the target record row number 4 which is not present in Source tables using Outer Joins.
SELECT T.* FROM
target_table T LEFT OUTER JOIN /*Target table is left table and the source table is right table*/
(SELECT p. Prod_Id, P. Prod_Name ,D. Prod_Dept /*Joining 2 source tables*/
FROM Product_Table P, DEPT_Table D
Where p. Prod_Id= D. Prod_Id
) Source_Table S
On T. Prod_Id= S. Prod_Id
Where T. Prod_Dept is null;/*Source table do not have record no 4 so it displays nULL*/
10)What is error log table ??
Ans)emp(source) table is to be loaded into EMP1 (target)
In informatica, for this mapping session properties specify the dbname to where it has to send the reject records.
Suppose 4 records are getting rejected then those will go into 4 tables of that particular db.Below are those.
1. PMERR_MSG
2. PMERR_SESS
3. PMERR_TRANS
4. PMERR_DATA
11)SCD2 Validation:
Product st_date end_date
Watch 1/7 5/7
Watch 6/7 13/7
Watch 11/7 15/7
Here the highlighted records are violating scd condition (i.e)end date of 2nd record is after the start date of 3 rd record.How will you validate?
Ans)Using lead function (not sure please check).
Win Exciting and Cool Prizes Everyday @ www.2vin.com, Everyone can win by answering simple questions.Earn points for referring your friends and exchange your points for cool gifts.
ReplyDelete