Thursday, March 13, 2014

Abinitio Beginning

  1. What is Abinitio?
Ans) Abinitio is a latin word which means start from the beginning.It is an ETL tool.ETL stands for extraction,transform and load.It is a powerful ETL tool used in data warehousing.Its main objective is to process the data for enterprise.

This software works as a client server model.

Client is GDE i.e Graphical Development Environment.Server is CO>OPERATING system.Parallelism and Integration is the main part of data warehousing.Abinitio code is called graph which has got an extension .mp.

Ab-initio is having 13 in-built components that will be used to achive  the operations.These are as follows:

  • Sort
  • compress
  • deprecated
  • partition
  • transform
  • continuous
  • dataset
  • ftp
  • miscellaneous
  • translate
  • validate
  • database
  • DE-partition


Important Questions:


 Q1) What component need to be used to lower the size of the file?
Ans) Deflate or compress are the components that can be used to lower the size of the file.

Q2)  Can a graph be infinitely run? If yes how?
Ans) A graph can run infinitely by call the .ksh in the end of the script.

Q3) What meaning has lock in abinitio?
Ans) A graph must be locked in order to give permission to the developers to edit the object if needed.For eg if any other developer want to make change in the same object then he ll get warn that this graph has already been locked by some other user.This is basically for protection mechanism.

Q4)What is EME?
Ans) EME stands for Enterprise meta environment.It is basically repository to store all the objects.It is also called as version controller.It keep track of graphs or other objects.

Q5)What role does xfr plays in Abinitio?
Ans) XFR is basically used to store the mapping.It is useful because rewriting the code takes time and xfr saves that efforts.

Q6)What is the difference between phase and checkpoint?
Ans) Phase basically deletes the intermediate file(temporary files) before a new phase begins which is quite different from checkpoint.Checkpoint keeps the temporary files till the end of the graph hence it can start from the last good process.

Q7) How much memory do we need for a graph?
Ans)Some calculations lead to 8 MB plus.MAX_CORE and phase size of the file.

Q8)How the term Standard environment can be defined?
Ans) The term standard environment is basically used when it include more than one project i.e private and public.

Q9)What is the difference between DB config and cfg?
Ans) Similarity between both is that they both used in database connectivity.The difference is that cfg used in Informix database however Db config is used in sql server and oracle.

Q10)What is the difference between Scan and rollup component?
Ans) Scan is basically used to get the cumulative summary of records and rollup is used to get the summarized records.

Q11)What are supported layouts in Abinitio?
Ans) Abinitio supports both serial and parallel layouts.Parallel layouts is basically related to parallelism degree.

Q12)What is the definition of multistage components?
Ans) Multistage components are basically transform components that includes different packages.

Q13)What can we say about partition by key and partition by round robin?
Ans) Partition by key also known as hash partition when we have diverse keys.It is basically used for processing parallel data.Round Robin is the technique that allows us to distribute the data on every partition uniformly.

Q14)What is driving port?
Ans) Driving port is basicall used to increase the performance of the graph.

Q15)What is the reason for a database to contain stored procedures?
Ans) The main reason is the network traffic reduction.Stored pocedures are precompiled sql blocks,the time of execution can be reduced.In this way the application performance will get increase being stored in the database the procedure will be called by the application and execute faster than in case of not already compiled.They also provide reusability for different other applications.

Q16)What is the reasons for using the parameterized graphs?
Ans)When we are trying to use the same graph many times for various files in this way we should set the parameters in the graph.So we can keep a generic graph to achieve it.

Q17)Different between API and Utility mode?
Ans)API and utility mode both are used as the connection interface of performing specific tasks required by the user.
The difference between the two is that API is slower but provide a high range of flexibility.Also API is considered to be more diagnostic feature.

Q18)What methods exists for performance tuning?
Ans)The main focus is to use join when we have two tables that needs to bring together.Alternatively we can write the query to make join at the level of database advantage is it will hit the database only once and it will improve the performance.










 

1 comment: