Thursday, 20 October 2011

Explain Architecture of Teradata

The biggest strength of the Teradata is the parallelism. So the architecture of the Teradata is designed in such way to keep this strength in mind. Teradata is unique from any other database because of its unique architecture only.

The main components of Teradata architecture is PE (Parsing Engine), AMP (Access Module Processor) and BYNET . We look into these components in details after looking the logical view of the architecture.


The logical view of Teradata architecture is given below -

Parsing Engine (PE) – Whenever a user login to Teradata it actually connect to Parsing Engine (PE). When a user submits a query, then the PE takes action, it creates a plan and instruct AMPs what to do in order to get the result from the query. The PE knows all, it knows how many AMPs are connected to Teradata system, how many rows are in the table and what is the best possible plan to execute the query.

This is why the PE is also called as the ‘OPTIMIZER’.

Beside making a perfect plan for query execution PE also make a check on the access right of the user that weather the user has the privilege to execute the query or not.
In this way PE also perform security feature on the users.

Access Module Processor (AMP) – Each AMP attached to the Teradata system listens to the PE via the BYNET for instructions. Each AMP is connected to its own disk and has the privilege to read or write the data to its disk. The AMP can be best considered as the computer processor with its own disk attached to it. Whenever it receives the instructions from the PE it fetches the data from its disk and sends it to back to PE through BYNET. Each AMP is allowed to read and write in its own disk ONLY. This is known as the ‘SHARED NOTHING ARCHITECTURE’. Teradata spreads the rows of the table evenly across all the AMPs, when PE asks for data all AMPs work simultaneously and read the records from its own DISK. Hence a query will be as slow as the slowest AMP in the system. This is known as parallelism.

BYNET – The BYNET is the communication channel between PE and AMP. It ensures that the communication between PE and AMP is correct and on right track.
In Teradata system there are always two BYNET system.
They are called as ‘BYNET 0’ and ‘BYNET 1’. But  we refer them as a single BYNET system. The reason two BYNET exist on a Teradata system is that –

1)      If one BYNET fails, the second BYNET takes over it place.
2)      Two BYNET improve the performance of the system, the PE and AMP can talk to each other over both BYNET which fasten the communication.

Short summary –
    * The PE checks the syntax of the query, check the user security rights
    * Then PE come up with the best optimized plan for the execution of the query
    * The PE passes this plan through BYNET to AMPs.
    * The AMPs follow the plan and retrieve the data from its DISK.
    * Then AMPs passes the data to PE through BYNET.
    * The PE then passes the data to the user.

No comments:

Post a Comment