Thursday, 29 September 2011

Explain Primary Index

It is used for data distribution among the AMPs. There is always one index in a table and it is must for every table (even if you don't define it explicitly).

A primary index should be defined to provide a nearly uniform distribution of rows among the  AMPs, the more unique the index, the more even the distribution of rows and the better  space utilization.
The index should be defined on as few columns as possible. 
Primary index can be either Unique or non-unique. A unique index must have a unique value in the corresponding fields of every row.

A non-unique index permits the insertion of duplicate field values. The unique primary index is more efficient. Once created, the primary index cannot be dropped or modified, the  index must be changed by recreating the table. 

Once created, the primary index cannot be dropped or modified, the index must be changed by recreating the table. 

Creating index with create table command
create table organic (serial_No integer,
                                   organic_name char(15),
                                   Carbon_number smallint,
                                    amount smallint)  
unique primary index (serial_No);

create table inorganic (serial_No integer,
                                      inorgnic_name char(15),
                                      anion char(5),
                                      cation char(6),
                                      amount smallint)  
unique primary index (serial_No); 

non unique primary index:
create table student (student_ID char(15),
                                  student_name char(20),
                                  Department char(10))  
primary index (student_ID);

Thursday, 15 September 2011

Teradata Utilities

Teradata loading utilities
The article contains comparison and main features of the data loading tools provided by Teradata. The tutorial illustrates main features of Teradata Multiload , FastLoad and TPump (Parallel Data Pump) and provides sample real-life uses of those tools.

Fast Load

  • Main use=> To load empty tables at high speed.
  • The target tables must be empty in order to use FastLoad.
  • It supports inserts only(not possible to perform updates or deletes in FastLoad).
  • Although Fastload uses multiple sessions to load the data, only one target table 
  •    can be processed at a time.
  • Teradata Fastload does not support join indexes, foreign key references in target 
  •     tables and tables with secondary index defined.
  • It is necessary to drop any of the constraints listed before loading and recreate
  •     them afterwards.
  • The maximum number of concurrent Teradata Fastload tasks can be adjusted 
  •     by a system administrator.
  • Fastload runs in two operating modes: Interactive and Batch.
  • Duplicate rows will not be loaded. 

  • Multi Load

  • Main use: Load, update and delete large tables in Teradata in a bulk mode
  • Efficient in loading very large tables.
  • Multiple tables can be loaded at a time.
  • Updates data in a database in a block mode (one physical write can update 
  •    multiple rows.
  • Uses table-level locks.
  • Resource consumption: loading at the highest possible throughput.
  • Duplicate rows allowed .

  • TPump
  • Main use: to load or update a small amount of target table rows
  • Sends data to a database as a statement which is much slower than using bulk mode
  • TPump uses row-level hash locks
  • Resource consumption: loading speed can be adjusted using a built-in resource 
  •    consumption management utility. The throughput can be turned down in peak periods.
  • TPump does not support MULTI-SET tables. 

  • FastLoad(script)

    ===========================


    The following script attached below will load a sample fixed-length columns extract into a Teradata database using FastLoad.
    
    
    SESSIONS 4;
    ERRLIMIT 25;
    logon tdpid/username,password;
    
    create table gg_cli (
    wh_cust_no integer not null,
    cust_name varchar(200),
    bal_amt decimal(15,3) format �ZZZ,ZZ9.999�
    )
    unique primary index( wh_cust_no ) ;
    
    SET RECORD UNFORMATTED;
    
    define
                    wh_cust_no(char(10)), delim1(char(1)),
                    cust_name(char(200)), delim2(char(1)),
                    bal_amt(char(18)), delim3(char(1))
                    newlinechar(char(1))
    file=insert.input;
    
    SHOW;
    
    BEGIN LOADING gg_cli errorfiles error_1, error_2;
    
    insert into gg_cli (
                    :wh_cust_no,
                    :cust_name,
                    :bal_amt
    );
    
    END LOADING;
    
    logoff;
    

    MultiLoad(script)

    ===========================

    The following script attached below will load a sample fixed-length columns extract into a Teradata database using MultiLoad. Use the following command to run load the ggclients.mload file using Teradata FastLoad script:

    .logtable inslogtable;
    .logon tdpid/username,password;
    
    create table gg_cli (
    wh_cust_no integer not null,
    cust_name varchar(200),
    bal_amt decimal(15,3) format �ZZZ,ZZ9.999�
    )
    unique primary index( wh_cust_no ) ;
    
    .BEGIN IMPORT MLOAD tables gg_cli;
    
    .layout ggclilayout;
                    .field wh_cust_no 1 char(10);
                    .field cust_name 12 char(200);
                    .field bal_amt 213 char(18);
    .dml label insertclidml;
    
    insert into gg_cli.*;
    
    .import infile insert.input
                    format text
                    layout ggclilayout
                    apply insertclidml;
    .END MLOAD;
    
    .logoff;
    
    

    TPump(script)

    ===========================

    The sample script attached below loads a sample fixed-length columns extract into a Teradata database using Parallel Data Pump - Teradata TPump. 

    .logtable tpumplogtable;
    .logon tdpid/username,password;
    
    .BEGIN LOAD SESSION 4;
    
    .layout ggclilayout;
                    .field wh_cust_no 1 char(10);
                    .field cust_name 12 char(200);
                    .field bal_amt 213 char(18);
    .dml label insertclidml;
    
    insert into gg_cli.*;
    
    .IMPORT INFILE insert.input
                    layout ggclilayout
                    apply insertclidml;
    .END LOAD;
    
    .logoff;




    Monday, 5 September 2011

    Stats in Teradata

    Collect stats is a mechanism which helps optimizer to get the least cost execution plan for each requested query.

    Collect statistics on table
    collect stats on TABLE_NAME;

    Collect statistics on column of table
    collect stats on TABLE_NAME column(COL1);
    collect stats on TABLE_NAME column(COL2);

    To get detail about statistics of a table
    Help stat DBName.TableName;


    Friday, 2 September 2011

    Macro in Teradata


    What is macro. What is the difference between macro and procedure.
    A macro is the set of sql statements which can be run as a single statement.
              Macro                                                                    Procedure
    Macro returns set of data/rows to the user.     It does not return data/rows to the user.
    It is allows only input values.                            It provides Input/Output parameter.
    It is stored in DBC Perm space.                    It is stored in DATABASE/USER Perm space.

    To create a macro:
    CREATE MACRO macro_name AS
    (SELECT eid, ename  FROM emp WHERE deptno=20;);
     To execute a macro:
     EXEC macro_name; 

    To create a parametrized macro:
    CREATE MACRO macro_name( in_param DATATYPE)
    (SELECT eid, ename  FROM emp WHERE deptno= :in_param;);
    To execute a macro:
    EXEC <macro_name> [parameter value list];