recent

Titulo

Sql*Loader Optimization

SQL*Loader is an Oracle utility to load data into a database. This utility is used for small or bulk data load which is much faster compared to a traditional SQL INSERT statement. INSERT may not work in all instances of data loading for instance when you are trying to load data from a file. External table or SQL*Loader is the only available means to load into database. External file has few limitations which requires some additional work therefore SQL*Loader is your go-to tool for data loading. Remember, this is just a loading utility not an unloading tool unlike in other DBMS systems.

SQL*Loader loads data from external sources into database that has a powerful data parsing engine. This tool is more efficient where you have more control and customize according to your need like DIRECT INSERT, NOLOGGING, PARALLEL and COMMIT INTERVAL etc. This is something very hard to implement with the SQL INSERT statement.

Next, we will have some practical demos on loading data from an external text file to an employee table using SQL*Loader utility. For this demo, we will need a table and a data file. We will first create employee table and then mock some data for employee table on employee_data.txt.
Employee DDL:
CREATE TABLE employees (
    employee_id    NUMBER(6),
    first_name     VARCHAR2(20),
    last_name      VARCHAR2(25) CONSTRAINT emp_last_name_nn NOT NULL,
    email          VARCHAR2(25) CONSTRAINT emp_email_nn NOT NULL,
    phone_number   VARCHAR2(20),
    hire_date      DATE CONSTRAINT emp_hire_date_nn NOT NULL,
    job_id         VARCHAR2(10) CONSTRAINT emp_job_nn NOT NULL,
    salary         NUMBER(8,2),
    commission_pct NUMBER(2,2),
    manager_id     NUMBER(6),
    department_id  NUMBER(4),
    CONSTRAINT emp_salary_min CHECK (salary > 0),
    CONSTRAINT emp_email_uk UNIQUE (email));
Employee_Load.txt: 
109,"Daniel","Faviet","DFAVIET","515.124.4169",16-AUG-94,"ACCOUNT",9000,,108,100
110,"John","Chen","JCHEN","515.124.4269",28-SEP-97,"ACCOUNT",8200,,108,100
111,"Ismael","Sciarra","ISCIARRA","515.124.4369",30-SEP-97,"ACCOUNT",7700,,108,100
112,"Jose Manuel","Urman","JMURMAN","515.124.4469",07-MAR-98"ACCOUNT",7800,,108,100
This employee.txt has header/column information followed by employee data. It has 4 employee records.
Control File: employee_data.clt:
OPTION (SKIP 1, DIRECT=TRUE, ROWS=1) 
LOAD DATA
INFILE '/home/pkbaniya/employee_data.txt'
REPLACE
INTO TABLE employee
FIELDS TERMINITED BY ","(
     EMPLOYEE_ID,
     FIRST_NAME,
     LAST_NAME,
     EMAIL,
     PHONE_NUMBER,
     HIRE_DATE,
     JOB_ID,
     SALARY,
     COMMISSION_PCT,
     MANAGER_ID,
     DEPARTMENT_ID)
The above code is stored inside an employee_data.ctl file. The code utilizes employee_data.txt as data file. When the loading goes through, it creates employee_data.log and employee_data.bad files. The log files all the loading details and .bad stores invalid data for review.

Below, we will go over some of the options used inside control file. INFILE: Specify the location of data file. LOG: Contains a detailed summary of load, including a description of any errors that occurred during the load. BADFILE: Stores records that cause errors during insert which are mostly improperly formatted.
OPTIONS:
SKIP 1:                  Telling Sql*Loader to ignore 1st line from data file.
DIRECT=TRUE              Load data faster using direct load.
PARALLEL=TRUE            Parallel processing if system is enabled
APPEND | REPLACE         Either replace or append the records
TERMINATED BY:           WHITESPACE  (for new line as separator);
                         Field separator     used in   data file. “,” COMMA
Executing SQL*Loader:
sqlldr username/password@dbarepublic control=/home/dbarepublic/employee_load.ctl

We just loaded data from a file to a table. So far, we are only concern on how to load data from a file? Once you know how it works, the next thing we are now concerns about how to make this load faster and efficient? A good DBA always look for ways to make the process faster, secure and efficient way of doing things.

Here are few tips that I recommend to make Sql*Loader goes faster. Always, test them before implementation in production. These below tips are option parameters from Loader itself and knowing them how to choose these parameters is very important for performance. Let's explore them.
  1. DIRECT=TRUE: The direct path loader (direct=true) loads directly into the Oracle data files and creates blocks in Oracle database block format. Much faster compared to traditional Insert.
  2. PARALLEL =TRUE: Multiple load process happening at the same time.
  3. UNRECOVERABLE: disables the writing of redo logs.
  4. ROWS=n:
  5. Larger Bind Array: This applies to conventional data load only, larger bind array limit the number of calls to the database and increase the performance.
Disabling table Constraints, Triggers, and Indexes help load data fast for any kind of INSERT or SQL Loader utility. Now, you can load data from a file and share your experience on Sql*Loader utility.

Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.

Have a Database-ious Day!

1 comment

  1. I have looked couple of days and found some hypnotizing framed work on it. Regardless, it is the best of all. SEO in New Jersey

    ReplyDelete

Powered by Blogger.