recent

Titulo

Bulk Collect and FORALL

In this article, we will conduct two or more experiments to check the speed of data Insert. Both the experiments will read data from employee_temp table and Inserts into employee_perm table. We will compare the load times on these experiments and explain why one is faster? Yes, you might have guessed it right, the speed of Insert is faster with BULK Collect but we don't know how much faster. Is it fast enough to go through the long code to implement Bulk Collect. Let's be patient and find out the exact script elapse times on these scripts.
The first PL/SQL script uses the traditional INSERT using cursor and the next script will apply BULK COLLECT along with FORALL statement. Before, we  begin to  write code and run these scripts, we will need to do a prep work for this exercise. We will create two tables modeled after the world famous employee table.  Employee Table has 14 records and these many records won't do justice to our exercise therefore, we will load records from employee table to employee_temp 200K times (yes, 200,000 times) which makes a total of 2.8 Mill records for us to use.

Table DDL:
--ddl employee_temp 
CREATE TABLE employee_temp AS 
  SELECT * 
  FROM   employee
  WHERE  1 = 2; 
--ddl employee_perm 
CREATE TABLE employee_perm AS 
  SELECT * 
  FROM   employee
  WHERE  1 = 2;
Data Load Script: employee_temp
DECLARE
  commit_count NUMBER := 0;
BEGIN
  FOR x IN 1 .. 200000
  LOOP
    INSERT INTO employee_temp
    SELECT * FROM employee;
    commit_count := commit_count + 1;
    IF MOD(commit_count, 10000) = 0 THEN
      COMMIT;
    END IF;
  END LOOP;
  COMMIT;
END;
/
The above PL/SQL script copies data from employee to employee_temp around 200,000 times. The employee table has 14 records, therefore it loaded 2.8 million records. We now have source and destination tables created. The source (employee_temp) has 2.8 millions records which we will move to employee_perm table using both BULK Collect and traditional INSERT. Let the coding begins.

TRADITIONAL INSERT:
DECLARE
CURSOR employee_temp_cursor
IS
  SELECT * FROM employee_temp;
commit_count NUMBER := 0;
start_time   NUMBER;
end_time     NUMBER;
BEGIN
  --truncate destination table for fresh insert
  EXECUTE IMMEDIATE 'truncate table employee_perm';
  start_time := DBMS_UTILITY.get_time;
  FOR cur_record IN employee_temp_cursor
  LOOP
    INSERT
    INTO employee_perm VALUES
      (
        cur_record.empno,
        cur_record.ename,
        cur_record.job,
        cur_record.mgr,
        cur_record.hiredate,
        cur_record.sal,
        cur_record.comm,
        cur_record.dept
      );
    commit_count := commit_count + 1;
    IF MOD(commit_count, 1000) = 0 THEN
      COMMIT;
    END IF;
  END LOOP;
  COMMIT;
  end_time := DBMS_UTILITY.get_time;
  DBMS_OUTPUT.PUT_LINE('REGULAR INSERT: '||TO_CHAR(end_time-start_time));
END;
/
PL/SQL procedure successfully completed.
Elapsed: 00:02:42.758
REGULAR INSERT: 16267 

BULK COLLECT FORALL:
set timing on
DECLARE
 c_limit PLS_INTEGER :=1000;
  CURSOR employee_temp_curr
  IS
    SELECT * FROM employee_temp;
TYPE employee_temp_att
IS
  TABLE OF employee_temp_curr%ROWTYPE INDEX BY PLS_INTEGER;
  l_employee_temp  employee_temp_att;
  start_time number;
  end_time number;
BEGIN
  --truncate destination table for fresh insert
  EXECUTE IMMEDIATE 'truncate table employee_perm';
  OPEN employee_temp_curr;
  start_time := DBMS_UTILITY.get_time;
  LOOP
    FETCH employee_temp_curr 
    BULK COLLECT INTO l_employee_temp 
    LIMIT c_limit;
    
    FORALL l_row IN 1 .. l_employee_temp.count
    INSERT INTO employee_perm 
      values(
      l_employee_temp(l_row).empno,
      l_employee_temp(l_row).ename,
      l_employee_temp(l_row).job,
      l_employee_temp(l_row).mgr,
      l_employee_temp(l_row).hiredate,
      l_employee_temp(l_row).sal,
      l_employee_temp(l_row).comm,
      l_employee_temp(l_row).dept);
    dbms_output.put_line(l_employee_temp.count);
    EXIT WHEN employee_temp_curr%NOTFOUND;
    COMMIT;
  END LOOP;
  COMMIT;
  end_time := DBMS_UTILITY.get_time;
  DBMS_OUTPUT.PUT_LINE('BULK COLLECT INSERT: '||to_char(end_time-start_time));
  CLOSE employee_temp_curr;
END;
/
PL/SQL procedure successfully completed.
Elapsed: 00:00:12.767
BULK COLLECT INSERT: 1247 

Based on the stats collected from these scripts, BULK Collect or the second script should be your choice for large INSERT/DELETE/UPDATE. Now, you may be wondering why BULK Collect is fassssster? To find out why, you will need to know the concept of Context Switching. Oracle Database has SQL engine to process your SQL statements and PL/SQL engine to process PL/SQL code. A PL/SQL is a cocktail of SQL and PL/SQL code. To process PL/SQL code, the  database engine switches between these two engines. To get the best performance, we need to reduce the number of context switching to a minimum, how do you do that?. In the first script, there were 280 K context switches whereas on the second one, there were only 2800 context switches therefore, the BULK Collect is faster. You can view the context switch counts if you enable some db logging which is outside the scope of this article.

We learned, reducing the number of context switches makes the DML operations faster. In our case, we are Bulk Loading 1000 records and processing them. Increasing the size of cursor limit helps us reduce the number of switches. To gain performance, there must be overload somewhere in the database, don't your think so? We need to be careful while selecting the cursor limit not to take the database down. This is something that should be discussed with a Database Administrator. The default is 100 rows if the limit is not defined.

Other BULK Collect Features:
  • EXIT WHEN bulk_collect_curr%NOTFOUND: Loop ends when no record are found inside the cursor. Alternately, you can also write EXIT WHEN l_employee_temp.COUNT =0;
  • INDEX By PLS_INTEGER: Data type for storing signed integer which use machine arithmetic and are generally faster than NUMBER and INTEGER operations.
You must be thinking why are we not using  INSERT INTO SELECT FROM what your learnt in first day of Database class? This is my personal favorite and highly recommend whenever it fits your criteria but life is not always that simple. There may be instance where you may need to perform some business checks before performing DML operations or may need to COMMIT the changes frequently when dealing with big tables. The biggest drawback of this is the lack of commit interval settings. 
BEGIN
 INSERT INTO employee_perm
 SELECT * FROM employee_temp;
 COMMIT;
END;
/
2,800,000 rows inserted.
Elapsed: 00:00:05.586

Script #1 is a Honda Civic, which is very reliable, gas efficient and affordable whereas Script #2 is 2017 Ferrari that is expensive but extremely powerful. Choice is yours, pick your ride!

------
Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.

Have a Database-ious Day!

No comments

Powered by Blogger.