recent

Titulo

Table Partitioning 101

Oracle Table partition uses an old concept of divide-and-conquer strategy where a large table is divided into smaller tables based on partition key column to make the data storage and retrieve process easier and faster. Working on big tables are hard in many ways which you may have experienced it. Creating Index might take a while and removing unwanted data may take days to months depending on the table size. To prevent these nightmares of large table maintenance, Oracle has came up with a concept of table partitioning. It is the process of dividing big table into smaller tables storing likes data on each partition. This is a very old concept which Oracle applied for partitioning table to speed up the query operations that works like a magic when used properly.

Partitioning of tables and Indexes can benefit the performance and maintenance in several ways. You can do a maintenance on one partition while all other partitions are online and serving your applications. This is not possible on a single table where you may need to take the whole table offline. The performance of a query is improved as it has to read from a partition not from a whole table. A parallel process is possible which greatly improves the speed of operation on a partitioned table.

With Oracle Table partition 101, I am going to cover 4 basic table partition types that are available on Oracle. There are other advanced partition types that are not covered in this article.
  1. RANGE Partitioning
  2. RANGE INTERVAL Partitioning
  3. LIST Partitioning 
  4. HASH Partitioning.
RANGE Partitioning: Range Partitioning is the most common type of table partitioning based on the range of values of the partitioning key that is set for each partition. Partitions are created manually and on demand.

SYNTAX:
CREATE TABLE BANIYA.SALES_RANGE_PART (
 sales_id NUMBER PRIMARY KEY,
 sales_date DATE NOT NULL,
 sales_qty NUMBER,
 sales_amt NUMBER(5,2))

PARTITION BY RANGE(sales_date)
(PARTITION sales_2010 VALUES LESS THAN (to_date('01-JAN-2011','DD-MON-YYYY')),  -- stores data for 2010
 PARTITION sales_2011 VALUES LESS THAN (to_date('01-JAN-2012','DD-MON-YYYY')) ,  -- stores data for 2011
 PARTITION sales_2012 VALUES LESS THAN (to_date('01-JAN-2013','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2013 VALUES LESS THAN (to_date('01-JAN-2014','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2014 VALUES LESS THAN (to_date('01-JAN-2015','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2015 VALUES LESS THAN (to_date('01-JAN-2016','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2016 VALUES LESS THAN (to_date('01-JAN-2017','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2017 VALUES LESS THAN (to_date('01-JAN-2018','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2018 VALUES LESS THAN (to_date('01-JAN-2019','DD-MON-YYYY')) , -- stores data for 2011
 PARTITION sales_2019 VALUES LESS THAN (to_date('01-JAN-2020','DD-MON-YYYY')),  -- stores data for 2011
 PARTITION JUNK VALUES LESS THAN (MAXVALUE));

In our example above, if we insert record with a sales_date that has a value of 02-MAY-2014 lies on sales_2014 partition. Similarly, the sales_date of 06-JAN-2015 falls on sales_2015 partition.

Each partition has a VALUE LESS THAN clause that specifies the upper bound for the partitions. Any key value equal to or higher are added to next higher partition. All except the sales_2010 partition has an implicit lower bound. A MAXVVALUE literal is used to define the highest partition which represents a virtual infinite value that stores data higher than any other possible values for partitioning key, including NULL value. In a simple word, it catches everything that does not fall into existing table partitions.

The partitioning key consists of one or more columns that determine the partition where each rows are stored. Oracle automatically directs Insert, Update and Delete operations to the appropriate partition. You cannot insert data for future date that is not defined on a Range Partitioned table. Wouldn’t it be nice if Oracle can create a partition automatically for dates not specified on existing partition? The partition key column should be a DATE or NUMBER only. It does not accept string or varchar2 datatype for partitioning key column

RANGE INTERVAL Partitioning: The Interval partitioning concept was first introduced in Oracle 11g. The INTERVAL feature answered the limitation of Range partition table. The DBMS creates new partition automatically based on the data inserted to the table. There is no need for DBA or Developer to create new table partition manually to accommodate new data.

SYNTAX:
CREATE TABLE BANIYA.SALES_INTERVAL_RANGE (
 sales_id NUMBER Primary KEY,
 sales_date DATE NOT NULL,
 sales_qty NUMBER,
 sales_amt NUMBER(5,2))
PARTITION BY RANGE(sales_date)
INTERVAL(NUMTOYMINTERVAL(1, 'MONTH'))
(PARTITION sales_JAN_2016 VALUES LESS THAN(to_date('01-FEB-2016','DD-MON-YYYY')),-- stores data for JAN 2016
 PARTITION sales_FEB_2016 VALUES LESS THAN(to_date('01-MAR-2016','DD-MON-YYYY')),-- stores data for FEB 2016
 PARTITION sales_MAR_2016 VALUES LESS THAN(to_date('01-APR-2016','DD-MON-YYYY')),-- stores data for MAR 2016
 PARTITION sales_APR_2016 VALUES LESS THAN(to_date('01-MAY-2016','DD-MON-YYYY')));-- stores data for APR 2016


This is the most popular partition type I have seen and used frequently at my work and in my personal projects. Interval partitioning is an extension of Range partition which instructs database engine to automatically create new partition of a specified interval when data inserts into the table exceed all of the existing range partitions. You must specify at least one range when creating a partitioned table. My personal choice is have at least four partitions to begin with to avoid confusion regarding intervals. You can’t have multiple columns as partition key; you just need one DATE column for your key column which does not get updated at all. The partitioning key column cannot be a VARCHAR2 column or any other type except DATE.

What are your interval options for Range-Interval partition? Your interval options are MONTH and YEARS but did you know we also have DAY, HOUR, MINUTE, SECOND as an interval options as well.

INTERVAL YEAR TO MONTH:
 INTERVAL(NUMTOYMINTERVAL(1, 'MONTH'))  -- Monthly
 INTERVAL(NUMTOYMINTERVAL(3,'MONTH')) -- Quarterly
 INTERVAL(NUMTOYMINTERVAL(6,'MONTH')) -- 1/2 Yearly
 INTERVAL(NUMTOYMINTERVAL(1, 'YEAR'))  -- Yearly

This only takes YEAR or MONTH  as interval. Well, the above function only supports MONTHS or YEAR how do you partition a table daily, weekly, hourly if you have a business need for that.

INTERVAL DAY TO SECOND: 
 INTERVAL(NUMTODSINTERVAL(1, 'DAY'))  -- Daily Partition
 INTERVAL(NUMTODSINTERVAL(7, 'DAY')) -- Weekly Partition
 INTERVAL(NUMTODSINTERVAL(1, 'HOUR')) -- Hourly Partition – Not recommended
 INTERVAL(NUMTODSINTERVAL(1, 'MINUTE'))  -- Minute Partition –- Not recommended
 INTERVAL(NUMTODSINTERVAL(1, 'SECOND'))  -- Second Partition – Not recommended

LIST Partitioning: List partitioning is the best way to organize data that cannot be ordered. This is basically a grouping key value data into each partition. The default partition keeps anything that does not belong to existing partitions. If DEFAULT is not defined, anything that does not fall on the list will throw an error and the insert operation fails. It is a best practice to specify default keyword and check the stuff inside the default periodically.

SYNTAX:
CREATE TABLE BANIYA.SALES_LIST_PARTITION (
 sales_id NUMBER Primary KEY,
 sales_date DATE NOT NULL,
 sales_qty NUMBER,
 sales_amt NUMBER(5,2),
 sales_region  VARCHAR2(20))
PARTITION BY LIST (sales_region)
(PARTITION P1_NORTHEAST VALUES ('NORTHEAST'),
 PARTITION P2_MIDWEST  VALUES  ('MIDWEST'),
 PARTITION P3_SOUTH VALUES ('SOUT'),
 PARTITION P4_WEST VALUES ('WEST'),
 PARTITION P4_NULL VALUES (NULL)
 PARTITION P4_DEFAULT VALUES (DEFAULT)); 

This partition is pretty much self-explanatory where data organizes itself based on the sales region. DEFAULT partition will keep data that does not belong to existing partitions. DEFAULT is an optional, but great to have it to avoid ORA-14400 "error: ORA-14400: inserted partition key does not map to any partition" error. The record with NULL partition key column data gets stored in Default partition as well.

HASH Partitioning: What is a HASH Partition? What value do you get out of this partitioning strategy are some questions I asked myself when I was first introduced to Hash partition concept. Honestly, I never had to use this kind of partitioning at work except when learning and writing about this article. May be I am not yet exposed to those kinds of business or environment.  The spread of data into different disk to balance IO contentions is one benefit I am aware of it.

SYNTAX:
CREATE TABLE BANIYA.SALES_HASH_PARTITION (
 sales_id NUMBER Primary KEY,
 sales_qty NUMBER,
 sales_amt NUMBER(5,2))
PARTITION BY HASH (sales_id)
(PARTITION part1 tablespace sales_hash_partition_1
 PARTITION part2 tablespace sales_hash_partition_2
 PARTITION part3 tablespace sales_hash_partition_3
 PARTITION part4 tablespace sales_hash_partition_4);

Anyway, there are few more benefits which I have collected from the ORACLE expert Tom Kyte:

Instead of having a 100 gig tablespace to backup, you have 100, 1 gig tablespace. Each tablespace spends less time in backup mode, reduces the amount of potential extra redo, reduces the amount of manual recovery you need to do if the instance fails during backup. Same thing applies with restores. You can analyze each partition quickly, instead of running an analyze on a 100 gig table.

You can reorg each partition independent of any other partition which can easily redistribute the load over many disks, you now have evenly distributed the data into 100 1 gig partitions, move them at will. If you did this manually, you would have a heck of a time moving the stuff around and so.

Well, this is end of article Oracle Partition 101 and you now know these basic Oracle partitions concepts. Also note, there are more advanced type of partitions that Oracle supports which I recommend only after mastering these partitions.  

Next, I will show you how to find table partitions type, partitioning key column, high value, selecting data from a partition and any other frequently used DDL or DML statements. These SQL queries apply to RANGE, INTERVAL, HASH and LIST partitioning table.

--Displays the partition key column
SELECT * 
FROM   dba_part_key_columns 
WHERE  owner ='BANIYA' 
AND    name = 'SALES_PART_RANGE'; 

--Display the Partition type
SELECT owner, 
       table_name, 
       partitioning_type, 
       ref_ptn_constraint_name 
FROM   dba_part_tables 
WHERE  owner = 'BANIYA' 
AND    table_name ='SALES_PART_RANGE'; 

--Checking partition name and high value.
SELECT partition_name, 
       high_value 
FROM   dba_tab_partitions 
WHERE  table_owner ='BANIYA' 
AND    table_name = 'SALES_PART_RANGE'; 

--Selecting data from a  partition
SELECT * 
FROM   baniya.sale_range_part PARTITION(sales_2019);

Now, your assignment is to find out the large table in your organization and start partitioning to make your report and application running much faster. 

Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.

Have a Database-ious Day!

No comments

Powered by Blogger.