recent

Titulo

Selectivity Vs Cardinality

Cardinality refers to a number of unique values in a row. You can find the cardinally of any column using the SQL keyword DISTINCT.

Example:
Cardinality: select count(distinct (deptno)) from employee;

The above SQL statement displays the number of department in a company or in other words the number of unique value that appears on a deptno column of an employee table.

Selectivity is the ratio of cardinality to the number of records of an Indexed column.
Selectivity = (Distinct Values/Total number of records)

If the ratio or selectivity is 1, it means all the records in a row are unique and is also referred to as high selectivity. An Indexed column with high selectivity is the most optimized column and is normally faster for select from a table. Column with Primary Key or Unique constraints have a selectivity of 1.

Employee Table column “gender” stores gender information for 20,000 employees of our company. This column stores just two unique records (M/F). This is a solid example of low selectivity column. Column that stores Boolean records like Y/N, 1/0, T/F is another example of low selectivity. Whenever, there is low or bad selectivity, the optimizer utilizes Full Table Scan (FTS) to fetch data. FTS is bad especially with large table causing a lot of overhead to the database engine.

Selectivity = (total unique count/total records)
                  = (2/20,000)
                  = 0.0001

As a database developer or dba, you cannot modify company’s data to raise the selectivity for a performance. How do you improve selectivity for a column with low selectivity for performance on such case? Adding a composite Index is the best way to make a low selectivity high. How does it improve the selectivity? Using more than one column makes the Index more unique which in turns improves Index selectivity.

How to measure Index Selectivity in Oracle?

Syntax:-
SELECT (distinct_keys / num_rows) AS Selectivity
FROM dba_indexes
WHERE index_name like 'Index_Name' and Owner ='Table_Owner';
Example:
SELECT (distinct_keys / num_rows) AS Selectivity
FROM dba_indexes
WHERE index_name like 'PK_EMP' and Owner ='BANIYA';

Using our standard employee table, we are going to demonstrate you how to improve the selectivity? What is the selectivity of deptno column on a table? The company has 3 departments and 14 employees. The selectivity is for deptno is around 0.22 and it considered low selectivity. As stated above, creating a composite Index makes the data unique and improves the selectivity. Let’s create composite Index and measure the selectivity again?

--Composite Index
CREATE INDEX composite_index ON baniya.employee(
            deptno,sal, comm);
Selectivity:
SELECT (distinct_keys / num_rows) AS Selectivity, distinct_keys, num_rows
FROM dba_indexes
WHERE index_name like 'COMPOSITE_INDEX' and Owner ='BANIYA';

Output:


The selectivity has now changed from 0.22 to nearly 1 or almost 1.  We have just increased the selectivity and the performance of a query where deptno column is used.

The difference between these two keywords are clear.The term selectivity is not the same as cardinality. These words are often used interchangeably by those who do not know the difference. Now, you know the difference, you are always going to pick the right keyword going forward.

Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.

Have a Database-ious Day!

2 comments

  1. ok, but how to improve the scalability of index, e.g. index has scalability 0.00002761500267865525982956020346733973633195 how to improve it?

    ReplyDelete

Powered by Blogger.