In my previous post I talked about bitmap indexes, where they should be used and where they shouldn’t be used or what you have to keep in mind when using them. Today I show you a example where they are very helpful. Remember I said that bitmap indexes where designed for read intensive environments and that they are useful on columns with low cardinality. Lets assume you have a table for all your employees in your company. The company has three different locations, lets say 1=New York; 2=London; 3=Paris:
SQL> CREATE TABLE employees ( empId NUMBER, deptId NUMBER, name VARCHAR2(50), gender VARCHAR2(6) ); Table created.
Now I insert lets say 100,000 employees distributed over the three locations. The distribution on the locations is based on a random number out of DBMS_RANDOM. Same for the gender:
SQL> BEGIN 2 FOR n IN 1..100000 LOOP 3 INSERT INTO employees VALUES (n, ROUND(DBMS_RANDOM.VALUE(0.5,3.5),0), 'Employee ' || n, DECODE(ROUND(DBMS_RANDOM.VALUE(1,2),0),1,'MALE','FEMALE')); 4 END LOOP; 5 COMMIT; 6 END; 7 / PL/SQL procedure successfully completed.
A quick look at the distributions:
SQL> SELECT gender, COUNT(*) FROM employees GROUP BY gender; GENDER COUNT(*) ------ ---------- FEMALE 49985 MALE 50015 SQL> SELECT deptId, COUNT(*) from employees GROUP BY deptId; DEPTID COUNT(*) ---------- ---------- 1 33532 2 33456 3 33012
So as you can see the distribution is good and the cardinality is “bad”. As the next step I create a normal index on it and look how it behaves.
SQL> CREATE INDEX EMPLOYEES_GENDER_I001 ON employees (gender); Index created. SQL> CREATE INDEX EMPLOYEES_DEPT_I001 ON employees (deptId); Index created. SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'EMPLOYEES', CASCADE=>TRUE); PL/SQL procedure successfully completed.
Now lets have a closer look at some selects and their explain plans. First I just count the number of female employees and the number of employees in the New York (deptId=1) office. As I have indexes on both columns Oracle should retrieve the data directly from them – no need to go to the table:
SQL> set autotrace traceonly explain;
SQL> SELECT COUNT(*) FROM employees WHERE gender = 'FEMALE';
Execution Plan
----------------------------------------------------------
Plan hash value: 938208788
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 88 (18)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | | |
|* 2 | INDEX FAST FULL SCAN| EMPLOYEES_GENDER_I001 | 50108 | 342K| 88 (18)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("GENDER"='FEMALE')
SQL> SELECT COUNT(*) FROM employees WHERE deptId = 1;
Execution Plan
----------------------------------------------------------
Plan hash value: 3728864332
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 74 (9)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
|* 2 | INDEX RANGE SCAN| EMPLOYEES_DEPT_I001 | 34186 | 100K| 74 (9)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DEPTID"=1)
I got an INDEX FAST FULL SCAN and an INDEX RANGE SCAN – fair enough. Costs are 88 and 74. Now what if I have a more complex statement? What if I want to have all male employees who are either in the New York or the London office:
SQL> SELECT COUNT(*) FROM employees WHERE gender = 'MALE' AND deptId IN (1,2);
Execution Plan
----------------------------------------------------------
Plan hash value: 1712853197
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 177 (16)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
|* 2 | TABLE ACCESS FULL| EMPLOYEES | 34047 | 299K| 177 (16)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("GENDER"='MALE' AND ("DEPTID"=1 OR "DEPTID"=2))
This leads already to a TABLE ACCESS FULL with costs of 177. Now the same story with bitmap indexes on the columns:
SQL> DROP INDEX EMPLOYEES_GENDER_I001; Index dropped. SQL> DROP INDEX EMPLOYEES_DEPT_I001; Index dropped. SQL> CREATE BITMAP INDEX EMPLOYEES_GENDER_I001 ON employees (gender); Index created. SQL> CREATE BITMAP INDEX EMPLOYEES_DEPT_I001 ON employees (deptId); Index created. SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER, 'EMPLOYEES', CASCADE=>TRUE); PL/SQL procedure successfully completed.
First lets do the counts again:
SQL> SELECT COUNT(*) FROM employees WHERE gender = 'FEMALE';
Execution Plan
----------------------------------------------------------
Plan hash value: 3004312514
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 7 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 7 | | |
| 2 | BITMAP CONVERSION COUNT | | 51145 | 349K| 3 (0)| 00:00:01 |
|* 3 | BITMAP INDEX SINGLE VALUE| EMPLOYEES_GENDER_I001 | | | | |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("GENDER"='FEMALE')
SQL> SELECT COUNT(*) FROM employees WHERE deptId = 1;
Execution Plan
----------------------------------------------------------
Plan hash value: 2661849472
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
| 2 | BITMAP CONVERSION COUNT | | 34039 | 99K| 3 (0)| 00:00:01 |
|* 3 | BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001 | | | | |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPTID"=1)
As you can see the optimizer chose another plan now for both tables. Instead of an INDEX FAST FULL SCAN and an INDEX RANGE SCAN I got now both times a BITMAP INDEX SINGLE VALUE and a BITMAP CONVERSION COUNT. Costs dropped from 88 and 74 to 3! Now the complexer statement:
SQL> SELECT COUNT(*) FROM employees WHERE gender = 'MALE' AND deptId IN (1,2);
Execution Plan
----------------------------------------------------------
Plan hash value: 597175463
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 9 | 9 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 9 | | |
| 2 | BITMAP CONVERSION COUNT | | 33998 | 298K| 9 (0)| 00:00:01 |
| 3 | BITMAP AND | | | | | |
|* 4 | BITMAP INDEX SINGLE VALUE | EMPLOYEES_GENDER_I001 | | | | |
| 5 | BITMAP OR | | | | | |
|* 6 | BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001 | | | | |
|* 7 | BITMAP INDEX SINGLE VALUE| EMPLOYEES_DEPT_I001 | | | | |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("GENDER"='MALE')
6 - access("DEPTID"=1)
7 - access("DEPTID"=2)
I got again another plan and here you can see the power of bitmap indexes now. Oracle takes the department ids and does now a BINARY OR on the bitmap. Out of the result it does a BINARY AND with the gender bitmap and look at the result: The costs dropped from 177 on a FULL TABLE SCAN to 9 on some bitmap operations! So as you can see on this short demonstration bitmap indexes are quite powerful!