Handling high OLTP workload with RAC

Recently I had to face a high OLTP workload with RAC. This workload was a massive insert with high concurrency in one table. When used RAC, the performance drops down because of the concurrency management needed to maintain index : this creates a lot of roundtrips between the inserting nodes to agree which one should write the index updates, even with only one index to maintain. Fortunately, it’s possible to overcome this issue, and we’ll see beetween several solutions that partitioning can be helpful even with OLTP workload.
Let’s look at the solutions offered:
1) Keep the writes on one node only and distribute the reads on the other nodes ; this is the most simple way and works great, but we loose the RAC-iness, the load balancing, the scalability of the architecture and need an application rework ;
2) As recommended by Oracle in « Troubleshooting ‘enq: TX – index contention’ Waits in a RAC Environment. (Doc ID 873243.1) » and « Top 5 Database and/or Instance Performance Issues in RAC Environment (Doc ID 1373500.1) », create a reverse index; this works great! But as we’ll see later, I’m doubtful about the scalability especially when dealing with sequences (there are only 10 different starting points for the index). The BIG drawback is that a reverse index cannot be picked by the optimizer for a range scan ; then if you’ve got high read load on top of your write load, this can be a serious issue ; for example forget about having good performance when filtering on timestamp columns to identify recent records (unless you got an Exadata box I admit) ;
3) My preferred one: hash-partition the table on the primary key and use LOCAL indexes ; then when the inserts are done, the rows gets balanced beetween partitions and avoid roundtrips between nodes for index maintenance ; moreover the solution is really scalable as you’ll see in the tests, you can trustfully go up to 1000 partitions to handle your workload.
OK, stop the theory, let’s go for experimentation. I’ve done it using the following:
  – create test table using dba_objects as model ;
  – create PK on object_id column ;
  – implement the test tool that runs the following :
    * create 100 connections to the RAC cluster (2 nodes with 16 procs eachs) ;
    * create a thread pool of 100 resident threads ;
    * loop 50000 times in the thread pool to:
      + pick an available connection to RAC cluster ;
      + sleep some reasonable amount of time to simulate real workload (50ms here) ;
      + insert a line in the test table using a sequence ;
      + commit.
Here are the results. Two criterias for analyzing them: the time spent in inserting and the run queue size (number of active sessions in the cluster).

Test Time spent inserting (s) Maximum run queue size
no RAC, no partition and regular index on PK 480 6
RAC, no partition and regular index on PK 2000 45
RAC, no partition and REVERSE index on PK 450 8
RAC, 5 partitions with regular local index on PK 510 15
RAC, 10 partitions with regular local index on PK 450 8
RAC, 20 partitions with regular local index on PK 450 8
RAC, 30 partitions with regular local index on PK 450 6
RAC, 50 partitions with regular local index on PK 400 6
RAC, 70 partitions with regular local index on PK 400 6
RAC, 100 partitions with regular local index on PK 380 6
RAC, 1000 partitions with regular local index on PK 530 8
RAC, 100 partitions with regular local index on PK and regular local index on « created » column 460 6

We can see that hash-partitioning using the PK is a great way to handle the concurrency in RAC. You will be able to scale correctly the write workload even in RAC.
As we’ll see in next blog, we’ll also could to handle the read workload without reworking the application by using local indexes on the other columns of the PK: the number of small range scans is balanced with one big range scan (in a certain range of partition number).
Traditionnaly, local indexes are mostly used in DWH for maintenance purpose (partition exchange …) and parallelism, and global indexes in OLTP. But as we just see, global indexes kill OTLP write performance in RAC (this is the same behaviour as for non-partitionned tables), except if you use global partitionned indexes, as we’ll also see in a next post.
Last tip, please note that the number of partitions should be a power of 2 (4, 8, 16 …) to get the best balance of the rows in the different partitions when hash-partitioning your tables (ref: Expert Oracle Database Architecture by Thomas Kyte).
Thanks for reading, see you soon.