On my recent project we have a large package holding constants values. From a design point of view this is a very elegant solution because this way the constants are all defined in a centralized way. But the usage of this package lead to enormous performance issues. This post tells why we had these problems and how we solved them.
An example of a constants package could be:
create or replace package my_constants is my_name constant varchar2(30) := 'Andre'; end; /
One of the problems is that package constants cannot be used in Oracle Forms or directly from SQL. So a function was written that dynamically retrieved the constants value from the package. This function can be used in Forms and SQL,
so that problem seemed to be solved. An implementation of a function that retrieves the constants value can be:
function get_my_constant (p_constant in varchar2 ) return varchar2 is l_return varchar2(2000); begin execute immediate ('begin :1 := '||p_constant||'; end;') using in out l_return; return l_return; end get_my_constant;
And now the horror story starts. This function was being used inappropriately. Supposed to be used only in Forms, but used in a lot of packages. The statspack report counted well over 30 million calls to a single value from the package using this function. (Granted the Statspack report covered 28 hours, but still)
Performance of the system was becoming so poor that production problems were beginning to occur. Every call to the function results in a hard parse which is very expensive. The best way, of course, was to remove the function call whenever possible. But the application is so big, that this is not possible in a short timeframe.
The best way was to rewrite the function. Was there a way to make the function perform better? This way we didn’t have to touch the rest of the code. I shortly thought of using deterministic functions. But that is impossible in combination with execute immediate.
The final solution was as follows:
- Lookup the requested constant value in a database table. If the value exists, return this value. This way no dynamic SQL is needed.
- When we can’t find the value in the database table. perform the Dynamic SQL and store it in the table for later use.
The table looks like:
SQL> desc my_constants Name Null? Type ----------------------------------------- -------- ---------------------------- CONST NOT NULL VARCHAR2(61) VALUE VARCHAR2(2000)
I used an private procedure to insert the constant value in the table. This way we could use an autonomous transaction to commit the data that was inserted in the table without affecting the main transaction. The rewritten function looks like:
function get_my_constant (p_constant in varchar2 ) return varchar2 is l_return my_constants.value%type; procedure create_entry (p_con in varchar2 ,p_val in varchar2 ) is pragma autonomous_transaction; begin insert into my_constants (const, value) values (p_con, p_val); commit; exception when dup_val_on_index then null; end create_entry; begin select value into l_return from my_constants where const = p_constant; return l_return; exception when no_data_found then execute immediate ('begin :1 := '||p_constant||'; end;') using in out l_return; create_entry (p_constant, l_return); return l_return; end get_my_constant;
The new function “copies†the constant values from the package to our “cache†table. It’s important to delete the contents from the table whenever the constants package is modified.
During the Steve Adams seminar in the Utrecht, the Netherlands, Steve opted to use index organized tables (IOT) instead of “normal†heap tables for lookup tables. The my_constants table will not be a big table, so my colleague Alex and I investigated if there was something to gain by implementing the my_constants table as an IOT instead of a heap table.
Creating an IOT is not very different from creating a normal heap table:
CREATE TABLE CAD_CONSTANTS (CONST VARCHAR2(61) NOT NULL ,VALUE VARCHAR2(2000) ,CONSTRAINT CON_PK PRIMARY KEY (CONST) ) ORGANIZATION INDEX /
Only the organization index clause is extra. This makes sure that the complete table is actually an index. We looked at the number of consistent gets to check if the IOT was more efficient compared to a heap table.
The “normal†heap table results in:
0 recursive calls 0 db block gets 3 consistent gets 0 physical reads 0 redo size 391 bytes sent via SQL*Net to client 426 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The IOT results in:
0 recursive calls 0 db block gets 1 consistent gets 0 physical reads 0 redo size 391 bytes sent via SQL*Net to client 426 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
So I chose the IOT because it’s a little efficient than a heap table when the table holds a small number of rows.
Finally we checked if the new function was faster. We created a test script in which we retrieved the value of 30 constants. This was looped 5000 times. So we did 150000 calls to the function.
We started this script twice in two simultaneous sessions in order to check if the sessions were blocking each other.
The following is a comparison of the two functions. We used tkprof to analyze the trace files.
The old function has the following statistics:
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 10.24 0 0 0 0 Execute 1 354000.39 394291.20 0 0 0 1 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 354000.39 394301.44 0 0 0 1
The new function has the following statistics:
call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 177500.05 190986.24 0 0 0 1 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 177500.05 190986.24 0 0 0 1
These are big number because the test was done on a 4 CPU machine. I believe you have to divide the number of CPU seconds by 4 (the number of CPU’s) to get the real number of CPU seconds. The new function is around 50% faster than the old one. This is a lot considering that the function is called around 5,575,762 times an hour in production.
On production this change was dramatically. Overall CPU load is now 20 to 30% less, which is enough the keep the system happy for now.