Defining constants in Pl/pgSQL, with a twist

For the past few months i have been thinking about how to implement constants (like for example order status codes) in your code so that you maintain readability even when the constants are defined as integers and you can’t do anything about it. Of course i would prefer having status codes as text (eg. order CREATED, DELIVERED, REVERSED) as in PostgreSQL this doesn’t create noticeable performance penalties and there are no limitations on text column sizes. but you don’t just refactor all your database code and do berserk ALTER COLUMNS as this is way too much work. So still you have the integer status codes in your table and have to define lengthy CONSTANT C_ORDER_DELIVERED int :=1; blocks in your functions declare section to keep the code readable. This gets extremly annoying when there are let’s say 20 different statuses, should i declare them all in the DECLARE section or just the ones used by the function? However this can be avoided by a small trick…

Handling constants as record types attributes

Then one day i discovered that our JAVA team implements constants as a simple SQL function that returns one record where the statuses textual code is a out parameter and the integer value of the status is kept inside that parameter.



CREATE OR REPLACE FUNCTION order_statuses(

    OUT CREATED int,

    OUT DELIVERED int,

    OUT CANCELLED int

) RETURNS record AS

$$

BEGIN

     CREATED := 1;

     DELIVERED := 2;

     CANCELLED := 3;

END;

$$ LANGUAGE plpgsql;

This is a really neat idea, you only have to select the output of the function into a record type variable and Voilà!



C_ORDER = order_statuses();

SELECT * FROM orders WHERE key_user = i_username AND key_status = C_ORDER.DELIVERED;


Note that this kind of approach also has it’s drawbacks: The query is unable to use partial indexes in the plan and when you are scanning a table solely based on this key then hardcoding a constant will give you a more correct plan as it is able to use the statistics collected for the column.

No need to declare anything besides the C_ORDER record and code is readable, no integer constants inside.
However such an approach has it’s shortcomings, when you add a new status you need to drop the previous constant returning function and create a new one, this is quite annoying when you have all the same statuses stored inside a table and could fetch them from there. However the limitations that a function must always return a predefined fixed count of output variables will not enable you to return a record with dynamical length.

creating records with variable length

Then i figured out that this approach actually works if we do this one little exception to our “no SQL should be dynamically created inside DB” policy. It’s perfectly ok to create dynamic SQL if you are not doing any data access within it – just for converting data. (Doing data fetching inside dynamically generated SQL makes monitoring the database performance and access patterns a pain in the ass).

So here is what i came up with:



CREATE TABLE classificator.classificator(

    id_classificator serial PRIMARY KEY,

    classificator text,

    code int,

    lookup text,

    description text

);

CREATE INDEX idx_classificator ON classificator.classificator(classificator);INSERT INTO classificator.classificator (classificator, code, lookup, description) VALUES ('order_status',1,'CREATED','Order created');

INSERT INTO classificator.classificator (classificator, code, lookup, description) VALUES ('order_status',2,'DELIVERED','Order delivered');

INSERT INTO classificator.classificator (classificator, code, lookup, description) VALUES ('order_status',3,'CANCELLED','Order cancelled');

CREATE AGGREGATE array_accum (anyelement)

(

sfunc = array_append,

stype = anyarray,

initcond = '{}'

);

CREATE OR REPLACE FUNCTION classificator.get_constant_structure_sql(

    IN i_classificator text,

    OUT exec_str text

) RETURNS text AS $$

DECLARE

    _lookup text[];

    _key_integer text[];

    _keys text;

    _vals text;

BEGIN

    SELECT array_accum('"'||c.lookup||'"'),

           array_accum(c.code)

      FROM classificator.classificator c

    WHERE c.classificator = i_classificator

      INTO _lookup,

           _key_integer;

    IF NOT FOUND THEN

         RETURN;

    END IF;

_keys = array_to_string(_lookup, ', ');

    _vals = array_to_string(_key_integer, ', ');

    exec_str = 'SELECT * FROM (VALUES ('||_vals||')) AS t ('||_keys||')';

    RETURN;

END

$$ LANGUAGE plpgsql SECURITY DEFINER;

As you can see i don’t care much about the 3NF (Third Normalized Form) standards to keep all status codes inside separate tables. I personally am not a big fan of 3NF as it has it’s own drawbacks. Having a swarm of small status codes & other similar tables in your system makes it too difficult to locate anything. However if you like you can create this function in a way that it handles only one table (eg. it creates the status codes record only for the table order_statuses).

The implementation into a function is really simple as you can see from the following example:



...

DECLARE

    _rec_struct text;

    C_ORDER record;

BEGIN

    _rec_struct = classificator.get_constant_structure_sql('order_status');

    EXECUTE _rec_struct INTO C_ORDER;

    SELECT * FROM orders WHERE key_status = C_ORDER."DELIVERED";

...

Code is clean, simple, readable and consumes only 3 rows for every constant set that you need defined. Works also nicely as a eval check so you can not implement constants in your code that are not yet in the table.

Here’s a small data dump to visualize whats going on inside these functions:



select * from classificator.classificator;

 id_classificator | classificator | code |  lookup   |   description

------------------+---------------+------+-----------+-----------------

                1 | order_status  |    1 | CREATED   | Order created

                2 | order_status  |    2 | DELIVERED | Order delivered

                3 | order_status  |    3 | CANCELLED | Order cancelled

(3 rows)select * from classificator.get_constant_structure_sql('order_status');

                                  exec_str

-----------------------------------------------------------------------------

 SELECT * FROM (VALUES (3, 2, 1)) AS t ("CANCELLED", "DELIVERED", "CREATED")

(1 row)

SELECT * FROM (VALUES (3, 2, 1)) AS t ("CANCELLED", "DELIVERED", "CREATED");

 CANCELLED | DELIVERED | CREATED

-----------+-----------+---------

         3 |         2 |       1

(1 row)

So now we have the possibility to create a record inside every Pl/PgSQL function that holds all the constants, changes dynamically on adding new constants and spares us a lot of code in function DECLARE parts. Hope this idea was useful, i for sure plan to implement it wherever possible 😉

Promo corner

skype is looking for a DB QA engineer, the job is located in Tallinn. The team is small and brilliant, this is a great opportunity if you are a quick learner and are interested how big database clusters actually work!

Advertisements

About this entry