Jens Krämer

PostgreSQL Array Columns and Indices

 |  sql, rails, postgresql

Here’s a quick one to give you a nice potential speed up your PostgreSQL Array queries.

Suppose you have a sites table with an aliases Array column and you defined an index like this:

create index idx_sites_on_aliases on sites using GIN(aliases);

So far so good. Except that a (otherwise perfectly working) query like

select * from sites where 'asdf.com' = ANY(aliases);

to find out the site for a given alias name will never use that index. Instead you have to use one of the PostgreSQL Array functions like so:

select * from sites where aliases @> ARRAY['asdf.com']::varchar[];
-- or, equivalent:
select * from sites where aliases @> '{asdf.com}'::varchar[];

If this were a Rails app, you might query your Site model like that:

Site.where('aliases @> ARRAY[?]::varchar[]', hostname)

The shorter curly brackets syntax is not easily used with Rails due to Rails’ automatic quoting.

If you want to try this out on your own with a small data set, don’t forget to enforce index usage with SET enable_seqscan TO off; in the psql shell you are using, otherwise the optimizer will choose sequential scans over index usage due to the small table size. With this set, have PostgreSQL explain what it does:

explain analyze select * from sites where 'asd.com' = ANY(aliases);

 Seq Scan on sites  (cost=10000000000.00..10000000001.16 rows=1 width=176) (actual t
   Filter: ('asd.com'::text = ANY ((aliases)::text[]))
 Total runtime: 0.031 ms


explain analyze select * from sites where aliases @> ARRAY['asd.com']::varchar[];

 Bitmap Heap Scan on sites  (cost=12.00..16.01 rows=1 width=176) (actual time=0.040.
   Recheck Cond: (aliases @> '{asd.com}'::character varying[])
   ->  Bitmap Index Scan on idx_sites_on_aliases  (cost=0.00..12.00 rows=1 width=0)
         Index Cond: (aliases @> '{asd.com}'::character varying[])
 Total runtime: 0.095 ms