PostgreSQL Array Columns and Indices
Here’s a quick one to give you a nice potential speed up your PostgreSQL Array queries.
Suppose you have a sites
table with an aliases
Array column and you defined
an index like this:
create index idx_sites_on_aliases on sites using GIN(aliases);
So far so good. Except that a (otherwise perfectly working) query like
select * from sites where 'asdf.com' = ANY(aliases);
to find out the site for a given alias name will never use that index. Instead you have to use one of the PostgreSQL Array functions like so:
select * from sites where aliases @> ARRAY['asdf.com']::varchar[];
-- or, equivalent:
select * from sites where aliases @> '{asdf.com}'::varchar[];
If this were a Rails app, you might query your Site model like that:
Site.where('aliases @> ARRAY[?]::varchar[]', hostname)
The shorter curly brackets syntax is not easily used with Rails due to Rails’ automatic quoting.
If you want to try this out on your own with a small data set, don’t forget to
enforce index usage with SET enable_seqscan TO off;
in the psql shell you are using, otherwise the optimizer
will choose sequential scans over index usage due to the small table size.
With this set, have PostgreSQL explain what it does:
explain analyze select * from sites where 'asd.com' = ANY(aliases);
Seq Scan on sites (cost=10000000000.00..10000000001.16 rows=1 width=176) (actual t
Filter: ('asd.com'::text = ANY ((aliases)::text[]))
Total runtime: 0.031 ms
explain analyze select * from sites where aliases @> ARRAY['asd.com']::varchar[];
Bitmap Heap Scan on sites (cost=12.00..16.01 rows=1 width=176) (actual time=0.040.
Recheck Cond: (aliases @> '{asd.com}'::character varying[])
-> Bitmap Index Scan on idx_sites_on_aliases (cost=0.00..12.00 rows=1 width=0)
Index Cond: (aliases @> '{asd.com}'::character varying[])
Total runtime: 0.095 ms