TL;DR: Underscore is a matching character in LIKE queries to match a single character.

I’m using Postgresql for one of my projects to store some python package metadata. Canonical python packages must be all lowercase and should not use underscores (_) but instead should use hyphens (-) as delimiters. It is just a recommendation and not enforced by PyPI or pip.

I wanted to check how many packages in my database were using underscores.

SELECT count(*) FROM packages WHERE filename LIKE "nflx_%"

Little did I know that an underscore in a LIKE query is treated as a placeholder to match exactly one character in Postgresql. So this query matched every package that had nflx_ or nflx- as a prefix in the name. 🤦

It took me a good 15 minutes of pulling my hair before I searched for it online.

Of course, Postgresql being an exemplary project had this well-documented.

Update1: Turns out this is not just a Postgresql quirk but it is a feature of SQL, so SQLite, MySQL etc all behave this way.

Update2: You can escape the underscore using a back-slash.

SELECT count(*) FROM packages WHERE filename LIKE "nflx\_%"