How to ORDER BY using the contents of a column in Postgres

Knock helps teams deliver great notification experiences to end users. One of our most popular features is batching: Instead of sending many notifications to a user in a short time frame, Knock can batch multiple notifications into a single message, delivered to the user at the right time.

Sometimes a batch accumulates far more notifications than a single message can handle. If a tweet goes viral and gets thousands of likes, it wouldn’t make sense for a single notification to list the name of every user that liked your tweet.

When we batch notifications at Knock, we produce a number of variables you can use to give your users helpful information about the batch, without listing every item contained within it.

These variables include the count of items in the batch and the count of unique users in the batch. We also provide an array of the first ten items in the batch for including details like a user name or information about the subject of the notification. This is similar to how Twitter delivers its own notifications.

Recently a customer asked us if, instead of the first ten activities, could we instead provide the last ten activities (that is, the most recent activities.) Naturally, we took up the challenge. How hard could it be?

It turned out that it was hard to design and then easy to implement, all thanks to the power of Postgres.

⏩

tl;dr: Take a good look at this chunk of SQL and then tell us, why would shoving two case statements with mixed ordering and fixed dates into an ORDER BY ever be a good idea? What if we told you that this is part of a WINDOW function?

ORDER BY
	CASE
		WHEN sort_order = 'asc' THEN inserted_at
		ELSE '1900-01-01'
	END ASC,
	CASE
		WHEN sort_order = 'desc' THEN inserted_at
		ELSE '1900-01-01'
	END DESC

Intrigued? Good - let's talk about how this bit of SQL enabled a cool new feature at Knock.

A short aside on window functions

SQL window functions are the beating heart of our batch functions. A window function makes it possible to fetch rows from the database and perform operations on them in groups. When resolving a group of batches, we first window over each batch, and within each batch we take the first ten activities in that batch. Instead of doing all this work in our backend code, SQL will load, process, and return each batch in just the right way.

Instead of performing an N+1 query on each batch returned and then processing the results in our backend to build our batches, we can send just one query and the batches come loaded with at most 10 items.

When each batch is ordered differently

The hard part is that we need to load multiple batches in one query, where some batches would want the first ten items (aka ORDER BY ASC) and some batches would want the last ten items (aka ORDER BY DESC). Our customers set this using our dashboard editor, which saves the setting on the batch step itself:

So the question is: How do we selectively order each set of items in a batch - some ascending, some descending - using a single query?

Other approaches we tried

Our initial research concluded without finding a good way to solve this problem using window functions. Instead, we thought about shifting from a “resolve on read” approach to a “resolve on write” approach:

Resolve on read: As activities arrive, store them quickly. When it’s time to load the batch, figure out what is in the batch (more work done in one shot).
Resolve on write: As activities arrive, build the batch (spread the work out over each activity). When it’s time to load the batch, no further work is required.

Using a “resolve on write” approach, we looked at building a list of each activity that belonged in the finished batch. If a batch needed just the first ten items, we would add items to the batch until we hit 10 items, and the remaining items would not be delivered when the notification is built. If a batch needed the last ten items, we would add items to the end of the batch, and trim the start of the batch to ensure we only have 10 items delivered when the notification is built.

Although we started down this road, we revisited the window query to see if there was any way to make it work…and there was!

ORDER BY 🤝 CASE statements

Postgres ORDER BY clauses have a few interesting properties:

Typically, an ORDER BY clause has a list of columns by which to sort, and maybe the direction to sort (By default ASC, or DESC):
```
ORDER BY inserted_at ASC
```
Sometimes an ORDER BY clause will have more than one column:
```
ORDER BY
	updated_at DESC,
	inserted_at ASC
```
You can also use CASE statements in an ORDER BY clause. This makes it possible to selectively choose which column to sort by!
```
ORDER BY
	CASE
		WHEN sort_by = 'inserted_at' THEN inserted_at
		ELSE updated_at
	END DESC
```
This gets us part of the way to our desired outcome, but we still need a way to choose whether we are sorting ASC or DESC.
A quirk of ORDER BY is that you can order by columns or by a fixed value in the query. Take this example:
```
ORDER BY '1900-01-01' DESC, inserted_at ASC
```
In this query, there are two order by clauses. The first clause is ignored because every row will have the same value, so sorting by that value will do nothing. Instead, all rows will be sorted by inserted_at ASC. Although this would be a waste with most code, it enables us to selectively sort ASC or DESC per batch in a single SQL query.

Using these building blocks, we can actually solve our selective-sort batching problem with just a few lines of SQL!

Bringing it all together

If you’ve been following along this far, congratulations: you spend too much time writing SQL. Even so, we hope the following snippet has been worth the journey:

ORDER BY
	CASE
		WHEN sort_order = 'asc' THEN inserted_at
		ELSE '1900-01-01'
	END ASC,
	CASE
		WHEN sort_order = 'desc' THEN inserted_at
		ELSE '1900-01-01'
	END DESC

This ORDER BY clause will do the following:

For each row, if the sort_order is asc, it will use the inserted_at column when sorting. If sort_order is not asc, it will use the fixed value 1900-01-01, which effectively does nothing.
Then, if the sort_order is desc, it will use the inserted_at column when sorting. If sort_order is not desc, it will use the fixed value 1900-01-01, which effectively does nothing.

Here is a simple table structure and corresponding SQL query that shows it all in action:

SELECT t.activity_id
	FROM (SELECT a.id as activity_id, a.batch_id, a.inserted_at,
					row_number() OVER activities_partition -- See window definition below
				FROM activity a
				INNER JOIN batch b ON b.id = a.batch_id
				WINDOW activities_partition AS (
					PARTITION BY b.id
					ORDER BY
						-- This is the good part
						CASE
							WHEN b.sort_order = 'asc' THEN a.inserted_at
							ELSE '1900-01-01'
						END ASC,
						CASE
							WHEN b.sort_order = 'desc' THEN a.inserted_at
							ELSE '1900-01-01'
						END DESC
			)
	) t
	WHERE t.row_number <= 10 -- For 'asc' batches, this is the first items; for 'desc' batches, this is the last 10 items
	ORDER BY t.batch_id, t.inserted_at ASC -- Returns all batches to natural sort order within each batch

Conclusion

In the end, the changes to our code involved adding a join (to get the sort_order setting) and changing our WINDOW ORDER BY from just inserted_at to a couple of weird but very readable CASE statements.

Performance-wise, our query is a little more expensive than it used to be - but still within tolerances for our biggest, most complex batches (and there may be further improvements to our table and index structure to speed things up even more). But, a little more query complexity in exchange for more maintainable, understandable code is a tradeoff that is well worth it for this instance.