Dry, Flat and Technically Correct
On algorithms, aesthetics, and the stubborn irreducibility of human taste
This will probably be short.1
A cynic is a man who knows the price of everything and the value of nothing. - Oscar Wilde
This is a quote that recently resurfaced in my mind. One of the people at my uni made this very aesthetic collage of five photos of similar colors (and she made five such collages).
The caption had the following line: “My eyes and hands kinda hurt from finding these pics and arranging them, but aaahhh, look at it! I am sooo happy after making this 🤌”
As an algo person, my brain almost automatically went: well, I could write code that does that faster! I basically need to choose five colors and then choose 10-12 photos per color (I can choose the ones I like most for the collage from them manually).
Wait, how do we choose the colors? If I choose them arbitrarily, they may not necessarily exist in my photos. And exact colors are seldom repeated between photos.
So we want some way to find the dominant color in the photos. One idea is to choose the average color, after all colors are just tuples of red, green and blue. As it turns out, that doesn’t work as combining all three equally gives a variety of greyish and muddy tones.
We now have the choice of either moving to a better representation of color (costly as we need to make the change per pixel) or choose some method where small differences in the non-extreme colors matter. If you remember from your math class, one way to do the latter is called median. And as it turns out, there is an efficient algorithm to get the median of a list.2
Now that we have associated every photo with one color, we need to choose 5 colors for the collages. This could be done by something called k-means.
We basically choose five colors (say, randomly) and then partition all the other photos into groups based on which color it is the closest to. We then go to each group and find the most dominent color among them. If these five colors are significantly different from the initial, we repeat the above step with them. Otherwise, we are done.3
This surprisingly simple algorithm is what is used whenever we want to categorize almost anything. An observation I shall mention in some algorithms tutorial is that the Sorting Hat, if it were real, would probably just be using k-means.4
So now, I was done, right? I was using well-known techniques and my code was correct. I had outsourced the ‘eyes and hands hurt’ to code and jumped straight to ‘aaahhh, look at it’.
Well, the collages my code generated were shit. They were dry and flat and technically correct. It sometimes confused the background and the subject, sometimes choose a terrible color for the collage and sometimes, well the classification was plain weird.
I had spent more than an hour coding and the code produced… nothing.
After some thought, this has been something that has been staring me in the face for quite some time. Certain tasks seem to be so distinctly human that algorithms, whether the ‘normal’ or AI kind, keep failing. Whether it is choosing the best episodes/moments of a series or finding papers to look at with respect to a problem or looking for examples to illustrate a concept or, well, curating photos for a collage.
So if there is anything to take away is that computers are still far, far away from replacing taste. So, if I want to make such collages, I probably will have to scour my gallery manually.5
I will use the word algorithms quite a lot. Here I don’t refer to whatever blackbox drives Instagram feeds or ChatGPT. I use the word in its purest sense, a series of steps to perform a specified task. This definition applies from recipes, to IKEA manuals to how Google Maps calculates the distance and path in milliseconds.
There is a linear time algorithm for median. Someone might argue that the operation on pixels to move to a different color space are also O(n), remember, the constant factor exists and in this case, can lead to the bottleneck.
Technically, I am being a bit coy here. We make one change that we start of our initial points as far off as possible with the goal of getting the final colors further apart. This is called Extremal K-Means.
To be complete, it would be using Online K-Means. Here Online means that all data is not available to begin with and keeps being added at latter times. Our algorithm has to commit to decisions it once made, so it has to update to new information without completely doing away with old classification. Basically, it can’t completely change the meaning of the houses based on the new batch.
If you liked this post, feel free to press the little heart at the top(and bottom) left. It improves the reach of this post and tells me you enjoyed what you read.
If you enjoyed this post, you might enjoy my other works on math and CS, litrature and writing, game design and what not. Consider subscribing, it is free and gets these things straight to your inbox.
If you have some questions or additions or just wanna share something, feel free to comment it down below.



Maybe try changing from RGB/HSV to CIELAB? That won't solve the issue of background colors, but should improve color classification.... and with that I defeated the purpose of this article.