How old is this newspaper?
The Google News Archive contains an archive of The Indian Express newspaper from 1932 to 1994. And every once in a while I go there and open an edition to see what was happening on this same day 50 or 60 years ago.
It was in the back of my mind to use this dataset for a project but couldn’t come up with any ideas. A few months ago while playing a variant of the game Geoguessr, I wondered if I could use the same game format for the old Indian Express newspaper archive - guessing the publication date using hints from headlines and articles on the front page. I wasn’t sure how I would go about it, so I put the idea on the back burner.
Around the same time last year, I made a daily Bollywood trivia game, along the lines of NYT Connections which I had a lot of fun making and playing. With some time off last week, I decided to revisit the newspaper idea, turn it into an actual game and publish it.
And I’m pretty happy with how it has turned out. It was tougher to play than I imagined, but also a lot of fun to build and read through so many old front pages along the way.
You can read a few brief notes on the game and the Indian Express newspaper archive dataset below. Or you can skip all of that and just play the game:
If you’re still on the fence, a few screenshots might help.
In case you missed it above, here’s the same link in the form of a slightly rounded pink button.
Thank you for indulging me by playing this silly little game. If you think someone you know might enjoy it too, please share the link with them. 🙏
A few brief notes on the game and the Indian Express newspaper archive
The first step in building the game was getting a copy of the dataset. Google News Archive makes it incredibly hard to download the archive with its complex loading of newspapers via iframes and image tiling. Luckily, the front page thumbnail images were available as high-enough-res jpg files, which were good enough for the purposes of the game. I wrote a little script to fetch the image urls of all the front pages from 1932 to 1994. There were thousands!
For the newer editions, the only data source for The Indian Express I could find was PressReader which contained editions from 2006 onwards. The archive itself was behind a paywall, but luckily the front pages were all publicly available. A few thousand more front pages.
Even though a few years were missing in between, I now had a collection of 13,000+ front pages from 1932-2025, enough to build a game around.
The next challenge was blurring out the publication date on each front page. I was hesitant to use LLMs or other AI tools for this, as I had no reliable way to verify that they wouldn’t tamper with the newspaper content or introduce unintended edits. Because the date usually appeared in roughly the same place at the top of the page, I used a simple image processing library to programmatically blur out those coordinates. The design and layout of the newspaper have changed over the years, so this process did require some manual oversight and didn’t always blur the dates correctly.
Another annoyance was full page ads. From 1932 to 1994, there were no full page ads on any of the front pages of The Indian Express. Starting around 2007, they begin to appear more frequently. Between 2009 and 2017, 11.57% of front pages had full page ads. That number jumps nearly fourfold to 42.33% of all front pages between 2023 and 2025. And this data doesn’t even include ads that covered 60-80% of the front page.
I removed those front pages from the game. And even though I’m certain some full page ads slipped through the cracks or publications date on some editions still visible, I was satisfied with the final dataset.
After cleaning the data, building the game was fairly straight-forward and also enjoyable. I decided to name the game PressGuessr, after the original format and bought the .com domain name.
The link, once again for your convenience: pressguessr.com
While we are on the topic of games, apart from the usual NYT Games, here are a few daily web games that I enjoyed this year: