What gets to the front page of Hacker News?

Thursday, June 29th, 2023

In my job as technical writer / marketer1, the most common question I get from companies I work with is “how do we get to the front page of Hacker News?” And as someone whose writing has been on said front page many times, I’ll tell you: I have no clue!

Sometimes it seems like good, high quality writing always finds its way to the front page; other times, it feels like the mods are out to get you. So I started (very manually) collecting data on what the top 30 posts on HN are at the end of any given day.

Here are the highlights (FP = Front Page):

  • Blog posts (45%) are the most popular type of content on the FP
  • A blog post from a corporate entity only has a 8% shot at making the FP
  • 25% of FP posts are blog posts from engineers on their personal blogs/sites or OSS
  • 36% of FP posts are news articles, and the (slight) majority of them are actually not about software/hardware
  • ShowHN posts almost never make the front page (<2%)

Here’s the breakdown more visually:

Before some more analysis and a few more charts, I want to preface this post by saying that I don’t mean to comment on the inherent value of getting your content to the front page of Hacker News. Whether this is the right goal, or if perhaps you should pursue a different goal like number of upvotes, or maybe comments, or maybe angry comments, is a discussion for another post.

How I gathered and categorized the data

The way I gathered this (small) data set was by manually2 combing through the top 30 posts on Hacker News using the past feature. For each post, I clicked on the link to see what the content was about, who published it, and where. Partially in advance, and then partially on the fly the more stuff I saw, I classified each post into a category. Since my main focus is technical writing and marketing, the categories I chose relate to that lens:

  • News / opinion articles in media publications
  • Academic journals and papers
  • Blog posts
    • Personal blog vs. a corporate blog vs. an open source entity
    • Types of content: tutorials, thought leadership, etc.
  • Hiring announcements
  • ShowHN
  • Misc.
    • Repo links
    • Non-blog websites
    • Tweets, Reddit posts, etc.

These categories have excellent coverage (Misc. <5%) despite them being oddly specific. You can see that I wasn’t particularly concerned with the subject matter per se (everything is about Rust anyway) but more the format and the authoring entity.

The astute reader will note several limitations of the data set.

First, the data I collected represents what finished the day on the front page of Hacker News. But many items will be on the front page over the course of a given day, and then end the day somewhere else (perhaps number 35, or 67). What “gets to” the front page – a group that contains, and exceeds the size of, what “ends” on the front page - is a richer data set but I do not have access to it / it may not exist.

Second, and perhaps more importantly, the dataset doesn’t record the attempts made to get to the front page, i.e. all posts on Hacker News in a given day. It’s possible that there are orders of magnitude more blog posts posted but fewer that make the FP, whereas 95% of any academic paper submitted makes the front page (extreme figures used for illustrative purposes). So for simplicity, I’ll say “the likelihood of making the FP” which assumes a constant rate of conversion from post to FP across different categories.

Limitations aside, the results started to converge very clearly after only 5 or 6 days of data, although there were a few outlier days with spikes in a particular category. In total I collected and sifted through 30 days worth of data, and hope to add more in the future.

What kinds of blog posts get to the front page?

Statistically, your best shot of getting your writing to the front page of Hacker News is by writing something (with nothing to promote) on your personal website or blog. 26% of total FP posts are blogs like this, while only 11% of total FP posts (or 24% of FP blog posts) came from corporate3 entities on their corporate blogs or websites. 20% of blog posts are from some sort of open source entity (usually launches).

On the subject of what to post, the Hacker News guidelines say:

On-Topic: Anything that good hackers would find interesting. That includes more than hacking and startups. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual curiosity.

and this is pretty much the story with what blog posts make the FP.

Of those corporate blog posts, about 40% of them are product announcements or launches; the rest are less promotional content formats like technical tutorials. A common question I get is “how do we get our product launch on the front page of HackerNews?” and the answer is that it’s statistically4 highly unlikely (~4%) for that to happen. And of those product announcements that made the FP, a good deal of them are (a) from established companies and products like Apple, (b) posted organically by the community, and (c) about hardware and gaming. Not your software startup.

Personal blog posts, though, are highly popular on the FP. They span the gamut from tutorials to “how I built ___” type posts, and of course the perennial “I made a thing.” Here are a few examples of personal blog posts that made the FP:

The common thread is that they’re non-promotional, and typically focus on a personal pursuit of the author5.

My personal experience writing for corporate entities says that useful tutorials and interesting stories do the best. The most recent post I wrote that made it to #1 was a tutorial for PlanetScale about how database sharding works (HN post here). A few others I wrote that made the FP were all non-promotional:

  • My two stories for Retool about why Accenture (link) and Oracle (link) are worth so much money
  • My blog post for WorkOS about best practices for building webhooks (link)
  • My “thought leadership” for PlanetScale about DBA experience (link)

These successes live next door to a massive graveyard of blog posts I’ve written that I thought were really good, but Hacker News did not. Or perhaps randomness just reared its ugly head.

What kinds of news gets to the front page?

The second biggest category of posts that make the front page of Hacker News is (shocker) news, which I define here as a story or opinion piece published by a media organization. 36% of FP items are news, which is a lot!

While almost all of the news that makes the FP is STEM related, the majority of it (well, by a few percentage points) doesn’t relate to software or hardware. There are a lot of articles about space exploration and rockets, biology and chemistry, and physics, but fewer about code and SaaS and things like that.

It’s worth noting that I didn’t see a single article from TechCrunch in the entire dataset I gathered, despite there being plenty of articles for places like the Verge, Wired, etc. A cursory search using Algolia’s Search HackerNews tool, ordered by number of upvotes, reveals that HackerNews really does not like TechCrunch very much.

Miscellaneous findings and other things

6% of items on the front page are academic papers, which is more than I thought.

ShowHN is very valuable, but is not likely to land your product on the front page.

Tweets and tweet threads sometimes make the front page.

It’s uncommon for hiring or launch posts - the two types of posts that are reserved for YC companies, and “artificially” promoted by moderators - to make the front page.

You can access the underlying dataset here.

I want to thank Max Woolf (who may recall interviewing me for a Data Science job at Buzzfeed that I thankfully did not take) for the excellent “Hacker News Undocumented” resource. It was tremendously helpful.


Footnotes

Footnotes

1. With an undergrad Data Science degree for some reason

2. For the empathetic reader wondering why I did this manually when there are tons of ways for the author (who, as mentioned in the previous footnote, has a Data Science degree) to access historical Hacker News data via BigQuery, API, etc. – the amount of effort it would have taken to train a classifier would have been highly impractical, plus I’d need to manually label the data anyway.

3. By this I mean closed source, basically. Anything written on the blog of a company that is selling something.

4. Noting once again that this framing isn’t entirely fair, since the dataset is missing attempts.

5. Which makes me wonder if the best “front page strategy” might be to encourage your engineering team to work on their personal blogs?