The Bookmarking Data Model Is Wrong For Highlighting

The data model for bookmarking and highlighting services past and present can generally be distilled down to the following:

* URL
    * Title
    * Scraped Content
    * User Highlight
    * User Annotation
    * ... Other Metadata

The bookmark itself is tied to a URL, and anything else related to the bookmark, such as the title, the scraped content (if the service scrapes on your behalf), highlights and annotations are stored as additional metadata linked to that URL.

There are some unfortunate restrictions that come with this data model.

Let’s take the example of comments.

Comments by their nature are distributed; the same article can be shared on any number of websites for any number of different users and communities to discuss. Especially in the case of tightly focused communities, the commentary on an article is often just as valuable as the article itself.

When the data model is anchored around the URL, highlights become tightly coupled to that URL, and this tight coupling ignores the distributed reality of comments, leaving them with no real place to exist in the data model.

Ideally, highlights made on an article and comments saved about an article should be easy to connect and view together, because it is the article itself, and not the URL, that is the common denominator (the URL is an imperfect proxy for the article).

What if there is nothing of value in the article itself to highlight, but the discussion of the article contains the real information of value that you want to save? This is often the case when a commenter debunks bogus claims published in an article shared on link aggregation websites while simultaneously explaining their method.

If you think about this for long enough, you may also come to the conclusion that I have, that saving comments is also a form of highlighting, and both comments and highlights can be described in a common data model as “content”.

A content-first data model solves all of these issues and more, but it requires deliberate thought and planning when designing a service, and it is not something that can be tacked on at a later stage after starting out with a bookmarking data model.

In addition to solving these issues, a content-first data model also allows for new and exciting possibilities in the sharing of the information that we save. Especially if you are a believer in the idea that knowledge is social.

If you have been a subscriber of Instapaper or Pinboard in the past, you may have subscribed to RSS feeds filled with URLs friends and peers were saving. There is a limited utility in this kind of feed which essentially acts as a firehose, leaving the significant burden of filtering down to the individual subscriber, who has no way of knowing why an item was added or saved.

If you have been a subscriber of Readwise in the past, you may have used the link sharing feature to share a scraped version of a free or a pay-walled article with others, complete with your highlights and annotations. There is again a limited utility in this kind of sharing which involves significant manual intervention every time to generate a shareable link and then again share it with N individuals.

Earlier this month I wrote about embedding RSS feeds on a static website, but I didn’t get too much into how the feeds themselves were being populated.

With a content-first data model, each individual piece of content can be tagged and classified independently, both independently as a piece of text, but also independently of its source material. This can be leveraged along with content-driven tagging rules to publish self-populating RSS feeds, which is how the feeds listed under the “Recent Highlights” section of the homepage of this website are populated.

The "Recent Highlights" section of the homepage, powered by Notado Feeds

If reading any of this has flipped a switch in your head, I encourage you to try out Notado. It is a simple, text-focused knowledge management and publishing service with no clunky JavaScript frameworks slowing your machine to a crawl.

If you liked the ethos and design of Pinboard, but feel like the feature set has not kept up with the competition, there’s a pretty good chance you’ll like Notado. If you’re a hacker who likes to build, Notado also has an extensive and fully documented GraphQL API for you to play with.

In addition to web highlights on desktop and mobile devices, Notado also supports Kindle highlights (without having to reauthorize with your Amazon account periodically) and first-class support for saving and importing comments from popular link aggregator and microblogging platforms such as Reddit, Mastodon, Hacker News, Twitter, YouTube, and more.

Notado comes with a 30-day free trial (no billing details required), after which it is very moderately priced and has only a single tier in which all features (tagging rules, smart feeds, full-text search, unlimited highlights and comments support etc.) are included.

Service	Monthly Cost (full feature set)
Notado	$1.99
Readwise	$7.99
Pinboard	$1.83
Instapaper	$2.99
Raindrop	$2.33

Notado will be around at least as long as I am, as my one and only knowledge management tool for the rest of my life. In the event that I’m no longer able to run Notado, plans have been made to open source the codebase for people to continue running their own instances.

Discuss this article on /r/LGUG2Z