Preview: Article API

Results

Thumbnail
Title A New Parser
Date Jan 14, 2016
Summary 2015 was a big year for Instapaper and in addition to all of our updates and feature developments , we were also quietly working on rewriting Instapaper’s website parser with a more modern web in mind. Today we’re really excited to finally launch our brand new and vastly improved parser! We approached the rewrite with the idea that the parser is a core product and we’ll be treating its development as such from now on. As a core product, we’ve...
HTML

2015 was a big year for Instapaper and in addition to all of our updates and feature developments, we were also quietly working on rewriting Instapaper’s website parser with a more modern web in mind. Today we’re really excited to finally launch our brand new and vastly improved parser!

We approached the rewrite with the idea that the parser is a core product and we’ll be treating its development as such from now on. As a core product, we’ve given the parser a name of its own, and will be maintaining and updating the parser with new versions and feature releases. After some deliberation, the name we went with is the somewhat obvious but entirely accurate “Instaparser,” and this launch is Instaparser 1.0.

Here are some of the ways that Instaparser will mark a massive improvement in your day-to-day Instapaper usage!

Enhanced Video Support

Instaparser supports inline YouTube and Vimeo videos from any website. With the new parser and today’s release of the Instapaper iOS 7.1.1 and Instapaper Android 4.3.2 updates you’ll now be able to view inline videos, and articles with inline videos will show up in the Videos section for easy access.

Additionally, the new parser allows us to make the changes necessary to launch the Videos section on instapaper.com, which is something we’ve gotten a lot of requests for!

Way Better Image Handling

By default, Instaparser is far superior at parsing images on websites. The parser works better to handle “lazy-loaded” web images and supports more modern web image attributes like “srcset.” Additionally, Instaparser is more proactive about finding cover images and inserting them into the top of the body text, and we’ve added special rules to allow us to manually denote cover images when needed, to make sure you’re getting the best reading experience possible.

More Aggressive Parsing

The new parser’s default parsing algorithm is a bit more aggressive at reducing navigation and other cruft from the top and bottom of the articles, ensuring that your Instapaper experience is even cleaner and less cluttered than before.

Performance Improvements

In our production testing we found that Instaparser was roughly five times faster than Instapaper’s older parser and over ten times faster than Diffbot, an external parsing service that we used to supplement the old parser and which we hope to replace entirely with Instaparser. The difference in speed is dramatic enough that you should notice a marked improvement when downloading articles. Here’s a graph that demonstrates the difference between parsing time before and after launching the new parser (Y-axis in milliseconds):

In addition to all of the great new enhancements we made sure to preserve the original parser features that make Instapaper so great, like multi-page and footnote support. If you encounter any problems with the new parser, please be sure to use the “Report a Problem” tools at the bottom of your articles, or send us an email to support@help.instapaper.com with the URL in question and we will resolve it as soon as possible.

Thanks for using Instapaper!
— Instapaper Team

Site Name Instapaper Blog
Words 507