Pinterest currently does not have an official webservice API. It seems kind of crazy in this day and age. They really should have one. I can’t think what the business reasons might be for not having one.
They’ve not had one for long enough that it’s high time we write our own. It’ll be surprisingly easy with a few choice tools
Webservice API on NodeJs
NodeJs is just a fun platform to write IO-heavy applications for the web. We’re going to write a quick RESTful endpoint using the Express library that allows us to consume real Pinterest content that’s not available via a pre-existing service.
Screen Scrape Pinterest
Given no API, we’re left to our own devices. The data for Pinterest is only exposed via the UI on the website. We’re going to have our service visit that UI and grab the data that we need as a user of a web browser would see it. This is screen scraping. There a lot of downsides here, but we wouldn’t be trying it if there was an API already.
One down side is that our service will be brittle. If Pinterest ever changes the layout of the page, our service won’t be able to bring back the right data. Our solution will be simple, so it’ll be easy to update, but this should be a red flag not to do anything mission critical via screen scraping unless you’re giving it your full attention.
Another down side might be speed. Screen scraping a UI is not the fastest way to get data. We’ll try and help mitigate this with the fastest tools that we have. NodeJs is a blasted fast web server. A library called cheerio is supposedly best-in-class for screen scraping (advertised as 8x faster than jsdom).
To make this retrieval even faster for repeat use, caching could be very helpful. We could cache in our service what we get back from pinterest via some datastore or we could cache in our client. Best practices here will be very dependent on your use case. These kind of enhancements have been made over and over again and would only clutter the simple Pinterest interaction, so I will exclude them for now.
Getting Pinterest Data
Here’s the final solution in all its glory. This snippet only includes only the code inside the Express route.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
When I wrote it out for my own use, I was surprised at the brevity. I love it. Given, there’s no handling of any errors or attempts to make this semi-robust. This just gets us the data on a good day.
The final json that’s exposed at our chosen Express endpoint looks like it this:
1 2 3 4 5 6 7
It’s ready for use by a json-ready client. So stinkin easy. We’re connecting the web together, and it’s awesome! Now the world will know of the baked goods and flower arrangements that we love the most.