Making Live Blog search engine-friendly¶
Today, Sourcefabric is releasing a beta version of Live Blog 2.0. It includes a major new feature, developed in partnership with Zeit Online, that many of our clients and users have been asking for: You now have an easy way to make your blog content visible to search engines.
In this blog post, I explain why that wasn’t possible before, how Sourcefabric solved the problem, and what you need to do to integrate our SEO solution in your CMS.
As always, we're grateful for any feedback you have. Please get in touch at [email protected] if you need help with the install or want us to provide you with a demo instance.
The problem with search engines and dynamic content
Traditionally, search engine crawlers were designed to index static content such as HTML, plain text, and office document formats. As long as content remained static and was rendered by the web browser exactly as it was received from the server, that was fine.
But in today’s web applications, data is continuously exchanged between the browser and the server, thanks in large part to the development of the ‘AJAX’ group of technologies. Live Blog is an example of this type of dynamically generated web application that works without the need to reload the page.
In response, some search engine crawlers (notably Googlebot), have been developing their ability to index dynamically generated content by emulating what happens in the web browser when a user visits and interacts with a page of dynamic content. There has been some progress, but it’s still incomplete. So developers still have to do extra search engine optimization (SEO) to ensure that content generated by their applications will get indexed.
Sourcefabric’s SEO solution for Live Blog
With the release of Live Blog 2.0, we provide a solution for ensuring that blog content is indexable by search engines. A static HTML version of each blog is now generated on the server. As a publisher, you can set your CMS to request this HTML from an API and insert it into an article page before that page is delivered to the browser. When search engine crawlers visit the page, they will see the latest posts from the embedded blog and index them.
Human readers might notice an improvement as well: They will see the live blog without any delay when they visit the page where it’s embedded as both are loaded at the same time. Any new posts that are published after they open the page will still appear automatically without the need to press reload. They’re added to the blog using AJAX requests.
How we implemented the SEO solution
We have implemented the following technical changes to Live Blog to achieve the server side HTML generation and to make blog content indexable by search engines:
The embedding of Live Blog has been refactored with Backbone.js to make it compatible with Node.js for server-side HTML generation.
The static HTML of the blog content is generated using Node.js on the server.
Backend API services have been built to make this generated HTML accessible to the user’s CMS.
A new section has been added to the blog configuration to allow users to configure a) the number of posts initially contained in the HTML generated on the server, and b) the refresh rate for HTML on the server.
How to integrate Live Blog in your CMS
The user’s CMS can request the following data from the Live Blog REST API:
HTML for any given blog
the time when that blog was last updated
the time when the server-side HTML was last generated
There are two ways of integrating with the Live Blog REST API:
The CMS can make regular requests to the Live Blog API to retrieve the latest version of the HTML content.
In the Live Blog admin interface, the user can specify an optional callback URL. If this URL is specified, it will be called every time there’s an update to the HTML of the blog.
Once the HTML has been requested from the Live Blog API, it has to be included in an article page in the CMS at the desired position.
The HTML for each blog can be retrieved from a URL that is structured as follows:
Users will be able to copy this URL for each blog from the backend interface of Live Blog.
Links to get you started
- See Live Blog in action at Zeit Online
- Github repository for the new SEO-friendly embed
Do you want to use Live Blog but not worry about the code? Get in touch at [email protected] to learn more about our managed hosted and support services. We can install and maintain a Live Blog instance for you on our servers, so you can focus on blogging and leave the rest to us.