I'm starting to be a fan of API based communication and content loading. In this blog I shortly describe why.
Let’s have a blog page which is a bit like this page. It has following components:
- Menu
- Content (this text+title)
- Comments
Let’s first look at the life cycles of these parts:
- Comment - it’s changing whenever someone sends the comment. So it’s changing quite often at the famous blog. Each blog entry has its own comments.
- Menu - it’s changing when new content is coming or the titles are updated. The menu is practically the same at every page.
- Content - every page has its own content and it’s not changing very often after it has been published. In most cases it’s not changing at all. (Well - maybe some typo fixes but not much more than that.)
First we have the traditional architecture which e.g. Wordpress is using. It doesn’t have any API. It just constructs the whole page at the server and returns it. So you’re every time loading the menu, content and comments. You can’t cache any of this data easily or you risk that people are missing the comments. Or if you think it’s possible to create the cache and then invalidate it whenever there are any changes the process is quite complex. With pseudo code:
- If menu changes -> Invalidate all pages which has menu - this is the loop and the invalidation process must know what pages has the menu
- If content changes -> Invalidate the content of that page
- If there is comment -> Invalidate the content of that page
The menu changes are expensive. After that all page loads are hitting the backend for a while.
What if we create API based communication? The ‘static’ web page is a bit of HTML without any content, JavaScript and CSS files. The APIs are Menu, Content and Comments. Below is the architecture picture of the system. User's cache can be e.g the internal cache of the browser or the proxy of Internet Service Provider.
There’s good chances that the Content does not have to hit the real storage ever after it has been loaded for the first time. Content cache TTL for the local cache can be forever. We can easily invalidate that. The story for the remote caches are different. The TTL can be e.g. 30 seconds. In that case the user’s cache does not store the data for a long time. But instead of hitting our Content service it hits our local cache.
When the data at the Menu changes we don’t have to create a complex loop which invalidates the cache. We have only one call which invalidates the cache of the menu of all pages. This simplifies our rules a lot. The rule for the local cache can be “forever”, but for the users’ caches it can be e.g. 30 seconds or even shorter.
The caching of Comments API depends what the features are. If it gives the user the possibility to modify or delete his comment, then this API cannot be cached for the user who is logged in. There can be more complex rules for caching the Comments API. User logged in -> Never cache. Anonymous user -> Always cache, but invalidate when new comments are written.
Good microservice architecture can improve the performance with the good caching policies. The APIs can have their own life cycles and caching rules should follow those. In many cases it’s enough that the component sets proper caching headers. But to separate the different caching rules for local cache and user’s cache the caching application must be able to modify those.
P.S. Good caching lowers also the infrastructure costs and increases the reliability of the system.