Posts Tagged ‘memcache’

Memcache and database are out of sync

Wednesday, May 20th, 2009

We heavily use caching in our models using cache_fu.  However, we started to notice that a small number of objects in our system were out of sync between MySQL and memcached.
In one case, this led to model validation exceptions as a user appeared unregistered in memcache, but when we went to register the the user the db would complain that the user is already registered.

What was the bug?

Lets look at the log file…

Ex) user.save!
BEGIN
Update (0.7ms) UPDATE `users` SET `token` = '665456562', `updated_at` = '2009-05-13 17:22:26' WHERE `id` = 4
MemCache Delete (0.000285)  users:2
COMMIT

Uh oh!
We use InnoDB in MySQL to get transaction support and the way rails implements ActiveRecord calls is that it wraps a transaction around everything within an ActiveRecord method call.  This includes the after_save callback to expire_cache.

This leads to the following scenario with two threads where one is reading and another writing:

t1: BEGIN t1
t1: update user in db
t1: expire user in memcache
t2: get user from memcache
t2: cache miss -> get user from db
t2: set user in memcache
t1: COMMIT

Since t2 is reading from the database before t1 has committed it will read the old value and save the old value into memcache. The db is updating, but memcache is out of sync.

What’s the solution?
We need a way for the expire_cache calls after save to be outside the transaction.  This way, t2’s read will either be before the db update and cache expiry or after the db update on the expired cache. The three possible cases are listed below. In each case, memcache and the database remain in sync.

Case 1: t2 is before the COMMIT
t1: BEGIN t1
t1: update user in db
t2: get user from memcache
t1: COMMIT
t1: expire user in memcache

Case 2: t2 is after the COMMIT, but before cache expiry
t1: BEGIN t1
t1: update user in db
t1: COMMIT
t2: get user from memcache
t1: expire user in memcache

Case 3: t2 is after the COMMIT and cache expiry
t1: BEGIN t1
t1: update user in db
t1: COMMIT
t1: expire user in memcache
t2: get user from memcache
t2: cache miss -> get user from db
t2: set user in memcache (new user)

Luckily, there is a great plug-in called after_commit that does what we need. This plugin implements additional callbacks like “after_commit” which run after save, but are outside the commit.

It’s all about the cache money

Thursday, November 1st, 2007

Cache is money to MedHelp as it helps us scale by reducing cpu, I/O, roundtrip times, etc. This is critical as we continue to grow from millions of users to tens of millions of users and beyond.

There are five popular ways to cache:

  1. page caching
  2. action caching
  3. fragment caching
  4. model caching
  5. in-memory computational cache

Page Cache

Page caching caches static pages so the request does not need to hit your rails server. I can imagine, every page on wikipedia is page cached and invalidated on update. This has huge performance benefits as we can configure our web server in front of rails to return the cached html page. We could use page caching for our medical dictionary pages. Almost all our pages have or will soon have dynamic content so the value of page caching appears low.

However, an interesting use of page caching is to leverage the fact that all pages viewed by a non-logged-in users looks the same. Since, the majority of our traffic are new users from google this will have a large impact. The trick here is that we have to identify that a request is from a non-logged in user from inspecting the cookie in our front-end web server.

Methods: caches_page and expire_page or sweepers

Action Caching

In action caching, the request hits the rails server and passes through all the filters. This is useful when you require auth to differentiate logged in and out and have an authorization filter. I cannot think of a case where we would use this form of caching since we use dynamic content on almost every page.

Methods: caches_action, expire_action

Fragment Caching

Fragment caching saves rendering of portions of your view. This method is interesting as we could cache the middle div (user journals) in the people page and invalidate using a time to live.

Another interesting technique, is to use identifiers on the cache fragments so that in the controller you do not execute code related to that fragment (ie, save db calls and computations). This does poke a hole in the MVC model as it creates an additional coupling between view and controller. Here is a code example:

View:
<% cache( :controller => :post, :action => :show, :subject_id => @post.subject_id ) dp %>
  <% # beautifully written medhelp code %>
<% end %>
Controller:
def show
  unless read_fragment(
:controller => :post, :action -> :show, :subject_id => @post.subject_id  )
    # lots of db calls
  end
  # code you need for other fragments on the page
end
def edit
  ...
  expire_fragment(:controller => :post, :action -> :show, :subject_id => @post.subject_id)
end

Methods: <% cache do %> in the view expire_fragment

Model Caching

Model caching is when we store ActiveRecord and db results to save db calls which is typically the bottleneck of our web appplication. We currently use this everywhere and we are very strict in reviewing this in our bottoms up design reviews and code reviews prior to check-in. For example, we cache all users. Soon, we will be caching all posts, subjects, and forums. All hail memcache.

Methods: CACHE get/set, act_as_cachable

In-memory computational cache

This is where we save our computation in a data structure. I almost forgot this since it is second nature. An example of this is caching the key words -> links data structure that is used to link-ify user generated text. See our medical terms highlighting in user journals and forum posts.

Methods: CACHE get/set, session set/get, class variables