Hints on the nginx fair load balancer

September 11th, 2009

The default load balancing for nginx is simple round robin. If you are proxying to a set of peers, and one request for whatever reason is very slow, nginx will continue to give the busy peer additional requests. The result is that requests that might ordinarily be quickly fulfilled will stack up behind the slow request.

There is an add on module for nginx called the fair load balancer. There isn’t all that much documentation available. Hopefully this post will fulfil some of that need. This is of necessity a summary — the fair load balancer has three weight modes and two scan modes, and can fit a large number of site requirements.

The fair load balancer is initialized in the upstream declaration:


upstream mongrel {
fair;
192.168.1.77:4000;
192.168.1.77:4001;
192.168.1.77:4002;
}

This declaration will use fair in its default mode. In this mode, fair will assign idle peers requests first, when all are busy it will assign the requests using a score that depends on both the number of requests assigned to the peer (as most important) and if all equal, to the peer that had the earliest assignment.

Peers may be assigned weights. A weight is an integer value that is related to the number of requests that can be assigned to each peer, although the interpretation is a little different for each weight mode,

Fair also keeps track of peer failures. A failure is defined as an I/O error connecting to the peer — but not an html error and, most importantly, not a timeout. If a peer fails once, the load balancer will only consider the peer for a request if all others are busy. Moreover, when a peer reaches “max_fails”, the software will not assign it a request for fail_timeout seconds.

Here is an example of a configuration that takes advantage of these features. It uses weight_mode=peak to prevent upstream overload, It sets peer failure to pause 60 seconds in assignment on the event of a backend error.


upstream mongrel {
fair weight_mode=peak;
192.168.1.77:4000 max_fails=1 fail_timeout=60 weight=6;
192.168.1.78:4000 max_fails=1 fail_timeout=60 weight=6;
192.168.1.79:4000 max_fails=1 fail_timeout=60 weight=6;
}

The weight_mode=peak option imposes a limit on each peer of weight requests, in this case, 6. If all peers have 6 requests, then the directive instructs the load balancer to return busy (nginx will display the “site is temporarily unavailable” page). Obviously, this sort of configuration only makes sense if you have a large number of servers and would rather have an error page than an overloaded backend.

Bug: There is an important bug in the fair load balancer that affects 64 bit systems. ngx_http_upstream_choose_fair_peer_busy has an initialization that is incorrect. The problem line that reads:


ngx_uint_t best_sched_score = ~0U;

This only assigns a 64 bit integer 32 bits of 1s, it needs to be 64 bits of ones.


ngx_uint_t best_sched_score = ULONG_MAX;

Is one simple way to fix the problem.

Memcache and database are out of sync

May 20th, 2009

We heavily use caching in our models using cache_fu.  However, we started to notice that a small number of objects in our system were out of sync between MySQL and memcached.
In one case, this led to model validation exceptions as a user appeared unregistered in memcache, but when we went to register the the user the db would complain that the user is already registered.

What was the bug?

Lets look at the log file…

Ex) user.save!
BEGIN
Update (0.7ms) UPDATE `users` SET `token` = '665456562', `updated_at` = '2009-05-13 17:22:26' WHERE `id` = 4
MemCache Delete (0.000285)  users:2
COMMIT

Uh oh!
We use InnoDB in MySQL to get transaction support and the way rails implements ActiveRecord calls is that it wraps a transaction around everything within an ActiveRecord method call.  This includes the after_save callback to expire_cache.

This leads to the following scenario with two threads where one is reading and another writing:

t1: BEGIN t1
t1: update user in db
t1: expire user in memcache
t2: get user from memcache
t2: cache miss -> get user from db
t2: set user in memcache
t1: COMMIT

Since t2 is reading from the database before t1 has committed it will read the old value and save the old value into memcache. The db is updating, but memcache is out of sync.

What’s the solution?
We need a way for the expire_cache calls after save to be outside the transaction.  This way, t2’s read will either be before the db update and cache expiry or after the db update on the expired cache. The three possible cases are listed below. In each case, memcache and the database remain in sync.

Case 1: t2 is before the COMMIT
t1: BEGIN t1
t1: update user in db
t2: get user from memcache
t1: COMMIT
t1: expire user in memcache

Case 2: t2 is after the COMMIT, but before cache expiry
t1: BEGIN t1
t1: update user in db
t1: COMMIT
t2: get user from memcache
t1: expire user in memcache

Case 3: t2 is after the COMMIT and cache expiry
t1: BEGIN t1
t1: update user in db
t1: COMMIT
t1: expire user in memcache
t2: get user from memcache
t2: cache miss -> get user from db
t2: set user in memcache (new user)

Luckily, there is a great plug-in called after_commit that does what we need. This plugin implements additional callbacks like “after_commit” which run after save, but are outside the commit.

Kawaii - an amazing web console

October 3rd, 2008

The Ruby on Rails console is one of the most useful tools a web developer could have. It is the backend equivalent of firebug for those not familiar with Ruby on Rails. But every now and then I find myself hoping that I could actually connect to a running instance of ruby on rails and debug it.

Today, I decided that I am desperate enough that I was going to look for one (not enough to build one yet). I googled “ruby on rails web console” and there it was on the first page: KAWAII (which means “cute” in japanese).

I had to do some changes to make it work with the tree where I need it, since the tree is on rails 1.2.6. Here is a list of changes if you happen to need it:

  • Modify all html.erb files back to the rhtml extension
  • Download the prototype 1.6 javascript file and replace the “javascript_include_tag :defaults” with an explicit link to the new prototype file (rails 1.2.6 includes prototype 1.5)
  • Copy the contents of config/initializer/kawaii.rb into config/environments/development.rb

enjoy!

MedHelp seeking superstar developers

June 29th, 2008

We are always looking for great developers to join the team. Here is our job description:

Are you looking for a challenge? Work on cutting edge technologies for one of the hottest startups in tech? Are you passionate about technology and ready to make an impact? If so, MedHelp is the place for you!

MedHelp is the world’s largest online heath community. We are a social network where we connect people with similar interests, a destination where doctors from the best hospitals answer medical questions, and where users can find and use medical applications related to their health.
We are bringing health to web2.0. We have over 5.5 millions users and growing rapidly. We are a top 10 health destination.

We are looking for top notch engineers. We are located in San Francisco. We offer competitive salary, full benefits, and stock options…and cool co-workers, a pool table, and fully stocked fridge :)

Some of our recent projects include:
* building an application platform for tracking and charting medical information: http://www.medhelp.org/land/ovulation-calendar
* implementing real-time activity feeds that scale to handle millions of user activities
* developing social networking features that apply to medical information and our health-interested users

Responsibilities:
MedHelp is a fast-paced startup looking for motivated and passionate developers.
We look for candidates who are ready to take on a problem and drive it end to end from concept to completion.

Desired Skills:
• Ruby on Rails or related MVC framework development experience
• Database experience (MySQL)
• AJAX, JS, prototype, CSS
• Experience building scalable and performant applications

Send your resume to jobs@medhelp.org

Home page: http://www.medhelp.org
Typical user: http://www.medhelp.org/user_profiles/show/193137

Ruby Meetup at MedHelp

June 9th, 2008

In Feb, we hosted the SF Ruby Meetup.

We had a great time hosting this event and meeting other local developers building apps on ruby. We are big time rails proponents and we gave a presentation covering our top lessons learned in scaling MedHelp

Ruby on Rails: A look back

May 30th, 2008

It has been a little over a year since we started rewriting MedHelp’s software and had to answer a very simple question: which platform should we use?

After much exploration and deliberation, I decided that Ruby on Rails was the way to go. At that time, the debate on whether RoR was scalable or mature enough was raging (and still is), with few high profile stories adding to the drama (a twitter dev dissing RoR for what seemed to be architecture failures was a classic).

Just like anyone making an investment decision, I followed the various blogs talking about why RoR is such a terrible platform, why it couldn’t scale and how it is obviously a bad choice, starting with twitter of course and going through to the various people for and against.

To my surprise (or not) the issues people faced as they scaled RoR were not specific to RoR. In fact they were issues I saw people dabble with for years. Bottlenecked (and sometimes not truly stateless) app servers, expensive database queries, single points of failure, centralized databases.

For some reason many people in the debate assumed that there are platforms that scale and others that don’t. And that by picking the right platform you will be able to serve millions of users. Unfortunately, it is never that simple. Scaling is a continuous exercise of understanding the bottlenecks in your system and the limitations of your architecture and finding ways to gracefully get beyond them.

Another argument against Ruby on Rails was that Ruby is a slow language or that it consumed too much memory. But wasn’t this the argument against Java when the world was dominated by C++ fanatics? Wait, wasn’t this also the argument against C when Assembly developers were the coolest kids on the block? What about machine code.. you get the picture!

The answer to this argument is two folds. The first is an economic one. Developers are way more expensive than hardware. This statement held true for years, and is truer every minute than the minute before. The other part of the answer is that today’s architecture (thanks to the 90’s) puts completely stateless software at the heart of your system allowing you to scale horizontally. So it is not really that important how fast each machine is (as long as it is not noticeable to the end user), you can always add another piece of hardware and double your capacity.

So not finding any challenges with RoR that I didn’t expect to face with any other platform, and having been sold on its design philosophy (long live conventions), the elegance of its architecture and the elasticity of the Ruby language, I decided that MedHelp is going to be a Ruby on Rails shop.

Fast forward one year later. And you will notice that MedHelp is up and running. We were able to rewrite the entire application in RoR in about four weeks. We transformed the site from a simple forum application to a vibrant community. Added tons of feature, some of which are complex Ajax applications such as trackers. Swapped out the site’s interface in favor of better flow and aesthetics. And we did all that while growing our visitors from 2 million unique visitors to 5.5 million uniques.

Our average team size during this year was 3.5 people (we are 6 now). And while all of them are experienced engineers with a lot of experience in building and scaling server software (whom I knew or worked with prior to MedHelp, and am proud to continue doing so today) all of them learned Ruby on Rails on the job.

After all this, I am now taking a deep breath and asking myself again. Have I made the right choice? The answer for me is clearly yes.

The ride was not an easy one. And we had our share of emergencies, head scratching and nervous moments. But none of the mistakes made or the bugs found were caused by Ruby on Rails except in the sense that the platform’s flexibility made it easy to make some mistakes. But the mistakes were ours. When made, they often showed a misunderstanding of how a certain feature worked, a flaw in our database schema or how our components are distributed across our servers.

Now that we’ve gone through those pains to grow the site, I think I am ready to share many of the things that we learned or had to re-learn as we grew MedHelp. Each week or two I will share one of the big pitfalls that we managed to fall into, and what lessons we learned as we climbed out of it and started marching for the next pitfall.

ActiveRecord and includes maxing my ethernet

February 5th, 2008

We recently ran into an issue where using multiple includes were making a huge join on the backend and returning 1000s of rows which was taking all the bandwidth between servers.

Now, as we start to aggressively cache computed data we may run into a similar problem. For example, we cache some html pages in the db that change rarely. These pages can be huge. We would not want to return 10s of MBs of data per view of a list of pages (ie, table of contents or index)

The solution is that ActiveRecord has a :select option which does what SQL SELECT does. We should consider using this when the amount of data returned is very large.

Caching Lessons Learned

January 22nd, 2008

We have a set of bugs with caching:

Versioning:

  • We must version whenever we cache so that we when upgrade, the app uses the updated revision of the cached object
  • In acts_as_cached, there is a version. However, in page caching and fragment caching there is no version number.
  • In page caching, the cached page is saved on disk. Our deployment method overwrites the directory which will refresh this cache.
  • In this case, we should use the same workaround we use in css, icon, and js includes where we define the key as name?<version_number>

Includes:

  • Early on, we used :include in acts_as_cached so that we minimize the number of database calls. However, by over including you can accidentally max out your network. The include is implemented as a big join so if you have m includes where one column has a large amount of data, you will transfer n^m data.
  • We have seen this where we had a column that returns results in the order of 10K, but instead of transfering 60 rows of 10K (600K), we were transferring 10^6 * 10K rows (100MB). Now, that’s a huge difference!

Expiry:

  • We use fragment caching to save rendering in our views. However, the base implementation from rails does not support expiry.
  • Thus, we need to either explicitly expire or use another technique to expire like TTL expire or sweepers

Server down 12/28

December 28th, 2007

Santa Clause’s gift to us this Christmas was four hours of down time. The issue was caused by a deletion of all stale notifications in the queued_notifications table. The table had a little bit over 3 million records, all of which were stale and needed to be deleted (they were occupying more than 25% of the sql db footprint or about 0.5GB).

Following is the sequence of events for the record:

  • At around 4am I kicked off a delete on all records in that table and went to bed.
  • The deletion finished about 3 hours later (the mysql client process exited successfully after finishing the delete)
  • at around 8:48am John managed to get a hold of me to tell me that the server was down.At that stage here’s what was happening:
    • mysql was taking an unusual amount of CPU (40-50%)
    • simple queries were taking many seconds to finish sometimes tens of seconds
  • I put up the maintenance screen to bring the database back to idle state, but the database still used 10% of CPU on average and show 70-80% of CPU was in iowait state. This is highly unusual especially that the database was not being used at this point.
  • I also noticed something. While querying for count(*) on queued_notifications resulted about 39k records, show table status showed the original number of records before the delete was started (above 3 million records). This led me to believe that the database was still re-arranging data based on the large delete (not sure what that means yet, but will be investigating further later)
  • I dropped the queued_notifications table and recreated it with its index and the database started behaving.
  • Brought all the app servers back online and all seemed to work fine.

What should we learn from this:

  • We should avoid large data creation/deletion
  • We should insure tables that hold transient data get cleaned periodically (currently notifications and feeds)
  • A slave DB would have avoided us a lot of downtime in a case like this.

Sigh!

IE6 border bug with vertical scroll

December 12th, 2007

In IE6, we have seen two unusual behaviors solved by position relative.

One is known as the peekaboo bug, where elements to the right of a float div disappear and reappear. You will see this technique used in communities and mymedhelp layout files.

The new one is the border flash bug, where the page renders correctly, but the border disappears when you use the vertical scroll bar to move up and down. This was found in the community members page.

We do not have this bug on tags which uses the exact same partial…so I hypothesize that the vertical scroll bar makes IE6 hit this error path

#community_members {

overflow: hidden;
+ position: relative;

}