Archive for December, 2007

Server down 12/28

Friday, December 28th, 2007

Santa Clause’s gift to us this Christmas was four hours of down time. The issue was caused by a deletion of all stale notifications in the queued_notifications table. The table had a little bit over 3 million records, all of which were stale and needed to be deleted (they were occupying more than 25% of the sql db footprint or about 0.5GB).

Following is the sequence of events for the record:

  • At around 4am I kicked off a delete on all records in that table and went to bed.
  • The deletion finished about 3 hours later (the mysql client process exited successfully after finishing the delete)
  • at around 8:48am John managed to get a hold of me to tell me that the server was down.At that stage here’s what was happening:
    • mysql was taking an unusual amount of CPU (40-50%)
    • simple queries were taking many seconds to finish sometimes tens of seconds
  • I put up the maintenance screen to bring the database back to idle state, but the database still used 10% of CPU on average and show 70-80% of CPU was in iowait state. This is highly unusual especially that the database was not being used at this point.
  • I also noticed something. While querying for count(*) on queued_notifications resulted about 39k records, show table status showed the original number of records before the delete was started (above 3 million records). This led me to believe that the database was still re-arranging data based on the large delete (not sure what that means yet, but will be investigating further later)
  • I dropped the queued_notifications table and recreated it with its index and the database started behaving.
  • Brought all the app servers back online and all seemed to work fine.

What should we learn from this:

  • We should avoid large data creation/deletion
  • We should insure tables that hold transient data get cleaned periodically (currently notifications and feeds)
  • A slave DB would have avoided us a lot of downtime in a case like this.

Sigh!

IE6 border bug with vertical scroll

Wednesday, December 12th, 2007

In IE6, we have seen two unusual behaviors solved by position relative.

One is known as the peekaboo bug, where elements to the right of a float div disappear and reappear. You will see this technique used in communities and mymedhelp layout files.

The new one is the border flash bug, where the page renders correctly, but the border disappears when you use the vertical scroll bar to move up and down. This was found in the community members page.

We do not have this bug on tags which uses the exact same partial…so I hypothesize that the vertical scroll bar makes IE6 hit this error path

#community_members {

overflow: hidden;
+ position: relative;

}

Mongrel Misleading Me to Mistakes

Wednesday, December 12th, 2007

I got the error below recently. Good luck debugging the error message below.

The bug was a bad require path. This can also happen with missing gems in your deployed application.

=> Booting Mongrel (use ’script/server webrick’ to force WEBrick)
=> Rails application starting on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
** Starting Mongrel listening at 0.0.0.0:3000
** Starting Rails with development environment…
Exiting
c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/rails-2.0.1/lib/commands/ser
vers/mongrel.rb:16: warning: already initialized constant OPTIONS
c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/rails-2.0.1/lib/commands/ser
vers/mongrel.rb:19: undefined method `options’ for []:Array (NoMethodError)
from c:/medhelp/InstantRails/ruby/lib/ruby/site_ruby/1.8/rubygems/custom
_require.rb:32:in `gem_original_require’
from c:/medhelp/InstantRails/ruby/lib/ruby/site_ruby/1.8/rubygems/custom
_require.rb:32:in `require’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/activesupport-2
.0.1/lib/active_support/dependencies.rb:496:in `require’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/activesupport-2
.0.1/lib/active_support/dependencies.rb:342:in `new_constants_in’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/activesupport-2
.0.1/lib/active_support/dependencies.rb:496:in `require’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/rails-2.0.1/lib
/commands/server.rb:39
from c:/medhelp/InstantRails/ruby/lib/ruby/site_ruby/1.8/rubygems/custom
_require.rb:27:in `gem_original_require’
from c:/medhelp/InstantRails/ruby/lib/ruby/site_ruby/1.8/rubygems/custom
_require.rb:27:in `require’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/activesupport-2
.0.1/lib/active_support/dependencies.rb:496:in `require’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/activesupport-2
.0.1/lib/active_support/dependencies.rb:342:in `new_constants_in’
from c:/medhelp/InstantRails/ruby/lib/ruby/gems/1.8/gems/activesupport-2
.0.1/lib/active_support/dependencies.rb:496:in `require’
from ./script/server:3