recent

Titulo

Why is my query running like a champ the second time?

Doesn't matter how long you are in database business, the slow running query will always get you. I have been doing SQL query tuning for last few years and this one long running job got me today so bad that I have to blog about it.

Let me give you a background of my problem in SQL Job. We have a daily job scheduled to run at 5:30 am. The job normally takes 5-6 min to complete. For past few weeks, it is taking 45 minutes longer. This is not an  acceptable S.L.A to our client. Also, there are few other jobs depending on this and the customer wasn't happy with this sluggish job. They wanted me to fix it now. Yes, I mean NOW!

Last month, I have blogged on how to optimize a query and make it run damn fast. I followed all the steps I listed there. I made sure there were no drastic change in data size, examined the execution plan, no full table scan, proper indexes use. Did index rebuild and updated the table statistics. Oh, yeah, the query inside the job hasn't changed in years. What else can you think of?  The result stayed the same, F-I-F-T-Y freaking minutes.

What is more interesting? When I run the same query in Development db box, it runs under 5 min. And YES, the table structure and data size is same on both Prod and Dev environments. Our Production has better hardware, memory and processing power. What was I doing wrong here? I set up job to run in development box and scheduled it to run the same time as production thinking that something is going on during that time in our network, no luck.  The Dev Job would still run at 5 min and the prod at 50 min. Really?

I am so freaking tired and frustrated. Followed all the directions and instructions that was available online. I was all over the forum looking for a solution. When I ran the job twice, the second one took 5 min in production. I have no idea what made me run it twice back to back. Do you know what is going on here? Why it is taking way less time during second time? Why didn't I run it twice earlier?

I had to scheduled a downtime to re-run the job during day time and you can only  request 1 hour of downtime each day. The job ran faster the second time because it is re-using the cached plan. Everyone knows the second time the query uses the cached plan.  When it comes to tuning, you completely forget all about cache rule.

The execution time for the job is always 50 min, but the cache was making it run faster. With  new application going live with our prod db box, we might have max the memory which might have caused the cache to refresh before we run it the job next day.  Requesting more memory definitely wont hurt so I have requested additional 16 GB which we were going to do it anyway. How are you going to prove your manger or upper management that you need more memory?

OK, let's do some fun exercise to prove that caching was an issue here. Remember, the  job took 5 min the second time you ran it. Again, it took 5 min again on your third re-run. Now, we will  try some different approach this time.

Run your query using CHECKPOING and DBCC DROPCLEANBUFFERS Three times and record the execution time.

First Run:
USE [Your DB NAME]
GO
checkpoint;
DECLARE @RC int
-- TODO: Set parameter values here.
EXECUTE @RC = [dbo].[daily_usage_report]
DBCC DROPCLEANBUFFERS;
GO
Execution Time: 50 min.

Second Run:
USE [Your DB NAME]
GO
checkpoint;
DECLARE @RC int
-- TODO: Set parameter values here.
EXECUTE @RC = [dbo].[daily_usage_report]
DBCC DROPCLEANBUFFERS;
GO
Execution Time: 5 min.

Third Run:
USE [Your DB NAME]
GO
checkpoint;
DECLARE @RC int
-- TODO: Set parameter values here.
EXECUTE @RC = [dbo].[daily_usage_report]
DBCC DROPCLEANBUFFERS;
GO Execution Time: 50 min

Conclusion:The real execution time is always 50 min, not 5 min. It is taking 5 min the next time because the query inside a job is using a cached data or reusing the cached plan. If you are working on tuning a query or database. You should clear your cache and test your query or else the result or stats you gather after first execution is based on the cache and may not be a good stats to use.

Warning: DBCC DROPCLEANBUFFERS is something I won't run in production database, this will clear the cache immediately and the applications will see a huge performance problem. You tell me why you see a slowness on application after running this?

Interested in working with me? I can be reached at pbaniya04[at]gmail.com for any questions, consulting opportunities or you may drop a line to say HELLO. Thank your again for visiting my blog and looking forward to serving you more.

Have a Database-ious Day!

4 comments

  1. Hey Prabin,

    I liked your blog. good stuff :)

    Even i had the similar issues once, I resolved it by creating non clustered indexers and same cache concepts.

    Keep posted such good stuffs :)

    great job :)

    Thanks,
    Praveen Nelge

    ReplyDelete
  2. Hi Prabin,

    I got a doubt here. Why this query taking 50 mins while 3rd run? Shouldn't it use cached plan again? AM I missing something here...

    Nice blog BTW.

    Thanks,
    Varun Mehta

    ReplyDelete
  3. Thank you Varun for pointing out the mistake. I will fix it ASAP and I am sorry for the late response.

    ReplyDelete

Powered by Blogger.