Friday, March 23, 2012

Rebuilding a huge clustered index

I have a client that has a 800GB table. The current clustered index on this table is of low selectivity and is causing index scans on queries. I wish to drop the current index and create a new one that is of more use. What i really want to know is, what is the fastest method of rebuilding such a huge index? What would be the storage requirements to process this?

That is indeed a very big table. Perhaps, you want to use index defrag (dbcc indexdefrag) instead of a full rebuild.

A good article on index defrag (sql2k) is here:

http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/ss2kidbp.mspx

If you're on sql2k5, take a look at this:

http://www.microsoft.com/technet/prodtechnol/sql/2005/onlineindex.mspx

|||

Defragmenting the current index won't help, since you want a new index on different column(s).

If you are running SQL Server 2005 Enterprise Edition, you might try creating a new, non-clustered index with the ONLINE option and with MAXDOP set to something like 25% of your number of available CPU's (so it does not peg all the CPU's while building the index). This would be a lot less disruptive than dropping the clustered index and building a new one. This is assuming you need to stay online while you are doing this.

Without knowing the table schema, its hard to estimate the storage requirements for this.

|||The fastest way to rebuild a new index or create a new clustered index is to build/rebuild it in OFFLINE mode. If you rebuildin OFFLINE mode, we do minimally logging in the engine (we only log complete pages instead of individual inserts).

The space requirements for ONLINE / OFFLINE are simular: you need space for the original index, you need space for the new index (simular size to the original index), and if you build the clustered index on new columns, you need space in tempdb as SQL Server will be doing a SORT before creating the new indexes. This totals on about 2.2 times the size of the original index (but could be larger, the 2.5 is just a rough guess)

So if perf is your only concern , use OFFLINE. The drawback is the table is offline, so other clients cannot access the table while the rebuild is in progress.

Also, the MAXDOP option will work for OFFLINE rebuild as well, so you can specify the number of CPU's you want to use during the rebuild.

Thanks,

No comments:

Post a Comment