Counter of Festivals

Ashok Blog for SQL Learners and Beginners and Experts

Tuesday 28 May 2013

Weekly Rebuilding Indexes


 Weekly Rebuilding Indexes 

Source: https://www.simple-talk.com/content/article.aspx?article=1119
  • What Rebuild Index does and the problems that can arise if it is not used
  • Considerations when using the task, and possible alternatives
  • How to configure and schedule the task using the Wizard

Rebuild Index Task:

As indexes are subjected to data modifications, index fragmentation can occur in the form of:
  • Gaps in data pages – this creates wasted empty space.
  • Logical fragmentation – this is a logical ordering of the data that no longer matches the physical ordering of the data
Gaps in data pages can reduce the number of rows that can be stored in SQL Server's data cache, leading to increased disk I/O. Logical fragmentation can cause extra disk activity as the disk subsystem has to work harder to find the data on disk and move it to the data cache. The only way to remove wasted space and logical fragmentation is to rebuild or reorganize the indexes on a regular basis. This is one of the most useful and powerful maintenance tasks that you can perform on a database, because the steps it performs can greatly boost database performance.
If you configure the Rebuild Index task using all the default settings, then when the task runs, it physically drops and rebuilds every index in your selected databases, removing both wasted empty space and logical fragmentation. As a byproduct of rebuilding all the indexes, index and column statistics are also recreated anew and fully updated.
The T-SQL command that is generated from these default settings is as follows:
ALTER INDEX index_name ON table_name REBUILD PARTITION = ALL WITH ( PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, ONLINE = OFF, SORT_IN_TEMPDB = OFF )
While this command looks complicated, the bulk of the code is simply turning off various options. The ALTER INDEX command has a lot of options, some of which you can configure using the Maintenance Plan Wizard, but many more that you cannot. We will discuss all of the available configuration options as we work through this article.
If you review the text file report from this task, it looks something similar to the following:
Retrieving Text file reports
Unless you specified otherwise in the "Select Report Options" screen of the Wizard, text reports created by the Wizard are, by default, located in this folder: C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\JOBS
Microsoft(R) Server Maintenance Utility (Unicode) Version
Report was generated on "HAWAII".
Maintenance Plan: User Databases Maintenance Plan
Duration: 00:00:23
Status: Succeeded.
Details:
Rebuild Index (HAWAII)
Rebuild index on Local server connection
Databases: AdventureWorks
Object: Tables and views
Original amount of free space
Task start: 2009-07-29T16:01:48.
Task end: 2009-07-29T16:02:09.
Success
Command:USE [AdventureWorks]
GO
ALTER INDEX [PK_AWBuildVersion_SystemInformationID] ON [dbo].[AWBuildVersion] REBUILD PARTITION = ALL WITH ( PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, ONLINE = OFF, SORT_IN_TEMPDB = OFF )
GO
USE [AdventureWorks]
GO
ALTER INDEX [PK_DatabaseLog_DatabaseLogID] ON [dbo].[DatabaseLog] REBUILD PARTITION = ALL WITH ( PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, ONLINE = OFF, SORT_IN_TEMPDB = OFF )
GO
Once the header information is displayed, note that the command executes for each index in the database. I have abbreviated the report to only show two indexes being rebuilt, as there were a lot more that were actually rebuilt in the full report.

When and how often should you Rebuild Indexes?

The performance of your indexes, and therefore your database queries, will degrade as you indexes become fragmented. The Rebuild Index task does a very good job of rebuilding indexes to remove logical fragmentation and empty space, and updating statistics. As such, it is very important that you schedule this task to run regularly.
On the other hand, the Rebuild Index task is a resource intensive task. In addition, as an index is being rebuilt, locks will be placed on the index, preventing anyone from accessing it while the rebuilding occurs. Any queries trying to access this index in order to return the required results will be temporarily blocked, until the rebuild is complete. As such, the Rebuild Index task is considered an offline activity, to be run when as few people as possible are accessing a database. In general, this means during a scheduled maintenance window.
It is quite difficult to offer general advice with regard to when and how often to rebuild indexes, using the Maintenance Plan Wizard, as it is so dependent on the nature of the data, the indexes and the queries that use them. However, take a look at my general advice with regard to index rebuilding, and then we'll consider the advice in a little more detail over the coming sections:
  • Nightly, if required. If your indexes fragment rapidly, and you have a nightly maintenance window that allows you to run the Rebuild Index task, along with all the other maintenance tasks; then do so. Index fragmentation will degrade the performance of your indexes. Assuming that you have a maintenance window, rebuilding every night can't do any harm, and can very well boost the performance of your server.
  • Weekly, at minimum. If you can't perform this task nightly, then, at a minimum, it should be run once a week, during a maintenance window. If you wait much longer than a week, you risk hurting your SQL Server's performance due to the negative impact of wasted empty space and logical fragmentation.
  • Consider alternatives, otherwise. If you don't have a maintenance window long enough to run this task at least once a week, then you need to consider the following alternatives:
  • Use the online version of the Rebuild Index task – available only with the Enterprise Edition of SQL Server.
  • Use the Reorganize Index task followed by the Update Statistics task – if you're using the Standard Edition of SQL Server. This is your only real alternative when using the Maintenance Plan Wizard if want to avoid the Rebuide Index task.
  • Avoid the Maintenance Plan Wizard – T-SQL or PowerShell scripts offer greater control and flexibility over the exact nature and duration of this task.

Tracking Index Fragmentation

The question of exactly how often to rebuild indexes is a difficult one to answer, and the Maintenance Plan Wizard doesn't offer any guidance. The speed and degree to which an index fragments depends on how it is used and will vary wildly from database-to-database.
It is beyond the scope of this article to enter a full discussion of measuring index fragmentation, and therefore deciding how often you should rebuild your database's indexes. However, it is worth noting that the sys.dm_db_index_physical_stats Dynamic Management Function contains two columns that store valuable information regarding index fragmentation:
  • avg_page_space_used_in_percent – this column stores the average amount of space that is used on a page. For example, a particular index might have 50% space used, which means that only half of the space on a data page, on average, is used to store rows of data.
  • avg_fragmentation_in_percent – this column stores the degree of logical fragmentation of an index, as a percentage. For example, a particular index might be 80% fragmented, which means that, on average, 80% of the data pages physical ordering does not match their logical ordering.
If you were to track this data over a period of time, you will be able to better gauge how quickly your indexes fragment, and so how often you should consider rebuilding them. However, if you are at this level, then the chances are high that you'll be using scripting techniques to rebuild your indexes rather than the Maintenance Plan Wizard.

Offline Index Maintenance

While it is not a requirement to perform the Rebuild Index task offline, while the database is not being accessed, it is certainly a strong recommendation, especially for large databases with many users. If your tables are relatively small, rebuilding will be fast and most users who happen to be accessing the database at the same time probably won't notice any performance degradation as a result of the locking required by Rebuild Index task. On the other hand, if your tables are big, or if you have lots of concurrent users, the Rebuild Index task can negatively affects your user's experience, greatly slowing down their access to the database, and potentially causing some queries to time out from their client application.
Generally speaking, if you have a large enough maintenance window to accommodate running your Rebuild Index task offline, then I recommend you use this task, and run it during that window.

Online Index Maintenance

If you don't have a maintenance window, or it is not long enough to accommodate an offline Rebuild Index task, then you have one or two possible alternatives available to you, when using the Maintenance Plan Wizard:
  • Use the online version of the Rebuild Index task
  • Use the Reorganize Index task followed by the Update Statistics task
If you have the Enterprise Edition of SQL Server, the Maintenance Plan Wizard offers a Keep index online while reindexing option, which means that the index will continue to be available to users even while it is being rebuilt. Even though this is an online activity, you will still want to schedule this task during a time of the day when the server is less busy as it is still a resource intensive activity. Performing this online task during busy times of the day can affect your users' ability to access the database in a timely manner, especially if your SQL Server already has performance bottlenecks.
If you don't have Enterprise Edition, and your maintenance widow is too short to accommodate an offline Rebuild Index task, then you should consider using the Reorganize Index task instead, and then running the Update Statistics task immediately thereafter. The Reorganize Index task is an online operation, which means that it can run while users are accessing the database. While this is an online process, it is still resource intensive, and you should schedule the task during a time of the day when the server is less busy.
The downside to using the Reorganize Index task is that its index defragmentation capability is not as thorough and complete as the Rebuild Index task. In addition, it can take longer to run than the Rebuild Index task, and you have to run the Update Statistics task as a separate step.
Reorganizing Indexes
This topic, as well as subsequent updating of statistics, is also covered in detail in my book, Brad's Sure Guide to SQL Server maintenance Plans, available as a free download from here.

Scripting Index Rebuilds

If you have the Enterprise Edition of SQL Server, chances are that your databases may be very large, and using the Maintenance Plan Wizard to maintain your databases may not be a great choice in the first place. You can obtain more flexibility and control by creating your own custom maintenance plans using T-SQL or PowerShell scripts.
For example, you can measure and track fragmentation using sys.dm_db_index_physical_stats and then build a script to defragment only those indexes that really need it.

Configuring the Rebuild Index Task

Now that we know a little about the Rebuild Index task, and when it should be run, let's take a look at its configuration screen from the Maintenance Plan Wizard, shown in Figure 1.1.
Figure 1.1: We are ready to configure the "Rebuild Index" task.
Our first choice is to select which databases we want to run this task against..

Database Selection

First, notice the Databases drop-down box appears on the screen. Second, notice that directly below the Databases drop-down box are two more drop-down boxes: Object and Selection. These two drop-down boxes appear for some tasks, and not others. We will talk about what they do in a moment.

Selecting Several Databases

As a general rule, you want to keep the number of separate Maintenance Plans to a minimum, so ideally you'd create a single Plan and apply the Rebuild Index task to all indexes in a given set of databases, for example in all user databases. Also, in order to ease maintenance and avoid confusion, each task in the Plan should be applied to the same set of databases.
However, there may be special cases where you'd need to create separate Plans to deal with the specific index maintenance requirements of different databases. For example, let's assume that on a single SQL Server instance you have 25 small databases, each less than 1 GB in size, and one large database, say 50 GB. Let's also assume that few if any users will need access to the small databases during your maintenance windows, but that many users may need to access the 50 GB database during this time. In this case, you might consider creating a special Maintenance Plan for the 50 GB database that uses the Reorganize Index and Update Statistics tasks, and another Maintenance Plan that applies the Rebuild Index task to the smaller databases.
For this example, we are going to keep things simple, so let's assume we want to perform the Rebuild Index task for all user databases. In this case, we can choose the option shown in Figure 1.2, and then click OK.
Figure 1.2: To keep things simple, select the "All user databases" option.
The Define Rebuild Index Task screen reappears, and the two drop-down boxes I referred to earlier are displayed below the Databases drop-down box, but they are grayed out, as shown in Figure 1.3.
Figure 1.3: The "Object" and "Selection" drop down boxes are not available.
So what's going on? Why are these two options grayed out? The reason is that these two options are only available if you select one database on which to run the Rebuild Index task. Since we selected All user databases, these two options are not available.

Selecting a Specific Database

Although it does not apply to our example, let's take a look at what happens if we select only a single database, such as AdventureWorks, for the task. To do this, select These databases from the screen shown in Figure 1.2 and then check the checkbox for AdventureWorks. When you click OK, this section of the Define Rebuild Index Task screen will look as shown in Figure 1.4.
Figure 1.4: When a single database is selected, then the "Object" drop-down box becomes available.
Notice that Specific databases now appears in the Databases drop-down box, the Object box is now available, and the Selection box is still, for the time being, grayed out.
What the Object and Selection options allow us to do is to selectively rebuild some of the indexes in your database, and not others. If we click on the Object drop-down box, we'll see the choices shown in Figure 1.5.
Figure 1.5: You must select either "Table" or "View".
Notice that there are three choices for Object. If you leave the default option selected, Tables and views, then the Rebuild Index task will be applied to the indexes associated with all tables, and all indexed views, in the selected database. In other words, you haven't changed anything. In order to narrow the scope of the task to specific objects, you need to choose either Table or View. Having done this, the Selection drop-down box becomes available. For example, let's choose Table, and then click on "Select one or more" in the now available Selection drop-down box, as shown in Figure 1.6.
Figure 1.6: You can select which tables you want to rebuild with the Rebuild Index task.
Now, we get the option of selecting specific tables within the AdventureWorks database, to which this task should apply. For example, we could choose to rebuild only the indexes associated with the dbo.ErrorLog table, or we could select some combination of tables, by checking each of the relevant checkboxes.
Why would we want to rebuild the indexes for some tables and not others? Actually, there is a very good reason for this. In most databases, there are some tables that are virtually static; they rarely if ever change, and so there is no benefit in rebuilding their associated indexes as they don't, over time, develop wasted empty space or become logically fragmented. By selecting only those indexes that really need defragmenting, you can reduce the time it takes to perform the Rebuild Index task, and at the same time, reduce the resource overhead associated with this task.
The problem I see is that most people who are using the Maintenance Wizard won't have the knowledge to determine which indexes are relatively static and which are subject to a lot of wasted space and logical fragmentation. If you are at the level where you know how to evaluate each index using the sys.dm_db_index_physical_stats DMF, in order to apply a selective rebuild process, then the chances are you are probably better off implementing this process using T-SQL or PowerShell scripts, and avoiding use of the Maintenance Plan Wizard in the first place.
Before we move on, let's briefly consider the View option that is available in the Object drop-down box, as shown in Figure 1.7.
Figure 1.7: You can select which indexed views you want to rebuild with the Rebuild Index task.
In this case, "view" doesn't refer to conventional views, but to indexed views. Indexed views are physical views, unlike regular views, which are only materialized when they are called by a query. Because indexed views are physical, they need rebuilding just like regular indexes. As per my advice with regard to the Table option, if you need this kind of granularity for the maintenance of your indexes, you shouldn't be using the Maintenance Plan Wizard for this task.
While I have taken a little time to explain what the Object and Selection drop-down boxes do, I am recommending that you don't use then, as they just make Maintenance Plans overly complicated, defeating the benefit of using them in the first place.

Free space options

We still have several more choices to make before we are done configuring this task. Note that the discussion of these options assumes that each of your tables has a clustered index, and is not a heap. A heap is a table without a clustered index. As a best practice, all tables should have a clustered index.
The first two choices are listed under Free space options and include "Reorganize pages with the default amount of free space" and "Change free space per page percentage to", as shown in Figure 1.8. You can choose one option or the other, but not both.
Figure 1.8: These options can have a significant impact on the Rebuild Index task.
The default option of "Reorganize pages with the default amount of free space" is a little confusing. First, it says reorganize, not rebuild. Remember, we are working on the Rebuild Index task, not the Reorganize Index task. Don't let this confuse you into thinking that selecting this option reorganizes indexes, rather than rebuild them. It does the latter, and this is actually a mistake in the user interface. It really should say "rebuild," not "reorganize".
The second part of this first option says "default amount of free space". What does that mean? When creating a SQL Server index, there is an option to create the index with a certain amount of free space on each data page. This setting is known as the fill factor. If an index is created without specifying a fill factor, then the default fill factor is used, which is 100 (actually 0, but 0 means the same thing as a 100% fill factor). This means that no free space is created for the data pages of an index.
The potential problem with a fill factor of 100 arises when data is added to a table as a result of an INSERT or UPDATE, and a new row needs to be added to a data page. If there is no room for it, then SQL Server will reorganize the rows, moving some of the rows onto a new data page, and leaving some on the old data page. This is known as page splitting. While page splitting is a normal SQL Server activity, too much page splitting can cause performance issues because it results in index fragmentation, the very thing we are trying to eliminate with the Rebuild Index task. In order to mitigate this problem, DBAs often decrease the fill factor to, say, 90, meaning that data pages be 90% full, leaving 10% free space.
For more information regarding fill factors and page splitting refer to Books Online. A full discussion of these topics is beyond the scope of this article, but I needed to include a little background so you better understand what is happening when you make particular selections within the Wizard. Also, don't assume that the example of 90 I give in this example for a fill factor is appropriate for your indexes. It may be, or it may not be.
What is really confusing is that the phrase "default amount of free space" in the Wizard does not mean the same thing as the "default fill factor" that can be set for the entire server. Some people confuse the two.
In the Rebuild Index task, "default amount of free space" refers to the fill factor that was used when a specific index was first built, or last rebuilt. In other words, if you choose the option "Reorganize pages with the default amount of free space", what happens is that each index is rebuilt using whatever fill factor value was used the last time it was rebuilt. This may be the same as the server-wide default, or it may be a specific value that was specified for that index, or it may be a value set using the second "Change free space per page percentage to" option (discussed next).
In almost all cases the "default amount of free space" option is the one you want to use, as it means the index will be rebuilt using the fill factor that was originally specified when the index was created.
With the second option, "Change free space per page percentage to", you specify a single fill factor value to be used for every index when it is rebuilt. For example, if you choose "Change free space per page percentage to" and set it to 10%, this is the same thing as setting all of the indexes in your database to a fill factor of 90, regardless of what the value was when the index was created. It is rarely a good idea for every index in your database to have the same fill factor. The appropriate fill factor is specific to an index, and you can't generalize a fill factor that will work well for every index in your database. While this setting might be beneficial for some indexes, it could cause performance problems with other indexes. As a result, I advise against using this option.
Of course, the choice of the default "Reorganize pages with the default amount of free space" options assumes that the fill factors of all of your indexes have been ideally set when they were originally created, or last rebuilt. If they aren't, then it's a tossup as to which option is really the best. But assuming that you don't know if the fill factors are ideal or not, which you probably don't, I would still recommend using this default option.

Advanced Options

The two options under Advanced options are shown in Figure 1.9.
Figure 1.9: The Advanced options section of the Define Rebuild Index Task screen
By default, both options are turned off. The first one is "Sort results in tempdb". If you don't choose this option, then when an index is rebuilt, all of the rebuilding activity is performed in the database file itself. If you select the "Sort results in tempdb" option, then some of the activity is still performed in the database, but some of it is also performed in tempdb. The benefit is that this can often speed up the index rebuild process. The drawback is that it also takes up a little more overall disk space, as space in tempdb is required, in addition to some space in the database where the indexes are being rebuilt.
The benefit you get out of this option depends on where tempdb is located on your server. If tempdb is located on the same drive or array as the database file that is having its indexes rebuilt, then the benefit may be minimal, if any. However, if tempdb is located on its own isolated drive spindles, then the benefit will be greater because there is less disk I/O contention.
So, should you use this option? If your databases are small, you probably won't be able to discern much performance benefit, but if you have large databases, with large tables and indexes, and if tempdb is located on its own spindles, then turning this feature on will probably boost index rebuild performance.
The second advanced option is one we've discussed previously: "Keep index online while reindexing". This option is only available if you have the Enterprise Edition of SQL Server. By selecting this option, index rebuilding becomes an online, rather than offline task. If you are using Enterprise Edition, you will probably want to select this option. I say "probably" because there are pros and cons of performing an online index rebuild; a topic that is beyond the scope of this article.

Creating the Job Schedule

Our final step is to define an appropriate schedule on which to run our Rebuild Index job. With the previous advice in mind, the best option would be to run the job within a nightly maintenance window. However, in many organizations, only a single weekly maintenance window is available, which is often entire day of Sunday.
Therefore, let's schedule the Rebuild Index task to occur on Sunday, right after the Database Integrity task completes. As such the screen will look as shown in Figure 1.10.
Figure 1.10: The schedule for the Rebuild Index job
The only question we need to consider is how soon after running the Check Database Integrity task should we schedule a Database Rebuild task? That depends on how long the Check Database Integrity task takes to complete, and we won't know until we try it.
Since this is a new Maintenance Plan, we don't have any experience with regard to how long each task runs yet, and so we have to guess. In this example, I'll guess that the first Check Database Integrity task will take an hour, starting at 1:00 AM, so I will schedule the Rebuild Index task to start at 2:00 AM. If I'm wrong, the two jobs will overlap, which could cause some performance problems.
As a DBA, the first time you run any Maintenance Plan, you need to check how long each job takes to run in order to avoid overlapping jobs. If your guess is wrong, and jobs overlap, you can use the Maintenance Plan Designer (covered in my book) to alter the schedule for the next time it runs.
I recommend that you run the Rebuild Index task before any of the backup tasks are performed. This way, if you have to restore a backup, your backup will be of the latest, index rebuilt version.

Summary

Index fragmentation is an issue all databases experience, and if it is not removed on a regular basis, it can lead to query performance problems. One way to remove index fragmentation is to run the Rebuild Index task on a regular basis, which drops and rebuilds every index in a database. While the Rebuild Index task is very effective at what it does, it is considered an off-line activity, and it is very resource intensive. As such, using this task may not always be appropriate for all of your databases.

No comments:

Post a Comment