What is Fragmentation: - Fragmentation may be defined as a state when data is scattered and non contiguous. Fragmentation can be divided into two broad categories, which are following.
- Physical fragmentation of data files on Disk
- Index Level fragmentation
Physical fragmentation of data files on Disk:-Physical fragmentation of database file is quite similar to the other files fragmentation in windows file system. When operating system file system is not able to allocate contiguous space to database files then physical fragmentation comes into picture.
Reason for Physical fragmentation: -
- Physical fragmentation happens in those scenarios where database files are kept in the same disk space where other application’s files are kept.
- When database file growth is frequent and in smaller extent.
How can we determine Physical Fragmentation: - There is no SQL query or DMV (dynamic management view) that can give information about physical fragmentation. To check physical fragmentation there is free tool provided by Microsoft which is called sysinternal’s contig tool. You can download this tool from the below link.
How we can avoid physical fragmentation: - To minimize the chances of physical fragmentation we can pay attention on below points.
- Keeping database files on separate disk from the other applications and files.
- Choosing appropriate growth rate for databases while creating the database.
Index Level Fragmentation: - Index level fragmentation can further divided into two categories.
- Internal Fragmentation
- External Fragmentation
Internal Fragmentation: - When records (rows) are stored in the data pages non-contiguously that is called internal fragmentation. In other words, Internal fragmentation happens when there is unused space between the records within a page and the reason of Internal fragmentation is data modification (Insert, Update, Delete) done against a table.
Let’s try to visualize internal fragmentation by taking an example.
I have taken a clustered table as an example, where there are three Indexes pages and each index page is containing 100 keys values. At the leaf level we have 3 data pages which are fully filled by 100 records per page. Now, if we delete 50 records starting from (3 to 52) from the first page, (104 to 153) records from the second page and (204 to 253) records from the third page respectively. Earlier we had 300 records on three pages, now we have 150 records on 3 pages those can be stored on only two data pages. Due to deletion of the records, lot of unused space has been created. Due to fragmentation, same amount of I/O cost will be occurred even we have lesser no of records on the data pages but SQL optimizer has to read all 3 data pages in case of reading all the records.
External Fragmentation:-When physical storage of data pages and extents are non-contiguous on the disk then it is called External fragmentation. In other words, if a table has been allocated space on non-contiguous extents then it is called External Fragmentation. Or when logical order of the pages doesn’t match with the physical order of the pages, these pages are called out of order pages.
Let’s take an example to understand, what the out of order pages are. In the below screen shot, I have shown three pages where first two pages are completely full and third page is not completely full. In below image physical ordering of the pages is same with logical ordering of the pages.
Page Split Example: - Now we will insert a record value “5”, that will go to the first page and since the first page is already full, hence page split event will occur. As a result of page split half of the records will reside on the first page and half of the records will be moved to a new page (page 4). We can see in the below image that logical ordering of the pages are not similar to the physical ordering of the pages.
Impact on performance: - If we fetch individual records, fragmentation has no impact on those cases, as it goes directly to the page containing the matching records and fetch the data, External fragmentation will impact the performance of the query where we are going to read records in sequential order, In those cases disk drive head has to jump back and forth order since the pages are not in contiguous order.
In my next post we will learn how we can find fragmentation details using SQL scripts in different versions of SQL servers and how we can do de-fragmentation to improve performance.