Tag Archives | backups

Database slow to open after Instance restart or failover due to too many VLFs in transaction log.

An occasional client hosts a pretty busy 1/2 TB Dynamics CRM database on a SQL Server cluster. The database has been in place for a while, and everyone’s happy with the performance. So on the surface everything seems nice and happy. And btw, they don’t have a full time DBA.

Due to a hardware failure the had an Instance failover on the cluster. All DBs in the instance popped back up nice and quickly, but company_mscrm is stuck IN RECOVERY. As this is the core database for Dynamics CRM their system was down. Thinking it was a size problem they decided to leave it a bit longer. After 10 minutes they decided it was ‘stuck’ and restarted the instance. Same problem.

Now people were getting worried, so they started a restore onto a secondary server, and called us.

On connecting to their ailing server a quick sp_who2 showed a number of processes executing command DB STARTUP, so something was still operating on the database.

Using the scripts provided from Microsoft SQL Server Customer Support – http://blogs.msdn.com/b/psssql/archive/2010/12/29/tracking-database-recovery-progress-using-information-from-dmv.aspx – I started collecting execution stats so we could see what the processes were doing (or not doing as the case might be). Btw, I tend to adjust the capture interval to 5 seconds to start with, so I can get a feel without swamping myself with data.

From the tbl_recovery_tracking table: SELECT total_elapsed_time, wait_type,reads,writes, cpu_time FROM [dbo].[tbl_recovery_tracking] where session_id=23

 

total_elapsed_time wait_type reads writes cpu_time
762441 IO_COMPLETION 109966 0 2418
767445 IO_COMPLETION 110811 0 2418
772448 IO_COMPLETION 111726 0 2449
777451 IO_COMPLETION 112753 0 2480
782453 IO_COMPLETION 113625 0 2480
787457 IO_COMPLETION 114565 0 2496
792460 IO_COMPLETION 115527 0 2527
797462 IO_COMPLETION 116303 0 2574
802465 IO_COMPLETION 117106 0 2589
807469 IO_COMPLETION 117880 0 2589
812471 IO_COMPLETION 118499 0 2589

So much as you’d expect we see lots and lot of reads.Then all of a sudden, reads stop and we start seeing a bigger increase in cpu_time:

total_elapsed_time wait_type reads writes cpu_time
2103341 IO_COMPLETION 317957 0 7129
2108344 IO_COMPLETION 318662 0 7144
2113346 IO_COMPLETION 319523 0 7207
2118350 NULL 320016 0 9672
2123353 NULL 320016 0 14523
2128355 NULL 320016 0 19390
2133359 NULL 320016 0 24398

So nearly there you might think. Not a bit of it. This carried on for a long time. In fact twice as long as it had taken to complete the reads:

total_elapsed_time wait_type reads writes cpu_time
6787241 NULL 320016 0 4456636
6792244 NULL 320016 0 4461316
6797247 NULL 320016 0 4465856
6802251 NULL 320016 0 4470505
6807262 NULL 320016 0 4475341
6812265 NULL 320016 0 4480177
6817267 NULL 320016 0 4484950

So bearing in mind that total_elapsed_time is measure in milliseconds, the database recovery consisted of 35 minutes of reading data, and then 78 minutes of processing. Which is a lot of time, when the recovery stats in the error log are this small:

LogDate ProcessInfo Text
18/04/2013 08:50 spid23s 16 transactions rolled forward in database ‘company_mscrm’ (8). This is an informational message only. No user action is required.
18/04/2013 08:50 spid13s 0 transactions rolled back in database ‘company_mscrm’ (8). This is an informational message only. No user action is required.
18/04/2013 08:50 spid13s Recovery is writing a checkpoint in database ‘company_mscrm’ (8). This is an informational message only. No user action is required.
18/04/2013 08:50 spid13s Recovery completed for database company_mscrm (database ID 8) in 2 second(s) (analysis 720 ms, redo 698 ms, undo 196 ms.) This is an informational message only. No user action is required.
18/04/2013 08:50 spid13s Recovery is complete. This is an informational message only. No user action is required.

And the restore on the seconday box? Hadn’t gotten anywhere.

This all pointed to a ridiculous number of VLFs in their transaction logs. Lo and behold and quick DBCC LOGINFO in the offending database revealed a shockingly high count of ~28000 VLFs in a 70GB transaction log file. That’s a new record for me.

So to fix this we agreed that that they’d shrink their log files and recreated them at a more realistic size, lay off on shrinking them and setting autogrowth use a better value if it needed to by running:

USE company_mscrm
go
BACKUP LOG company_mscrm to DISK=N'\\backup\share\company_mscrm\backup.trn'
DBCC SHRINKFILE ('company_mscrm_log',0,TRUNCATEONLY)
DBCC SHRINKFILE ('company_mscrm_log',0)
ALTER DATABASE company_mscrm MODIFY FILE (NAME='company_mscrm_log',SIZE=70GB, FILEGROWTH=2GB)

First we backup the transaction log to ‘flush’ everything out. Then we shrink the log file as much as we can. Then resize it to 70GB, which was the size we determined would cope with pretty much all of their usage. And then we set a sensible size autogrowth for the few occasions when the transaction log file isn’t big enough.

A simple solution to what was quite a major problem. The lesson to take away is that just because your production database appears to be running perfectly there can still be issues under the surface just waiting to catch you out. There are a number of tools and scripts available that allow you to monitor your log files and their VLF counts.

If you need any help setting them up, or require advice on sizing your transaction logs then Leaf Node are happy to help, please get in touch

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

What impact do SQL Server data and log file sizes have on backup size?

Many SQL Server DBAs seem to be very keen to shrink their database files at every opportunity. There seems to be a worry that any spare space inside a data file or transaction log is a problem. Is it?

Once of the common misconceptions is that a large data file translates to a large backup file. So now to try and lay that myth to bed. We’ll run through a couple of scenarios and compare the backup sizes of different size databases.

These examples were all done on SQL Server 2012 Developer Edition, but will apply to any other version as well.

Also, for all these examples we have no transactions occuring during the Backup. If the data in the data files changes during the backup, then SQL Server will include the relevant portions of the Transaction log in the backup so that you have a consistent backup.

First we’ll create 2 databases of different sizes, and populate them with the same data:

use master;
go

CREATE DATABASE dbsmall
ON 
( NAME = small_dat,
    FILENAME = 'C:\DATA\smalldat.mdf',
    SIZE = 10,
    MAXSIZE = 50,
    FILEGROWTH = 5 )
LOG ON
( NAME = small_log,
    FILENAME = 'C:\DATA\smalllog.ldf',
    SIZE = 5MB,
    MAXSIZE = 25MB,
    FILEGROWTH = 5MB ) ;
GO


CREATE DATABASE dbbig
ON 
( NAME = big_dat,
    FILENAME = 'C:\DATA\bigdat.mdf',
    SIZE = 100,
    MAXSIZE = 150,
    FILEGROWTH = 5 )
LOG ON
( NAME = big_log,
    FILENAME = 'C:\DATA\biglog.ldf',
    SIZE = 5MB,
    MAXSIZE = 25MB,
    FILEGROWTH = 5MB ) ;
GO

use dbsmall;
go
create table tbl_junk
(junk varchar(max)
);
go
insert into tbl_junk (junk) select REPLICATE('Junk',1000);
go 50

use dbbig;
go
create table tbl_junk
(junk varchar(max)
);
go
insert into tbl_junk (junk) select REPLICATE('Junk',1000);
go 50

So dbsmall has a primary data file of 10MB and dbbig has a primary data file of 100MB, both contain a single table of 200KB.

So now let’s back them up to see if there’s any difference. And just to make sure we’ll try it with compression as well:

use master;
go

backup database dbsmall to disk=N'c:\backup\dbsmall_normal.bak';
go
backup database dbbig to disk=N'c:\backup\dbbig_normal.bak';
go
backup database dbsmall to disk=N'c:\backup\dbsmall_compressed.bak' with compression;
go
backup database dbbig to disk=N'c:\backup\dbbig_compressed.bak' with compression;
go

exec xp_cmdshell 'dir c:\backup\';

And the results from the xp_cmdshell are:

01/16/2013  04:49 PM           371,200 dbbig_compressed.bak
01/16/2013  04:49 PM         3,232,256 dbbig_normal.bak
01/16/2013  04:49 PM           371,200 dbsmall_compressed.bak
01/16/2013  04:49 PM         3,232,256 dbsmall_normal.bak
               4 File(s)      7,206,912 bytes
               2 Dir(s)  16,019,734,528 bytes free

So that shows that all the backups are the same size. Which is to be expected, as it’s well documented that SQL Server only backs up the pages in use within the data files, and as both databases contain exactly the same amount of data the backups will be the same size.

So that’s the data files checked. But what about transaction logs I hear you ask. Well, let’s have a look. In these examples we’re only looking at full database backups, transaction log backups are a different thing and I’ll be looking at those in another post.

So let’s increase the size of the transaction log for dbbig and see what that does for the backup size:

alter database dbbig modify file (Name=big_log, size=200MB);
go

backup database dbbig to disk=N'c:\backup\dbbig_log_normal.bak';
go

backup database dbbig to disk=N'c:\backup\dbbig_log_compressed.bak' with compression;
go

exec xp_cmdshell 'dir c:\backup';

And this tie xp_cmdshell tells us:

01/16/2013  05:10 PM           375,296 dbbig_log_compressed.bak
01/16/2013  05:10 PM         3,240,448 dbbig_log_normal.bak
01/16/2013  04:49 PM           371,200 dbbig_compressed.bak
01/16/2013  04:49 PM         3,232,256 dbbig_normal.bak

Hang on, that’s 8KB more. Are we missing something here? No, not really. The 8k is an artefact of the alter database statement, we get that whatever we increase the transaction log to:

drop database dbbig;
go

restore database dbbig from disk=N'c:\backup\dbbig_normal.bak' with recovery;
go

alter database dbbig modify file (Name=big_log, size=100MB);
go

backup database dbbig to disk=N'c:\backup\dbbig_1log_normal.bak';
go

backup database dbbig to disk=N'c:\backup\dbbig_1log_compressed.bak' with compression;
go

drop database dbbig;
go

restore database dbbig from disk=N'c:\backup\dbbig_normal.bak' with recovery;
go

alter database dbbig modify file (Name=big_log, size=500MB);
go

backup database dbbig to disk=N'c:\backup\dbbig_5log_normal.bak';
go

backup database dbbig to disk=N'c:\backup\dbbig_5log_compressed.bak' with compression;
go

drop database dbbig;
go

restore database dbbig from disk=N'c:\backup\dbbig_normal.bak' with recovery;
go

alter database dbbig modify file (Name=big_log, size=1000MB);
go

backup database dbbig to disk=N'c:\backup\dbbig_10log_normal.bak';
go

backup database dbbig to disk=N'c:\backup\dbbig_10log_compressed.bak' with compression;
go

exec xp_cmdshell 'dir c:\backup';
go

Which gives:

01/16/2013  05:30 PM         3,240,448 dbbig_10log_normal.bak
01/16/2013  05:29 PM         3,240,448 dbbig_1log_normal.bak
01/16/2013  05:29 PM         3,240,448 dbbig_5log_normal.bak

So we take that small 8kb hit for any size transaction log. Now the eagle eyed will have spotted that those are empty transaction logs, so what happens when we have a full transaction log?

We’ll reset dbbig, increase the transaction and then run something to fill up the transaction log. But then we’ll empty the datafile back to where it was:

use master;
go
drop database dbbig;
go

restore database dbbig from disk=N'c:\backup\dbbig_normal.bak' with recovery;
go

alter database dbbig modify file ( name = N'big_log', maxsize = 150MB , FILEGROWTH = 25MB );
go

create table dbbig.dbo.tbl_junk2
(junk varchar(max)
);
go

insert into dbbig.dbo.tbl_junk2 (junk) select REPLICATE('Junk',200) from sys.all_columns;
go 4
drop table dbbig.dbo.tbl_junk2;
go

backup database dbbig to disk=N'c:\backup\dbbig_full_log.bak';
go

backup database dbbig to disk=N'c:\backup\dbbig_compressed_full_log.bak' with compression;
go

exec xp_cmdshell 'dir c:\backup';

Which gives us:

01/16/2013  06:14 PM           375,808 dbbig_compressed_full_log.bak
01/16/2013  06:14 PM        29,585,920 dbbig_full_log.bak

Hang on a sec, that’s a much larger backup there. What’s going on?

Well this is an example of the Ghost Writer process in action. The short version is that SQL Server doesn’t instantly deallocate the pages in a data file when you drop/truncate a table. It just marks the pages as deletable and the Ghost Write process actually clears them the next time it runs. (lot’s of good detail from Paul Randal (who wrote the code) in this post: Inside the Storage Engine: Ghost cleanup in depth).

So if we rerun the above SQL with this appended:

waitfor delay '00:02:00';
go

backup database dbbig to disk=N'c:\backup\dbbig_full_logaq.bak';
go

exec xp_cmdshell 'dir c:\backup';

Then we get:

01/16/2013  06:14 PM           375,808 dbbig_compressed_full_log.bak
01/16/2013  06:14 PM        29,585,920 dbbig_full_log.bak
01/16/2013  06:16 PM         3,240,448 dbbig_full_logaq.bak

Which is the same size as we’ve been seeing in all the other examples. BTW, I give a very generous 2 minutes for Ghost Cleanup to kick in to make sure, it’s usually much quicker than that, though actual time will depend on how busy your system is.

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

Never credit SQL Server with intelligence it doesn’t have

and that goes double for SQL Server Management Studio (SSMS)

Yet another example of someone having problems with backups and SQL Server, mostly caused by not checking exactly what’s happening and not having a full written procedure in place.

DBA was notified of a problem with a key production system. After a bit of digging they realised that a misbehaving SSIS package had truncated a table. Weighing up the chance of the package running perfectly and not causing any more problems, the DBA decided to restore the database to a temporary location from a backup and then migrate the data across. The package run at 19:30, before the full backup kicked in at 20:00. Being a good DBA they had plenty of hourly transaction log backups as well, so a few clicks in SSMS and a point in time restore was underway. Feet up, a coffee while waiting and then off to the pub having pocketed some nice overtime.

Unfortunately, when the the restore had completed our brave DBA found that the table in the restored DB was also empty. OK, maybe the table was corrupted by something else and not the SSIS package, so they setup a point in time restore for close of business @17:00, as the users hadn’t complained then. Hoping for a quick restore as last orders is fast approaching our brave DBA sets it off.

Restore finishes, and the table is still empty. Getting slightly worried by now the DBA tries the night before’s backup as a last resort, though they can’t imagine how the system could have run a day with an empty table. Guess what, still no luck.

At this point we get called in as noone can work out what’s going on. A quick look in msdb.dbo.backupfiles and we had our answer:

select 
a.backup_start_date,
b.physical_device_name
from 
backupset a
inner join 
backupmediafamily b on a.media_set_id=b.media_set_id 
where 
a.type='D' and 
a.database_name='dbname' 
order by a.last_lsn desc

Our DBA had been diligently backing up the database every day like the should, but they’d been backing it up to the same dbname.bak file every night, complete with init and noappend, so every night the same file was being overwritten. SQL Server will only do what you tell it to.

But why didn’t the point in time restores complain? Well again, SQL Server went off to look in msdb.sys.backupfile and worked out which files it needed to do the restore. Guess which file it used every time? And then guess what, as the following transaction backups were for later LSNs, SQL Server happily ignored them.

Once we’d pulled the earlier version of dbname.bak off of tape and manually built up some T-SQL statements to do the required point in time restore everything worked as expected and we could just migrate the table data across.

The first moral of the story is to be careful about what you tell SQL Server to do, as it will do exactly that. Just like the petulant 5 year old.

And the second is that SSMS can be pretty dumb about what it lets you set when building up backups and restores. If the restore here had been down with T-SQL in the first instance, then the DBA would have seen error messages as SQL Server refused to apply transaction log backups due to LSN order.

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

Problems migrating and upgrading SQL Server Reporting Services DBs

Ever upgaded a SQL Server Reporting Services instance after migrating to a new SQL Server, and everything looks good until you click on the Report Server URL to be greeted with:

The version of the report server database is either in a format that is not valid, or it cannot be read. The found version is ‘C.0.8.54’. The expected version is ‘147’. (rsInvalidReportServerDatabase)

You dillegently read through all the google and bing results you can find. But there’s nothing useful in there, and all the guides say that connecting the Reporting Services instance should upgrade the database schemas to the correct versions. So why isn’t it for you?

By now you’ve started to delve into the depths of the Reporting Services logfiles (always a great place to look for any Reporting Services issues, can be quicker than googling). And you’ve spotted what looks like the smoking gun:

library!WindowsService_0!b84!07/23/2012-11:28:31:: i INFO: Current DB Version C.0.8.54, Instance Version 147.
library!WindowsService_0!b84!07/23/2012-11:28:31:: i INFO: Starting upgrade DB version from C.0.8.54 to 147.
library!WindowsService_0!b84!07/23/2012-11:28:32:: e ERROR: Throwing Microsoft.ReportingServices.Diagnostics.Utilities.ReportServerStorageException: , An error occurred within the report server database.  This may be due to a connection failure, timeout or low disk condition within the database.;
library!WindowsService_0!b84!07/23/2012-11:28:32:: e ERROR: ServiceStartThread: Exception caught while starting service. Error: Microsoft.ReportingServices.Diagnostics.Utilities.ReportServerStorageException: An error occurred within the report server database.  This may be due to a connection failure, timeout or low disk condition within the database. ---> System.Data.SqlClient.SqlException: Cannot find the object "dbo.SnapshotData" because it does not exist or you do not have permissions.
Changed database context to 'ReportServerTempDB'.

So you dutifully go and check if dbo.SnapShotData exists, check the permissions, maybe even give you service user sysadmin rights as a last chance hope?

And still none of it works. So what is the problem then. Well, did you move over ReportServerTempDB as well as ReportServer? Wait a sec, why should you? Didn’t some guy on the internet say you didn’t need to backup that one, just like TempDB? Well they may have said it, and sometimes they might have been right. But when doing an uprade you need ReportServerTempDB as well. So now you need to find it, hopefully you’ve still got the original kicking around to make life easier. If not you’ll need to find one of the same original version and then sort the permissions out.

This is a very common issue we see when we get called in to look at Reporting services upgrades gone bad. For most instances, the space used by a ReportServerTempDB backup isn’t really a worthwhile reason for not taking one just to make your life easier.

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

Syncing SQL Server databases ready for DB Mirroring

Whilst Database mirroring is now deprecated with the release of SQL Server 2012 (http://msdn.microsoft.com/en-us/library/ms143729%28v=sql.110%29.aspx) it’s still something I’m going to be using heavily with 2008 and 2008R2 as it’s a simple way of implementing failover between discrete clusters or instances. They’ll move to Availability groups eventually, but not in the short term as there’s no direct way to do this without building a new Windows cluster that contains the Fallover Clusters

However, with a database with frequent transaction log backups (and if it’s important enough to mirror, it’s almost certainly important enough to deserver frequent log backups) it can take a bit of work to get everything synced up again. Or catching back up if you’ve had to break the mirror for any reason.

The first script is to sync up from scratch. It connects back to the primary server to get the back information. It assumes that:

  • You’re running it as an account that exists on both instances with sysadmin role granted
  • You don’t already have a linked server with the name of the primary instance on the secondary
  • The file structure is identical on the 2 instance
  • All backup files (log and db) are available in their main location (ie; not taken to tape and deleted)

declare @last_lsn numeric(25,0), @first_lsn numeric(25,0), @@tmp_lsn numeric(25,0) declare @dbname varchar(50), @restorefile varchar(255), @oserver nvarchar(50), @lserver nvarchar(50)

--Set server name twice. First is used for building db links, 2nd is used in the actual query to hide special characters set @oserver='server1\instance' set @lserver='[server1\instance]' set @dbname='bigdb1'

EXEC master.dbo.sp_addlinkedserver @oserver, @srvproduct=@oserver, @provider=N'SQLNCLI', @datasrc=@oserver EXEC master.dbo.sp_addlinkedsrvlogin @rmtsrvname=@oserver,@useself=N'True',@locallogin=NULL,@rmtuser=NULL,@rmtpassword=NULL

exec(N'select c.physical_device_name, a.last_lsn, a.first_lsn into ##mirror from '+@lserver+'.msdb.dbo.backupset a, '+@lserver+'.msdb.dbo.backupmediaset b, '+@lserver+'.msdb.dbo.backupmediafamily c where a.database_name='''+@dbname+''' and a.media_set_id=b.media_set_id and b.media_set_id=c.media_set_id and a.last_lsn > (select max(first_lsn) from '+@lserver+'.msdb.dbo.backupset where database_name='''+@dbname+''' and type=''D'') order by last_lsn asc') declare restore_cursor cursor for select physical_device_name, last_lsn, first_lsn from ##mirror order by last_lsn asc open restore_cursor fetch next from restore_cursor into @restorefile, @last_lsn ,@first_lsn while @@FETCH_STATUS = 0 begin --comment out the exec line and uncomment the print line if you want to check the restore commands first

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

Please backup your SQL Server Express databases

Really, please back them up. The number of non-trivial apps that now ship with a SQL Server Express backend database is increasing. And that increases the risk that you’ll end up with the one that’s corrupted or broken. Without a good backup there’s no guarantee that your precious data will be recoverable, or that what you can recover will be valid for your application. Having spent a day recovering a SQL Express database that had a couple of corrupted pages in the a datafile, I would like to offer this up to hopefully reduce the number of times I come across this.

Unfortunately SQL Server Express doesn’t come with all the nice functionality of the other SQL Server versions, so implementing backups isn’t as easy as setting up a maintenance plan (not that that’s all you need to do for good backups (See

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

A transaction is a transaction is a transaction

Or at least as far as SQL Server’s transaction logs are concerned.

Had to fettle a SQL DB that was as in unfortunately common running out of space on a regular basis. And as is depressingly common it was running out of space because of a hugely increasing transaction log. The workaround had been to do a ad-hoc transaction backup and then shrink the file. Then repeat this the next time it fell in a heap.

But what about transaction log backups I hear you cry. Yep they were being done. But only every 6 hours. Yes, just 4 transaction log backups a day for a database that was generating enough transaction logs to fill a decent sized filesystem up in less than 6 hours.

So I volunteered the following non-controversial solution:

  • More frequent transaction log backups (~10 minutes in this case)
  • Size the transaction log based on how frequently you want to backup those logs and leave them fixed (~1GB in this case).

Which cues the following discourse:

“But we can’t do that because we’ll fill up our transaction log backup area”

“Do you do that at the moment?”

“No, but if we back up 15GB every 20 minutes we will”

“But you won’t, as there won’t be 15GB of transactions in 20 minutes”

“But our transaction log backups are 15GB each”

Penny drops on my side, and leads to the main point of this post.

If your system is performing x transactions per day, then that’s the number of transactions you do during a day. Exactly that number, no matter what you do. And if you want to be able to restore to a point in time, then that’s the number of transactions you need to backup every day.

Sounds sensible doesn’t it? But how many transaction log problems are cause because the owner doesn’t follow that through logically:

Let’s say you’re doing 30,000,000 transactions a day and each one of those is writing a certain amount of data to the transaction log, because that’s the amount SQL Server needs to write to the transaction log. Let’s say that amount is 8k, so your database is writing out ~230GB of transaction log per day.

So, assuming that those transactions come in nice and evenly throughout the day (oh, if only they woul), then depending on how often you backup and truncate the transactions logs you’ll need the following space to hold the transaction log:

  • once per day – 230GB
  • twice per day – 115GB
  • every six hours – 57GB
  • every 4 hours – 39GB
  • every 2 hours – 19GB
  • every hour – 10GB
  • every 30 minutes – 5GB
  • every 15 mnutes – 2.5GB

OK, you get the idea. But the main thing to remember is that you’re still backing up exactly the same volume of transaction log data, just doing it in smaller chunks, more frequently. So you need less space on your fast DB disks, and if you’re moving your transaction log backups to a safe secondary location you’ve less critical business data sat waiting to be lost when the server dies.

If you’re generating 230GB of transaction logs a day, have 45GB of space for transaction logs, and expect to only backup and truncate the logs 4 times a day, you’re going to end up with a mighty big mess on your hands the 5 times a day the database falls over. But you could fix it at a stroke by backing them up 8 times a day.

And remember, if you think you’re seeing too much transaction log being generated then you’ll need to investigate why it’s

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

Backups – where it all starts

Dear DBAs of the world,

We need to talk about backups. Ever wonder why every DBA blog for any RDBMS product bangs on about backups? (ok, you NoSQL guys can shut up at this point).

Because if you’ve got good, and I mean good

Stuart Moore

Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike

More Posts - Website

Follow Me:
TwitterLinkedInGoogle Plus

Stuart Moore

About Stuart Moore

Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides

When not in front of a computer, is most likely to be found on the saddle of his bike