How to: Restore nodes from a SQL backup
We had an odd issue the other day where a bunch of nodes disappeared from one of our Umbraco installs. There didn't appear to be any reason behind why they were gone and the customer assured me that they didn't delete them but they were gone for sure. No worries I thought, they'll be in the recycle bin but after a quick check I found they weren't… I had a feeling things were about to get tricky -and I was right.
Thankfully all our databases are backed up over night so I reached out for the latest backup to see if I could find the missing data. I had to go back a week or two but I finally I found it -but now what? It's an active CMS so we couldn't just blitz one database with the other so I had to turn to SQL.
Those of you who are familiar with Umbraco's database will know it's a little tricky to get your head around at the best of times let alone when you're trying to fix an issue. There wasn't much on Google so we had to write up some steps to transfer the content ourselves. After a little battling I think the resulting script should be usable by others however this is the first time I've seen this in about 8+ years of working with Umbraco so I hope no-one else ever needs it!
Anyway, once you have your databases restored you can use this script. It's quite long so I've split it out with a little explanation into each step. In short though the main tables you need to restore are: umbracoNode, cmsContent, cmsContentVersion, cmsPropertyData and cmsContentXml
Find the missing nodes
This basically does a quick comparison between the two databases to find any missing nodes. This will include any content, media as well as document types etc so may be a little too inclusive but you can filter out the ones you don't want. The "INTO #ToRestore" pops the records into a temporary table which is dropped later as we'll be referencing them a few times throughout the script and if there are a lot of nodes that could get quite costly.
-- Identify missing nodes between the two databases SELECT r.id, r.text AS [NodeName], r.[path], (SELECT TOP 1 cv.VersionId FROM [BackupDB].dbo.cmsContentVersion cv WHERE cv.ContentId=r.id) AS [LatestVersion] INTO #ToRestore FROM [BackupDB].dbo.umbracoNode r LEFT JOIN [RestoreTo].dbo.umbracoNode m ON r.id = m.id WHERE m.id IS NULL ORDER BY r.id -- The list of nodes to restore SELECT * FROM #ToRestore r ORDER BY r.[path]
umbracoNode
The table we're probably all most familiar with, this one contains the relationships between the different nodes. Without the records here nothing will work. It firstly changes the table so we can insert the ids (normally these are created by SQL) and it then inserts all missing nodes from the BackupDB.
-- Restore the nodes SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode ON INSERT INTO [RestoreTo].dbo.umbracoNode (id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate) SELECT id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate FROM [BackupDB].dbo.umbracoNode iin WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id) SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode OFF
cmsContent
I wasn't too familiar with this table when this started but it's basically an important relationship table which contains information on the type of node each node is.
-- Restore the node to content joins INSERT INTO [RestoreTo].dbo.cmsContent (nodeId, contentType) SELECT DISTINCT c.nodeId, c.contentType FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id)
cmsContentVersion
This table is the version record for the node. This script only restores the most recent version of the content, you may need more but we didn't.
-- Restore the content versions INSERT INTO [RestoreTo].dbo.cmsContentVersion (ContentId, VersionId, VersionDate, LanguageLocale) SELECT DISTINCT cv.ContentId, cv.VersionId, cv.VersionDate, cv.LanguageLocale FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId LEFT JOIN [BackupDB].dbo.cmsContentVersion cv ON c.nodeId = cv.ContentId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentVersion ic WHERE ic.ContentId=iin.id)
cmsPropertyData
This will restore the actual values for each of the nodes.
-- Restore the property data INSERT INTO [RestoreTo].dbo.cmsPropertyData (contentNodeId, versionId, propertytypeid, dataInt, dataDate, dataNvarchar, dataNtext) SELECT DISTINCT ipd.contentNodeId, ipd.versionId, ipd.propertytypeid, ipd.dataInt, ipd.dataDate, ipd.dataNvarchar, ipd.dataNtext FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsPropertyData ic WHERE ic.contentNodeId=iin.id)
cmsContentXml
These are the snippets of XML that are used to build the larger umbraco.config file, we're not sure whether they were needed in the end but it saves re-saving everything!
-- Restore the XML INSERT INTO [RestoreTo].dbo.cmsContentXml (nodeId, [xml]) SELECT cx.nodeId, cx.[xml] FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId LEFT JOIN [BackupDB].dbo.cmsContentXml cx ON iin.id=cx.nodeId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentXml ic WHERE ic.nodeId=iin.id) AND cx.nodeId is not null
The entire script
To use it, replace BackupDB with the name of the restored database i.e. the one with the missing nodes in. Then replace RestoreTo with the one you want to add the missing nodes to. Once you're happy it's working, you can change the ROLLBACK TRAN to COMMIT TRAN and run one list time...
Note: You won't be able to see the nodes in Umbraco until you have run it with COMMIT in place. Also, don't forget to take a backup of your live database before running scripts from the interwebs... You know... just in case little Bobby Drop Tables is around ;)/p>
BEGIN TRAN -- Identify missing nodes between the two databases SELECT r.id, r.text AS [NodeName], r.[path], (SELECT TOP 1 cv.VersionId FROM [BackupDB].dbo.cmsContentVersion cv WHERE cv.ContentId=r.id) AS [LatestVersion] INTO #ToRestore FROM [BackupDB].dbo.umbracoNode r LEFT JOIN [RestoreTo].dbo.umbracoNode m ON r.id = m.id WHERE m.id IS NULL ORDER BY r.id -- The list of nodes to restore SELECT * FROM #ToRestore r ORDER BY r.[path] -- Restore the nodes SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode ON INSERT INTO [RestoreTo].dbo.umbracoNode (id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate) SELECT id, trashed, parentID, nodeUser, level, path, sortOrder, uniqueID, text, nodeObjectType, createDate FROM [BackupDB].dbo.umbracoNode iin WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id) SET IDENTITY_INSERT [RestoreTo].dbo.umbracoNode OFF -- Restore the node to content joins INSERT INTO [RestoreTo].dbo.cmsContent (nodeId, contentType) SELECT DISTINCT c.nodeId, c.contentType FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsContent ic WHERE ic.nodeId=iin.id) -- Restore the content versions INSERT INTO [RestoreTo].dbo.cmsContentVersion (ContentId, VersionId, VersionDate, LanguageLocale) SELECT DISTINCT cv.ContentId, cv.VersionId, cv.VersionDate, cv.LanguageLocale FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId LEFT JOIN [BackupDB].dbo.cmsContent c ON iin.id = c.nodeId LEFT JOIN [BackupDB].dbo.cmsContentVersion cv ON c.nodeId = cv.ContentId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentVersion ic WHERE ic.ContentId=iin.id) -- Restore the property data INSERT INTO [RestoreTo].dbo.cmsPropertyData (contentNodeId, versionId, propertytypeid, dataInt, dataDate, dataNvarchar, dataNtext) SELECT DISTINCT ipd.contentNodeId, ipd.versionId, ipd.propertytypeid, ipd.dataInt, ipd.dataDate, ipd.dataNvarchar, ipd.dataNtext FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT * FROM [RestoreTo].dbo.cmsPropertyData ic WHERE ic.contentNodeId=iin.id) -- Restore the XML INSERT INTO [RestoreTo].dbo.cmsContentXml (nodeId, [xml]) SELECT cx.nodeId, cx.[xml] FROM [BackupDB].dbo.umbracoNode iin LEFT JOIN [BackupDB].dbo.cmsPropertyData ipd ON iin.id = ipd.contentNodeId LEFT JOIN [BackupDB].dbo.cmsContentXml cx ON iin.id=cx.nodeId WHERE iin.id IN (SELECT tr.id FROM #ToRestore tr) AND NOT EXISTS (SELECT TOP 1 1 FROM [RestoreTo].dbo.cmsContentXml ic WHERE ic.nodeId=iin.id) AND cx.nodeId is not null -- Clean up DROP TABLE #ToRestore ROLLBACK TRAN
Liked this post? Got a suggestion? Leave a comment