Search every page in Umbraco for a string
As part of a rebrand, a customer had changed their email address in Umbraco and although it was instantly reflected on the website (yay for CMSs!), they wanted us to double check that the old email address had been completely removed.
Although we can't remove it from the internet entirely, we could reassure the customer that it had at least been removed from the data using the handy SQL script I posted a while ago - Search every table and field in a SQL Server Database. However, what if you wanted to find which pages/fields in Umbraco have the search string? That's a little easier as we can check a couple of tables.
I needed to do it in v7 but I've popped together a v9 version as well as the Umbraco database schema is slightly different. Hopefully it's of help to someone else.
Umbraco v9
DECLARE @SearchStr nvarchar(100)
SET @SearchStr = '## YOUR STRING HERE ##'
DECLARE @SearchStringLen INT = LEN(@SearchStr)
SELECT
d.PageId,
d.NodeName,
d.FieldName,
CASE
WHEN LEN(d.[Value]) = @SearchStringLen THEN d.[Value]
ELSE SUBSTRING(d.[Value], CHARINDEX(@SearchStr, d.[Value]) - (@SearchStringLen / 2), (@SearchStringLen * 2))
END AS MatchPreview
FROM (
SELECT
n.text AS [PageId],
n.text AS [NodeName],
pt.Name AS [FieldName],
COALESCE(CAST(pd.varcharValue AS NVARCHAR(MAX)), CAST(pd.textValue AS NVARCHAR(MAX)), CAST(pd.intValue AS NVARCHAR(MAX)), CAST(pd.dateValue AS NVARCHAR(MAX))) AS [Value],
VersionDate
FROM
dbo.umbracoPropertyData pd
LEFT JOIN cmsPropertyType pt ON pd.propertytypeid=pt.id
LEFT JOIN dbo.umbracoContentVersion cv ON pd.versionId=cv.Id
LEFT JOIN dbo.umbracoNode n ON cv.nodeId=n.id
WHERE
COALESCE(CAST(pd.varcharValue AS NVARCHAR(MAX)), CAST(pd.textValue AS NVARCHAR(MAX)), CAST(pd.intValue AS NVARCHAR(MAX)), CAST(pd.dateValue AS NVARCHAR(MAX))) LIKE '%' + @SearchStr + '%'
AND cv.[current]=1
AND n.trashed=0
) d
Umbraco v7.4
In earlier versions of Umbraco, we have to search using the version date to get the most recent version of the content which is a bit of a pain but not too difficult if you're on a more recent version of SQL Server. This should give you back your results you're after.
DECLARE @SearchStr nvarchar(100)
SET @SearchStr = '## YOUR STRING HERE ##'
DECLARE @SearchStringLen INT = LEN(@SearchStr)
SELECT
d.PageId,
d.NodeName,
d.FieldName,
CASE
WHEN LEN(d.[Value]) = @SearchStringLen THEN d.[Value]
ELSE SUBSTRING(d.[Value], CHARINDEX(@SearchStr, d.[Value]) - (@SearchStringLen / 2), (@SearchStringLen * 2))
END AS MatchPreview
FROM (
SELECT
n.text AS [PageId],
n.text AS [NodeName],
pt.Name AS [FieldName],
COALESCE(CAST(pd.dataNvarchar AS NVARCHAR(MAX)), CAST(pd.dataNtext AS NVARCHAR(MAX)), CAST(pd.dataInt AS NVARCHAR(MAX)), CAST(pd.dataDate AS NVARCHAR(MAX))) AS [Value],
ROW_NUMBER() OVER (PARTITION BY cv.ContentId ORDER BY cv.VersionDate DESC) AS Row_Num,
VersionDate,
(SELECT TOP 1 MAX(icv.VersionDate) FROM dbo.cmsContentVersion icv WHERE icv.ContentId=cv.ContentId) AS [LatestDate]
FROM
dbo.cmsPropertyData pd
LEFT JOIN cmsPropertyType pt ON pd.propertytypeid=pt.id
LEFT JOIN dbo.cmsContentVersion cv ON pd.versionId=cv.VersionId
LEFT JOIN dbo.umbracoNode n ON cv.ContentId=n.id
WHERE
COALESCE(CAST(pd.dataNvarchar AS NVARCHAR(MAX)), CAST(pd.dataNtext AS NVARCHAR(MAX)), CAST(pd.dataInt AS NVARCHAR(MAX)), CAST(pd.dataDate AS NVARCHAR(MAX))) LIKE '%' + @SearchStr + '%'
AND n.trashed=0
) d
WHERE
Row_Num = 1
AND LatestDate = VersionDate