Search every page in Umbraco for a string

As part of a rebrand, a customer had changed their email address in Umbraco and although it was instantly reflected on the website (yay for CMSs!), they wanted us to double check that the old email address had been completely removed.

Although we can't remove it from the internet entirely, we could reassure the customer that it had at least been removed from the data using the handy SQL script I posted a while ago - Search every table and field in a SQL Server Database. However, what if you wanted to find which pages/fields in Umbraco have the search string? That's a little easier as we can check a couple of tables.

I needed to do it in v7 but I've popped together a v9 version as well as the Umbraco database schema is slightly different. Hopefully it's of help to someone else.

Umbraco v9

DECLARE @SearchStr nvarchar(100)
SET @SearchStr = '## YOUR STRING HERE ##'

DECLARE @SearchStringLen INT = LEN(@SearchStr)

SELECT
    d.PageId,
    d.NodeName,
    d.FieldName,
    CASE 
        WHEN LEN(d.[Value]) = @SearchStringLen THEN d.[Value]
        ELSE SUBSTRING(d.[Value], CHARINDEX(@SearchStr, d.[Value]) - (@SearchStringLen / 2), (@SearchStringLen * 2))
    END AS MatchPreview
FROM (
    SELECT
        n.text AS [PageId],
        n.text AS [NodeName],
        pt.Name AS [FieldName],
        COALESCE(CAST(pd.varcharValue AS NVARCHAR(MAX)), CAST(pd.textValue AS NVARCHAR(MAX)), CAST(pd.intValue AS NVARCHAR(MAX)), CAST(pd.dateValue AS NVARCHAR(MAX))) AS [Value],
        VersionDate
    FROM 
        dbo.umbracoPropertyData pd
            LEFT JOIN cmsPropertyType pt ON pd.propertytypeid=pt.id
            LEFT JOIN dbo.umbracoContentVersion cv ON pd.versionId=cv.Id
            LEFT JOIN dbo.umbracoNode n ON cv.nodeId=n.id
    WHERE 
        COALESCE(CAST(pd.varcharValue AS NVARCHAR(MAX)), CAST(pd.textValue AS NVARCHAR(MAX)), CAST(pd.intValue AS NVARCHAR(MAX)), CAST(pd.dateValue AS NVARCHAR(MAX))) LIKE '%' + @SearchStr + '%'
        AND cv.[current]=1
        AND n.trashed=0
) d

Umbraco v7.4

In earlier versions of Umbraco, we have to search using the version date to get the most recent version of the content which is a bit of a pain but not too difficult if you're on a more recent version of SQL Server. This should give you back your results you're after.

DECLARE @SearchStr nvarchar(100)
SET @SearchStr = '## YOUR STRING HERE ##'

DECLARE @SearchStringLen INT = LEN(@SearchStr)

SELECT
    d.PageId,
    d.NodeName,
    d.FieldName,
    CASE 
        WHEN LEN(d.[Value]) = @SearchStringLen THEN d.[Value]
        ELSE SUBSTRING(d.[Value], CHARINDEX(@SearchStr, d.[Value]) - (@SearchStringLen / 2), (@SearchStringLen * 2))
    END AS MatchPreview
FROM (
    SELECT
        n.text AS [PageId],
        n.text AS [NodeName],
        pt.Name AS [FieldName],
        COALESCE(CAST(pd.dataNvarchar AS NVARCHAR(MAX)), CAST(pd.dataNtext AS NVARCHAR(MAX)), CAST(pd.dataInt AS NVARCHAR(MAX)), CAST(pd.dataDate AS NVARCHAR(MAX))) AS [Value],
        ROW_NUMBER() OVER (PARTITION BY cv.ContentId ORDER BY cv.VersionDate DESC) AS Row_Num,
        VersionDate,
        (SELECT TOP 1 MAX(icv.VersionDate) FROM dbo.cmsContentVersion icv WHERE icv.ContentId=cv.ContentId) AS [LatestDate]
    FROM 
        dbo.cmsPropertyData pd
            LEFT JOIN cmsPropertyType pt ON pd.propertytypeid=pt.id
            LEFT JOIN dbo.cmsContentVersion cv ON pd.versionId=cv.VersionId
            LEFT JOIN dbo.umbracoNode n ON cv.ContentId=n.id
    WHERE 
        COALESCE(CAST(pd.dataNvarchar AS NVARCHAR(MAX)), CAST(pd.dataNtext AS NVARCHAR(MAX)), CAST(pd.dataInt AS NVARCHAR(MAX)), CAST(pd.dataDate AS NVARCHAR(MAX))) LIKE '%' + @SearchStr + '%'
        AND n.trashed=0
) d
WHERE 
    Row_Num = 1
    AND LatestDate = VersionDate

Subscribe to TSD

Don’t miss out on the latest posts. Sign up now to get access to the library of members-only posts.
jamie@example.com
Subscribe