Last week I pair programmed with fellow Sitecore MVP Akshay Sura on a class that would serve as an <httpRequestBegin> pipeline processor to serve up ‘Page Not Found’ content along with a 404 status code when a user requests a page that does not exist as an Item in the Sitecore XP.
In this solution, the page does not redirect to the ‘Not Found’ page since this results in a 302 status code which isn’t ideal for SEO. Instead, the ‘Page Not Found’ content should appear on the page with the ‘Not Found’ request.
We decided to have our <httpRequestBegin> pipeline processor class not inherit from Sitecore.Pipelines.HttpRequest.ExecuteRequest — this lives in Sitecore.Kernel.dll — as can be seen in the following blog posts:
- Return a 404 Not Found status code when the ItemNotFound page is loaded by Sitecore MVP Ruud van Falier
- Return a 404 Not Found for invalid Sitecore item by someone at NTT Data
Why? The solutions in the above are a bit fragile given that they are subclassing Sitecore.Pipelines.HttpRequest.ExecuteRequest which is an example of tight coupling — code changes in Sitecore.Pipelines.HttpRequest.ExecuteRequest could potentially break code within the subclasses.
Further, the implementations of the RedirectOnItemNotFound() method in the above blog posts don’t redirect unless an Exception is encountered which is a bit awkward given the name of the method.
I’m not going to share the exact solution that Akshay and I had built in this blog post. Instead, I’m going to share one that is quite similar — actually the solution below is an enhancement of the solution we had come up with. I added some caching and a few other things (basically put more things into Sitecore configuration so that the solutions is more extendable/changeable):
using System; using System.Collections.Generic; using System.Linq; using System.Web; using Sitecore.Data; using Sitecore.Data.Items; using Sitecore.Diagnostics; using Sitecore.Links; using Sitecore.Pipelines.HttpRequest; using Sitecore.Web; using Sitecore.Sandbox.Caching; namespace Sitecore.Sandbox.Pipelines.HttpRequest { public class HandleItemNotFound : HttpRequestProcessor { private string TargetWebsite { get; set; } private string StatusDescription { get; set; } private List<string> RelativeUrlPrefixesToIgnore { get; set; } protected ICacheProvider CacheProvider { get; private set; } protected string CacheKey { get; private set; } public HandleItemNotFound() { RelativeUrlPrefixesToIgnore = new List<string>(); } public override void Process(HttpRequestArgs args) { Assert.ArgumentNotNull(args, "args"); bool shouldExit = Sitecore.Context.Item != null || !string.Equals(Context.Site.Name, TargetWebsite, StringComparison.CurrentCultureIgnoreCase) || StartsWithPrefixToIgnore(args.Url.FilePath); if (shouldExit) { return; } string notFoundPageItemPath = Sitecore.Context.Site.Properties["notFoundPageItemPath"]; if (string.IsNullOrWhiteSpace(notFoundPageItemPath)) { return; } Database database = GetDatabase(); if (database == null) { return; } Item notFoundItem = database.GetItem(notFoundPageItemPath); if (notFoundItem == null) { return; } string notFoundContent = GetNotFoundPageContent(args, database, notFoundPageItemPath); if(!string.IsNullOrWhiteSpace(notFoundContent)) { args.Context.Response.TrySkipIisCustomErrors = true; args.Context.Response.StatusCode = 404; if (!string.IsNullOrWhiteSpace(StatusDescription)) { args.Context.Response.StatusDescription = StatusDescription; } args.Context.Response.Write(notFoundContent); args.Context.Response.End(); return; } Log.Warn("The 'Not Found Page: {0} shows no content when rendered!", notFoundItem.Paths.FullPath); } protected virtual bool StartsWithPrefixToIgnore(string url) { return !string.IsNullOrWhiteSpace(url) && RelativeUrlPrefixesToIgnore.Any(prefix => url.StartsWith(prefix)); } protected virtual Database GetDatabase() { return Context.ContentDatabase ?? Context.Database; } protected virtual string GetNotFoundPageContent(HttpRequestArgs args, Database database, string notFoundPageItemPath) { Assert.ArgumentNotNull(args, "args"); Assert.ArgumentNotNull(database, "database"); Assert.ArgumentNotNullOrEmpty(notFoundPageItemPath, "notFoundPageItemPath"); string cacheKey = GetCacheKey(); string content = GetNotFoundPageContentFromCache(); if(!string.IsNullOrWhiteSpace(content)) { return content; } Item notFoundItem = database.GetItem(notFoundPageItemPath); if (notFoundItem == null) { return string.Empty; } string domain = GetDomain(args); string url = LinkManager.GetItemUrl(notFoundItem); try { content = WebUtil.ExecuteWebPage(string.Concat(domain, url)); AddNotFoundPageContentFromCache(content); return content; } catch (Exception ex) { Log.Error(string.Format("{0} Error - domain: {1}, url: {2}", ToString(), domain, url), ex, this); } return string.Empty; } protected virtual string GetNotFoundPageContentFromCache() { Assert.IsNotNull(CacheProvider, "CacheProvider must be set in configuration!"); return CacheProvider[GetCacheKey()] as string; } protected virtual void AddNotFoundPageContentFromCache(string content) { Assert.IsNotNull(CacheProvider, "CacheProvider must be set in configuration!"); if(string.IsNullOrWhiteSpace(content)) { return; } CacheProvider.Add(GetCacheKey(), content); } protected virtual string GetCacheKey() { Assert.IsNotNullOrEmpty(CacheKey, "CacheKey must be set in configuration!"); return CacheKey; } protected virtual string GetDomain(HttpRequestArgs args) { Assert.ArgumentNotNull(args, "args"); return args.Context.Request.Url.GetComponents(UriComponents.Scheme | UriComponents.Host, UriFormat.Unescaped); } } }
The code in the Process() method above determines whether it should execute. It should only execute when Sitecore.Context.Item is null — this means that previous <httpRequestBegin> pipeline processors could not ascertain which Sitecore Item should be served up for the request — and if the relative url does not start with one of the prefixes to ignore — for example, we don’t want this code to run for media library Item requests which all start with /~/ in a stock Sitecore instance.
Further, the path to the ‘Page Not Found’ Item must be set on the site node within Sitecore configuration. If this is not set, then the code will not execute.
If the code should execute, it tries to grab the ‘Page Not Found’ content from cache — the class above reuses the CacheProvider class which I wrote for my post on storing data outside of the Sitecore XP but using the Sitecore API.
If this does not exist in cache, we basically make a request to the ‘Page Not Found’ Item using Sitecore.Web.WebUtil.ExecuteWebPage; put this content in cache; and then return it to the Process() method.
If there is content to display, we send it out to the response stream.
I then glued everything together using the following patch configuration file:
<?xml version="1.0" encoding="utf-8" ?> <configuration xmlns:patch="http://www.sitecore.net/xmlconfig/"> <sitecore> <pipelines> <httpRequestBegin> <processor patch:before="processor[@type='Sitecore.Pipelines.HttpRequest.ExecuteRequest, Sitecore.Kernel']" type="Sitecore.Sandbox.Pipelines.HttpRequest.HandleItemNotFound, Sitecore.Sandbox"> <TargetWebsite>website</TargetWebsite> <StatusDescription>Page Not Found</StatusDescription> <RelativeUrlPrefixesToIgnore hint="list"> <Prefix>/~/</Prefix> </RelativeUrlPrefixesToIgnore> <CacheProvider type="Sitecore.Sandbox.Caching.CacheProvider, Sitecore.Sandbox"> <param desc="cacheName">[404]</param> <param desc="cacheSize">500KB</param> </CacheProvider> <CacheKey>404Content</CacheKey> </processor> </httpRequestBegin> </pipelines> <sites> <site name="website"> <patch:attribute name="notFoundPageItemPath">/sitecore/content/Home/404</patch:attribute> </site> </sites> </sitecore> </configuration>
In the above configuration file, I am injecting this <httpRequestBegin> pipeline processor to execute before the Sitecore.Pipelines.HttpRequest.ExecuteRequest <httpRequestBegin> pipeline processor.
Let’s see this in action.
I set up an Item in Sitecore to serve as my ‘Page Not Found’ page Item:
After publishing and navigating to a page url that does not exist in my instance, I get the following:
As you can see, we get the rendered page content for the 404 Item yet stay on the original requested nonexistent page (/nope).
If you have any comments or thoughts on this, please share in a comment.
