Caching
Data
Introduction
In this article I provide one solution for generating and sharing data across multiple nodes of a Sitefinity website.
There are a few options for caching data with Sitefinity and I thought I would share one which is not exactly out of the box.
The Considerations
- I needed to provide an API of JSON data based on the data stored in Sitefinity.
- The structure and design of this data was for efficient client-side consumption. The process of generating this data took about 45 seconds.
- The process was database intensive and needed to be refreshed every hour.
- The final JSON size was around 500Kb.
- There were three nodes in the production environment.
When we were on 11.2 this process took 160 seconds. Upgrading to 12.2 (and a new release of Open Access) improved this to 45 seconds.
The first approach was to create a Web API that would generate the data and cache it for an hour.
But the person requesting a non-cached request needs to wait 45 seconds and with three nodes that would mean three people every hour. This also meant every hour we would have a very intense database hit as three nodes tried to generate this data. But also, it was very likely that each node could have different data as there is no control over the timing on calling the API on each node.
Skipping the iterations and thought process and getting to the solution.
The Solution
To reduce the database impact the data generation was moved to a Sitefinity scheduled task and set to run every hour.
This task would then save the data to a physical file and store it in Sitefinity's document library. The idea here is that first, this is a central store for all instances. Second, Sitefinity automatically takes care of self caching and expiring the cache when the document is updated.
Below is the code that I use. First I serialise the data set and put it into a MemoryStream. I then either create a new document in case this is the first time (or some smuck deleted the file from the library) or I update the file if it exists.
var json = JsonConvert.SerializeObject(results); var dataStream = new MemoryStream(Encoding.UTF8.GetBytes(json)); UpdateDataFile(dataStream);
public readonly static Guid documentMasterId = new Guid("b87c31b6-fc09-48c0-989b-24a2d2bc1d69"); private void UpdateDataFile(MemoryStream dataStream) { LibrariesManager librariesManager = LibrariesManager.GetManager(); Document master = librariesManager.GetDocuments().Where(d => d.Id == documentMasterId).FirstOrDefault(); if (master == null) { CreateMasterDocument(dataStream); } else { Document temp = librariesManager.Lifecycle.CheckOut(master) as Document; temp.Title = documentTitle; temp.LastModified = DateTime.UtcNow; temp.Urls.Clear(); temp.UrlName = documentTitle; temp.MediaFileUrlName = documentTitle; librariesManager.Upload(temp, dataStream, ".json"); librariesManager.RecompileAndValidateUrls(temp); master = librariesManager.Lifecycle.CheckIn(temp) as Document; librariesManager.SaveChanges(); var bag = new Dictionary<string, string>(); bag.Add("ContentType", typeof(Document).FullName); WorkflowManager.MessageWorkflow(documentMasterId, typeof(Document), null, "Publish", false, bag); } } private void CreateMasterDocument(MemoryStream dataStream) { LibrariesManager librariesManager = LibrariesManager.GetManager(); Document document = librariesManager.CreateDocument(documentMasterId); DocumentLibrary documentLibrary = librariesManager.GetDocumentLibraries().Where(d => d.Title == "[Library Title]").SingleOrDefault(); document.Parent = documentLibrary; document.Title = documentTitle; document.DateCreated = DateTime.UtcNow; document.PublicationDate = DateTime.UtcNow; document.LastModified = DateTime.UtcNow; document.UrlName = documentTitle; document.MediaFileUrlName = documentTitle; librariesManager.Upload(document, dataStream, ".json"); librariesManager.RecompileAndValidateUrls(document); librariesManager.SaveChanges(); var bag = new Dictionary<string, string>(); bag.Add("ContentType", typeof(Document).FullName); WorkflowManager.MessageWorkflow(documentMasterId, typeof(Document), null, "Publish", false, bag); }
Then in my Web API method, I retrieve the file and return the result to the calling client.
[HttpGet] [Route("api/finder-data")] public IHttpActionResult GetFinderData() { var response = new FinderResults(); LibrariesManager librariesManager = LibrariesManager.GetManager(); Document document = librariesManager.GetDocuments().Where(d => d.Id == documentMasterId).FirstOrDefault(); using (var stream = librariesManager.Download(document)) { StreamReader reader = new StreamReader(stream); String json = reader.ReadToEnd(); response = JsonConvert.DeserializeObject<FinderResults>(json); } return this.Ok(response); }
Out of the box, Sitefinity has an internal cache so that the retrieved document won't always be pulling from the storage, (Blob or DB) and internal cache dependencies and notifications will take care of clearing it across multiple nodes when you update the file.
You may be thinking I should cache these results using Sitefinity's CacheManager in the API and avoid that retrieval and processing time altogether. And good, you should be thinking this. But here we need to have a think about it.
If you are on a single node, then there are no issues and it is a definite yes to doing this. But if you will be running on multiple nodes we have to think about the fact that the Sitefinity CacheManger does not notify other nodes.
Our cache code will look something like this.
String cacheKey = "FinderGeneration"; var response = (DateTime?)CacheManager[cacheKey]; if(!response.HasValue) { // Code to get the data CacheManager.Add("FinderGeneration", response, CacheItemPriority.Normal, null, new AbsoluteTime(TimeSpan.FromHours(1))); } return this.Ok(response);
Our scheduled task will run on one node in the group and we add a final step to clear or update the cache when we finish. On the node that this happens, the API will serve the latest update but on all the other nodes it will be serving the previous version until its cache times out.
Remember, the Sitefinity cache is a In-memory cache and thus stored in the physical memory of the server it is running on. Fast as, but local.
Sitefinity does not support (or recommend extending the cache notification service) to accommodate communicating with other nodes to tell them to clear a cache item such as this.
Depending on your usage and requirements this may or may not be an issue.
If it is, one option would be to use a centralised Redis Cache instance and maintain cache data there.
Another option is to create another API allowing us to call the other nodes and tell them to clear a certain cache key. The main thing here is to ensure this is not publicly accessible. You can use Sitefinity's Load Balancing configuration to get all the nodes/URL's that you need to contact.
But here is what I did. I added a timestamp to the result set to indicate the time of generation. I also added another API that returned just this time data and cached it. On the client, the large data set was stored in local storage. When the page was loaded an AJAX call checked the date generation API. If the server returned a later date then the client would get a fresh copy of the larger data set, else it continued to use what was on local storage.
I am not going to show the code I did for this as it still doesn't really solve the problem but I mentioned it to give you some thoughts and help you think about the problem against your own requirements.
I have created a new feature request for Sitefinity to extend their current cache notification service. You can find it at https://sitefinity.ideas.aha.io/ideas/SF-I-2441. If you agree, please vote it up.
Thanks for reading and feel free to comment - Darrin Robertson
If I was really helpful and you would buy me a coffee if you could, yay! You can.
Make a Comment