The last two holiday seasons have been uniquely different from those in previous years, prompting retailers to set up a different approach to ‘holiday readiness’ in 2022. Throughout the pandemic, retailers became increasingly dependent on their digital channels as the dominant, or potentially only, source of revenue.
Adding to the digital pressure, customers’ expectations have increased and their patience has decreased. Opportunities for competitive shopping are more abundant than ever. So, while the rules haven’t necessarily changed for this holiday season, the stakes of ensuring the success or failure for your retail business have never been higher.
Today, customers anticipate the holiday season earlier and expect it to last longer. With an expected duration of weeks or months, rather than days, we can no longer approach holiday readiness as preparation for a single large event, with special processes, 24/7 situation rooms, and other one-time activities.
We must create a sustainable operational cadence that allows our people, processes and technologies to respond to a series of peak events with little or no warning, and without causing undue impact to customers or business operations.
The most important step in planning for peak events is to clearly identify your business requirements. Having these requirements written down and agreed upon ahead of time will fuel easy decision-making and streamline the necessary response in times of uncertainty.
Requirements should be stated in business terms – X number of guest checkouts per second, Y number of logins within the first five minutes, or Z amount of store locator lookups at midnight on national holidays – rather than in technical terms, such as number of servers, amount of storage, or available bandwidth. You should clearly understand the priority of capabilities, and in the event of a crisis, what can be allowed to fail. For example, a rule could be set to allow real-time inventory checking to be disabled if the checkout rate exceeds 1,000 per second.
Of course, it is possible that your business is simply unable or unwilling to make these trade-offs. Regardless, the conversation should be explored so that these kinds of decisions can be made before the event, rather than under pressure.
Now that you clearly know what requirements your business has, you must ensure that you are gathering and tracking all the signals that will indicate whether your systems are meeting those needs. Observability can be broken down into three primary categories: monitoring, alerting, and logging.
Monitoring typically encompasses the time series data that we usually collect, such as logins, checkouts or search queries, along with the telemetry of our underlying infrastructure, such as CPU levels, bandwidth, and storage. But don’t forget that beyond these volumetric indicators, you need to be tracking the qualitative experience of the customer. Use load testing tools before the event to establish where in your usage curve you start to see performance degradations.
During your event, use synthetic monitoring to detect when you start to deviate from the baseline performance. Finally, leverage real user monitoring to understand what your customers are actually seeing, and in the event of an incident, to be able to quantify any negative impacts to the customer experience. Akamai tools for monitoring include Event Center, Event Viewer, Reporting, mPulse, CloudTest, Test Center, and Web Security Analytics.
Alerting simply means creating thresholds for your monitoring data to indicate when action is needed. Ideally, you can leverage adaptive thresholds, which can detect variations from the norm, rather than static thresholds. However, you should recognise that peak events by their very nature are variations from the norm, so you must still have well-defined absolute thresholds for the boundaries of safe operations for your application. Akamai tools for alerting include Control Center alerts, mPulse alerts, and Web Security Analytics alerts.
Logging is often overlooked, but it is a critical piece of peak event management. If your monitoring detects something out of the ordinary, and your alerting wakes up the entire team, how will they know what needs to be done? It pays to ensure that you are logging the right things ahead of time to better anticipate what could go wrong and what to debug first when issues arise.
Ideally, you will place logs into a unified dataset that can be combined rather than having to look at multiple systems. You should also ensure that your logging pipelines can scale to meet or exceed the expected volume as defined by your business requirements. Akamai tools for logging include DataStream 2, SIEM Integration, and Edge Diagnostics.
During a peak event, the impacts of small failures can quickly be exacerbated by load, and the impacts are more acutely felt by your customers. Make sure you’re ready to thrive with successful preparation in these busy times.
Read Akamai’s Holiday Readiness blog series here.
Jeff Darnton is commerce strategic engagement manager at Akamai
This article was originally published in the Autumn 2022 issue of Technology Record. To get future issues delivered directly to your inbox, sign up for a free subscription.