Retail technology is becoming increasingly complex as the number of consumer interactions continue to rise. From point-of-sale (POS) systems to quick response (QR) systems, building management, mobile devices, and IoT, retailers across Australia are under pressure to provide seamless experiences to meet growing customer expectations and stay ahead of the competition.

The influx of modern technologies also brings in a wealth of data from both digital and physical retail channels, making it essential for retailers to update their IT infrastructures and operations to leverage this data effectively while serving customers efficiently.

We’ve often heard that building operational resilience is key, but what does this resiliency look like and why should retailers care?

Decoding operation resilience

Operational resilience is not to be confused with reliability. Reliability focuses on the ability of systems and processes to function without failure under normal conditions, whereas operational resilience is about building for the fact that they will fail.

Operational resilience is an organisation’s ability to resist, adapt to, and recover from any type of disruption  and as such is not limited to just technical issues, but also the organisation having the processes and culture in place to deal with adversity.

If a retail establishment is not truly resilient, even a small disruption in workflow can have ripple effects across several different areas of an organisation. For example, an incident in a retailer’s supply chain can impact how, when and if a customer receives their purchases, which can have wider impacts on how they view and interact with the retailer in future transactions.

We’ve seen this first hand with the recent global IT outage. In fact, when it comes to unplanned downtime and disruptions, PagerDuty research on Australian customer-facing incidents shows that costs per incident can total $7,011 per minute, or on average over $1 million.

As retailers evolve and engage with customers through various digital and physical touchpoints, operational resilience becomes even more critical for omni-channel retailers. The interconnectivity of multiple retail channels is such that a slightly longer wait time in an e-retailer’s website can affect customer sales in physical brick-and-mortar stores and vice versa.

Retailers who prioritise building operational resilience may not see its instant results, but can reap its benefits over a longer period of time, whether it is during peak retail periods (like Black Friday) when there are risks from online traffic surges or an unexpected failure in a store’s POS system.

Where should retailers start?

Based on PagerDuty research, 40% of retail IT leaders say that customer-impacting incidents have increased. This is why retailers should direct their focus on investing in modern and robust incident response processes and systems.

For successful incident response, having meaningful and targeted insights that cut through the noise of modern complex systems is crucial. This will allow retailers to orchestrate the most effective  response to resolve incidents faster and more efficiently, ultimately improving key operational metrics such as mean-time-to-resolution (MTTR). To achieve this successfully, retail organisations should look at the end-to-end incident lifecycle and where it can be streamlined.

Three things to keep in mind are – assess, resolve and learn.

  1. When an incident occurs, responders must assess the impact to both the business and the customer, and mobilise the relevant stakeholders accordingly. 
  2. Based on the operational resilience of the retailer, resolving an incident could mean anything from just restoring service to fixing the underlying problem. A clear and robust plan to resolution should be well defined and well known.
  3. Most importantly, they must learn through incident post-mortems and aim to continuously improve their ability to deal with operational adversity.

Real-time response is key

Part of building operational resilience requires retailers to undertake an “always-on” approach, characterised by proactive, intelligent, and automated responses to manage operations effectively.

The speed of risk detection and resolution is critical to incident management, which is where real-time response comes into play. To keep up with ongoing customer expectations,retailers now require diverse monitoring and observability solutions capable of collecting real-time data signals from all components of their retail ecosystem. This complexity drives the need for a central context engine that can correlate and filter the important signals from all the noise. 

Additionally, to further mitigate these risks, retailers need to be sure that their incident management systems and workflows enable a real-time response powered by actionable insights so that service is restored as quickly as possible.

Where automation can help

Lastly, automation plays a pivotal role in achieving the operational efficiency necessary for long-term resilience. By integrating AI and automation into strategic operational processes, retailers can reduce reliance on manual intervention in incident management.

Automation can allow for faster issue detection and triage, streamline notifications to relevant support teams, remove the manual response to known issues, and ultimately reduce incident cost and duration.

With the power of automation, PagerDuty helped transform and streamline operations of a major Australian retailer during their in-house website launch. PagerDuty leveraged both automation and event intelligence to provide situational awareness to the retailer, help gain insight into the root cause of incidents and provide intelligent recommendations over time.

David Ridge is head of solutions consulting for Asia Pacific & Japan at PagerDuty.