Degraded performance : customer portals
Incident Report for 12return
Postmortem

In the night of Monday 11 December to Tuesday 12 December our storage hosting provider (Softlayer) performed what should have been a non-interruptive maintenance on the storage solution used by our Europe Region. The storage solution is used to store and serve, among others, the shipping labels.

During the maintenance window there where some small disruptions of the service, these where not investigated further since the service returned to stable after the maintenance window. However, the next day new issues with the storage solution started to appear. A small percentage of the writing attempts to this storage solution failed, as a result a number of shipping labels could not be created.

11-12-2017 - 19:00 UTC Softlayer starts maintenance on the Storage solution, within minutes storage errors are detected.

11-12-2017 – 22:00 UTC Softlayer maintenance window has ended and there are no further errors detected. No further investigation is deemed necessary.

12-12-2017 – 08:35 UTC The first label creation errors start to appear, we start to monitor and investigate the issue. Due to the large time intervals between errors it takes some time to pin-point the root-cause of the issues.

12-12-2017 – 11:00 UTC The cause and scoop have been determent and we are investigating it further with the storage provider.

12-12-2017 – 12:30 UTC Official acknowledgement from storage provider. Due to the increase of filers the portal status is updated to ‘ Degraded performance’. Still only a small percentage of all RMA creations and processing actions are affected.

12-12-2017 – 13:50 UTC Hosting provider implements a fix and reports the situation stabilized. After testing it our self, and monitoring 20 minutes of creations and processing actions on the system we detect no more issues and update the portal status to “Operational”.

12-12-2017 – 14:30 UTC Issues have appeared again, and we escalate the issue at the storage provider. The portal status is updated to ‘ Degraded performance’

12-12-2017 – 17:50 UTC Softlayer plans an extra maintenance window starting at 19:00 to implement a fix.

12-12-2017 – 19:00 UTC Softlayer starts to implement the fix.

12-12-2017 – 21:19 UTC Softlayer reports they have completed the fix implementation, the issues stop appearing. We decide to keep the official status on ‘Degraded performance’ and we keep monitoring the systems.

13-12-2017 – 07:00 UTC no errors have been detected since the fix implementation and all services are considered stable again. We update the portal status to “Operational”.

Posted Dec 13, 2017 - 18:25 CET

Resolved
Due to degraded storage services at our hosting partner some shipping labels could not be created. The issues have been resolved and the service has returned to a stable status.
Posted Dec 13, 2017 - 08:26 CET
Identified
Due to degraded storage services at our hosting partner some shipping labels can not be created.
Posted Dec 12, 2017 - 15:44 CET