While it’s immediately clear from the vast coverage and immense site traffic how popular Black Friday is in South Africa, it doesn’t automatically equate to thousands of sales.
For a company like Raru, buying millions of Rands of stock for a single day adds risk.
The vast amount of traffic could result in possible downtime for the site, meaning it would take much longer to sell that stock.
While the folks at Raru expected twice the traffic as last year, upgrading hardware wasn’t an option – so how did they do it?
It helps that the site is primarily automated, but with prices and stocks changing by the second, the technical team had to be prepared for the increased traffic well in advance.
Raru’s Renier Crause revealed how they kept the site up during the madness.
First Raru upgraded to the latest version of Ubuntu Linux, which includes an upgrade to PHP 7.
The site’s benchmarks showed that this gave it a 130% increase in performance under load. The Ubuntu upgrade also resulted in the latest versions of Nginx and MySQL.
The database was stress-tested and many configuration settings were adjusted for the site’s specific hardware and expected load.
Because of the way Raru runs its big deal event days, with new products appearing at regular intervals, this causes extreme spikes in traffic at those intervals as many people are refreshing their browsers at the same time.
For example, on Raru’s 2nd birthday sale, the concurrent connections at the spike were approaching the limits that Nginx could handle on the site’s hardware.
To mitigate this Raru implemented the following:
- Switched to HTTP/2. This new protocol reduces the number of connections your browser makes to the server and fetches multiple images and scripts on a single connection. This increases performance for the user and reduces concurrent connections on the server.
- Load balanced to serve images from their fail-over server as load gets bigger.
- Run increasingly more SQL queries from their slave database on the fail-over server should the need arise.
- Have the option to switch off some images on the deals page in the case of total overload. This is not ideal, but better than total downtime. Fortunately this was never needed.
On the day
At 06:00 it was already clear traffic was going to be big.
By 07:00 Raru was already past a normal day’s peak traffic even though its Black Friday deals only started at 08:00.
At 8:01am the site went down, only to reappear at 8:02am. With so many people refreshing at the same time, the database was having to update the new prices and deals on the products which pushed the site beyond its one minute timeout. If it used a two minute timeout, there would have been no downtime.
The site’s big deals sold out in minutes. As a result, Raru went past an average day’s turnover in just 14 minutes.
At 10:00 when the next batch of deals came online, some people experienced another timeout but it was only for a minute.
By 11:00 Raru had passed last year’s Black Friday total in sales. The site broke its all-time record of new customer registrations in a day even though most people pre-registered earlier in the week.
At 15:00 it had passed its target for the day with still 5 hours to go.
At 20:00 people were still buying last-minute deals, so Raru extended the deals to 22:00.
After Black Friday ended, the site automatically reverted to its normal home page and switched off the black colours.
With the huge volume of transactions countrywide, the company that handles 3D Secure for credit cards couldn’t cope and many people were unable to complete their credit card payments.
Fortunately, Raru does offer other payment methods and the ability to switch your payment method or retry your card later, after you secured your stock.
By the afternoon, 3D Secure improved and most people could complete their payments, but this meant that they might not have made the cut-off for their purchase to be shipped on Friday.
- Page views were double compared to Black Friday 2015.
- Page views were up 700% compared to an average November day.
- 100% growth in turnover compared to last Black Friday.
- Turnover was up 1100% compared to an average November day.
- Total uptime of 99.8% (2 minutes down in total for some people).
- Peak of 20,000 concurrent connections (even with HTTP/2 reducing this a lot)
- Peak of 6,000 web-server requests per second.
- Peak CPU usage was 38% for 5 minutes at 8am. Average usage was 18%.