navigation

Account

In your account you can view the status of your application, save incomplete applications and view current news and events

enEnglish

deGerman

June 10, 2013

The easiest way to crash your site

Architecture

Development

0 Comments

0 Likes

What is the article about?

With all the latest trends in the web such as “Social Sharing”, “Distributed marketing campaigns” or just simple website trackings, people tend to forget about one tiny simple fact:

Dependency on 3rd party vendors stability.

If your website is your sales channel or even your only product you must have the ambition to keep it running and fully functional 24/7 and with 99,99% uptime per year. Everything below that is a significant loss of revenues. My employers shop (www.otto.de) has 2 orders per second on average. If it had 99,9% (instead of 99,99%) uptime per year, well, calculate yourself. A year has 31.536.000 seconds. Talking about 0,09% downtime a year thus means 7,8hrs or 28.382,4 seconds of unavailability. Sounds fair. But 28.382,4 seconds of downtime compared to to 2 orders per second, this would mean a loss of 56.764 orders!

Now we get closer to the point I want to make. Just 0,09% of our website being “not fully available” can mean a loss of thousands of orders and thus a lots of revenues. And that is the only loss you are aware of ! Because there are red-lights bumping up in your data centre and your ops are running around like mad chicken trying to get stuff back online again.

What about “downtimes” or “times of being unavailable” you dont notice?!

That is the real interesting part.

Downtimes based on your own infrastructure are easy to spot and to measure. Your VP SiteOps may already generate a precise report for the C-Level guys saying that “we were up an running this month like 99.97%”.

Fine.

But after everyone embraced himself for being so available, the marketing departments requires a multi-channel-tracking-javascript from company XYZ, the CorpCom guys want some fancy new G+ share buttons and your business intelligence department requires a new tracking lib being served from the Foo CDN. So you just stabilized your own infrastructure and are proud of the 99.97% but introduced several new Point of Failures: Third Party Content.

Let me give you 3 rules which are very important to be totally aware of:

1) Third Party content is not within your control

2) In General servers will fail. Every server fails. There is no 100% uptime. And uptime with 98% CPU load is also uptime

3) Thus third Party servers will fail and if you haven’t done your homeworks, you will fail too. No matter how fancy your infrastructure fail overs are. And then think of 1)

Homeworks

So lets do our homework and understand what will fail!!

Lets assume we have two different types of Third Party Code. JavaScript and CSS. We leave out backend stuff here, because they are usually provided with good test coverage and failovers. If e.g. you want to use some marketing tracking stuff on your website, usually the “marketing tracking provider” asks you to put his <script> block just right below the opening <html> tag in order to work properly.

Now we mix some ingredients together:

every <script src=””> tag is loaded *synchronously* by the browser. Always. Per definition. That means, the browser does not (!!) continue evaluating the website unless it has downloaded the src file and has evaluated it.
Third Party Servers or network connections can crash, fail, delay and so on
And finally some salt to the soup: you want to track your pageviews because you get affiliate money or whatsoever

A very easy and abstracted code example would be

<html> <head> <script src="http://www.thirdparty.com/tracking.js"> </script> </head>

<body> Your websites content <script> affiliate.trackAndGenerateMoney();</script> </body> </html>

Now as mentioned above, lets imagine thirdparty.com or any of the magic between the client and the server of thirdparty.com is broken and the HTTP request there does not succeed. This leads to the first script-block loading for like 30-60seconds (default browser timeout). Until the browser aborts the pageload, the user gets to see a blank page with a loading indicator.

Repeat: Just because you embedded thirdparty javascript and THEY are down, YOUR users sit in front of a blank page.

Usually users wait for an average of 10seconds until they leave.

The users that get to see such a fail abort:

with the worst possible user experience and thus
with lower return rates and less recommendations
with most likely no generated revenues
and even worse, you dont know it, because your tracking did not fire :-) . Consider: the user aborted the pageload before the tracking fired.

The situation is not so much different when including third party CSS.

<html> <head> <link rel="stylesheet" href="http://www.thirdparty.com/some_widget_magic.css"> </head> <body> Your websites content <script> affiliate.trackAndGenerateMoney();</script> </body> </html>

In the common browsers, loading these from an unavailable server, the browser wont even start rendering the page at all (see e.g. http://www.phpied.com/rendering-styles/) until some browser timeout triggers (commonly 30 seconds). This gives the user the bad experience of a white screen. Again, you wont event notice as your tracking relies on dom:ready and thus wont fire. And interesting question would be: What happens if a third party webfont is being referenced from your own stylesheet ? But that would be too much here.

Example

Here is a tiny video I made from the very popular website www.smashingmagazine.com. This will give you a good visualization about the effect of a broken third party server.

wpvideo 8v1h94L9]

Please find on the left handside the page with everything working(your website and the third party webservers) and the right handside the situation were two third party servers (Affiliate partner and twitter) are down and don’t respond.

Got it?!

Not so nice, right?!

So how can be safe with regards to SPOF (Single Point of Failures) & Third Party fails ?

Tools & Tips

1) Choose your Third Party Providers wisely! Ask them whether their script snippet, css include or webfonts loads *async*. If the reply is like “uhm, what?” or “well, this is not possible”, choose another partner. Seriously.

2) Think about embedding such code to your platform. Do you really need that? Can you provide the feature yourself ? Could you at least host stuff in your own infrastructure?

3) Install a browser plugin such as “SPOF-O-MATIC https://chrome.google.com/webstore/detail/spof-o-matic/plikhggfbplemddobondkeogomgoodeg You can easily see if your page has the potential to fail. And it is fun to browse around the web and see how blind website owners are. Even companies, where the website is the only revenue channel.

4) browse your websites code (locally in your dev environment) for external references such as the above. Replace any occurence of a third party reference with http://blackhole.webpagetest.org This site generates a 30s lasting request that magicly simulates a third party downtime.

5) Pro and advanced tip: Change your /etc/hosts file and redirect request to facebook, googleplus, twitter and urlofyour3rdpartyprovider.com to http://blackhole.webpagetest.org Honestly, while at work, you shouldnt browse FB and G+, so why not working all day while simulating they are down ?! You will be astonished how many websites appear to be broken or even down while we only simulate FB and G+ are down.

My personal favs are #1 and #5.

1) I want to work with awesome people. And if someone gives me code that could crash my site, he is not trustworhty.

5) It is ever so great to see the impact of a well sorted SPOF and if you browse your product/website frequently over the day, you will immediately see SPOFs before they take down your site.

In the end, dont trust anyone but your own devs, ops, devops and talk with your third party vendors about SPOFs.

0No comments yet.

Write a comment

Answer to: Reply directly to the topic

Written by

Bjoern K.

We want to improve out content with your feedback.

How interesting is this blogpost?

We have received your feedback.

Allow cookies?

OTTO and three partners need your consent (click on "OK") for individual data uses in order to store and/or retrieve information on your device (IP address, user ID, browser information).
Data is used for personalized ads and content, ad and content measurement, and to gain insights about target groups and product development. More information on consent can be found here at any time. You can refuse your consent at any time by clicking on the link "refuse cookies".

Data uses

OTTO works with partners who also process data retrieved from your end device (tracking data) for their own purposes (e.g. profiling) / for the purposes of third parties. Against this background, not only the collection of tracking data, but also its further processing by these providers requires consent. The tracking data will only be collected when you click on the "OK" button in the banner on otto.de. The partners are the following companies:
Google Ireland Limited, Meta Platforms Ireland Limited, LinkedIn Ireland Unlimited Company
For more information on the data processing by these partners, please see the privacy policy at otto.de/jobs. The information can also be accessed via a link in the banner.

more information

The easiest way to crash your site

What is the article about?

Dependency on 3rd party vendors stability.

What about “downtimes” or “times of being unavailable” you dont notice?!

Homeworks

Example

Tools & Tips

0No comments yet.

Written by

Similar Articles

How we used a simple trick to save USD 500,000 in data transfer costs

Developer Hacks – Modern Command Line Tools and Advanced Git Commands

Your profile -
Your advantages

A people company.

Driven by technology.

We want to improve out content with your feedback.

Allow cookies?

Data uses