No matter the size of your business, you’re likely to rely on traffic coming to your website. Even if you’re a brick and mortar business, your presence on the web can bring targeted traffic to your site, where you can answer questions, provide the phone number and business address, features, specials etc. Even more, for the growing number of businesses that rely on their site for income, one of the most important decisions is how to handle one’s presence on the web. Every feature comes with a price tag, and this includes the amount of traffic a site is able to handle at any given time. When a company grows, they may have to consider the expense of features that need to grow with them. Watch our video to learn about how this may apply to your site, the kind of server you use and the services you should look for.
One of the things people are concerned about is: how do I get my infrastructure to scale?
Why slow happens
There are really four reasons:
- The server can have a problem,
- The pipe can have a problem (connection between you and the machine),
- You can have too many requests (that’s the denial of service attack scenario). Your clients can give you your own denial of service attack. We’ve had times when we’ve had a promotion that was incredibly successful that slowed everything down to a crawl. That’s a nightmare, you’ve slowed yourself down at a time when you definitely didn’t want to be.
- Or you can have just a small number of people using it but you’re asking them to do something hard. There can just be complex maths in the scenario.
Some tips around this.
- Firstly, define what ‘big’ is for you. For different clients ‘big’, as in a big number of users, can be very different. You want to have that conversation with your developer.
- How likely is a surge? Business to business platforms typically get fewer surges than business to consumer. Business to consumer platforms; when something gets popular it gets really popular, and every business changes on this. Think about how likely a surge in traffic is. We’ve got one client, who, on Thursday or Friday at 4pm, gets 80% of their business inside an hour. It’s just the nature of their business, so that is the surge that they have to manage every week.
There are four techniques we can use to solve this.
Technique #1 – More Hosting Capacity
We can get more hosting capacity. We can have what is called vertical scaling, just make it a bigger server. I’ve seen scenarios where that has been helpful. A really common approach now is to get more servers (horizontal scaling). Google does not run one big server. Google runs thousands and thousands of pretty basic servers and the requests are routed between them.
Four types of hosting
This is going to be the most techy bit of today and we’ve used cartoons, so hopefully it will go down. I think it’s important to know what are the types of ways you can host your platform. Once you understand this, you can have some really good conversations with your developers.
Let me do this as a history lesson.
1. Dedicated servers
Starting off, we just had dedicated servers. It was a box, you owned it and you had it and it was your box. On that server we would put your operating system, Windows, Linux, whatever it might be. We’d put your data bases and we’d put them all on your server. There are plenty of people for whom this is exactly the right infrastructure still today.
2. Shared hosting
There is a problem and that is these things were expensive. So what we did was we created shared hosting. Shared hosting is we put lots of people together. We had one big, beefy box and we put one hundred or five hundred people on the one box. The only problem is when somebody does something evil, it can affect everybody else. If it goes evil, the process that you’re running can affect everyone else on the server.
3. Virtual private server
So we came up with a third solution called a virtual private server. A virtual private server is like we take one big server and under the hood there are a couple of ways we can do this. We can think of it as one big server, cut into six, eight, nine pieces and each acts like its own server. It has its own operating system. If someone goes rogue, nobody else cares. It’s like sharing the cost of a big one, but really carving things up.
4. Cloud services
Then, we have a fourth option. The most recent move is this thing called cloud services. Cloud services, I could talk about for three days. There are lots of things that happen in here. But essentially by magic, a whole bunch of machines working together really cleverly, we can have what looks like our own servers plus we can buy these other services.
We can buy a database. So we might buy a little mini server and we can have control over it. We can go into it as a dial and say, I’ll have some more power please, and just crank it up. That server grows or we get more of them instantly. In fact we can tell the software to do it on our behalf. We can buy a database and we can buy other magical services like queues which I’ll introduce to you in just a moment.
This is the slide I ummed and ahhed about putting up. Everything on this slide is mostly true. I can give you a counter argument where each of these is not true. But we’re talking about the service offering of companies all around the world and I wanted to give you some general ideas. Please don’t ring me back and say, I found a hosting company that will do this for a quarter of the price. Great, I did too, but this is what our general experience is.
1. Shared hosting
Shared hosting is going to be pretty cheap. It’s not made to scale. You’re going to get capped pretty quickly. Their disaster recovery is generally slow. If it goes over, it typically takes a while to recover. You don’t have to pay extra for management. They often don’t have a lot of extra services and we like WP Engine and Anchor at the moment for that service. It’s good at the cheap end.
2. Virtual private servers
Virtual private servers – you can see that the price goes up but they are built to scale. You can often scale a virtual private server really quickly. You can ring up a hosting provider and say we’re busy, can you scale it up for me. Depending on your provider that might do it instantly or they might take a few days. Disaster recovery is slow. You still have to build a lot of things when they go down. They take a while to recover. They will charge you, it’s like having your own server, they will charge you a lot for management.
I’ll jump over to dedicated servers.
3. Dedicated servers
Dedicated servers are like a virtual private server just beefed up.
4. Cloud services
The interesting one is this one in the middle where it’s cloud services. Cloud services typically start at low cost, they can very quickly get a lot more expensive. We did some consulting for some people who were spending $400,000 a year in cloud, so don’t think it’s always cheap. But you start at a low cost. You can scale them up nicely. You can often scale them instantly. Your disaster recovery can be really fast. You actually recover the whole instance, you don’t have to put the code back and the database. They come back really quickly.
Management, it depends whether you’re paying extra for management and you get lots of extra services. There are lots of cool things coming out. There are lots of really useful little services coming out in the cloud environment that can do specific jobs. I’m going to give you an example of one of those that can get you out of a particular problem that we didn’t have going back a while ago.
So appropriate hosting certainly helps with scale.
This one I just put in for fun. A little while ago, the best hardware that you could buy, value for money was a PlayStation. You could put whatever operating system you liked on the PlayStations. So the US Department of Defense bought a whole bunch of them. They stuck them all together and they built a supercomputer out of PlayStations. Supercomputers are actually lots of computers strung together and controlled in an effective way. I just think it is cool there is a place where you can go and the US Department of Defense is analyzing data from around the world to keep America safe on a whole bunch of PlayStations.
Technique #2 – Caching
Caching is another technique we can use to do website scalability. Caching is the act of storing something for later use to make us faster.
So precalculation – if you’ve got a whole bunch of really complex maths – if that’s done on the fly each time, that can be really slow and expensive. Sometimes you can precalculate that.
Content distribution network (CDN)
I’m going to teach you with pictures what a content distribution network is. We’ve got a server and it’s happy. It has requests coming from all around the world, and that is fine, different devices. Now it is less happy because it is getting hit with lots of things it is beyond its capacity to handle. What we can do is take the images, because when you serve up a web page , you serve up web pages and images, we can take the videos, a lot of the material that stresses these sites out and we can put them on these other friendly servers around the world.
We can have one in every country in the world. When people come to our website, they think they’re coming to our website but they’re actually picking up most of what they want from an intermediary server along the way. The number of requests actually getting back to the mother ship is dramatically reduced. This can make sites run a lot faster. Your Australian site might look good to you in Australia. Then you go on holidays and you realize in America it is actually slow and it is really bad in the UK.
If we can distribute your content around the world, we can make your site appear to be a lot faster and to run a lot faster. The beauty of this is, these servers used to be really expensive. The cost of these has come down significantly. We’ve got people running CDNs, content distribution networks, there were actually some good free offerings, $20 -$100 a month, it’s not expensive to get good content distribution networks.
Technique #3 – Queuing
We can put requests into a queue and we can get to them in turn. This is a great trick when it works. Imagine I’ve got a server and this is the normal load that comes through that server. Then I don’t like it, I’m getting too much. I can put a queue in front of it and I can slow down my request so that my server is happy.
Let me give an example where that is useful. We built a platform in the SMS space. These people send SMSs on behalf of businesses. Other websites connect to them, give them a bunch of requests and they go and send out SMSs. It just so happens, Friday afternoons a lot of their customers, and I mean customers who send them big requests, hundreds of thousands of messages come flooding in and they’ve got to be able to handle them.
Now for that ten minutes, and it is probably ten minutes a month that they often get them all crossing over, normal infrastructure would fall over. But we can use a queue to sit in front of it, absorb all the requests and the worst case that happens is a message might take ten minutes to get out. This queuing infrastructure I can buy from cloud services very cost effectively. I can push all my material into it and I can get to it when I get a chance.
Essentially queuing works when you’re dealing with other services. It doesn’t work so well with humans because if I’ve got a webpage and I have to wait ten minutes for a response, it’s not so much fun. But when they work, they’re great.
Technique #4 – Improving algorithms
Levi regularly says this to me: “premature optimisation is the root of all evil”. Let me give you the example of somebody in the room. These are rough numbers, but they have a few pages on their site that they require this much processing power to run, not much. Then they’ve got a data entry page which requires heaps of processing power to run, and another report that requires heaps. In response the natural inclination is: do we go and give them a bigger, beefier server to do all of this math?
What we did was do some optimisation work on just those two pages. We left these ones completely alone and we were able to bring the amount of processing required significantly down.
You might be asking the question, why didn’t you just do that upfront? Why did you have to wait for all of this?
There are some problems with optimisation.
- It takes a while. That bit of work, there were days and sometimes weeks of work to get that to happen.
- These pages are now internally more complex. They will take more time to run.
So typically we don’t try to preoptimise too much. So there are lots of problems. Let me give you one of our best examples.
We built an Australian directory that has every business in Australia within it and we created a search algorithm to search across that entire database using geographic identifiers. When we first built it, it took two minutes to run. It gave you the correct answer but it took two minutes to run. The competitor in that space had a farm of servers. The thought was we were going to buy a whole bunch of hardware and there was a whole bunch of costs.
What we did was get a really good programmer and we put him in a dark room for two weeks. We fed him a lot of pizza. In two weeks he got this algorithm which we’d tried a bit already to run in under two seconds. It’s a pain to keep. When we want to change it there is a lot of work, but we changed the internal structures of queries. We did a whole bunch of dark arts and black magic.
That happened I think six years ago, there have been tens of thousands of dollars accumulated costs that have been saved because we were able to optimise an algorithm. The things engineers know is that people often rush to buy better hardware when they’d be better off to make an investment in a better algorithm.
- We typically buy a bit more capacity than we want, and
- We monitor and resolve issues as we go through.
That is the end of this session. That was the heaviest one of the lot. That was the learning one and the rest are a lot easier.
You can understand how important it is to make the right decision about the kind of service you choose for your website, at whatever scale works best for you. A company can place themselves in unwanted difficulty by saving money for servers that are unable to handle their growth. They can also overpay for services that they could never make use of. Either extreme can make the difference between success and failure, so make sure you consider what needs you have.