In the first Behind the Wi-Fi blog we looked at some of the physical aspects of building out a large scale temporary network, this time we look at how it all comes together as a ‘logical network’ or more simply how all of the networking components work together. With some event networks servicing 10,000+ simultaneous users and consuming anywhere between 100Mbps to 1Gbps of internet connectivity, chaos would ensue unless it was carefully designed and implemented.
Although networks are thought of as being one big entity in reality they are broken down into many ‘virtual networks’ which operate independently and are isolated from each other. This approach is very important from a management, security, reliability and performance point of view. For example, you would not want public users being able to access a network that is being used for payment transactions.
All of our events are rated based on a complexity score and this helps define how the network is designed. Larger and more complex events are designed using a fully routed topology rather than a simple flat design. This approach provides the best performance and resilience operating a bit like the electricity ‘grid’ network where a number of nodes are connected together in a resilient manner to provide a multipath backbone and then the customer services are connected to the nodes. This approach means that each node is provided with a level of isolation and protection which is not possible on a simpler flat network.
This isolation is important as a network grows due to the way when devices connect they are designed to send out ‘broadcasts’ to everyone on the network. With a large number of devices these broadcasts can become overwhelming on a flat network but on a routed network the broadcasts can be filtered out at the appropriate node. Faulty or incorrectly configured equipment can sometimes cause ‘network storms’ where huge amounts of network traffic is created in milliseconds reducing performance for all users, a routed topology offers much more protection against this isolating any problems to a small subsection of the network.
Every site has different network requirements so there may be anywhere between 5 and 50 virtual networks known as VLANS to ensure all the appropriate users and network traffic are kept separate. Traffic shaping rules are applied to these different networks to prioritise the most important networks, along with filtering and logging as required.
At the heart of this is what we call the ‘core’, the set of components which control the key aspects of the network such as the internet access, filtering, firewall, authentication, routing, wireless management, remote access and monitoring.
With several different connections to the internet, traffic is distributed across the different connections – this may be by load balancing, bonding, or policy routing. This is a complex area as different types of network traffic may only be suitable for certain types of connection. For example, voice traffic and encrypted VPNs do not work well over a satellite link due to the high latency (delay) of satellite.
The core routers also contain a firewall, this is the protection between the external internet and the internal network. Protecting against intrusion and hacking is sadly a very important factor with all internet connected systems subject to a constant stream of attacks from remote hackers in places such as China and Russia.
Additional firewalls also exist to control traffic across the internal networks. By default, everything is blocked between networks but for some services limited access may be required across VLANS so specific rules are added – an approach known as pin-holing. Filtering can be used to block particular websites or protocols (such as bit torrent and peer to peer networking); this may be done to protect users from undesirable content or to ensure the performance of the network is maintained.
Rate shaping and queuing are additional important controls to manage bandwidth to specific groups and users ensuring everyone gets the speeds they asked for. This is especially important for real-time services such as voice calls and video streaming. Traffic is managed at a user and network level using dynamic allowances so that all available bandwidth is utilised in the most effective manner without impacting any critical services. Users or networks may be given a guaranteed amount of bandwidth but this may be exceeded in a ‘burst’ mode provided there is spare capacity on the incoming internet links.
The core also houses the PBX, the onsite telephone exchange which manages all the phones and calls with big sites having as many as 200 phones and generating thousands of calls. All the features of a typical office telephone system are implemented with ring groups, voicemail, call forwarding, IVR, etc. As all of the phones are Voice Over IP (VoIP) they are connected via standard network cabling so can easily be moved between locations. Additional numbers and handsets can also be added very quickly.
The vast majority of users these days are connected via the Wi-Fi network which requires careful management and design. The detail behind this would run to several pages so for the purposes of this blog we will keep things relatively simple and look at a few key aspects.
Frequency/Standard – Wi-Fi currently operates at two frequencies, 2.4 GHz and 5 GHz. As discussed in previous blogs there are many issues around 2.4 GHz so all primary access we provide is focussed on 5 GHz with only public access and some other legacy devices connected via 2.4 GHz. All of the Wi-Fi access points we use are at least 802.11n capable with the majority now 802.11ac enabled to provide the highest speeds and capacity.
Wireless Network Names – When you look for a wireless network on a device you see a list of available networks, these identifiers are known as SSIDs and control the connection method to the network. Different SSIDs will be used for different audiences, with some SSIDs hidden such that you can only try to connect to it if you know the name. Wireless access points can broadcast multiple SSIDs at the same time but there are limits and best practice as to how many should be used. Some SSIDs may be available across the entire network whereas others may be limited to specific areas.
Encryption & Authentication – These two areas are sometimes confused but relate to two very different aspects. Encryption deals with the way the information which is sent wirelessly is scrambled to avoid any unauthorised access. It is similar to using a website starting with ‘https’ but in this case all information between the device and the wireless access point is encrypted. There are several standards for doing this and we use WPA2 which is the current leader. Not all networks are encrypted and, as is the case with most public Wi-Fi hotspots, public access is generally unencrypted.
Authentication deals with whether a user is allowed to use a particular network and ranges from ‘open access’ where a user just clicks on an accept button for the terms and conditions through classic username/password credentials and onto RADIUS or certificate based systems which offer the highest levels of protection. One common approach is the use of a pre-shared key or pass-phrase as part of the WPA standard, knowing the pass-phrase is in effect an authentication challenge. The pass-phrase is also the seed for the encryption and the longer the pass-phrase the harder it is for a hacker to crack the encryption. The pass-phrase approach is simple to manage but has inherent weakness in that it is easily compromised by sharing between users with no control.
On top of this various other services are employed to protect and manage the Wi-Fi. Client isolation for example stops a user on the network from seeing any network traffic from another user, whereas band steering & load balancing seamlessly move users between frequencies and wireless access points to ensure each user gets the best experience.
The rise of the smartphone has had a major impact on Wi-Fi networks at events due to the way they behave. If a smartphone has its Wi-Fi turned on, then it constantly hunts and probes for Wi-Fi networks so even in this ‘un-associated’ state it still creates an element of load on the network. Mechanisms have to be employed to drop the devices from the network unless they are truly connected (‘associated’) and active (accessing a web page for example). Even connected devices are typically dropped fairly quickly once they cease to be active so that other users can connect. This all happens very fast and transparently to the user with the device reconnecting automatically when it needs to.
This array of logical controls processes millions of pieces of information every second routing them like letters to the correct address, discarding damaged or undesirable ones and acknowledging when they have been received. Each of the components have to work in harmony with sites having anywhere up to around 30 routers, 200 network switches and 200 Wi-Fi access points. To manage this standard configurations and builds are used which have been pre-tested as this reduces the risk of introducing a problem via a new firmware or configuration change.
Next time in the final part of this series we will look at how this all comes together to deliver the end services for the users and the impact it all has on the event.