Here is some context for you
So, last week was a bit busy due to a customer issue, there was a high channel utilization issue that caused disconnections on customer devices; let me paint a picture on the deployment.
This is a 10 floor multi-building campus where all buildings connect to each other through a pentagonal atrium, and of course, the whole atrium was mostly constructed out of glass, honestly it was beautiful to look at. The rest of the atrium was an elevator, a stairwell and open space, every wireless engineer dream (Sarcasm). It was small enough that the APs in each floor hear each other and big enough to not have coverage bleeding for more than a floor up or down.
So far this does not sound particularly hard, does it? Well, I promise it gets better, the customer was leasing part of the building to other companies creating a shared space, we had around 5 companies in this space, that including the owner of the space that was my customer. There were more in total but let’s call 5 on the atrium section where we had the major drops and disconnections.
So, what we did here was a validation survey of the area to verify how difficult it was going to be to fix the issue and the results where worst than expected. We had some companies broadcasting their SSIDs with an 80Mhz channel bond at full power, others at 40MHz and there was just one being conscious and using the right settings. We tried to call a meeting between the IT personal of all companies to review their setting but as you figure they were remote and simply didn’t care. After a couple of tries to persuade them, it was clear that they simply were not going to fix their settings.
Now is a good time to tell you that most of the issues where on the guest and that the corporate network from my customer worked like a charm, even with all the miss configurations from the other companies. On my approach I asked, and we could reposition the customer APs but not the leasing customers APs (rogues from here on), we can’t manage any setting from the rogues, and we can’t regulate their power levels or change the physical architecture of the building. Also, why the guest was so important? As something came out of a bad practices manual the corporate phones connected to the guest, and they took calls and work from there, yes… on guest configure silver QoS network.
So, what would you have done?
Over the air QoS
Let’s take a step back to review a basic concept of QoS over the air. If you’re a network engineer you’re probably familiar with wired QoS where we can control the tags on a frame/packet and do different things with its “priority”, queues etc, but if you’re not a wireless engineer; and sometimes even if you’re, there is a few key differences on how QoS works over the air. Let’s start with something you might know already.
Precious metal QoS
Depending on how long have you been on the wireless business you might be familiar with this setting, Platinum, Gold, Silver and Bronze is a term you might have seen and you will find pretty much on any deployment guide from any vendor to use platinum whenever you have voice on your SSIDs, use Gold for video, silver for background and even some time bronze for the guest traffic right? But what does it do?
As you know, or I’ll assume you know for now (Let me know in the comments), is that when a client tries to communicate via wireless it has to contend for the medium, to do so it uses a CSMA/CA method where it senses the medium, if it is clear it goes up an transmit, right? In simplified way, yes. To start transmitting there are some mechanisms that take place, you have a the RTS/CTS, the IFS, and the part we want to focus a contention window.
With the QoS setting we can control our access to the medium a bit more granular moving from a DCF to a EDCA that was introduced in 802.11e; with the DCF we have something similar to a FIFO queue in theory were no priority is assigned to any client on the medium, but with EDCA we can assign priorities manipulating the contention window, depending on the precious metal assign is the values of our wait window, so the contention window; where a higher priority gets a smaller window. That makes that the end client with higher priority consistently access the medium faster than others.
It looks like this.
Precious metal |
CwMin |
CwMax |
Platinum |
3 |
7 |
Gold |
7 |
15 |
Silver |
15 |
1023 |
Bronze |
15 |
1023 |
And just if you’re curious on what happens in case of failing to transmit, here is a diagram on how the backoff timer can growth in case retries of sending the frame, basically what we have is that we are going to double the value of the backoff timer until it reaches the CwMax
Image from revolution Wi-Fi
Note: Quoting the CWNP study guides “The phrase contention window has caused so much confusion. This “window” is a rage of integer from which one is chosen at random to become the backoff timer for the immediate frame queued for transmission. Think it as a contention range instead.
The solution
Ok, so that was a long detour to tell you about the solution, but I wanted to let you know my train of tough, with all my restrictions in mind and with a VIP coming soon into the site we had to do some lateral thinking. First of all, we needed a quick solution and then a permanent solution, curious enough the permanent solution came easier to me, I suggested to install the certificates needed for the corporate network in the corporate phones and stop using the guest, since that was going to take some time, I came with the following quick patch.
We remove the channel bonding in 5GHz and start using 20MHz channels, and as you might have figured, and since we couldn’t contain legitimate APs or clients even when I keep thinking them as rogues, neither we could do anything about our or their power levels without making it worst; so, we set the guest QoS to Gold on the controller, against best practices and best judgment we needed our guest to be “faster” to access the medium than the rogues.
Now, why this is just a temporal solution? Well as you might know we might not be hurting our medium (that much) because corporate was set to platinum but having to SSID with high priority trying to transmit at the same time can be troublesome and maybe even cause interference from time to time.
Our big event has come and go, and the Guest performed with flying colors; now it’s time to take it back to silver and move the corporate smart phones to the corporate network. Wish us luck.
P.S.
I wanted to share this anecdote because the solution came to me in a second, I remember some of the training on the certs and how I played with the settings and did packet captures to verify, don't skip your labs guys!. Let me know what you think or what you would have done.
Thanks for reading!
Dan
Reference book: CWAP-403 Certified Wireless Analysis Professional by Tom Carpenter. Certitrek publisher
No comments:
Post a Comment