Understanding the Difference Between TCP and MAPI Latency
Over the past couple of months, I’ve received almost identical queries from our prospects regarding Outlook performance. Outlook performance could mean different things to different people so let me start with the definition first. For the purpose of this blog post, I’m talking about the amount of time it takes for an email to leave the “Outbox.”
End-user experience and lost productivity
Prior to cache mode, sending an email in Outlook was a “real time” event. What this means is that if the message has a large attachment associated with it, it could cause Outlook to stop responding. With cache mode, the sending of the email happens in the background and the user can keep using Outlook for other tasks. This does help in terms of end-user experience but it’s just masking the problem. I’ve heard people claim that nobody cares if an email takes an extra couple of minutes to send but the reality is that this does affect productivity. This is particularly true when in the users are trying to collaborate in real time.
When best practices are not enough
Going back to queries that from our prospects—all of them have implemented the network connectivity principles for connecting to Office 365 (O365). They are using local breakout and that when they ping outlook.office365.com, the latency is typically within 5-15ms. The question I get asked is: why are my users still complaining about Outlook performance even though the latency is low and there is sufficient bandwidth?
What I’ve realized is that some people are not aware of the difference between TCP latency and MAPI latency.
When a user pings outlook.office365.com, Microsoft returns an IP address that is closest to the end user location. For example, when I ping from our Singapore office, the IP address belongs to a server (Client Access Front End, or CAFÉ) in the South-East Asia region. It may not necessarily be in Singapore, but it will be an IP address in the South-east Asia region. At a TCP level, the client PC is terminates its TCP connection in the region. From our Singapore office, the latency is about 2ms. This is what I call TCP latency.
The actual mailbox location is likely to be elsewhere depending on the location of the O365 tenant. For example, for a US-based tenant, the users’ mailboxes are all located in US. This means that a second connection is made between the CAFÉ server and the actual server hosting the mailbox for that user. This inter-server connection does ride across the Microsoft backbone hence it’s less prone to congestion and packet loss. However, the speed of light is something that cannot be changed.
Continuing with my example, this means that when Outlook talks to Exchange Online, the latency is the sum of client-to-CAFÉ server + CAFÉ server-to-Exchange Server. In my example here, that would be 2ms + 200ms = 202ms. The application latency is there 202ms and this is what I call MAPI latency.
You can probably see where I’m going with this. Even though the CAFÉ server is close to the end-user and therefore providing low TCP latency, the actual MAPI latency remains high and that is the silent performance killer here. I can upgrade my Singapore office to 10Gbps but it would make zero difference simply because it’s not a bandwidth issue.
I ran a test earlier from my test machine in Singapore and it took 3m38s to send a 14MB file via Outlook using a US-based O365 tenant.
Ensuring Outlook performance
So what options are available to address this performance problem? The first option is use the multi-geo feature. Officially, Microsoft claims that the multi-geo feature was not designed with performance improvement in mind. However, the reality is that this could help with Outlook performance in general. It makes sense because the mailboxes are closer to the end users and therefore it reduces the overall MAPI latency. However, multi-geo is not free and it doesn’t do anything to solve your last mile issues either. Also, bear in mind that multi-geo makes no guarantee about the location of the user’s mailbox. As such, a user in Singapore could have their mailbox in Hong Kong even though O365 has infrastructure in Singapore.
The second option is to deploy the SteelHead solution. Back in 2004, Riverbed was the first company to offer layer 7 specific optimization for the MAPI-over-TCP protocol. Over the years, we’ve introduced optimization for encrypted MAPI-over-TCP and MAPI-0ver-RPC (Outlook Anywhere). Today, Riverbed is the only vendor in the industry to offer layer 7 specific optimization for the MAPI-over-HTTP protocol (MoH). MoH is the only protocol O365 supports for communication between Outlook and Exchange Online.
Effectiveness of SaaS optimization
How effective is the SteelHead optimization? I ran the same test and a cold transfer took 17s while a warm transfer took 14s. Interestingly, a client machine in the US took 13s for the same operation. What this means is that the SteelHead effectively made the WAN latency disappeared and therefore providing a similar end-user experience as someone who’s sitting in the US.
I remember when I first started at Riverbed, one of our marketing slides said, “Making 3,000 miles feel like 30 feet.” Well, here is perfect example of that.