Unified Confusion: October 2013

Wednesday, October 23, 2013

Some Cisco UC Basics, part Deux

Cisco's call control product is UC Manager, which used to be named CallManager, and gets referred to as Communications Manager or CUCM. It's designed to be scalable, potentially up to 10,000 endpoints registered per node. The per-node scalability is determined by the VMWare OVA template that gets applied in the initial build-out of the system. Systems Integrators will choose the correct OVA by evaluating the current number of endpoints in the environment plus a target growth number, which normally is provided by the customer. The reason not every Systems Integrator goes right to the 10,000 user OVA is that the system specifications for those virtual machines is weighted heavier, therefore available system resources on the physical host are reduced, which reduces the overall capacity of the physical host to hold other virtual machines.

UC Manager nodes can be clustered together, and that involves a Cisco proprietary protocol that passes call and endpoint state between nodes within the cluster. This proprietary protocol requires that every node maintain a connection to every other node, so the connections are fully-meshed. This impacts bandwidth required between nodes in the cluster, as a single connection requires up to 1.5MB of bandwidth. This is a big deal if you are planning to cluster nodes over the WAN, and have limited bandwidth to work with. This protocol allows for the entire cluster to act as one big registrar for endpoints. Cisco allows for up to 8 call processing nodes in a cluster, and they cap the number of phones per cluster at 40,000.

In SIP terms, UC Manager is a B2BUA, however media never flows through, it's always between endpoints. This isn't even an configuration option to change this. This fact gives a little bit more predictability to the scalability considerations of the system. Nowadays, SIP is definitely the protocol of choice for CallManager, both line-side to endpoints and trunk side to PSTN gateways, although the protocol used to the gateway can vary depending on what Cisco voice engineer you talk to.

The gateway is almost always a Cisco router. Voice functionality is built into Cisco's networking OS (IOS), and is enabled by a "UC" license. The gateway runs one of 4 possible protocols: SCCP (Skinny), MGCP, H.323, or SIP. The protocol used is usually determined by what the customer is connecting to the gateway. Analog lines, for instance, usually work better if SCCP or MGCP is used. PRI's are going to entail usage of either MGCP, H.323, or SIP. If the customer is setting up a SIP trunk to a telco, SIP would be the protocol used from the gateway to CUCM. MGCP relies on CallManager for call control and dial-plan decisions. H.323 and SIP are effectively peer-to-peer protocols, meaning there are separate dial-plans and call routing decisions made on both CallManager and the gateway. Another set of functions that the gateway can provide is transcoding, ad-hoc conferencing, and Media Termination Point (for media relaying and DTMF termination) services. These services register to CUCM using SCCP, and can easily be co-located with the other protocols.

In the first part of this series, I mentioned that Cisco had chosen IBM Informix for the database component. The way that CallManager passes configuration data between nodes is through a database Publisher/Subscriber model. There are a few challenges to this approach in the CallManager space. The first node installed becomes the Publisher server. This role is locked in to the first installed node, and cannot be changed later without more or less backing up the Publisher, re-installing, and restoring the backup. So if you find yourself with a downed Publisher and no backup, you'll have a headache in front of you, because the environment will need to be rebuilt. The Subscriber nodes will continue functioning, however no configuration changes will be able to be made to the servers without that Publisher. Cisco has not provided the capability to promote a Subscriber server to a Publisher. Also, for sometimes unknown reasons (and quite a few known ones), database replication will break. There's a list of CLI commands to go through when this happens, and that resolves the problem 80% of the time, and the remaining 20% of the time you will be on the phone with Cisco's Technical Assistance Center (TAC).

Stay tuned for part 3, where I'll talk about some of the other Cisco UC applications (client and server).

Monday, October 21, 2013

Some Cisco UC Basics, part 1

In part one of this series, I'm going to explain the overall Cisco UC landscape as it exists today (with a bit of historical information thrown in).

The Cisco UC Product Suite consists of a slew of different appliance virtual machines, providing functions like traditional call control (dial-tone), voicemail, presence, E911 (specifically, providing more specific location information to the Public Safety Answering Point than just physical address), conferencing (think web-based content sharing and audio bridging), and telepresence (video conferencing). All of those functions run separately, so there are no co-resident applications on a single virtual server instance. On the physical host multiple virtual machines are allowed, however over-subscription of resources is not permitted. For instance, if each virtual machine takes 2 processor cores and 4 GB of RAM, a physical host that has 2 quad core processors (8 total cores) and 32 GB of RAM would have a maximum of 4 virtual machines. Even though there was more RAM available, the physical host ran out of available cores, and thus would not be able to hold additional virtual machines.

All the individual services within the virtual machines run on top of a re-packaged Red Hat Enterprise Linux. Even though this is the RHEL that a lot of companies know and love, the normal Linux bash shell is not accessible. Cisco built a CLI wrapper that mimics some of the traditional IOS functionality and allows anadministrator to perform basic OS administrative functions. This re-packaged RHEL is referred to as the Cisco Telephony OS. Cisco uses an IBM Informix database for storing all configuration, call states, and Call Detail Records. Depending on the application, you may or may not be able to directly access the database, and might be limited to an API for accessing and/or modifying relevant information. The graphical administrative interfaces and web-based API's all run on top of Apache Tomcat.

As far as scalability goes, Cisco is very rigid in their product system requirements. This means you can't just go willy nilly and throw a product instance on just any old server or hypervisor. The general rule of thumb is that their products should be on UCS servers, virtualized on VMWare 5.x, and if there is any SAN connected to the servers it should be connected via Fibre Channel. However they will allow customers to go outside of these requirements to some degree. You may use any brand server you wanted as long as the hardware specs meet the requirements; and you also may use iSCSI or NFS for storage. The caveat is that the level of support you would get from Cisco would not be as good as it would be if you had stayed on UCS servers and fibre channel or direct attached storage.

At first, customers didn't have the option to virtualize, so every appliance would run on bare metal machines that were OEM'ed from HP or IBM by Cisco. Version 8.0 introduced the option for doing the hardware abstraction, and now virtualization is going to be the only way these appliances are supported in the upcoming 10.0 release. There have been rumors that Cisco would open up the hypervisor requirement to use something else. Personally, I'm hoping for some KVM support, because today VMWare is just another thing that customers have to purchase.

Thanks for reading, and stay tuned for part 2!