Saturday, November 20, 2010

DR, Redundancy, Backup, Fail-over: What does it all mean?

So many customers when choosing a new phone system always ask about redundancy and disaster recovery, but they really are asking about back or fail-over.  Most have no concept of true redundancy and DR entails or how you can implement in a communications systems.   Start with trunking.

If you look at your trunking, how do you have redundancy at any level when most connections come in through a single LEC, even though you may have separate CLEC’s.  One option is SIP but you still have data that usually is single LEC-based copper or a fiber connection.  If you can divide your services between wireless, copper, and fiber, you have better chance of redundancy, but fail-over is then an issue as most secondary carriers will not be able to send the original DNIS digits to you as the original carrier.  Having a single carrier will allow this across different trunks, even in separate locations, so that is an option as is SIP over different carriers, but retargeting SIP trunks is not easily accomplished with most providers.  Some customer believe that if they put all their eggs into a data center, they are protected and don’t need to think about redundancy, but the question arises of business continuity when the MPLS to the data center fails.

Phone sets are another issue.  If you are still deploying digital sets and haven’t moved to SIP – good luck!  While there are systems out that can redirect calls to other units in a network, for DR, not having IP sets is the same as expecting agents to take their computer with them to work at home if the corporate site is down.  SIP sets offer the ability to register to multiple locations, meaning if your main site goes down, sets can use alternate routing to other servers or even be taken to other sites and brought up on the DR servers.  With soft phones and remote number login, all you need is a web browser and a phone line or cell phone to stay connected to the office and your customers through a secondary DR server.

Phone systems of the past was not very redundant or able to provide full DR.  Some had simple fail-over and most offered backup, but instantaneous redundancy with full mirroring of information has only been around for a few years at most.  Today’s servers can be stacked, dispersed, and stored (virtually) anywhere and be able to take over automatically or with very little intervention from system administrators.  One important factor to look at when choosing a phone system is if they applications servers like voicemail, IVR, or speech servers have full redundancy that work with the DR capabilities of the actual phone system  -- though with systems like the ININ CIC server, all applications are in one box, so there consistent and replicated databases and functions no matter what server you are on.   If a switchover to a backup server does occur, planning must be in place to redirect trunks, re-register phones, and allow clients to reconnect to the new server.  Without this, no amount of DR work will work.

At a hardware level, DR and redundancy includes insuring good, off-site backups, RAID configurations on the hard drives, redundancy power supplies, dual-NICS, and all the other protection you can give your servers to survive the longest in a disaster situation.   If you look outside of the actual phone platform, what about the networks – do you have dual switch fabric in place?  Do you use spanning between multiple switches and routers?  Do you have dual routers in place with complete capabilities to route your inbound and outbound traffic. 

Obviously thinking about DR to this level can give any IT manager an ulcer in not time at all.  It is important to include experts at all levels in your DR plan, starting first with your phone vender since telecommunications is the life-blood of any company – no one can survive without communicating to the outside world and customers.  It is far more important to have people answering questions on the top of there head for an inbound call center than to have a web site up in the middle of a disaster.  So many companies take the other approach – build Fort Knox for their data silos and plan for complete redundancy at a DR site, but still count on their single PRI, single voicemail computer, single phone system to get them through a disaster or major outage. 

There are many consultants that can help you think through your DR and redundancy plans, but don’t go overboard.  Remember that people in a disaster will not be thinking about how they should be answering the next call, but how to reached their loved ones the fastest.  Plan all you like but remember the people factor in the equation.

Robert Wakefield-Carl, QoS Telesys


Post a Comment

Links to this post:

Create a Link

<< Home