skip to Main Content

I have installed and configured FreeSwitch. It is up and running perfectly. Now I need to achieve high availability. My freeswitch was deployed in aws ubuntu ec2. As per aws docs for HA, it shows the floating ip concept. I tried this but I cant create a virtual ip in aws. I also tried efs. It also fails. Is there any other possible solutions?..

2

Answers


  1. The FreeSWITCH docs for "HA" describe a trivial (and practically useless) scenario from which FreeSWITCH can recover. That’s not really giving you HA. There is no sensing of different types of failures, no coordination between nodes in case both/neither are trying to become active, no synchronization of data, no control of nodes if promotion fails, no response to events/failures, etc. If this is for a home system then just create a standby node and keep it powered off until you need it. If it’s for a commercial VoIP installation read on.

    Start by having a look at this voip-info web page: high availability design. Although a couple parts refer to Asterisk instead of FreeSWITCH it does a great job explaining what high availability is and isn’t. (Its easy to confuse high availability with load balancing). If you think a floating IP is the biggest challenge then you might have overlooked what it takes to make VoIP HA work in the real world.

    Start by defining a VoIP ‘failure’. In the most simplistic terms it’s the FreeSWITCH process dying. But often the FreeSWITCH process is alive just not bridging calls (so avoid simplistic process monitoring scripts). What if the network connection goes out (or a firewall fails), or a route between you AZ and trunks/phone sets becomes unstable. Your HA solution should be able to consider environmental factors like upstream routes/etc to determine if a peer can no longer offer telephony service. Some solutions use generic Linux heartbeat software which doesn’t have any deep FreeSWITCH visibility, or environmental visibility.

    What about keeping data in sync between peers? From voicemails, to configuration data, to phone set firmware, etc. Solutions like RDS or DRBD make it easy, but corruption by one peer immediately corrupts the other. For example, if a corrupt process on one peer damages critical FreeSWITCH files, will the other peer start (if they use DRBD then no). So avoid RDS or DRBD based ‘solutions’.

    If you introduce load balancing (i.e. multiple active peers) which one ‘wins’ in the event that 2 peers each receive voicemail #1 for user 123 at the same time? This require you introduce front-end servers for call bridging, back-end for voicemail, etc. And you still have single points of failure or shared components.

    If you recover from a failure and the cluster needs to re-assemble, what happens if each peer wrote data to its copy of the shared ‘disk’? Do you manually start reconciling? What if 2 peers come up at once (dual active) – which one wins and takes over? If you introduce a shared disk solution (DRBD, NFS, iSCSI) then you eliminate one of the biggest and most important elements of an HA solution: peer autonomy. So look for ‘synchronization’, not ‘shared disk’.

    The cheapest ‘HA’ solutions for FreeSWITCH tend to use a shared virtual disk (eg: DRBD/NFS/SMB) and/or a shared channel bank. As you will read above, real HA solutions (like the ones used in 911/PSAP call centers) require completely autonomous peers and call paths. There are solutions which start adding redundancy everywhere (eg: front end, database, role servers, etc) but all they really do is create multiple points of failure unable to handle most real-world failure scenarios. And they are difficult to maintain.

    So now decide if you want free or commercial, and what the tradeoffs are. On the high end is HAfs a (free / commercial) product which has no shared components and uses sophisticated health detection, and is compatible with all FreeSWITCH distributions – but requires more Linux skills to install and can be more expensive depending on the edition (more for enterprise or mission critical phone systems). On the other end is a (free script) flipit script that is simple to install but it’s a stretch to call it ‘HA’. Although you asked about AWS, there is also VMware which offers generic HA (but it’s not PBX/trunk/SIP/etc aware), and you will also find some vendors offering RAID 1 as "HA" for a PBX but that’s a stretch. And there are more products in this spectrum too. No vendor ‘approves of’ or ‘endorses’ or ‘certifies’ any other product, so you have to try before you buy.

    You will also find people offering ‘containers’ as HA solutions, but that isn’t really HA. Containers are convenient ways to deploy software, and you can have a spare PBX container ready to deploy; but, you don’t have synchronization of settings/voicemails/etc, no detection of failures, etc.

    Finally, you will also find references to Kamailio, openSIPs, proxies, etc. But all of these create a new single point (or many points) of failure in front of your "cluster". Then you have to add HA to your HA solution, etc.

    Just be sure to ask the right questions when you do evaluate products! No single solution is right for everyone – but the voip-info HA design page will help you understand the trade-offs. If you need to meet 911/PSAP standards or are building for a high volume call center, have a look at the high-end HAfs product. If it’s for home use, try flipit (I saw an AWS specific version somewhere but can’t find it now) or generic Linux clustering (combining packages for heartbeat, rsync, etc, monitoring, etc. available on AWS too). In the end you will discover moving a floating IP is one of the easiest things to get working.

    Login or Signup to reply.
  2. In Freeswitch, there is no single thing about HA.
    If you want VIP, you do have [0], but this is not so recommended since VIP is required to decide whether the server is up or down, which does not come easily
    There are multiple solutions for this. but generally, you need to think about FreeSwitch as an application that is doing some business logic, and you put a reverse proxy behind it. the proxy is lightweight, and he is capable of much more from HA and scalability.

    OpenSIPs or Kamailio is capable of at least ten times of concurrent sessions than any soft switch (FreeSwitch, Asterisk, etc.)
    So your solution is to put some SIP proxy behind the FreeSwitch. then if the FreeSwitch is down, the proxy will not dispatch calls to it.
    The proxy itself has a dedicated protocol for multi-master
    Still, you need to think about edge cases like BLF, conferences, etc.

    Generally, about VoIP HA, don’t try to save the live calls. Just make sure the system is up and running for the next calls

    UPDATE: see https://developer.signalwire.com/freeswitch/FreeSWITCH-Explained/Configuration/High-Availability/

    [0]: (outdated URL https://freeswitch.org/confluence/pages/viewpage.action?pageId=7143926)

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search