Building Heavy Load Messaging System

How we delivered a powerful and resilient SMS platform that can scale to meet future traffic demands

    SHARE

About the project

Intelli Messaging simplifies mobile communication methods so you can cost-effectively build mobile communication into your business processes, Marketing, or Operations. Whether you are an application provider, a telecommunication services company, or a large business, we have the service and application offerings to support your needs.

Duration

  • 2+ years

Industry

  • Telcom
  • Business

Technology

  • Scala
  • Akka
  • Groovy
  • AngularJS
  • Java
  • Drools
  • MongoDB

Challenges

IntelliSMS has been present in the SMS market for a long time and, through the years, evolved to offer a wide range of SMS-related services:
• a wholesale enterprise SMS gateway,
• bulk and individual SMS sends,
• account and billing management for resellers,
• REST API for integrating with applications,
• 2-way SMS - replying to unique messages and many others.

A couple of generations of the MessageCore platform were developed, backing the services mentioned above.

The previous system, MessageCore4 (MC4), worked well initially, but as more and more customers signed up and message volumes increased, scalability problems started to surface, which hindered the business. Also, the code base grew and took a lot of work to maintain due to several features being added and removed over the years. That's when IntelliSMS turned to SoftwareMill to take part in developing the next-generation system, MC5. The core requirements were to create a scalable architecture accommodating the increasing messaging volume.

Another essential requirement was for the messaging to be reliable. A large part of the messaging traffic is alert messaging, so the messages mustn't be lost. Combining a replicated, persistent message storage, which will deliver messages even if a downstream service is unavailable or a server crashes permanently, with the scalability and performance requirements, provided a challenging task for our team to work on. When creating the new system, an important factor was to maintain a clean, extendable code base where new features could be implemented and plugged in relatively quickly without altering existing code significantly.

Technology used

  • #Java
  • #MongoDB
  • #Scala
  • #Akka
  • #SQL
  • #Groovy
  • #Drools
  • #AngularJS

Every project is an adventure. Take another one with us!

Let’s dive into project together

How we faced client’s needs

We started by drafting a system architecture. We designed the architecture of MC5, dividing it into independent components:
• a submission server, where users submit sms send requests and queries for sms status;
• workers, which through a series of pluggable handlers, provided message routing, billing, and reacting to reply SMS messages;
• SMSC communication servers, which send the messages downstream to specific SMS providers and reporting servers to provide statistics and insights into SMS sending/delivery pattern.

Each component can be scaled independently, providing flexibility and simplicity. The components are written using pure Java and communicate asynchronously through a persistent, replicated message queue, and MongoDB was chosen for its ease of use and performance.

For SMS information persistence, events are stored in a MongoDB collection, forming a natural stream. The reporting system uses an SQL database, offering familiarity and query flexibility. To ensure performance and scalability, we conducted automated stress tests.

Designing for performance and scalability is one thing; ensuring the system is performant and scalable is another. That is why one of the first tasks we completed after a skeleton of the system was done was setting up an overnight, automated stress test, where the system sent and received simulated SMS messages over 4 hours. It let us iron out many integration issues on the Operating System, Load Balancer, Mongo, and application levels as the test runs each day; after a day of work, we quickly know when we have degraded the system's performance. IntelliSMS is a truly distributed team, we're working remotely across two different time zones, so we kept the formalism of our cooperation as low as possible.

Results

Code quality and maintainability were vital when developing the project. We used many now-standard methods, like (fast) unit and (slower) integration tests, adhering to "clean code" rules, paying attention to naming, keeping classes small, with a single responsibility, and so on. Each change is also code-reviewed by other team members to catch bugs and design flows and spread knowledge about the system. We had no strict requirement documents. Instead, we tried to communicate as frequently as possible to deliver the highest value for IntelliSMS and hence the highest value for IntelliSMS users.

The system has been in production for almost two years; it has proved to work well under load, and the architecture met the original requirements. The code base is kept in good shape, allowing existing developers to modify the code quickly and new developers to join the project without problems. We have delivered a performant and resilient SMS platform, which can be scaled to meet future traffic demands, and evolved to include new features to meet emerging business needs.

Peter Humphries
Executive Director at Intelli Messaging

"With a very difficult development task with some very high benchmarks SoftwareMill has delivered a great result and done so very cost effectively."

Interested in the first-hand experience? Let us know and we will connect you with our clients

connect me with your client