We are building a large scale e-comm web site to service over 100,000 users, but we expect the number of users to grow rapidly over the first year. In general, the site functions very much like ebay where users can create, update, and remove listings. User can also search listings and purchase an item of interest. Basically, the system has transactional and non-transactional requirements:
**Transactional**
Create a listing (multi-record update)
Remove a listing
Update a listing
Purchase a listing (multi-record update)
**Non-Transactional**
Search listings
View a listing
We want to leverage the power of scalable, document-based NoSQL data stores such as Couch or MongoDB, but at the same time we need a relational store to support our ACID transactional requirements. So we have come up with a hybrid solution which uses both technologies.
Since the site is “read mostly”, and, to meet the scalablity needs, we set up a MongoDB data store. For the transactional needs, we set up a MySQL Cluster. As the middleware component we use JBoss App server cluster.
When a “search” request comes in, JBoss directs the request to Mongo to handle the search which should produce very quick results while not burdening MySQL. When a listing is created, updated, removed, or purchased, JBoss services the transactions against MySQL. To keep MongoDB and MySQL synchronized, all transactional requests handled by JBoss against MySQL would include a final step in the business logic that updates the corresponding document in MongoDB via the listing id; we plan to use the MongoDB Java API to facilitate this integration of updating the document.
So, in essence, since the site is read mostly, the architecture allows us to scale out MongoDB horizontally to accommodate more users. Using MySQL allows us to leverage the ACID properties of relational databases while keeping our MongoDB store updated through the JBoss middleware.
Is there anything wrong with this architecture? No platform can offer consistency, availability, and partition-tolerance at the same time — NoSQL systems usually give up consistency — but at least with this hybrid approach we can realize all three at the cost of additional complexity in the system, and we are ok with that since all of our requirements are being met.
2
Answers
If you have already built it, there isn’t too much wrong with the architecture aside from being a little too enterprisey. Starting from scratch on a system like this though, I’d probably leave out SQL and the middleware.
The loss of consistency in NoSQL data stores isn’t as complete as you suggest. Aside from the fact that many of them do support transactions and can be set up for immediate consistency on particular queries, I suspect some of your requirements are simply an artefact of designing things relationally. Your concern seems to be around operations that require updates to multiple records – Is a listing really multiple records, or just set up that way because SQL records have to have a flat structure?
Also, if search and view are handled outside of MySQL, you have effectively set up an eventual consistency system anyway.
There is nothing wrong with this approach.
Infact Currently am also working on the application (E-Commerce) which leverages both SQL & NonSQL. Ours is a rails application and 90% of the data is stored in mongo and only transactional & inventory items stored in mysql. All the transactions are handled in Mysql, and everything else goes to mongo.