Is there anything wrong with this approach in MongoDB?

Question

Is there anything wrong with this approach in MongoDB?

1 Answer

SakshiSharma · Answer 1 · 2020-06-03T06:12:05+0000

MongoDB sounds like a fine application for your problem, but there are plenty of reasons why you would not use it.

MongoDB would not be well suited for applications that need:

Multi-Object Transactions: MongoDB only supports ACID transactions for a single document.

SQL: SQL is well-known and a lot of people know how to write very complex queries to do lots of things. This knowledge is transferrable across a lot of implementations where MongoDB’s queries language are specific to it.

Strong ACID guarantees: MongoDB allows for things like inconsistent reads which is fine in some applications, but not in all.

Traditional BI: A lot of very powerful tools exist that allow for OLAP and other strong BI applications and those run against traditional SQL database.

Possible downsides:

Ans:

You work in an organization that has only used SQL relational databases. You have no approval or support for using a NoSQL database yet.

You’ve never administered a MongoDB cluster; there’s a learning curve, as with all technologies.

Your data is really relational (e.g., one User has many Questions; a Question has many Answers), and you’ve overlooked the possibility.

MondoDB is a fine solution, a good alternative for those situations where it applies. If you can use it, why not in MongoDB?

MongoDB is a brilliant database and I enjoy using it. That said, it has a few gotchas if you come from the world of SQL.

Apart from ACID and other things that are well documented (and in other answers too), these things have caught us by surprise:

1. MongoDB expects you to have memory. Lots of memory. If you can’t fit your working set in memory, you can forget about it. This is different from most relational DBs which use memory only as cache! To be more specific: MongoDB uses RAM as primary storage and “swaps” the unneeded parts out to disk (Mongo leaves the decision over which parts get “swapped” to kernel). Traditional RDBMS work the other way around – they use disk as primary storage and use RAM as caching mechanism. So in general MongoDB uses more RAM. This is not a bad thing by itself, but as a consequence “real” RAM consumption is difficult to predict, which can lead to serious and unexpected degradation of performance once the working set grows over the (hard to predict) limit.

2. Storage does not auto-shrink when you remove records. The space that is allocated per collection stays allocated until you either repair DB or drop the collection. And it is allocated in huge chunks on a DB level (data files), which are then allocated to collections when needed (extents). That said, inside the collection’s allocated space the documents that are removed DO release their space for other documents in the same collection. This is a good explanation of concepts: https://www.10gen.com/presentations/storage-engine-internals

3. As a contrast to SQL which is parsed server-side, in Mongo you pass the data structures to query and CRUD functions. The consequence is that each driver provides a different syntax, which is a bit annoying. For instance, PyMongo uses a list of tuples instead of a dictionary (probably because dict in Python does not preserve order of keys) to specify which fields will be returned byfind(): (to be fair, that was probably the only sane way to do it – but it is a consequence of not using string-based language such as SQL)

a. MongoDB shell: db.test.find({}, {a:1})

b. PyMongo: db.find({}, fields=[(a,1,)]

This should not be viewed as a criticism of MongoDB – I enjoy using it and it has proven to be a reliable and performant tool. But to use it properly you need to learn about its space management.

Is there anything wrong with this approach in MongoDB?

Please log in or register to answer this question.

1 Answer