skip to Main Content

We have the use case that we’re storing deeply nested objects in MongoDB. Certain fields in this object can have different types. For instance let’s say an object can have a field that’s a collection of InterfaceType and there are different implementations of that say: ConcreteTypeA, ConcreteTypeB with different fields.

package foo.bar

sealed interface InterfaceType
data class ConcreteTypeA(val aField1: String, val aField2: Int) : InterfaceType
data class ConcreteTypeB(val bField: List<String>) : InterfaceType

When using Spring Data Mongo DB to persist say:

package bar.foo

@Document("myCollection")
data class MyDocument(
    @Id id: String,
    val value: InterfaceType
)

Spring will automatically add fields called _class that helps to instantiate the correct class on read. See also: Type Mapping :: Spring Data MongoDB

{
  "value" : {
    "aField1" : "some value",
    "aField2" : 1234,
    "_class" : "foo.bar.ConcreteTypeA"
  },
  "_class" : "bar.foo.MyDocument"
}

This all works fine, but of course this is not very refactoring safe… renaming the class or moving it to a different package will fail when reading the document. We could use @TypeAlias and register the aliased classes in the config, that would allow renaming/moving the class (as long as we don’t touch the constant value in the annotation). Or alternatively have Document classes with lots of nullable fields and handle everything in the Repository.

What’s the recommended approach to handle this type of data structure in MongoDB/Spring?

Same question exists in MongoDB Community Forum

2

Answers


  1. Chosen as BEST ANSWER

    Thank you AndrewL and qsi for your inputs. We decided to take the @TypeAlias route.

    We also added a test using ArchUnit that ensures that classes that implement an interface and are used within a @Document annotated class are annotated with @TypeAlias.

    It's not highly sophisticated, but while it wouldn't be an issue for us to migrate the database and fix the fully qualified class names in _class, we're more afraid of accidentally missing these changes and then run into runtime issues.


  2. I use this default _class behaviour with Spring & Mongo.

    In my experience, object repackaging/renaming is not a frequent problem. The much more frequent issue is addition of new not-nullable fields, or field renaming/removal/type changing.

    So even if you solve the _class matter, you will have many other data issues still that are vulnerable to refactoring. I see this as unavoidable and so you need to tighly control your domain objects : managed in your data design(/change) process.

    We have scripts that will upgrade (but not downgrade) a DB each time we make a structural change and a quick skim and I see < 3% on _class changes. These scripts are written by hand; nothing as sophisticated as Liquidbase or one these database schema managers.

    For the case of changing the class location it is as simple as:

    db.channel.updateMany({"_class" : "old.package.MemberChannel"}, {$set:{"_class" : "new.package.MemberChannel"}})
    
    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search