the/experts. Blog

Cover image for Dealing with personal data in Axon Framework
Mitchell Herrijgers
Mitchell Herrijgers

Posted on • Originally published at blog.codecentric.de

Dealing with personal data in Axon Framework

Welcome to Axon Framework 102, where we will be deep diving into many interesting challenges you might encounter when working with Axon Framework. This blog post will dive into dealing with personal data in your Event Store. Or even better: How NOT to.

About the Axon Framework series

I have been working with Axon Framework for two years now. I have written my thesis about strangling a remaining monolith at the Port of Rotterdam Authority and I am currently doing exactly that with Axon Framework. We have seen interesting challenges, such as a great number of events, long replays and privacy regulations. I believe those challenges to be relevant to all people using Axon Framework, or maybe even all people using Event-Sourcing! This is why I’m sharing these challenges and possible solutions with you; so you can be inspired to make a solution just as good, or even better.

This blog series will require some base experience with Axon Framework. If you have no experience with it yet, it is a very cool framework and you will probably enjoy learning and using it. So get your hands dirty and come back in a bit. This blog series will be waiting for you.

Privacy regulations are a pain

By now we all know about GDPR, right? It’s the privacy regulation of the EU that gives the customer certain rights about his or her personal data. For instance, they have a right to retrieve all data related to them, or to have certain or all data deleted.

This presents us with a dilemma. Let’s consider the following event to be in our event store:

{
  "payloadType":"com.insidion.CustomerPlacedOrderEvent",
  "payload":{
    "customerName":"C. Boyle",
    "items":[
      {
        "sku":"4823023",
        "amount":5
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Now C. Boyle calls our company. He wants his data removed, but our event store is immutable. We now have three options:

  • Ignore the request, knowing that it might incur a fine by the authorities.
  • Delete the event (and all possible other data) leaving a gap.
  • Alter the event, masking the value

This means that, if you even can, the only viable option was to alter the event changing the customer’s name to a masked value such as ***. However, modifying the event store is not a good thing to do since you are altering the past. Furthermore, the name is only erased from the event store, but all projections still have the value in the database tables.

Luckily for us, we have found a better way to do this. Some people have gone with a crypto-shredding approach and Axon has a commercial data regulation library that takes care of it for you. Personally, I prefer a more extreme approach.

Ignorance is bliss

You can keep your event store in the dark about the personal data in your system, without losing access to it. We can achieve that with a cool Jackson feature; custom serializers. Let’s dive in.

When you publish an event from an aggregate Axon stores it in the event store. Before being stored it is first processed by a Serializer to convert them in an appropriate format. Serializers are also used to convert them back from that representation when Axon reads the events from the store. Not only events are processed by them, snapshots, commands (when using Axon Server), and the metadata that is stored.

Axon offers three serializer implementations; Xstream, Jackson, and the java serializer. Xstream is the one enabled by default. You could also write one yourself (for example to serialize to YAML). However, simply because I like JSON more than I like XML I have configured axon to use Jackson by writing the following spring boot config.

serializer:
  events: jackson
  general: jackson
  messages: jackson
Enter fullscreen mode Exit fullscreen mode

Now Jackson is in charge! And Jackson has just the feature we need for our cause; custom serializers. You can write your own serializers so certain Java classes are serialized in the way you want them. By creating a PersonalData wrapper for a String we can say to Jackson not to serialize the value of the String, but anything we want instead. You can see the effect of this in the following code.

data class PersonalData(
    val value: String,
)

data class UserRealNameChangedEvent(
    val realName: PersonalData?,
)
Enter fullscreen mode Exit fullscreen mode

We can now instruct Jackson that every PersonalData object present in an event, metadata, or other location, it should serialize this in another way. In our serializer we will lookup or write the value to a database table and store the id in the JSON instead.
This way personal data never even enters the event store while we can still access the value whenever we want. It also allows us to delete or mask the personal data without altering the event store in any way. Let’s get the serializer to work:

@Component
class PersonalDataJacksonSerializer(private val store: PersonalDataStore) : StdSerializer(PersonalData::class.java) {
    override fun serializeWithType(value: PersonalData?, gen: JsonGenerator, serializers: SerializerProvider, typeSer: TypeSerializer) {
        this.serialize(value, gen, serializers)
    }

    override fun serialize(personalData: PersonalData?, gen: JsonGenerator, provider: SerializerProvider) {
        if (personalData == null || personalData.value.isBlank()) {
            gen.writeNull()
            return
        }

        personalData.storedId = store.retrieveDataId(personalData.value)
        if (personalData.storedId == null) {
            gen.writeNull()
        } else {
            gen.writeObject(SerializedPersonalData(personalData.storedId!!))
        }
    }
}

data class SerializedPersonalData(
    val id: Long,
)
Enter fullscreen mode Exit fullscreen mode

As you can see, it is pretty simple. Whenever this serializer encounters a PersonalData class, it will write that value to a database using the ‘PersonalDataStore’, get its id and write that in the JSON instead. We can also use the same principle to revert the process and access the data again. This is what our deserializer does:

@Component
class PersonalDataJacksonDeserializer(private val store: PersonalDataStore) : StdDeserializer(PersonalData::class.java) {
    override fun deserializeWithType(p: JsonParser, ctxt: DeserializationContext, typeDeserializer: TypeDeserializer, intoValue: PersonalData) = this.deserialize(p, ctxt)

    override fun deserialize(p: JsonParser, ctxt: DeserializationContext) = try {
        p.readValueAs(SerializedPersonalData::class.java)?.let {
            // Retrieve the actual value with the id provided in the serialized (json) value id
            store.retrieveDataValue(it.id)
        }
    } catch (e: Exception) {
        // Something happened. The value does not exist or we have an error. Return masked value
        PersonalData("***", -1)
    }
}
Enter fullscreen mode Exit fullscreen mode

In conclusion, this approach enables you to keep personal data out of your event store while still being able to see the data, delete the data or mask the data. The only thing you have to do is wrap it in a PersonalData class. Great, isn’t it?

Demo time

Let’s take a look at the following aggregate. The aggregate keeps a user’s real name, wrapped in a PersonalData object. This allows access to the value while prohibiting it from being stored in the event store or in snapshots. In all other ways, it works the same as a String value would.

@Aggregate
class ProfileAggregate {
    @AggregateIdentifier
    private lateinit var username: String
    private lateinit var realName: PersonalData

    @CommandHandler
    constructor(command: CreateProfileCommand) {
        AggregateLifecycle.apply(ProfileCreatedEvent(command.username, PersonalData(command.realName)))
    }

    @EventSourcingHandler
    fun onEvent(event: ProfileCreatedEvent) {
        this.username = event.username
        this.realName = event.realName
    }

    constructor() {
        // Here for axon
    }
}
Enter fullscreen mode Exit fullscreen mode

When we now create an account through the REST-endpoint, the following event is published by the Aggregate.


{
  "username": "axon102",
  "realName": {
    "id": 1
  }
}
Enter fullscreen mode Exit fullscreen mode

As you can see. there is no personal data present in the event. Of course, the personal data is still there but is stored in a database table as you can see below.

Personal data is now kept in a separate table

More code?

I cannot post all the code here, so I selected the important bits and pieces. You can find the full source code of the demo application here: https://github.com/Morlack/axon-102/tree/main/personal-data/src/main/java/com/insidion/axon102
Take a look and try it out for yourself!

Caveats

All power comes at a price. Each time events containing personal data are read it’s necessary to consult a database table. That means the application is a little bit slower when reading and writing events, but I think the impact is negligible unless you got insane amounts of personal data in events, as the lookup is very fast.

The serializer approach we present here can only be used for new events or entire projects. Projects that already have personal data in their event store are at a disadvantage since the store already contains it, so you have to get it out first or decide only new data is written to the database table. It is never too late to implement it and write an upcaster to take advantage of it!

Conclusion

You can use Jackson to your advantage in order to keep personal data from your event store. This saves you the hassle of (illegally) editing your event store or deleting events, if these are even possible. The next Axon Framework 102 blog will focus on using metadata in your Aggregate and projections in an efficient manner, stay tuned!

Discussion (0)