Hero Image

The Single View Challenges: 2 Data Standardisation

 

11 May 2021 | Emma Benham

What is your biggest challenge when seeking to achieve Single View? The idea of pulling data to one location for the purpose of reuse is powerful, and central to a Single View Modernization strategy.

However, what may appear simple isn’t always; systems represent data in different ways. For example, a client’s information may be stored in different systems, in different schema formats and have varying data plus there may be duplicates (multiple systems having similar data but having different formats and granularity).

Another challenge is the entity references; the idea that a client’s ID may vary from system to system. This may also be true for product, service, lookup, and type identifiers. In moving the data from its local location to a central location, there becomes a need to understand who is who and what is what across those systems.

So, what are the possible solutions to this Single View challenge?

There are a number of possible options to explore when considering data standardisation, which the right one, or combination of solutions dependent upon individual business objectives and strategy.

The first option we consider is the creation of a schema-less store to land all data. This means the challenge of generalising the data (shaping into a canonical model) can be completed postprocess.

In this approach, maintaining the original data, in addition to what was generalised, allows for new canonical views on older data to be built retrospectively.

The second option requires a greater explanation; the concept of an event driven architecture, which compliments a microservice design approach. In an event-driven architecture approach, in many ways data is a by-product. of activity; a customer signed up ,a purchase was made, an address change was made etc. Data is associated, a full view is built up. Event driven architectures move data based on activity. An event is raised when something meaningful happens (client account created etc).The data vocabulary of events builds an architecture to allow for a system agnostic data model, based on activity (events), which communicates to the other systems. Consumers listen and consume events they are interested in and update their state (data) based on the change.

There are approaches such as digital decoupling to allow legacy systems to take part in the event conversation in a non-intrusive way. This approach is more flexible and powerful but carries additional foundational complexity. Referencing data issues, when separated and isolated as an individual concern, is an easier problem to solve by providing a simple mapping capability, which holds data maps between systems.

In addition, a taxonomy, to allow for the extension of the data types and the ability to model relationships when entities at a canonical level, brings a great deal of flexibility to the globularity and logical mapping problem. Model mapping and taxonomies as micro services, coupled with an event driven architecture, is particularly effective, however, consideration must be given to the size and access patterns of these services: data storage technology and caching is important.

An additional note on event driven architectures: Storing events in their original sequence (the time they were raised) can provide very useful insights. Knowing what happened when and having that data to query opens up future analytics; specifically, predictive. Events are immutable, meaning they are factual and should only ever be created, not updated or deleted. MongoDB is a good fit to store this type of information given it’s schemaless and can hold vast amounts of entries.