By Kristian Di Gaetano, VP of Data & Analytics
As many readers likely know, data lakes act as archives that store vast amounts of structured and unstructured data (information) at any scale. This flexibility enables users to leverage the information to provide useful insights for a variety of business needs and requirements.
However, a data lake becomes less valuable when the environment isn’t properly governed or managed. Poor interaction rules, technological choices, and inadequate approaches to data communication and integration contribute to this decline.
Central to understanding a proper information governance and management program is an introduction to the key roles enabling a data lake’s functionality – the producer, the consumer, and the preparer.
The Producer
The producer is responsible for providing a complete description of what is entering the data lake. This information should, ultimately, provide the consumer with what they require. However, any information required of the producer by the consumer must be provided to the preparer.
The Consumer
The consumer is responsible for providing a complete description of what information is being asked of the data lake. The description must be precise and to the point so that it accurately directs the producers to provide what’s needed by the consumer. Any information required of the producer by the consumer must be provided to the preparer.
The Preparer
The preparer is the middleman that communicates with both the producer and consumer. The role of the preparer is key to the functionality and management of a data lake as they guide how information is organized for better use by the consumer. both the producer and consumer.
To better understand the relationship between these three parties, let’s look at an example outside of a technical data lake – online shopping and order fulfilment via Amazon. In this case, the producer provides all of the products consumers shop for, the consumer accesses the store to find the items provided by producers, and the preparer is Amazon bridging the communication and interaction gap between producer and consumer. Amazon usually facilitates warehouse storage and shipping too – efficiency is key where this is concerned. Warehouses may benefit from industrial equipment such as folding ladders, electric pallet trucks, and more (check such products out here – platformsandladders.com/folding-ladders) to help make the process of transporting goods across the warehouse floor faster and easier for their workers.
Information governance is what maintains the relationship between the producer, consumer, and preparer. It ensures that all three parties follow the rules and processes for data lake utilization. It also provides standards and templates, the use of which, enable efficacy, efficiency, and consistency of interaction across each party.
Information governance and proper management are, as delineated here, key for ensuring the value of a data lake. When this is run correctly by the preparer, a data lake becomes more valuable for both the producer and consumer, as they are able to give and get exactly what’s required.
Curious to learn more about how Paradigm can enable your organization with business insights via a well governed and managed data lake? Send us a note today at [email protected].