Comment: How to cross the data chasm

From artificial intelligence (AI) to blockchain, business leaders are seeking new ways to get ahead of their competitors, says Toby Mills, founder and CEO of Entopy.

AdobeStock/Sashkin

Digital transformation is a hot topic in the world of business. Embracing digital has become vital for companies keen to gain and maintain competitive advantage – so it's no surprise that many new technologies are coming to the fore. From artificial intelligence (AI) to blockchain, business leaders are seeking new ways to get ahead of their competitors.

The supply chain industry is a good example. Given the complexity of modern supply chains, the number of influencing factors and the number of separate organisations interconnecting, the latest next-generation technologies are extremely attractive. Blockchain's ability to provide many stakeholders with access to trusted data across a supply chain network, and the potential of AI to automate and predict the future, could deliver profound advances.

Analysts agree. Over the next five years, blockchain is predicted to have a compound annual growth rate (CAGR) of 51.3 per cent in the supply chain market – and AI 41.5 per cent. But harnessing these next-generation technologies is not straightforward.

Technologies such as blockchain and AI need to be fed data. There is no shortage of data, of course. But there is a gap – in fact, a huge chasm between the amount of data being generated and businesses being able to harness and use that data. Blockchain and AI are not capable of bridging this gap as, quite simply, it’s not what they are designed to do.

There are two central parts to the challenge: access to the data needed and the ability to use that data.

Accessing data in the supply chain is very challenging. Relevant data could reside in a separate business unit or an external organisation. Without the whole picture, it’s difficult to maximise value. There have been several attempts to address this challenge. Data lakes, for example, enable huge amounts of data to be centralised – meaning stakeholders can connect to the data lake and access the data they require. However, this approach can cause data privacy issues between organisations.

Some blockchain vendors are offering an enhancement whereby participants can upload data and use permission-based rules to control which participants can see their data. However, as there is so much data, companies struggle to know and manage which data to share with which organisation.

Blockchain is predicted to have a compound annual growth rate of 51.3 per cent in the supply chain market

But the bigger issue with the mass centralisation approach is the range of data types in play – it’s for this reason that most data lake efforts focus on a particular type of data. For this challenge to be overcome, the data must first be structured to allow many different data types to be brought together in a coherent way. For example, all data regarding a consignment must be accessible as a single data object or data product. This data product would include a range of data from several systems, with an ontology that allows the data to fit nicely together. Access to structured data products, as opposed to many streams of different data types, enables users to access and analyse all relevant data simply and quickly – allowing more complex queries to be run and much greater insights to be generated.

Digital twin technology is another emerging innovation, with a projected CAGR of 60.6 per cent over the next five years. It enables real-time models of real-world objects to be created in the digital world and is traditionally associated with AI – creating a foundation from which simulations can be run.

However, the concept can be used in a more abstract way to cross today's 'data chasm'. Continuing the supply chain example, each real-world 'thing' would have its own, unique digital twin, accessible by all participants in a supply chain via a central platform. The digital twins would comprise all relevant data for the thing, dynamically updating in real time with new data as the thing progresses through its lifecycle. Digital twin technology offers a way to combine data from many disparate sources to deliver a single data product which can be accessed, on a permission basis, by stakeholders across the supply chain network – enabling them to access all data simply and effectively.

Unlike with traditional object-oriented databases, the digital twin is temporary – only existing for the duration of the associated thing's lifecycle, after which the digital twin disappears from the central platform. Of course, the lifecycle record of the associated thing could be easily warehoused separately. But the digital twin would be temporary on the central platform – enabling this approach to be delivered relatively cost effectively and at scale.

The technology to make this possible is intelligent data orchestration. Using the central digital twin concept, only relevant data is sourced from connected systems – with little or no effort required from the respective domains. The automated technology brings the disparate data together, structuring it to form a complete data product and dissolving it at the end of its operational lifecycle – providing a data foundation to unlock the next generation of digital transformation.

Toby Mills, founder and CEO of Entopy