Data as a living asset: Rethinking legacy data



In the days of pen and paper, organizations collected, transcribed, and stored data in physical file cabinets. These cabinets retained information for a minimum of five years to comply with potential audit requirements. With advancements in digital data capture, organizations use forms software, spreadsheets, or sensors to gather information. However, many have not updated their processes significantly, limiting progress in effective data management. Now, the physical file cabinet has been replaced by outdated spreadsheets or an aging Access database. Organizations often accumulate large amounts of legacy data, justifying costs based on operational requirements. Due to the complexity of data management, organizations frequently re-collect the same data unnecessarily. This redundant collection results in added expenses without delivering any real benefits to the organization. Once entered into corporate systems and used initially, much of this data becomes forgotten and unused.
Itâs the living, institutional knowledge youâve invested in over years or decades of doing business, full of latent value.
But there are a number of challenges that stand in the way when trying to make use of historical data:
If you give consideration to these issues up-front as youâre designing a data collection workflow, youâll make your life much simpler down the road when your future colleagues are trying to leverage historical data assets.
Letâs dive deeper on each of these issues.
I call this the âLotus 1-2-3â problem, which happens whenever data is stored in a format that dies off and loses tool compatibility1. Imagine the staggering amount of historical corporate data locked up in formats that no one can open anymore. This is one area where paper can be an advantage: if stored properly, you can always open the file.
Of course thereâs no way to know the future potential of a data format on the day you select it as your format of choice. We donât have the luxury of that kind of hindsight. Iâm sure no one wouldâve selected Lotusâs .123 format back in â93 had they known that Excel would come to dominate the world of spreadsheets. Look for well-supported open standards like CSV or JSON for long term archival. Another good practice is to revisit your data archives as a general âhygieneâ practice every few years. Are your old files still usable? The faster you can convert dead formats into something more future-proof, the better.
This is one of the most important issues when it comes to using archives of historical data. Presuming a user can open files of 10 year old data because youâve stored it effectively in open formats â is the data somewhere that staff can get it? Is it published somewhere in a shared workspace for easy access? Most often data isnât squirreled away in a hard-to-reach place intentionally. Itâs often done for the sake of organization, cleanliness, or savings on storage.
Anyone that works frequently with data has heard of âdata silosâ, which arise when data is holed up in a place where it doesnât get shared, only accessible by individual departments or groups. Avoiding this issue can also involve internal corporate policy shifts or revisiting your data security policies. In larger organizations Iâve worked in, however, the tendency is toward over-securing data to the point of uselessness. In some cases it might as well be deleted since itâs effectively invisible to the entire company. This is a mistake and a waste of large past investments in collecting that data in the first place.
Look for publishing tools that make your data easy to get to without sacrificing controls over access and security. But resist the urge to continuously wall off past data from your team.
Now, assuming your data is in a useful format and itâs easily accessible, youâre almost there. When working with years of historical records it can be difficult to extract the valuable bits of information, but thatâs often because the first two challenges (compatibility and accessibility) have already been standing in your way. If your data collection process is built around your data as an evergreen asset rather than a single-purpose resource, it becomes much easier to think of areas where a dataset could be useful 5 or 6 years down the road.
For instance, if your data collection process includes documenting inspections with thorough before-and-after photographs, those could be indispensable in the event of a dispute or a future issue in years time. With ease of access and an open format, it could take two clicks to resolve a potentially thorny issue with a past client. That is if youâve planned your process around your data becoming a valuable corporate resource.
Iâm currently working with a construction company on re-roofing my house, and theyâve been in business for 50+ years. Over that time span, theyâve performed site visits and accurately measured so many roofs in the area that when they get calls for quotes, they often can pull a file from 35 years ago when they went out and measured a property. That simple case is an excellent example of realizing latent value in a prior investment in data: if they didnât organize, archive, and store that information effectively, theyâd be redoing field visits every week. Though they arenât digital with most of their process, theyâve nailed a workflow that works for them. They use formats that work, make that data accessible to their people, and know exactly what information theyâll find useful over the long term.
Data has value beyond its immediate use case, but you have to consider this up front. Design sustainable workflows that allow you to continuously update data, and make use of archival data over time. Youâve spent a lot to create it, you should be leveraging it to its fullest extent.