One of our recent commercial production jobs, a single project, produced upwards of 60GB of data. A simple web project can range from 1.5 – 3GB. As CPU & RAM speed and capability expand and hard drive space increases, software developers create software to take advantage of these specs. You fill the room you have, right? Greater processing power allows designers to create more complex work. Screens are hungry for content and screen resolution is a market differentiator. The byproduct of increased power and pixel resolution for a design company is increased data storage needs. And we’re not alone. Turns out the whole world is awash in a rising sea of data. The International Data Corporation estimates that the amount of digital data being stored is more than doubling every 24 months and could grow by 50 times by 2020.
Let’s look at the challenge here. Because while a project starts at brainstorming and asset collection, even after the client takes delivery of the files, it never truly ends. Once a project is complete it now enters the storage phase of its existence.
Project Data Storage Life Cycle
- For the first year a project needs to be highly accessible. It may be revised, alternate versions may need to be created, artwork may need to be pulled and provided, and it may need to be referenced for new work.
- Then the demands on the data slow, although, they don’t entirely go away. The project still needs to be available for reference and pulling of assets (“do you still have that logo from 3 years ago?”)
- It then enters the archive for long term storage. For the next 4+ years assets may be pulled but eventually logos will be revised and stock will go out of style.
- From there it is a historical archive. The project still exists if ever needed, but is generally not accessed.
The storage evolution
When Paragon first started back in 2001, we were fresh out of Art School and our file management would have made Picasso proud. Our backup strategy consisted of burning CDs and we created about 30 or so of those. I’m sure in some haphazard fashion. Still having backups gave great comfort. That was about 19.5GB worth of data.
From there we moved to DVDs, we had about 90 of those total. Another 423GB of data.
As we continued to grow, produce more work along with associated digital byproducts we moved to hard drives. We had made the big time, folks!
From there we built a network with a centralized server for localized project work, as everyone was in the same location, and backups. This worked for a number of years and in very quick succession we created about 5 hard drives with 10TB of data along with frequent on and offsite server backups.
Hard drive juggling became a constant so we moved to NAS (Network-attached storage, or NAS is a file-level computer data storage server connected to a computer network) to take over where the server began to falter (for long term storage) and external drives became impractical. NAS was used for long term back up and the onsite server was used for localized workspace.
As Paragon continued to grow, our team began to relocate. With this relocation a centralized server workspace would no longer work for us, so we moved to a cloud workspace which synced to centralized storage to multiple locations. There was another issue to consider too – internet speed varied among different locations and localized server syncing was slow due to server OS compatibility issues. We also continued having issues with constant hard drive replacements in the NAS.
At this point we knew we needed to rethink our data management strategy. For added difficulty we had to research, strategize, test and rollout a brand new data management strategy and system all while in full operation. A daunting task that I liken to changing all tires on a bus while it continued to drive it’s route.
How best could we:
- Share data across all locations
- Keep it secure
- Back it up when it was ready to archive
- Access it when needed
- Move data to long term storage when it was no longer as frequently needed.
For location based data sharing we selected Dropbox for Business. This allowed us to share a workspace everyone could access with multi-factor security and top-level administration. We added to our already stringent security policies both with the use of Dropbox, system level security and sharing. We also planned a need-to-know syncing policy that gave users access to only the projects they needed to touch.
But wait, there’s more.
Turned out, this wasn’t enough. (Of course not!) We had a workspace in place but we still needed to figure out how we would back up the work. We decided on a project “lift” backup strategy and mindset. By lift, we mean that a project should be able to lifted out of the Workspace and into storage with all elements of the project intact and doing this should have zero effect on any other projects within the Workspace. This was a new strategy. What we had up until then could be better summarized as extract.
This required a brand new way of thinking about our project folder/file structure and nomenclature. While client segmentation had been implemented for years, client project assets and projects referencing assets in other projects were still wide spread. It was efficient and helped keep project sizes more manageable but project backup tended to be messy requiring projects to be extracted. And the effort was time consuming.
With our lift and backup strategy outlined we now needed to outline long term storage processes. We decided on Amazon’s S3. This was a bit of a no brainer as we had begun using it about a year or so before this restructure. The missing piece was a fast and effective way to move work from a cloud based workspace into S3 and a policy to outline what needs to be backed up and when, and what kind of storage is appropriate for a backed up project.
We started with an internal policy to outline when a project was deemed complete. At that point, the project was moved from to a location to a special location within our workspace that would then immediately synced to S3 through a 3rd party service. Once it was synced, it was removed from workspace.
We then would utilized S3’s bucket policies to move work to Glacier after a 3 year period.
Old dogs, new tricks
The final step was implementation. We had a company meeting to review the entire strategy, I did one on one meetings with everyone to discuss any specific roles they had within the strategy and, over a weekend, we rolled out the new workflow. I made cheat sheets. I had to have a couple of reminder meetings. It’s not easy to get people, even creative, flexible people, to change. Change is good and overall there were very few hiccups. Now the work of data management, maintenance, archiving, and stress has drastically been reduced. And the data is safely stored and easily accessed as needed.
Absolutely. A solid data storage strategy is vital to any business. Our advice?
- Plan carefully
- Take your time
- Expect a learning curve for your team
In a way, the most interesting part of this journey is how this reflects the changes that have occurred at Paragon over the years. When we first started Paragon we were purely designers. We are now so much more offering creative, strategy and conceptual support. And just as our company’s focus has evolved so has our internal tech strategy. Now it is easy to see how a company like Amazon, known by most of the public as a great place to shop, can naturally become a technology service company to those in the tech industry. It is a natural evolution.