Content storage management
Content storage management is a technique for the evolution of traditional media archive technology used by media companies and content owners to store and protect valuable file-based media assets. CSM solutions focus on active management of content and media assets regardless of format, type and source, interfaces between proprietary content source/destination devices and any format and type of commodity IT centric storage technology. These digital media files most often contain video but in rarer cases may be still pictures or sound. A CSM system may be directed manually but is more often directed by upper-level systems, which may include media asset management, automation, or traffic.
Typically, CSM systems are server-based software applications that reside between the media network, which connects the various broadcast or manipulation devices, and the storage network, which connects the nearline and archive storage tiers. The most basic function of CSM is the automated retrieval of high-resolution digital content either from a data tape library, or from a data server, and the delivery of that content either to a workstation, playout or editing device. CSM also performs this process in reverse – moving content back to storage. In a given media operation, CSM may be used to facilitate content manipulation and repurposing; systems interoperability through high and low bit rate content transcoding; and/or site-to-site content replication for disaster recovery.
CSM solutions comply with the well known Reference Model for an Open Archival Information System, which is fundamental to long-term archive and content preservation in file-based environments and are characterized by a set of application-specific, which can include:
- Interfaces to proprietary media creation and consumption devices regardless of interface and storage topology
- Transcoding and rewrapping technologies to ensure compatibility of content despite its source, format, encoding rates, aspect ratio, content container/wrapping standards
- Direct integration to any type and format of storage device typically categorized as IT-centric storage devices allowing limitless or near limitless storage expansion
- Integration with various network technologies including Ethernet, Fibre Channel, etc. and many protocols such as SCSI, TCP/IP, etc.
- Full compliance with the well established OAIS model allowing combination of reference metadata, media and any other key elements, which comprise an overall asset targeted to storage and preservation goals
- Subjective content analysis for file-based audio/video content entering and exiting the system
- Data integrity and validation checks such as checksums, etc.
- Analytics engines for capturing, measuring and reporting on all internal aspects of the system including network bandwidth, read/write error rates, data storage profiles, etc.
- Features for local and geographically distributed content storage for file-based distribution as well as disaster recovery applications
- Extensibility through incremental addition of features and resources to allow limitless or near limitless expansion of the system
History of CSM
The most significant differentiator between CSM and HSM solutions is the difference between migration between tiered storage fundamental to HSM and the active storage management fundamental to CSM solutions. Conceptually, HSM solutions are used to age content to less expensive tiers of storage based on static policies examining parameters such as least recently accessed, file size, specific directories/paths, etc. and treat each file as a separate and unique entity. Although CSM solutions also manage tiered storage but do not rely on static policies but rather can be living policies that can be assigned dynamically to content entering the system and govern replication policies, storage persistence, age-based migration as well as more advanced content aware processing steps such as transcoding, rewrapping, reformatting and subjective quality analysis. In some cases, the content stored within the CSM system is intentionally different than the content that entered the system initially. As an example, content entering a CSM solution in a legacy media format such as Pinnacle MPEG2 may be intentionally transcoded to MPEG2 IMX50 and wrapped in an industry standard format such as MXF to ensure longer term compatibility with other systems connected to the CSM solution. These media-centric workflows are not inherently supported by HSM solutions.
Based on the content centric focus of CSM solutions, they do not have legacy reliance on key HSM traits such as the use of stub files, which must remain on the original file system from where the file was migrated to cheaper tiers of storage. Although beneficial in some environments where HSM is being used simply as a “disk extender” for economic motivation, this feature is a significant limitation in active file-based content workflows typical in media, entertainment and preservationist applications where CSM solutions are key. Because of the inability for broadcast devices to maintain these HSM specific stub files on their internal disk, the use of HSM systems in these advanced applications mandates a copy of the media content be made first from the online storage to some other tier of spinning disk prior to the HSM system taking ownership and migrating the content to other tiers of disk. As the HSM systems do not have the ability to directly interface to these various broadcast and media devices such as encoders, video servers, editing systems, etc. there is usually another software application responsible for this additional step such as a media asset management system or other application specific utility.
CSM solutions extend the realm of content control and management directly from the internals of the content source and destination devices through any number and type of mass storage devices all managed by configurable intelligent policies similar in concept to information lifecycle policies but enhanced to become “content aware.”
CSM solutions are not meant to age content into less expensive storage over time as its perceived value diminishes but rather actively play a part in a fairly symmetrical content lifecycle where content stored within the CSM system yesterday or five years ago is just as likely to be requested to be restored. All content stored within a CSM storage infrastructure is equally viable for restore operations at any point of time and often no reliable predictability is possible.
Augmenting the overall content storage capacities by allowing the addition of less expensive storage technologies is an advantage of CSM solutions but preservation is also a key driver. In terms of preservation, high-value content can be assigned differing ILM policies that can govern the number of copies or instances of the content maintained by the system and also geographical distribution to other CSM solutions via WAN connectivity providing distribution as well as disaster recovery functionality.
Example of CSM in Broadcast
An example of the role of CSM within a broadcast setting is as follows: A movie is ingested into the content owner’s workflow leveraging an encoding device and a control system such as automation or MAM. The control system instructs the CSM system to simultaneously analyze the content to ensure acceptable subjective quality, copy this original material into the archive library, make two copies or instances of it for protection and generate a low resolution Windows Media proxy version for Web access. An editor uses the MAM system to view the proxy generated by the CSM solution, and from his/her desktop workstation selects shots for use in a promo for the movie. The editor then sends his/her edit decision list to the CSM system, which restores to the editing system only the desired broadcast-quality segments based on mark in and mark out timecode values defined within it. The CSM system then transcodes or rewraps the segments as necessary, so they can be used by the editor in the creation of the promo. Once the editor completes the promo, the editing system sends it via the CSM system to the on-air video servers for playout to air.