In the Plan phase of the life cycle you will map out the processes and resources for the entire data life cycle. Begin with outlining the project goals including desired outputs, outcomes, and impacts and work backwards to build a data management plan, supporting data policies, and sustainability plans.
The following points will guide development of a well-plan data project:
A backup policy helps manage users’ expectations and provides specific guidance on the “who, what, when, and how” of the data backup and restore process. There are several benefits to documenting your data backup policy:
If this information is located in one place, it makes it easier for anyone needing the information to access it. In addition, if a backup policy is in place, anyone new to the project or office can be given the documentation which will help inform them and provide guidance.
In the planning process, researchers should carefully consider what data will be produced in the course of their project.
Consider the following:
When preparing a data management plan, defining the types of data that will be generated helps in planning for short-term organization, the analyses to be conducted, and long-term data storage.
In addition to the primary researcher(s), there might be others involved in the research process that take part in aspects of data management. By clearly defining the roles and responsibilities of the parties involved, data are more likely to be available for use by the primary researchers and anyone re-using the data. Roles and responsibilities should be clearly defined, rather than assumed; this is especially important for collaborative projects that involve many researchers, institutions, and/or groups.
Examples of roles in data management:
Steps for assigning data management responsibilities:
A data model documents and organizes data, how it is stored and accessed, and the relationships among different types of data. The model may be abstract or concrete.
Use these guidelines to create a data model:
Steps for the identification of the sensitivity of data and the determination of the appropriate security or privacy level are:
Shaping the data management plan towards a specific desired repository will increase the likelihood that the data will be accepted into that repository and increase the discoverability of the data within the desired repository. When beginning a data management plan:
Multimedia data present unique challenges for data discovery, accessibility, and metadata formatting and should be thoughtfully managed. Researchers should establish their own requirements for management of multimedia during and after a research project using the following guidelines. Multimedia data includes still images, moving images, and sound. The Library of Congress has a set of web pages discussing many of the issues to be considered when creating and working with multimedia data. Researchers should consider quality, functionality and formats for multimedia data. Transcriptions and captioning are particularly important for improving discovery and accessibility.
Storage of images solely on local hard drives or servers is not recommended. Unaltered images should be preserved at the highest resolution possible. Store original images in separate locations to limit the chance of overwriting and losing the original image.
Ensure that the policies of the multimedia repository are consistent with your general data management plan.
There are a number of options for metadata for multimedia data, with many MPEG standards (http://mpeg.chiariglione.org/), and other standards such as PBCore (http://pbcore.org).
As a best practice, one must first acknowledge that the process of managing data will incur costs. Researchers should plan to address these costs and the allocation of resources in the early planning phases of the project. This best practice focuses on data management costs during the life cycle of the project, and does not aim to address costs of data beyond the end of the project.
Budgeting and costing for your project is dependent upon institutional resources, services, and policies. We recommended that you verify with your sponsored project office, your office of research, tech transfer resources, and other appropriate entities at your institution to understand resources available to you.
There are a variety of approaches to budgeting for data management costs. All approaches should address the following costs in each phase:
Methods for Managing Costs
Phases of the Data Life Cycle (see Primer on Data Management on the DataONE website for a description of the life cycle)
The plan will be created at the conceptual stage of the project. It should be considered a living document and a road map for the project, and should be closely followed. Any changes to the data management plan should be made deliberately, and the plan should be updated throughout the data life cycle.
Data management planning provides crucial guidance to all stages of the data life cycle. It provides continuity for operations within the research group. The data management plan will define roles for all project participants and workflows for data collection, quality assurance, description, and deposit for preservation and access. The data management plan is a tool to communicate requirements and restrictions to all members of the project team, including researchers, archivists, librarians, IT staff and repository managers. The plan governs the active research phase of the project life cycle and makes provisions for the hand-off to a repository for preservation and data delivery.
Funding agencies and institutions require data management plans for project funding and approval.