Decide what to archive
Where file sizes are small, best practice is to publicly archive quality-controlled raw data (rather than calculated fields)
- Exceptions may be made for large datasets, fine-scale temporal data; model outputs, and or other products derived from complex calculations)
- Regardless of what you publicly archive keep a copy of all your own scanned datasheets & raw files (prior to cleaning or reformatting) - on your own computer.
Data will be published in data packages. A single data package can consist of multiple entities (tables, shapefiles, code, etc). In deciding what to lump/split, consider what groups of data would be usefully downloaded together (e.g. one could consider packaging up all the measurements from a single experiment).
For datasets where a specialized repository provides unique tools or audience, it may be appropriate to archive your data in a non-EDI repository (e.g. GenBank, Ameriflux). If you believe your data would be better served in a non-EDI data repository, contact the IM.