Formats for Data Storage
Store files in non-proprietary formats whenever possible (i.e. .txt, .csv, .asc, .html, .xml) to enable more open, longer-term access, storage, and preservation of your data.
Text files - use TXT, XML, PDF/A, HTML, ASCII
Databases and Tabular data - use XML, CSV
Statistical data - use ASCII, DTA, POR, SAS, SAV
Movies - use AVI, MOV, MPEG, MXF
Images - use TIFF, JPEG 2000, PDF, PNG, GIF, BMP
Other Examples of Commonly Used File Formats:
Proprietary | Non-proprietary/Preferred |
---|---|
Excel (.xls, .xlsx) | Comma Separated Values (.csv or .tsv) ASCII |
Word (.doc, .docx) | Plain text (.txt), or PDF/A (.pdf) |
PowerPoint (.ppt, .pptx) | PDF/A (.pdf) |
Photoshop (.psd) | TIFF (.tif, .tiff) |
Quicktime (.mov) | MPEG-4 (.mp4) |
See the Library of Congress Recommended Formats for a more extensive, regularly updated list
Data Encryption: Although encryption may make your data more difficult to for collaborators and future users to access, sensitive data (e.g., data related to medical records or human subjects) may need to be encrypted.
If you need assistance with encrypting your data, please contact Information Technology at 516-367-8390 or at helpdesk@cshl.edu.
Depositing your data in a repository is an important step in conducting responsible science, by enhancing data FAIRness and making it easier for other researchers to access, and potentially reuse your data. The NIH and other funding agencies either require or encourage depositing data collected in sponsored research in both local/institutional and external repositories. Here is a list of NIH-supported repositories.
There are a number of external disciplinary and multi-disciplinary repositories to choose from the submit your data. The CSHL Library can help you select a suitable data repository.
Commonly used data repositories: