Skip to Main Content

Data Management: Data Storage

How to manage data.

Data Storage

Formats for Data Storage

Store files in non-proprietary formats whenever possible (i.e. .txt, .csv, .asc, .html, .xml) to enable more open, longer-term access, storage, and preservation of your data.

  • Text files - use TXT, XML, PDF/A, HTML, ASCII

  • Databases and Tabular data - use XML, CSV

  • Statistical data - use ASCII, DTA, POR, SAS, SAV

  • Movies - use AVI, MOV, MPEG,  MXF

  • Images - use TIFF, JPEG 2000, PDF, PNG, GIF, BMP

Other Examples of Commonly Used File Formats: 

Proprietary  Non-proprietary/Preferred 
Excel (.xls, .xlsx) Comma Separated Values (.csv or .tsv) ASCII
Word (.doc, .docx) Plain text (.txt), or PDF/A (.pdf)
PowerPoint (.ppt, .pptx) PDF/A (.pdf)
Photoshop (.psd) TIFF (.tif, .tiff)
Quicktime (.mov) MPEG-4 (.mp4)

See the Library of Congress Recommended Formats for a more extensive, regularly updated list

Data Security

Data Encryption: Although encryption may make your data more difficult to for collaborators and future users to access, sensitive data (e.g., data related to medical records or human subjects) may need to be encrypted.

If you need assistance with encrypting your data, please contact Information Technology at 516-367-8390 or at helpdesk@cshl.edu.

Depositing Data in a Repository

Depositing your data in a repository is an important step in conducting responsible science, by enhancing data FAIRness and making it easier for other researchers to access, and potentially reuse your data. The NIH and other funding agencies either require or encourage depositing data collected in sponsored research in both local/institutional and external repositories.  Here is a list of NIH-supported repositories.

External Repositories

There are a number of external disciplinary and multi-disciplinary repositories to choose from the submit your data. The CSHL Library can help you select a suitable data repository.

Commonly used data repositories:

  • Zenodo - Zenodo allows users to upload any file format and accepts figures, datasets, media, papers, posters, presentations and filesets. 50 GB/dataset, but can request more. Note: we do have a CSHL Zenodo Community.
  • Figshare - Figshare allows users to upload any file format and accepts figures, datasets, media, papers, posters, presentations and filesets. 100 GB/user, but can request more.
  • Dryad  - Dryad welcomes data files associated with any published article in the sciences or medicine, as well as software scripts and other files important to the article. 
  • Github - Github allows users, as they develop code, to have version control and deposit their code to an individual repository for each project.
  • Omero - Omero has a variety of features with emphasis on managing microscope images.