Format description: overview

General Info

  • The format is organized in multiple individual Tables.

  • The only mandatory table is the DNA-Spot/Trace Data core table.

  • All other tables are either conditionally required depending on experiment design and type or optional but recommended for all experiment types.

  • Each file must contain a single table.

  • Accepted file formats for storing tables are txt, csv and tsv.

  • An underscore _ must be used as a word separator in header field names and column headers to improve readability while not violating common name restrictions in coding environments (dash - may be mistaken as subtraction of variables).

  • Each file has two parts: file header and data columns.

Warning

All MANDATORY header fields and column names are indicated in bold. All conditionally required header fields and column names are indicated in italics.

Tip

Except for DNA-Spot/Trace Data core table, Spot Demultiplexing table, RNA Spot Data table and Cell/ROI Mapping table, all included Tables MUST contain at least 1 Optional Column.

File Header

  • All tables have to contain a mandatory header section.

  • In the file header, each line contains only one field.

  • Header lines are denoted by #. In particular:

    • ## denotes machine readable header lines. These lines must follow the following format ##Key_A=Value_1 (e.g., ##FOF-CT_Version=v0.1).

    • # denotes human readable header lines. These lines should follow the following format, #Term_X: free text description (e.g., #Lab_Name: name of the lab where the experiment was performed).

    • #^ denotes lines that define user specified Optional Columns. These lines provide the name of the column header and a description of the column content. Descriptions must be understandable and sufficient to ensure the interpretation and reproducibility of the results. These lines should follow the following format #^Term_X: free text description (e.g., #^Optional_Column_1: optional column 1 description).

  • Header names MUST use the underscore as a word separator (e.g., RNA_A_intensity).

  • The file header contains required, conditionally-required, and optional fields.

  • Conditionally-required fields are fields that are required when certains conditions are met (e.g., ##Intensity_Unit= is required any time an intensity metric is reported).

Mandatory header lines (all tables)

##FOF-CT_Version= Data format version number. E.g., v0.2

##Table_Namespace= Identifier for this type of table. Value must be as in the example. E.g., 4dn_FOF-CT_core

#Lab_Name: Name of the lab where the experiment was performed

#Experimenter_Name: Name of the person performing the experiment

#Experimenter_Contact: Email address of the person performing the experiment

#Description: A free-text, description of the experiment and of the data recorded in this table. This description should provide a clear understanding of the process utilized to produce the data and contain sufficient details to ensure interpretation and reproducibility.

#Additional_Tables: List of the additional tables being submitted. Note: use a comma to separate each table name from the next. E.g., AddTable1, AddTable2, AddTableN

##Columns:= List of the data column headers used in the table. Note: enclose the column headers and use a comma to separate each header name from the next. E.g., (C1, C2, C3, Cn)

Additional conditionally required header lines

DNA-Spot/Trace Data core table and RNA Spot Data table tables

In addition to all of the above, the following header line is required for the DNA-Spot/Trace Data core table and RNA Spot Data table tables.

##Genome_Assembly= Genome build. E.g., GRCh38

Note

(1) the 4DN Data Portal only accepts GRCh38 for human and GRCm38 for mouse. For other species see the list of all 4DN allowable genome builds; (2) in case the genome under study contains an INSERTION or a DELETION, indicate this as indicated in DNA-Spot/Trace Data core table.

DNA-Spot/Trace Data core table, Spot Demultiplexing table, Spot Biological Data table, RNA Spot Data table, RNA Spot Biological Data table, and Cell/ROI Mapping table tables

Further, the following header line is required for the DNA-Spot/Trace Data core table, Spot Demultiplexing table, Spot Biological Data table, RNA Spot Data table, RNA Spot Biological Data table, and Cell/ROI Mapping table tables.

##XYZ_Unit= The unit used to represent XYZ locations or distances. Note: use micron to avoid problem with special, Greek symbols. Other allowed values should be drawn from SI units of Length. Examples: ‘nm’, ‘micron’ ‘mm’ etc.

Note

Other units related header lines are also conditionally required for all other Tables in case relevant metrics are reported (e.g., the ##Time_Unit= field is required if a time measure is reported).

DNA-Spot/Trace Data core table, Spot Demultiplexing table, RNA Spot Data table, Spot Quality table and RNA Spot Quality table tables

Finally, the following header lines are required for the DNA-Spot/Trace Data core table, Spot Demultiplexing table, RNA Spot Data table, Spot Quality table and RNA Spot Quality table tables.

#Software_Title: The name of the Software(s) that were used in this case for localizing individual FISH-omics bright Spots and/or to produce three-dimensional (3D) polymeric chromatin Traces.

#Software_Type: The type of this Software. Allowed values: SpotLoc, Tracing, SpotLoc+Tracing, Other

#Software_Authors: The Name(s) of the individual Author(s) of this Software. In case there are more than one Authors, individual names should be listed as follows: Doe, John; Smith, Jane; etc,.

#Software_Description: A free-text description of this Software. This description should provide a detailed understanding of the algorithm and of the analysis parameters that were used, in order to guarantee interpretation and reproducibility.

#Software_Repository: The URL of any repository or archive where the Software executable release can be obtained.

#Software_PreferredCitationID: The Unique Identifier for the preferred/primary publication describing this Software. Examples include Digital Object Identifier (DOI), PubMed Central Identifier (PMCID), ArXiv.org ID etc,.

Note

All Software related header lines are also conditionally required for all other Tables in case Software is used for producing the reported results.

Data Columns

  • Tables contain required, conditionally-required, and optional columns.

  • Conditionally-required columns are columns that are required when certain conditions are met (e.g., Cell_ID is required any time the experiment involves the identification of Cell boundaries).

  • Column names should use the underscore _ as a word separator (e.g., Spot_ID).

  • The first column is always either Spot_ID or another relevant ID (i.e., Trace_ID, Cell_ID, etc.).

  • The following tables have additional mandatory columns that do need to be specified in the header, DNA-Spot/Trace Data core table, Spot Demultiplexing table, RNA Spot Data table and Cell/ROI Mapping table

  • Unless otherwise specified, the order of all Optional Columns is at user’s discretion.

  • The order of the rows is at user’s discretion.