Introduction

A key output of the 4D Nucleome (4DN) project is the open publication of datasets related to the structure of the human cell nucleus and the genome, within. Recent years have seen a rapid expansion of FISH-omics methods, which quantify the spatial organization of DNA, RNA and protein in the cell and provide expanded understanding of how higher-order chromosome structure relates to transcriptional activity and cell development. Despite this progress, FISH-based image-data are not yet routinely made publicly available upon publication because of the lack of common specifications for data exchange. This challenge is experienced across the bioimaging community, as a result a solution built, tested and proven in 4DN can have a wide impact all over the world.

This document describes the 4DN FISH Omics Format - Chromatin Tracing (FOF-CT), a community data format designed for capturing and exchanging the results of chromosome imaging experiments produced within the context of the 4D Nucleome project. FOF-CT is directly compatible with several FISH omics techniques including, but not limited to, Optical Reconstruction of Chromatin Architecture (ORCA), Multiplexed Imaging of Nucleome Architectures (MINA), Hi-M, DNA Sequential Fluorescence In Situ Hybridization (seqFISH+), Oligonucleotide Fluorescent In Situ Sequencing (OligoFISSEQ), DNA Multiplexed error-robust fluorescence in situ hybridization (DNA-MERFISH), and In-situ Genomic Sequencing (IGS). In addition, the format is designed to be consistent with planned future extensions that will encompass single-molecule localization methods for volumetric imaging, such as OligoSTORM and OligoDNA-PAINT.

In chromatin tracing experiments, polymer tracing algorithms are used to string together the localization of individual DNA bright Spots to reconstruct the three-dimensional (3D) path of chromatin fibers. Thus, the format is organized around multiple tables. The core of the format consists of a Spot/Trace table that defines chromatin Traces as ensembles of individual DNA-FISH bright Spot localizations.

Additional tables support the integration of this core with additional properties such as quality metrics, physical coordinates placing the Spot/Trace in the context of cellular space, multiplexed RNA-FISH results and with additional data that is better captured at the global Trace (e.g., expression level of nascent RNA transcripts associated with a given Trace or overall localization of the Trace with respect to cellular or nuclear landmarks), Cell (e.g., boundaries and volume), sub-cellular Region of Interest (ROI; e.g., Nuclear feature or Nucleolus), or extracellular ROI (e.g., Tissue) level.

_images/FOF-CT_graph.png

Figure 1: Schematic representation of 10 tables composing the Fish Omics Format for Chromatin Tracing.

Tables

Number

Extended Name

Short Name

Namespace

Requirement Level

1

DNA-Spot/Trace Data core table

core

4dn_FOF-CT_core

required

2

RNA-Spot Data table

rna

4dn_FOF-CT_rna

conditionally required

3

Spot Quality table

quality

4dn_FOF-CT_quality

recommended

4

Spot Biological Data table

bio

4dn_FOF-CT_bio

recommended

5

Spot Demultiplexing table

demultiplexing

4dn_FOF-CT_demultiplexing

optional

6

Trace Data table

trace

4dn_FOF-CT_trace

optional

7

Cell Data table

cell

4dn_FOF-CT_cell

conditionally required

8

Sub-Cell ROI Data table

subcell

4dn_FOF-CT_subcell

conditionally required

9

Extra-Cell ROI Data table

extracell

4dn_FOF-CT_extracell

conditionally required

10

Cell/ROI Mapping table

mapping

4dn_FOF-CT_mapping

conditionally required