ReplicationDomain - Documentation


Table of Contents:


Replication Timing Analysis

Replication timing data were obtained by hybridizing early and late replication intermediates to Nimblegen oligonucleotide arrays. Briefly, replication intermediates are prepared from cells that are first pulse-labeled with BrdU and then sorted into early and late stages of S-phase by flow cytometry, followed by anti-BrdU immunoprecipitation of the BrdU-substituted (nascent) replication intermediates that were synthesized either early or late during S-phase. After unbiased amplification of recovered DNA, the samples are differentially labeled with Cy3 and Cy5 and hybridized to Nimblegen CGH arrays. Raw data are loess-normalized and scaled to have the same median-absolute deviation using the limma package (R/Bioconductor). In some cases, averages of multiple replicates are shown, in other cases individual replicates are shown. This information can be found in the DataSet Details by clicking on the Chip ID in the display page or the File Name in the Database page. Finally, the data are smoothed with a weighted moving average (loess: local polynomial smoothing). Further details can be found in Hiratani et al [PLoS Biology (2008) 6: e245] and elaborate details can be found in our online protocol.

Top

 

Transcription Analysis (Steady-State Transcript Levels)

Transcription microarray is performed either on an Affymetrix or a Nimblegen platform, as indicated in the data set details.

Affymetrix: This is standard Affymetrix GeneChip analysis of steady state transcript levels from total RNA, performed in triplicate from cells grown under identical conditions to those used to evaluate replication timing (GeneChip Mouse Genome 430 2.0 analyzing 45,037 single-copy murine genes and EST clusters). Each data set was first subject to Affymetrix Quality Control measures using GeneChip® Operating Software (GCOS) according to manufacturer’s protocols. All three data sets passed this quality control test and were subjected to normalization by the Probe Logarithmic Intensity Error algorithm (PLIER) developed by Affymetrix for calculating probe signals. For each Affymetrix ‘‘probe set,’’ signal intensity of the three biological replicates were averaged (i.e., average intensity). Genes are often represented by multiple probe sets. In such cases, the one with the highest total intensity (i.e., sum of ESC and NPC/EBM9 average intensity) was defined as the representative probe set, and the other probe sets were not used. We did so because such highest-intensity probe sets were empirically most consistent with reverse transcriptase (RT)-PCR analysis and can be defined in an objective way. Present (transcriptionally active) and absent (inactive) calls are generated by MAS5.0 (Affymetrix) per replicate per probe set, which results in multiple present–absent calls for a given gene [= 3 x (total number of probe sets for a gene)]. We defined ‘‘present’’ genes as those with more than 50% of all their probe set calls being present. A total of 15,143 (81%) of the 18,679 RefSeq genes, for which replication-timing ratios were obtained, were represented on the Affymetrix GeneChip microarrays and were assigned transcription levels and present-absent calls. Validation of transcription array results was evident from previously published transcription analysis under the same condition. Further details can be found in Hiratani et al [PLoS Biology (2008) 6: e245].

Nimblegen: Total cellular RNA was isolated by RNeasy kit (Qiagen) or alternatively, RNAqueous-Micro kit (Ambion) for low yield samples such as mesoderm and endoderm RNA derived from cells collected by FACS. For microarray analysis, RNA specimens were converted to double-stranded cDNA, labeled with Cy3, and hybridized according to standard procedures by NimbleGen Systems using a mouse expression microarray representing 42,586 transcripts (Roche NimbleGen Inc., 2006-08-03_MM8_60mer_expr). Further details can be found in Hiratani et al [Ichiros GR paper].

Top

 

Definitions of Data Sets and Data Entry Terms

Data sets are defined by a combination of data entry terms, as described below. Upon uploading data sets, users can either select terms from the dropdown list, or create a new term by filling in the blank. This information will then be found in the DataSet Details by clicking on the Chip ID in the display page or the File Name in the Database page.

 

  • Species: Supported species are: Arabidopsis, Drosophila, Gallus gallus, Homo sapiens, Mus musculus and Schizosaccharomyces pombe . Contact us to create any new species page.
Top

 

  • Company: Microarray product supplier name.
Top

 

  • Chip ID: This is the unique identifier for each data set. While a “Chip ID” normally represents a single replicate experiment (e.g. one microarray hybridization), most data sets currently displayed on the Data Display Page are averages of multiple replicates. Therefore, we have re-defined Chip ID as a string of characters combining individual “Chip ID” numbers and description of the data set identity. Chip ID is not useful except to communicate comments regarding a particular experiment. On the Data Display Page, however, the Chip ID for each data set is set up as a link to the "Data Set Details" (also accessible from the "Database" link on the main menu). The Chip ID is also useful for identifying data sets when downloading entire data sets through the "Download Data" link in the main menu.
Top

 

  • Build: This indicates the version of the genomic sequence information that was used to assemble the microarray chip in the particular experiment (for example, mm7, and mm8 for the mouse). Builds change slightly as sequence information becomes updated, so the exact base pair position of any given DNA sequence will change as the sequence information becomes annotated. The build information indicated in each data set shows the build used for chromosomal coordinates of probes on the particular array type used.
Top

 

  • GEO Accession Number: If you have published your data, you should have deposited the data in the Gene Expression Omnibus (GEO) database. Providing this number will provide users with additional information about your datasets.
Top

 

  • Order ID: For Nimblegen data sets only.
Top

 

  • Cell Line: This indicates the designated name of the cell lines employed.
Top

 

  • Differentiation State: This is the tissue or tissue type represented by the cell line used and grown under the indicated conditions. In some cases it represents a cell line derived from a particular tissue, in other cases it represents a stage in a stem cell differentiation protocol.
Top

 

  • Array Design Name: Microarray supplier and catalog number.
Top

 

  • Data Type: Indicates the property being measured in the indicated experiment, such as replication timing, transcription, ChIP-chip, ChIP-seq, etc.
Top

 

  • Reference: be displayed publicly you must include a reference to an in press or published paper.
Top

 

  • Comments: We provide detailed microarray design information here but any additional comments can be added.
Top

 

  • Present or Absent Column: For uploading transcription data sets that contain present-absent calls, specify here.
Top

 

  • Partial Dataset: Dataset that do not contain information for all chromosomes.
Top

 

  • Data Security Level: Users can select Public, Private, or Über Private. Users can make their published or “in press” data sets publicly available by selecting “Public” and providing a reference under the entry term, "Reference." Private data sets are viewable by all registered users with a ReplicationDomain account, while Über Private data sets are viewable only by the user who uploaded the data set and any other users that s/he wishes to designate.
Top

 

  • Data Starts on Line: Usually starts on line-2, with line-1 being the column names, but files without column names are also acceptable (i.e. starts on line-1).
Top

 

News & Events

x

Datasets need to be updated.

Click on X (right-top) or Click Out from the popup to close the popup!