Help Page - AcDs Tagging
This is a guide to using AcDs resources and tools at PlantGDB. If you have questions that you would expect or hope to be answered here, please let one of us know as described on the AcDs Contact Page. We will give you an answer, and modify this section if appropriate.
fDs Sequence in Genome Context (ZmGDB)
When viewed at ZmGDB, the Ds flanking sequence (aqua blue) is shown in its genome context, aligned to the maize B73 genome with sequence length as indicated. The predicted Ds insertion point is always at the proximal end of the fDs glyph image. Distal fDs sequences (with "_dir" appended to the clone name) are always a discrete distance from the actual Ds insertion point, and the gap region is shown in solid aqua blue. When adjacent and distal sequences from the same barcode are present, the endpoints should both align with the predicted point of Ds insertion (see diagram).
Cloning and Placement Methods
Flanking DNA is cloned by inverse PCR using Ds- (or Ac-) specific primers, and the resulting insert is sequenced from both 5' and 3' ends, using either adjacent or distal primers. The resulting Ds-flanking (fDs) or Ac-flanking (fAc) sequences are classified as 5d, 5a, 3a, or 3d (see diagram below). They are deposited in GenBank, and each fDs sequence is matched to the maize pseudomolecules (RefGen_v1) using NCBI BLAST with 1e-5 threshold. The BLAST output is parsed to determine which are the best matches for each sequence, and matches meeting certain criteria for % identity and coverage (see below) are rated as "placed" and reported on the Ds Insertions page. These matches are also displayed on the ZmGDB genome browser and the ZmGDB Protein Alignments table.
Ds insertion point
The first base of an Adjacent fDs sequence (5a or 3a) is the base immediately next to the point of Ds insertion, whereas a Distal fDs sequence (5d or 3d) is located some unknown (but small) distance from the site of the fDs insert. Both are presented in (+) strand format so that the first base of any sequence is the base closest to the site of the Ds insert. Note that the adjacent sequence could be adjacent to the 3' OR 5' side of Ds, but sequence direction is always away from the point of Ds insertion. See diagram.
Since the Ds insertions are maintained in maize inbred W22, some sequence divergence versus the maize reference genome (inbred B73) sequence needs to be taken into account in the placement process. See next section for more details.
Placement is determined by querying the fDs sequence vs the maize pseudomolecule sequences using NCBI's blastn and determining heuristically which matches indicate a true placement. Since Ds-flanking sequenced are from inbred W22, some mismatch with the B73 genome is tolerated (expressed as Quality score). Only elements with one (Single Placement; SP) or two (Dual Placement; DP) unambiguous locations meeting a Quality threshold are scored as placed. The following placement rules are applied:
- Not Placed (no hits or too many hits). These accessions are not displayed here, but they are available in the AcDs BLAST datasets (see below)
- SP (single placement; one unambigous location);
- DP (dual placement, two equally likely locations, not resolved);
- (not placed - multiple locations) - more than two equally likely locations if good match (MQ2-HQ); more than two equally likely locations of less good match (MQ3, MQ4)
- (not placed, no good hits) - falls below identity/coverage threshold (see below) or no match at 1e-10
Quality refers to the blastn % identity/coverage score and is applied hierarchically (once a sequence satisfies a higher level criterion, lower lever criteria are not applied)::
- HQ (high quality): > = 95% coverage and >=95% identity with no large gaps.)
- MQ1 (medium quality 1) > 90% < 95% coverage and > = 95% identity with no large gaps.
- MQ2 (medium quality 2) > 90% coverage and 95% cumulative identity, with one or more gap openings in alignment
- MQ3 (medium quality 3) > = 95% identity over at least 200 bp of sequence; only invoked when there is only one MQ hit
- MQ4 (medium quality 4) ( >90% identity over at least 200 bp); only invoked when there is only one MQ hit.
The Ds Insertions (Chr) table is a good place start for browsing or searching Ds (and Ac)-flanking sequences that have been provisionally placed on a Zea mays pseudomolecules as described above. Insertions are listed according to their W22 line of origin (barcode; left column), and they are sorted by chromosome - chr position order, based on the latest pseudomolecule assembly provided by maizesequence.org. You can do a search on any column, sort any column, or export results to a .csv file
Explanation of columns
- Each barcode ID (column 1) is linked to a "Placed" barcode record for that accession (see below); click to access it.
- fDs-5d, 5a, 3d, or 3a refers to the fDs position relative to Ds; these columns display GenBank ID's of any fDs sequence in that position. Click to view GenBank record.
- Placement for each barcode is scored as either single chromosome location (SP), or dual placement not resolved (DP). Non-placement can be due to multiple, ambiguous potential locations or to insuffient identity by blastn. See below for more details.
- Quality refers to the % identity/coverage score, and ranges from HQ (high quality; >= 95% coverage and >=95% identity with no large gaps.) to MQ4 (90% identity over at least 200 bp). Note that only a single placement location per barcode is displayed in this table even though more than one may be indicated. See below.
- Chromosome and Ds Coordinate columns refer to the predicted location of Ds insertion in the genome. The closest gene to Ds location is shown, and distance and polarity are shown. Red (negative) numbers indicate predicted insertion within a gene transcript region.
- Closest gene - The maizesequence.org gene model closest to the Ds insertion site is shown, together with distance and polarity (negative distance denotes insertion in a gene). Click the ID to view the gene model record at ZmGDB.
- Confirmation - Denotes data from re-PCR or other confirmation experiment. See barcode record for details.
- Redundancy - Ds insertions determined to be at an identical molecular location are grouped together under an RS number.
PLEASE NOTE that barcodes appended with an A, B, or C (e.g. B.S05.0813A) represent W22 lines for which more than one insertion site was identified (e.g. B.S05.0813B). Multiple sites may be linked or unlinked.
Things to keep in mind:
- Placement is a best guess based on available data, but in light of the complexity of the maize genome, the draft nature of the B73 reference genome and its divergence from W22 (the line used for AcDs tagging), it is possible some placements will be missed and others will be incorrect.
- If more than one flanking sequence is present (e.g. 5a and 3a sequences), it is possible they will map to divergent locations, or else one may be non-placed and the other placed.
- Barcodes appended with an A, B, or C (e.g. B.S05.0813A) represent W22 lines for which more than one insertion site was identified (e.g. B.S05.0813B). Multiple sites may be linked or unlinked.
- If you don't see a particular Barcode of interest in this table, it is because it has not been placed according to our criteria.
We also provide a Browse Ds Insertions (BAC) page showing placement of Ds flanking regions in the maize BAC dataset. These data may not be as up to date as the chromosome placement data, however.
Ds Insertion Sites Page
The Barcode record page (see example) is accessed by clicking a barcode ID in the Browse Insertions barcode column. Displayed are current maize genome placement data, fDs sequence and genetic data for this barcode line.
The Placement Table displays all available Ds placement data for this barcode, arranged in columns according to which flanking sequence(s) were successfully placed. Placement is based on BLASTp similarity of W22 Ds flanking sequence (fDs) to the maize B73 genome. Flanking sequence data are listed according to sequence type: 5d, 5a, 3a, or 3d.
fDs Placement Alternative (if any)
For some accessions, two equally likely, distinct locations are found, and these are classified as "DP". A second set of columms are used to display all data and links for the second potential placement.
Placement Table Links:
- fDs gi - links to Blast output for that gi
- Likely Ds insertion point - links to ZmGDB genome browser centered on the Ds insertion point.
- Closest Gene - links to the UniRef record for the top BLAST hit (if any).
- Blast @ ZmGDB - Opens a BLAST page loaded with the fDs sequence as query, for querying the maize genome assembly or other dataset.
A group number (RS) is assigned to multiple barcodes that have been determined to map to the same genetic location. Click header to view details including a list of Ds elements in the same group.
If present, data from molecular confirmation (e.g. re-PCR based on flanking sequence) are shown. Click header to view details.
Sequence and blast
Click Show FASTA Sequence to view FASTA; or click BLAST@ZmGDB to open a blast window pre-loaded with the fDs sequence.
Other barcode information
Scroll down the page for additional barcode data (Sequence, Genetics, Southern Blots, IPCR, Sequencing Methods).
The Source Data table shows the BLASTN output and placement scores for all fDs sequences that had a match (>1e-10) to the maize pseudomolecules.
Our placement pipeline uses the Solar program to aggregate BLASTN output HSP (high scoring pair) blocks that are likely to be part of the same query-subject match, but are separated by large or small gaps. This can be seen in "Total Query Start/End" and "Total Subject Start/End" columns, which aggregate the HSP block coordinates into an overall span, according to block number ("Number of Blocks"). Individual HSP start/stop coordinates can be seen in the "Query Start/Stop" and Subject Start/Stop" columns.
The last series of columns shows the computed placement quality and score for each block aggregate (See below).
How to use this table? If you arrived here from a Barcode Record page, you will see all aggregated blocks for your fDs sequence of interest, and their genomic coordinates. You can now get a picture of how the fDs sequence matched to the genome, and the quality of the match for each block.
To view all fDs matching blocks, click Clear Search. You can now browse the entire dataset or do other searches and sorts as desired.
- Search by Query ID: Enter an fDs GI to view all blocks for that gi
- Search by Subject ID: Enter a chromosome number to view all blocks for that chromosome
- To view more records at a time, select a larger value for "# per Page" and the click "Submit".
- To sort the returned results, click on any column header. Multiple column sorts are possible using [shift + click]
Ds - Gene Distance table
This table lists current maize gene models and their estimated distance to the nearest mapped AcDs insertion mapping to the same chromosome. Each gene ID is also linked to a table showing the 10 closest Ds insertions. Note that Ds placement in the genome is provisional, and users are encouraged to take additional steps to confirm any conclusions about proximity.
- GeneId: (left column). This is the official ID for each gene model.
Hint: click ID to view a table of additional Ds insertions in the neighborhood of this gene.
- Description - based on blastp similarity to UniRef90 proteins. You can search on keywords in this column using the Search tool.
- Chromosome, From , To The chromosome and left-right coordinates for this gene.
Hint: click in the "Chr" column to view this region in genome context at ZmGDB.
- Closest Ds Insertion The barcode ID of the AcDs insertion that maps closest to this gene. But see following caveats:
- There may be multiple insertions at this location but only one is shown.
- The "closest Ds" may not actually be the closest, due to the unordered nature of subcontigs in the assembly
- Ds Insertion Coordinate: The estimated coordinate of the AcDs insertion that maps closest to this gene.
Hint: click to view the region including this coordinate at ZmGDB
- Distance Distance (Kb) to the nearest Ds insertion. Entry will be red if insertion point is within gene.
- 5' or 3' Refers to the polarity (5' or 3' w/ respect to the gene) of the closest AcDs insertion.
- Gene- Ds span: The left and right coordinates of the region spanning the gene and thte Ds insertion point.
Hint: click in this cell to view this entire region in genome context at ZmGDB
Tips on using the table:
- To search on a table column (description, geneId, barcode, chromosome) select a search field from the dropdown, enter search text and click "Search".
- To sort the returned results, click on any column header. Multiple column sorts are possible using [shift + click]
- To view more records at a time, select a larger value for "# per Page" and the click "Submit". Note that slower browsers/connections may take a long time to load and display a table with a large number of records, so patience is advised.
- To return to original sort order, refresh your browser page
- To download a .csv (comma separated values) version of the table or any search set, click Download .csv
- Questions, comments? Please email Kevin Ahern.
Seed Order Form
To order seed, use the online Seed Request Form. Users are urged to research thoroughly the most likely chromosomal location of desired barcode accessions before ordering, and to read the order instructions and important information pages.
About Ds insertion seed stocks:
We maintain all Ds insertions as single heterozygous testcross ears (Ds/+ x +/+). The lines are maintained in the W22 inbred and are homozygous for the Ds reporter at either the r1-sc:m3 or a1-m3 locus. Thus, half of the seed we send you will contain the Ds insertion and half will not. It is to your advantage to identify the plants containing the insertion in advance of pollinations. In most cases, each Ds insertion is confirmed as segregating in the stock via a PCR (preferred) or DNA blot assay in advance of shipping or billing. Currently, we are verifying over 80% of our insertion lines. However, in some instances, a Ds that was isolated and subsequently annotated in our screens was found to be a transposition event that occurred in just a single kernel, and therefore is not recoverable in sibling progeny. Thus, it is important that we verify seed stocks before we ship. For more information, see seed order information.