Institute Genomic Analysis Laboratory
sequence indexed TDNA insertion
1. Q: Which Arabidopsis ecotype was used
for the T-DNA collection?
A: Columbia-0 (CS60000, the sequenced genome)
2. Q: Which T-DNA border flanking sequences were
amplified and sequenced? What are the primers do you use for left border PCR?
A: The T-DNA left border sequence was used
for PCR amplification of plant flanking sequences.
For PCR 1, we use LBa1 primer: 5' tggttcacgtagtgggccatcg 3'
For PCR 2 and sequencing, we use LBb1 primer: 5' gcgtggaccgcttgctgcaact 3'
3. Q: Is there any way to design a Right Border primer for
PCR at the other end of the insertion? Is the right border less predictable in its insertion pattern?
A: Please use the RB in the pBIN-pROK2 insertion sequences or try to use the sequences
(it is actually transferred T-DNA) and our iSect primer design tool.
4. Q: What is the number of T-DNA inserts per
A: Approximately 50% of the lines contain
a single insert, the other 50% of lines contain two or more inserts.
5. Q: Are multiple T-DNA insertions amplified
and detectable by sequencing?
A: In most cases, we have identified only 1 T-DNA flanking
sequence per insertion line.
6. Q: How can I obtain seeds for the Salk insertion
A: Seeds for all sequenced indexed insertion
line are made available through ABRC and NASC .Seed requested should be
directed to these stock centers. The Salk Institute Genomic Analysis lab
will not distribute seeds to individual laboratories.
7. Q: The ABRC or NASC has sent me seeds for a Salk insertion
mutant. What generation (posttransformation) are these seeds?
A: The sequence-indexed lines available
from ABRC and NASC are segregating T3 lines.
8. Q: What is the plant selectable marker used
for the transformation?
A: The marker is NTPII (kanamycin
resistance). However after several generations of growth, some of the lines
show silencing of this gene. Thus, it is not unusual for a mutant line
to not express the drug resistance phenotype.
9. Q: What is the T-DNA transformation vector
used to generate the mutant?
A: pROK 2- (REF:
BAULCOMBE DC, SAUNDERS GR, BEVAN MW, MAYO MA, HARRISON BD EXPRESSION
OF BIOLOGICALLY-ACTIVE VIRAL SATELLITE RNA FROM THE NUCLEAR GENOME OF TRANSFORMED
PLANTS,NATURE 321 (6068): 446-449 MAY 22 1986).
vector is a derivative of pBIN19. See our web site for a restriction map
of the pROK2 vector and the sequence of pBIN19.
10. Q: Which T-DNA border is used for plant
flanking sequence isolation?
A: Currently we are sequencing only
from the left border of the T-DNA.
11. Q: Given that each Salk line
contains an average of 1.5 -2.0 TDNA insertions. How many sequenced products
be generated per line?
A: The T-DNA may insert in more than one
place within the Arabidopsis genome. Due to the nature of the T-DNA integration
event and the limitations of the PCR border recovery protocol, PCR products
are sequenced directly without separating the products of each reaction.
Thus, some sequencing reactions may contain two or more overlapping sequences.
Our BLAST search uses the best quality sequence to generate a single BLAST
hit in the genome. It is up to the individual researcher to verify the
sequences of each PCR product from any line.
12. Q: On the T-DNA Express website
what are the arrows above the T-DNA line?
A: The arrow indicates the orientation of
the T-DNA insert within the chromosome. The arrow point in the 5' to 3'
direction beginning with the left border DNA and into the plant flanking
13. Q: What is the DNA sequence around
the left border of pROK2 and where do the PCR isolation and DNA sequencing
A: The sequence of the
left border of pROK2 and the position of the PCR amplification and sequencing
14. Q: What
is the sequence profile of the T-DNA inserts in the SIGnAL T-DNAExpress
database (and which has submitted to GenBank- GSS division)
unprocessed DNA sequence profile looks like this:
5' T-DNA left-border
last base of left border T-DNA is position 6117)
by genomic plant DNA sequence
by the linker/adaptor sequence used for PCR amplification...(the adaptor
sequence begins with 5'GCGTGCCC...)
15. Q: Is
sequencing of the entire T-DNA collection finished?
the LBb1 nested primerfor determining the DNA sequence of the PCR-amplified
plant flanking DNA.
above sequence for the precise position of LBb1 within the TDNA left border)
DNA sequence file,shown in TDNAExpress database and submitted to TAIR and
GenBank, contains only plant genomic DNA sequences. The T-DNA vector and
linker/adaptor sequences have been removed.
A: Yes. The funding ended on Sept 1, 2003
so we will not be sending more lines to ABRC.
The project funding began September 1, 2001 and continued for 24 months.
Monthly deposits of sequences in Genbank and Salk mutant lines in ABRC
and NASC had been made.
16. Q: On the T-DNA Express website,
some insertion lines are very close in number or have the same number.
For example SALK_005299, 005300, 005301, and 005305) have the same sequence.
How can this be?
A: We have found that the ABI 384-well sequence
precipitation method can lead to cross-contamination of samples. A new
protocol was devised to reduce the number of contaminating sequences. When
verifying insertions of interest you should order and test all lines listed
for that gene.
The most definitive test to
determining if the line contains the expected gene insertion is to use
Southern blotting. Often times PCR can give a false negative result. We
strongly suggest that you carry out Southern blotting
and hybridization with the labeled gene as a probe to determine if the line
you received contains an insertion in the expected gene.
17. Q: What are the items labeled-
RAFL_xxxx, Ceres_xxxx, AF_xxxx or AY_xxxx -within the TDNAExpress database?
A: These are partial or full length cDNA
sequences. These sequences are being produced by the Salk, Stanford, PGEC
Consortium in collaboration with the RIKEN Genome Science Cente. In the
case of Ceres_xxx, these are cDNA sequences that have been released to
the public by Ceres Inc. See the . Please note, the SSP is not distributing
cDNA clones. This function has been assumed by the ABRC and RIKEN. Ceres
cDNA are only available from Ceres. To obtain cDNA clones, please refer
to this page for details:http://signal.salk.edu/SSP/index.html
18. Q: I searched the TDNAExpress database
using a protein/DNA sequence of interest to my lab and got the following
result (below). What is the name of the Salk insertion line and how can
I order it?
TITL hypothetical protein
A: The sequence-indexed line (or "hit")
that corresponds to your query sequence is called: SALK_013537.54.95.x.
The first part of the name "SALK_013537" corresponds to the ABRC stock
number for this line. The "54.95" stands for the average of the best continuous 20 bases' seuqnce quality scores. The "x" means that vector sequence found and removed.
If an "n" instead of "x", there is no vector sequence found. If an "f" means that the hit is
the best match in the genome but the sequence
quality is low so it may not be correct. Use at your own risk!
The approximate location of the insertion site is
indicated (chromosome 1-position 127,748 bases) along with any associated
gene annotation. In this example, the TDNA insertion is found within a
predicted exon in this hypothetical protein. Investigators should confirm
these results by PCR amplification of plant genomic DNA from Salk_013537
using a left-border primer and genomic sequences that correspond to the
plant flanking DNA displayed within the TDNAExpress database.
19. Q: Hi! I would like to ask you a couple of things, and for doing so I took one example from the T-DNA Express page:
1) 55.50 is the avg of the 20 best continuous base quality scores. Is
this a good quality value? how would you compare this value with a value,
let's say, of 13.80?
We use a program, Phred to read DNA sequencer trace data. It calls
bases and assigns quality values to the bases. The quality score ranges
0 to 56 from low quality to high quality. 55.50 is definitely a good
score, while 13.80 might be not. Nevertheless, sequences with low quality
may still be true and map to the genome nicely.
2) .x means vector sequence found and removed. Does this mean that some of the
binary vector used for transformation goes along with the T DNA and it comes
after sequencing with LB primer? let's say, in order: LB T-DNA, vector
sequence, genomic sequence?
A: The vector sequence is
either the T-DNA sequence or the adoptor sequence, or both. The order should be:
LB -- T-DNA -- genomic sequence -- adoptor sequence.
3) L300-3' This means that the insertion is located in the last 300 bases
of the gene, at its 3' end. Could be that is actually in the last exon?
or in the 3'UTR? And then to double check i need to use PCR?
20. Q: How do I acknowledge Salk sequence-indexed
insertion lines in talks or publications?
A: Much of the data from The Salk
Institute Genomic Analysis Laboratory (SIGnAL) that is available from our
web site or other outlets (such as GenBank or TAIR) has not been published
in the traditional sense of peer-reviewed papers. This raises the issue
of the appropriate way to cite this data in your publications. It is our
intent that you be able to fully utilize SIGnAL data in your work and we
are committed to distributing this data widely and rapidly. In return,
we ask that you cite The Salk Institute Genomic Analysis Laboratory as
the source of Salk sequence indexed insertion line data in any publication.
Until such time at which these data are
published, we suggest the following acknowledgement: "We thank the Salk
Institute Genomic Analysis Laboratory for providing the sequence-indexed
Arabidopsis TDNA insertion mutants"
While the ABRC and NASC are distributing
these materials, they are not the primary source of the material. We suggest
that authors separately acknowledge the ABRC or NASC for providing them
with materials developed in our laboratory.
21. Q: Are there any other ways to access the information
in addition to SIGnAL T-DNAExpress page?
A: Yes, There are two ways to access the information.
1) we submit all sequences to Genbank (GSS division
http://www.ncbi.nlm.nih.gov:80/dbGSS/). (the identified "TDNA" can be
use to locate all entries in a text search).
2) we provide all sequences to TAIR - they provide a blast service
with links to the available stocks.
22. Q: What method do you use to recover the plant flanking T-DNA
sequences in the Salk insertion lines?
A: While several PCR methods are available for recovery of insertion
site flanking sequences, we have found the method of Seibert et al.
to work best. Once the flanking sequence has been determined,
investigators can confirm the insertion site using PCR and two
primers- one derived from our flanking sequence and the other using a
T-DNA left boarder sequence (See previous question for details of LB
(Siebert,P.D., Chenchik,A., Kellogg,D.E., Kukyanova, K.A. and
Lukyano, S.A (1995) An improved PCR method for walking in uncloned
genomic DNA NAR 23: 1087-1088
23. Q: Where can I get the pROK vector?
A: The pROK vector is at the ABRC now and ready to request.
Please order the stock through the TAIR web site (www.arabidopsis.org), the stock number is CD3-445.
24. Q: Is the insertion point in the tdnaexpress page the exact insertion site?
A: The exact insertion point may be
located at a distance from the sequence that we are providing due to
overlapping reads of two or more sequences in these T-DNA lines, or other reasons. The
first base we provide is the first high quality base in the sequence
trace and not necessarily the first base at the insertion site. The actual insertion site may be within 0-300 bps from the arrow direction of the point that we are providing.
25. Q: We have been using the SALK lines to obtain WRKY knockout lines. In the
course of our analyses we often have encountered that the LBb1 primer
alone gives a clear 450 bp PCR product even under relatively srtringent
PCR conditions (60°). Naturally, we assumed that we have some
contamination problem. However, yesterday a completely independent lab
at our institiute working with another gene family and with their own
primers told us they observe exactly the same (I saw their gels and I am
certain it is the same product). Have you seen this or has othrs
reported this to you?
A: This 450 bp fragment is an artifact of the LBb1 primer and
occurs in wt DNA under certain conditions. You can amplify with the LBa1 primer, but it will require different
conditions for pcr and lies farther in the T-DNA (about 180 bp away from
Another suggestion is to choose another polymerase. We use a hot start
taq polymerase for better specificity. We noticed a highly robust enzyme
such as Takara's ex-taq has presented problems for us when used with LBb1
26. Q: First, there are two SALK_009273:009273.19.95.x and 009273.56.00.x
They are both at the same position on At5g65430, but according to T-DNA
Express, the T-DNA insertions are in opposite directions. Does it mean that
I will have to run PCRs in both directions?
A: In many cases the T-DNA insert has two left boarders (one at each
end of the insertion site (LB-RB-RB-LB). In this line, it looks like
we have recovered both flanking sequences. You could use either
side for PCR to confirm the insertion.
27. Q: I received T-DNA express SALK-006294.
I don't know what high e-value mean. Please, let me know how can I understand " Caution: high e-value"?
A: For a small percentage of the Salk insertion lines, a high
quality flanking sequence could not be obtain. Rather than not make
any information (and the associated mutant seed) available, we
provide the results obtained along with a note of caution: high
e-value" (cutoff 1e-04 is used). It means that there is low similarity of the flanking
sequence to the genome sequence. Indicating a possiblity that the
insertion may not lie in that gene (and could possibly be due to an
insertion in a gene family member). PCR verification of the
insertion in necessary- see our new primer verification tool.
VALU 7e-07 Caution: high e-value.
TITL unknown protein; protein id: At1g01020.1 [Arabidopsis thaliana]
28. Q: while looking for T-DNA insertions for CODE At1g62830,
I obtained names of different hits with the same clone number. For example:
SALK_048276.55.75.x, SALK_048276.23.95.x and SALK_048276.40.90.x.
A: These are 3 reads from the same mutant with different quality scores.
29. Q: I have used the software in the web page to find suitable RP and LP and it gave me this result:
SALK_029701 PRODUCT_SIZE: 977:
LP: gcagctgcatcaggttcgtct LENGTH: 21 TM: ....
RP: ccccttttcttcgttcgcatc LENGTH: 21 TM: ....
When I take the my whole gene sequence and I can not find the
regions where the primer where designed by the software (at least I
should one of them). But when I paste the result of the primer
design in the T-express search it give me the right thing.
A: The primer design in the web is based on
insertion location and the genomic sequences. Therefore, LP or RP is not necessary within the
gene sequences. Like this example, the RP is in the gene's promoter or intergenic region.
30. Q: Is it correct that MAX N should be 500 instead of 300 for design of primers to work with LBa1? I would think that MAX N should be 100 in
these case to preserve the size of the band for HM, which will be 300 + 310 +N = approx.700. 310 in this equation is a distance between LBa1 and left border (110+200) if LBa1 locates 200bp further than LBb1. Is it right?
A: When using LBa1, the MAX N no longer stands for the distance
between the actual insertion and flanking sequence site. It is the
distance + 200, that is, plus the distance from LBa1 to LBb1.
If HM and using LBa1, the new product size is estimated:
200+110+(N-200)+300 = 410+N ~= 910
LBa1 LBb1 LB FS RP
| | | | |
| 200 | 110 | N-200 | 300 |
If no insertion, the genomic product size is estimated:
300+N+300 = 300+200+(N-200)+300 = 600+N ~= 1100
LP LB FS RP
| | | | |
| 300 | 200 | N-200 | 300 |
| N |
LP -- left primer
RP -- right primer
FS -- flanking sequence start point
LB -- the insertion site (Left Border)
LBa1, LBb1 -- the LB primer
31. Q: When I was looking for T-DNA insertions for CODE At3g28290, I obtained
names of different hits with the same clone number, SALK_070418.25.45.x, SALK_070418.35.40.x.
But when I looking for At5g41800 ,I also found SALK_070418.20.10.x. Now I have the seeds of SALK_070418,
but how I know which gene the seeds should belong to? Use PCR or other methods?
A: Some lines were sequenced more than once due to their sequencing
quality was not good at first time. A lower sequencing score is usually caused by that a line has two or
more insertions, or reaction failure. We usually put all sequences on our web site, if the sequences have
efficient hits, in order to give users full information and let them make their own discrimination.
For the line SALK_070418, we sequenced it three times because its
sequencing scores were lower than 30 at the first and second times. Obviously this line might have two
insertion sites, one in chr3, the other in chr5. You could use http://signal.salk.edu/tdnaprimers.html to generate
primers. The page would return 2 pairs of primers (two sequences share one pairs). You could use them
with LBa1 or LBb1 to set up two PCR reactions. By combining the results from two reactions, you could
identify if a line is with two insertions or one insertion, as well as is HM, HZ or WT.
32. Q: Are SAIL lines (formerly GARLIC) now deposited into your databse? If so, what is the binary vector sequence and the prospcetive LB primers used to
sequence? If not, where would I obtain information in that regard?
A: Please click any Sail name and follow the "about Sail" link.
33. Q: I have some SALK T-DNA insertion lines that I am trying to genotype
using PCR. I have been using the isect toolbox website to design
primers. I am a little confused about the distance from the primer
LBb1 to the insertion site. On the isect website it describes the
distance as 110 bp. However, when I look at the pBIN-pROK2 map from
the website it looks like LBb1 locates 216 bp from the left border. I
was just curious where the 110 bp number came from?
A: When we sequenced the pROK2 vector, the locations of the Left Border and LBb1
primer as shown in the vector map are correct. LBb1 should be 216 bp
from left border. Experimentally, though, the distance from LBb1 to the
left border is variable. For the whole T-DNA set,
we found, majority of them, LBb1 to be 110 bps from Left border or the flanking sequence.
Score = 58.2 bits (23), Expect = 5e-12
Query: 1 tggttcacgtagtgggccatcg 22 LBa1
Sbjct: 6481 tggttcacgtagtgggccatcg 6458
Score = 55.7 bits (22), Expect = 3e-11
Query: 1 tcaaacaggattttcgcctgct 22 LB6313
Sbjct: 6313 tcaaacaggattttcgcctgct 6292
Score = 55.7 bits (22), Expect = 3e-11
Query: 1 gcgtggaccgcttgctgcaact 22 LBb1
Sbjct: 6280 gcgtggaccgcttgctgcaact 62
34. Q: I have used the TDNA verification primer design tool in the past and it has worked great. However it won't design primers for: SAIL_748_E04 and SAIL_748_B04.
Do you know why or have any suggestions?
A: It is happened, usually due to that the region(s) picked to design primers is tadem repeat region(s) and not suitable for primer design. You could see it by checking the "Format" box. In this case, you could change the ext5, ext3 and pZone value to choose the right regions for primer design. For example, set ext5 = 100, ext3 = 100, pZone = 200. Nevertheless, you might have to pay attention to product sizes if you run LP+RP, LB+RP products on a well. It does not matter if they are runned on seperated wells.
SAIL_748_E04 Insertion chr3 4467442 No Primer Found. Please edit parameters and try again.
PRIMER PICKING RESULTS FOR SAIL_748_E04
Using 0-based sequence positions
NO PRIMERS FOUND
SEQUENCE SIZE: 1101
INCLUDED REGION SIZE: 1101
EXCLUDED REGIONS (start, len)*: 101,899
KEYS (in order of precedence):
XXXXXX excluded region
35. Q: If could I share my exprimental analysis data with other scientists for specific lines?
A: Yes, you can click the [analysis] for specific line and input your exprimental data to share them with others. For example: http://signal.salk.edu/cgi-bin/tdnaexpress?JOB=TEXT&TDNA=SALK_112098 or http://signal.salk.edu/cgi-bin/tdnainfo?TDNA=SALK_112098.
36. Q: If all SALK Homozygous lines contain only one T-DNA insertion?
A: No. They are only selected for being homozygous for a single insertion, about 1/2 of all lines have more than one insertion. So it is critical to look at two independent insertions when examining phenotypes.
37. Q: I was bugging one of our user to give us the accession id of the mutant allele they had been using. They said it is SAIL_390_C01. However, in the process of retrieving the info, the found out that this line is reported on the SiGnal site to be located either in intron 2 or exon 4. The information they had published (that had been obtained from Syngenta through a Blast) said that the insertion was in intron 6. Moreover, the researcher had sequenced the insertion and found that it was indeed in intron 6. Do you know why it appears at a different location in the SiGnal DB?
A: Our datebase shows the insertion lies in exon 4 - tdnaexpress?LOCATION=23464478&CHROMOSOME=chr3&INTERVAL=1
As to the source of the information, we got all the insertion sited sequences directly from Syngenta. most likely due to the sequences for quality so that actual site of insertion may be close to (usually 0-300 bps) - but not actually the mapped insertion site. this is true of all sequence-indexed insertion lines.
Nevertheless, you could retrive the FST from our site. If the FST was not trimmed, you could blast align them on the genome and then plus the start mapping bps to estimate the actual insertion site. Due to lack of information if the FST is trimmed or not, we did not provide information of the estimate site.
In your case, the region 504-737 bps of the FSTs were mapped on the genome, that is, its actual insertion site is at least 500 bps away in the downstream of the genome, if the FSTs was not trimmed and the vector sequence was removed. The insertion site is estimated as 23464585+504 = 23465089, which falls in the 6th intron of the gene
( W/23463915-23464130, 23464212-23464282, 23464381-23464395, 23464512-23464623, 23464712-23464811,
38. Q: Dear SIGnal team, I need three T-DNA lines, but I can't find these three lines anywhere:
Is it possible to get them over you directly?
A: you can order them from ABRC... but use the shorter name for your search.. You sent the sequence name...but the line (plant) name is
the see means homozygous. .but you can ask for segregating populations too...
39. Q: I am trying to design primers for a SAIL line. In regards to the
insert primer, you provide 3 primer sequences (LB1-3). How do I determine which one to use for my line?
There is no info on which "C/" group a particular line belongs to. If these were the primers used for
Syngenta's TAIL PCR, should we use LB3 for any SAIL line?
A: The SAIL lines use TWO T-DNAs, however, they both have the same left border region sequences, even thought the right borders are different.
Therefore, any of LB1, LB2 or LB3 can be used for ALL SAIL lines. Nevertheless, the Syngenta used the LB3 to sequence all their lines to get their FSTs.
The c/ group has no difference with w/ group with respect to their LB primers, as well as its LP or RP. The iSect Primer tool has already adjusted to recognize the insertion dirction, thus LB + RP will be always the combination for the insertion PCR, while LP+RP for the Wile Type.
40. Q: I am using the primers suggested in the web page:
http://signal.salk.edu/tdnaprimers.2.html and I encountered a problem.
There is a description in the web page:
"By using the three primers (LBb1+LP+RP) for SALK lines, users for
WT (Wild Type - no insertion) should get a product of about 900-1100
bps ( from LP to RP ), for HM (Homozygous lines - insertions in both
chromosomes) will get a band of 410+N bps ( from RP to insertion
site 300+N bases, plus 110 bases from LBb1 to the left border of the
vector), and for HZ (Heterozygous lines - one of the pair
chromosomes with insertion) will get both bands."
However, I got 2 bands for the homozygous mutant instead of one when
I using the three primers together. There is always a strange band
with a size nearly 800 bp appeared in the mutant sample. I wonder if
the primer suggested here have some problems (for the LBb1 I am
using ) or it is just normal to got three bands with different size
in the reaction.
A: On the site you referred to:
there is another primer LBb1.3 that is suggested as a better primer
LBb1. LBb1 can sometimes produce an extra band around 800bp ( pretty
similar to what you are describing ) which the LBb1.3 primer does not.
In addition to switching to LBb1.3, I would suggest that you don't
all three primers in a single reaction. This saves an extra
can also result in artifact bands. Instead, you might want to run an
LeftPrimer-RightPrimer pair , and a RightPrimer-LBb1.3 pair as two
seperate reactions on the DNA sample that you are genotyping.
41. Q: I need to ask some questions about the SALK insertion lines.
Is the insertion of the t-dna in only ONE position within the affected
gene or can there be multiple insertions within the affected gene?
2. Each SALK line corresponds to a specific gene, is this correct?
I know that some genes have more than one SALK line associated with them.
3. Is it correct to assume that insertions can be present in other
parts of the genome of a given SALK line besides the affected gene?
A: 1. Only 1 in that gene.
but could be others in that same line.
2. Yes, any line can have more than one insertion that is sequenced.
3. Yes, this is why analysis of two independent alleles is required.
42. Q: We have recently used your lab's adapter ligation-mediated PCR
technique to explore several unusual T-DNA insertions, and I had a
few questions about how the protocol would need to be modified in
order to investigate T-DNA right-border junctions. Have you ever
used this technique in this manner (using right-border T-DNA
primers)? Can you recommend any primers which were particularly
successful? If you have not, can you recommend any changes to the
protocol that would be necessary for this to work?
The primers we used for the right border rescues were:
JMRB2 "TGATAGTGACCTTAGGCGACTTTTGAACGC" (this is the primer for the
JMRB1 "GCTCATGATCAGATTGTCGTTTCCCGCCTT" (this is the primer for the
2nd PCR, nested to JMRB2)
We did not get very good results for the rescue of the right
borders. In our experience we used to get tones of nice PCR
products, but MOST of them contain no Arabidopsis DNA (they were
just vector sequences).
After few hundred sequences and having tried few different
restriction enzymes, we decided that it was not worth it for our
This does not mean you should not try. I think you can increase your
chances of getting the RB insertion sites by using not one but
several restriction enzymes (one at the time).