Instructions and Help
This document contains information about search options implemented in WebFR3D. This separate tutorial describes how to set up WebFR3D searches. If you can’t find an answer to your question, please contact us.
Changes in 2021, most recent first
- Distance constraints may be separated by semicolons or spaces, for example, "=1;>" for sequentially adjacent nucleotides in increasing nucleotide order. Exact distance constraints are specified by separating possible distances by commas, for example, "=4,5" for distance of 4 or 5 in either direction, or "=4,5 >" for distance 4 or 5 and in increasing nucleotide order.
- Type "syn" or "anti" or "intermediate syn" in the white diagonal boxes for those glycosidic bond orientations. "is" can be used as an abbreviation for "intermediate syn". These annotations are only shown in the results when you ask for them. You can force them to appear by typing "anti syn is" in one of the white diagonal boxes. That does not actually impose any constraint at all, but forces WebFR3D to load those interactions. Hopefully in the future we'll have a less clumsy way of doing that.
- Note that constraints on basepairs (like tSH) and base stacking (like s35) need to go in the yellow boxes. They can be written in the opposite order (like tHS and s53) so that you can find an appropriate box to put them in. On the other hand, base-backbone interactions (like BPh and BR) cannot be written in opposite order, so you can specify those in either the yellow boxes or blue boxes.
- When giving multiple interaction constraints in the yellow boxes, they are interpreted as having logical "or" between them, unless the keyword "and" is used. Use that between groups of different types of mutually exclusive constraints.
- cWW cWH cWS ... will be interpreted as cWW or cWH or cWS, which makes sense because these are mutually exclusive
- tHH BPh ... will be interpreted as tHH or BPh
- tHH and BPh ... will be interpreted as tHH and BPh
- Nucleotides and amino acids are listed using unit ids, which are explained on the page about unit ids
This diagram summarizes most of the available search options that can be entered in the Query Specification Matrix on the search webpages. Detailed descriptions can be found in the text below.
Sequential Distance Constraints
Set limits on the difference between nucleotide numbers using the boxes below the diagonal. (Actually, what is used is the difference between the index of nucleotides in the file, not NDB nucleotide number.)
To put an upper limit on the difference, type something like <5 or <=5.
To put a lower limit on the difference, type something like >5 or >=5.
To put both limits at once, type something like >5 <=12.
- To insist that the nucleotide in the given row have a lower nucleotide number than the nucleotide in the given column, type <, separated by a space from other specifications. For greater, type >.
Basepair, base stacking, or letter pair constraints are specified above the diagonal. To specify that all candidate motifs must have a tWH basepair between the nucleotides corresponding to the first and second nucleotides in the query motif, type tWH in the first row, second column. This means that the nucleotide in the first row must use its Watson-Crick edge, and the nucleotide in the second column must use its Hoogsteen edge. Base phosphate and base ribose constraints can be included either above or below the diagonal.
Valid basepair specifications are: cWW, tWW, cWH, cHW, tWH, tHW, cWS, cSW, tWS, tSW, cHH, tHH, cHS, cSH, tHS, tSH, cSS, tSS. Note, however, that the cSS and tSS interactions are not, in fact, symmetric, because each base can use the sugar edge differently. Following Leontis, Stombaugh, Westhof (NAR 2002), type cSs to specify that the first base has priority, csS for the second, or cSS for either. (Note: this feature is not currently enabled as of October 2021.)
Specifying multiple interactions allows more ways a candidate can satisfy the constraints; for example, typing cWH cHW requires a cis Watson-Crick/Hoogsteen basepair, but either base can use the Watson-Crick edge, and the other uses the Hoogsteen edge.
The abbreviation trans gives all trans categories, cis for cis.
Type bif for bifurcated basepairs (see NAR 2002).
Type ~cWW to exclude candidates having a cWW basepair.
Some pairs of bases are close to, say, cWW, but do not meet the strict criteria for membership in the cWW classification. Type ncWW (“near cWW”) to get basepairs that are not classified into any category, but for which the cWW category is the closest match, up to a certain fairly generous limit. Type cWW ncWW to get cWW and near cWW pairs, cWW. Type ntrans to get all pairs nearest to a trans pair.
Type s35 for stacking in which the first base uses its 3 face, and the second base uses its 5 face. Similarly, type s53, s33, ors55. Type stack to allow all stacking interactions. The prefixes “n” and “~” work with stacking, as above.
To specify that the nucleotides must match a certain pattern, type, for example, cWW CG GC to get only CG or GC cWW pairs.
To require that two nucleotides make a base-phosphate interaction, enter BPh in the corresponding yellow box. This will select pairs of nucleotides in which the first nucleotide’s base is a hydrogen bond donor and the second nucleotide’s phosphate is an acceptor. To specify particular base-phosphate categories, type 0BPh, 1BPh, 2BPh, ..., 9BPh. For near base-phosphate interactions, type nBPh, n1BPh, etc. See the original paper about classification of base-phosphate interactions for more information.
One can restrict to pairs that play a certain role in the secondary and tertiary structure. For pairs that are nested, type “N” or “nested". For pairs that cross nested interactions but involve nucleotides in the same branch of the RNA, type local or “L”. For long-range or distant interactions, between different branches of the RNA, type long-range, distant, “D”, or “LR”. Note that “nested”, “local”, and “distant” are mutually exclusive. They can be negated with ~, but ~local only returns distant interactions, not nested ones.
Currently not enabled: To find bases which are in the same plane and are close enough that they may hydrogen bond in some way, type coplanar or cp. Near and not coplanar can be obtained with the "n" and "~" prefixes, respectively.
To specify bases that participate in cWW pairs and that delimit a single-stranded region such as a hairpin loop or one strand in an internal or junction loop, type "bSS" or "borderSS" or "flankss" or "flank". Note: for internal and junction loops, flanking nucleotides will be on the same strand, one on each side of the loop. Such flanking nucleotides usually do not interact with one another. In a hairpin, however, the nucleotides in the closing basepair simultaneously make a cWW pair and satisfy the borderSS relation.
Nucleotide identity constraints
The user can impose a nucleotide identify constraint (nucleotide mask) for their search by putting in nucleotide constraints in the text-boxes on the diagonal in the Interaction Matrix, which has a white background. Typing A, for instance, means that only candidate motifs with an A in the corresponding position will be kept. Typing AG allows either A or G, etc.
The program uses these standard abbreviations for other combinations:
- M for A or C
- R for A or G
- W for A or U
- S for C or G
- Y for C or U
- K for G or U
- V for A, C, or G
- H for A, C, or U
- D for A, G, or U
- B for C, G, or U
- N for A, C, G, or U
Note that N is the default. One may also exclude a given base using the syntax ~G
for instance, to exclude candidates with a G in the corresponding position.
Nucleotides must be separated by commas in the input page. Use unit ids as described on the unit id page.
RNA-containing PDB files
The list of PDB files is updated weekly (on Wednesdays) to include all available RNA-containing PDB files. WebFR3D also includes several representative sets of PDB files at various resolutions. More information about the representative sets can be found on the representative set website and in the NAR 2009 paper.
Geometric discrepancy is a measure of how similar RNA structures are. Higher geometric discrepancy corresponds to more dissimilar structures. Identical structures have discrepancy zero. Searches with high geometric discrepancy cutoffs take significantly longer than those with lower cutoffs.
Geometric discrepancy is an entirely geometric measure that takes into account the general shape of the candidate motif and the orientations of its bases. First, we determine the shift vector and rotation matrix which map the geometric centers of the bases of each candidate motif onto the corresponding base centers in the query motif with the smallest error, called the fitting error. After the rigid body operations are performed, we compute the angles of rotation needed to align each base of the candidate with the corresponding base of the query motif. The square root of the sum of the squares (RMS sum) of these angles (in radians) is called the orientation error. The geometric discrepancy is defined to be the RMS sum of the fitting and orientation errors, divided by the number of bases in the query motif.
For more information about geometric discrepancy, please see the original FR3D paper.
You can optionally specify your email to receive a notification once your search has completed.