Ambiguous Matches


This option allows matches between ambiguous bases in input sequences and bases of their categories in recognition patterns, such as matches between GAATYC in input sequences and GAATTC in recognition pattern of enzymes such EcoRI.

Occasionally the input sequences may contain ambiguous/unknown bases such as Y, S, D, etc. By default, ambiguous bases in input sequences are not allowed to match any bases except compatible ambiguous bases in recognition patterns while ambiguous bases in recognition patterns are allowed to match any bases of their categories in input sequences. For example, strings of GGYRCC in input sequences will be recognized by enzymes, such as BshNI, BanI, etc, whose recognition pattern (GGYRCC) has compatible ambiguous bases at the corresponding positions. Such strings, however, are not allowed to match the recognition pattern GGCGCC of enzymes such as AccB1I because Y means either C or T and R means either A or G, but the enzymes only recognize C at the Y postion and G at the R position. But sometimes such matches may be desirable (for example, for searching missing restriction sites in new sequence data that contain ambiguous bases).

Note that if you set this option and your sequence contains a certain number of Ns in a row, most or all enzymes in the database may be matched.


Webgenetics™ Copyright (c) 1996-2014 TQiSoft. All rights reserved.