TISigner FAQ
Detailed guide to optimizing translation initiation
Core Concepts
What is TIsigner and how does it work?
TIsigner (Translation Initiation coding region designer) optimizes gene expression by targeting the rate-limiting step of protein synthesis: translation initiation. It calculates the opening energy of a segment , which is defined as where is the thermodynamic beta, is the partition function over all possible structures where is unpaired, is the total partition function. The ratio of these two partition functions is also the probability that the region is unpaired. A lower opening energy means the ribosome binding site is more accessible, typically increasing protein yield. By making the mRNA open TIsigner ensures the ribosome can bind more efficiently.
Why focus on the specific window around the translation initiation region?
The ribosome footprint -24:24 for Escherichia coli covers 48 nucleotides upstream and downstream of the start codon. Using 11,430 recombinant protein expression experiments in E. coli from DNASU, we found that the opening energy in this specific window is the most significant predictor of protein expression. TIsigner focuses its computational power here to maximize impact while minimizing changes to the gene. Similarly, in other datasets for other organisms, we found different regions (Saccharomyces cerevisiae (-7:89) and Mus Musculus (-8:11)) to be the most important ones. These regions are automatically selected when you select the host. Please see our paper DOI:10.1371/journal.pcbi.1009461 for more details. For Other hosts, we set the region -24:89 which covers all the above cases. However, in each of the cases, users can also input their own region of interest.
What are Synonymous Codons?
TIsigner uses 'silent' mutations. It changes the DNA sequence to improve mRNA folding properties without altering the amino acid sequence of the protein. Your protein remains identical; only its expression efficiency changes. Check the Translation Verified button in the results to confirm your sequence's integrity.
What is SoDoPE analysis?
SoDoPE (Soluble Domain for Protein Expression) predicts protein solubility using the Solubility-Weighted Index (SWI). It helps estimate if your protein will remain soluble in E. coli or aggregate into inclusion bodies. For more details, please see our paper DOI:10.1093/bioinformatics/btaa578 or SoDoPE FAQ page.
SoDoPE FAQWhat are the data used?
The training data for SWI is curated by DNASU and can be downloaded here. The data consists of sequences with pET21 vector with expression as 1 (success) or 0 (failure). The promoter is T7 lac GGGGAATTGTGAGCGGATAACAATTCCCCTCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACAT.
How do I cite the tools?
Please visit our general FAQ page using the button below for the citation details.
General FAQTechnical Parameters
What is expression score?
The expression score is a rescaled value (10 to 100) derived from logistic regression of opening energy distribution obtained from PSI:Biology (8,780 'success' and 2,650 'failure' experiments using an E. coli T7 lac promoter system). A score of 90 is the recommended default. Note that expression of 100 does not guarantee 100% success, as extreme overexpression can sometimes lead to cell toxicity or aggregation. This score is currently calibrated only for the E. coli T7 lac promoter system. For other systems, you can either minimize or maximize expression. The relationship between expression score and posterior probability of success conditional on PSI:Biology data and their 95% confidence interval is shown in the graph below. The dashed/faded regions are the exatrapolated values as the ranges from ~3 to 30 kcal/mol in PSI:Biology experiments.
Posterior Probability & Expression Score
Why is a 5′ UTR (promoter) sequence required?
The 5′ UTR (the sequence preceding the start codon) significantly affects mRNA folding around the initiation site. For accurate calculations, TIsigner includes this sequence. If using a custom vector, provide at least 71 nucleotides of the UTR to ensure the computational window is fully covered. Shorter UTRs might cause an error as we may not be able to compute the opening energy in the regions of interest.
What is substitution type?
Translation Initiation Region: Optimizes only the first few codons. This is often sufficient and allows for low-cost optimization via nested PCR using forward primer from the optimzed sequence.
- Full-length: Optimizes the entire gene. This is useful for removing internal repeats, unwanted restriction sites, or internal Shine-Dalgarno sequences.
How should I use Restriction Site constraints?
If you plan to clone your optimized gene using restriction enzymes, add those enzymes to the 'Avoid' list. TIsigner will ensure the optimized sequence does not contain those specific motifs, checking both the forward and reverse strands. We also preselect some restriction modification sites (RMS) based on host. For E. coli we preselect RMS for Universal Type IIS/Golden Gate Assembly (BsaI, BsmBI, BbsI) and classic BioBricks enzymes (EcoRI, PstI, XbaI, SpeI). For S. cerevisiae we also preselect three universal enzymes and the MoClo/YTK standards (SapI). For M. musculus and Others host, we preselect just the three universal enzymes.
What is the 'Quick' optimization strategy?
Quick: A fast heuristic (simulated annealing) that finds high-quality solutions in seconds.
- Deep: An intensive iterative process designed to satisfy multiple complex constraints (like removing both terminators and restriction sites). Due to the nature of constrained optimization, if a perfect solution isn't found, you may need to relax some constraints or try a different Random Seed.
Where is Codon Adaptation Index and other similar metrics?
Metrics like the Codon Adaptation Index (CAI) often have low predictive power (AUC ≈ 0.50) for protein yield. Instead, TIsigner focuses on opening energy which has highest predictive power (AUC ≈ 0.70 ). We do however show the overall GC content and GC using sliding window of 19bp to ensure your gene is easy to synthesize and express. GC below 30% (unstable/low melting point) or above 70% (prone to PCR failure) are set as warning limits on the local GC comparison plot.
Feature Importance (AUC)
Troubleshooting & Privacy
Why did I get a 'Job Not Found' error?
To protect user privacy and manage server load, job results are automatically deleted after 7 days. If you need to keep your results longer, please download the optimized sequences in FASTA or CSV format or take a screenshot of the analysis.
What does a 'NaN' value in the results mean?
This usually occurs if the input sequence is too short for the selected window or if the sequence contains non-standard characters. Ensure your input is a valid DNA/RNA sequence and that the optimization window is within the sequence bounds.
What does '3 predicted terminators, 2 restriction sites and 1 synthesis complexity issue (1 Internal RBS (Shine-Dalgarno)) found in the sequence' mean?
During the optimization we also check the input and variants for the presence of terminators, restriction modification sites (RMS) and synthesis complexity issues. Depending on your expression system, you may need to exclude some of the preselected RMS or add your own. For synthesis complexity issues, we check for long homopolymers and sequences such as Shine-Dalgarno sequence that make DNA synthesis/expression difficult for many commercial vendors. We also check for terminators using RMfam. All of these features, if found, are displayed on the sequence topology plot. Depending on your expression system, some of these issues may not be important for example you have a RMS site checked but it is not applicable to your cloning system.
What does 'Optimized variants were found but contained remaining constraints which were filtered out in Deep mode' mean?
In Deep mode, we discard any variant that fails any of your constraints (e.g., a variant that has a high score but still contains a forbidden BsaI site). If no variants appear, please try the Quick mode or remove non-essential constraints such as some restriction sites to broaden the search space.
I am getting unexpected results?
TIsigner saves your browser settings for convenience. If you are starting a new project, please use the Reset button in Advanced Settings to clear old parameters and return to defaults. The button looks like this:
I still got some error?
Please contact us with the detailed job id if possible or screenshot of the error/results page.
I found a bug 🐛 !
Please contact us or open an issue in our GitHub repo.