Reporter gene assays guide — Part 6: Choosing the right vector
Choosing the Right Reporter Vector
The vector that carries your reporter is not just a delivery vehicle. It is the foundation that determines how your data is normalised, how reliably your signal reflects biology, and how flexible your experiment is when you inevitably need to change something six months in. Most published reporter assays use off-the-shelf vectors that have been refined over decades, and the differences between them are real and worth understanding.
The standard architecture
A typical mammalian reporter vector has six elements, in this order: a bacterial origin of replication and antibiotic resistance marker for propagation in E. coli, an E. coli promoter for high-copy amplification, a mammalian selection marker (often a separate gene on the same plasmid), the multiple cloning site or gateway cassette where you put your promoter of interest, the reporter gene, and a polyadenylation signal. The arrangement matters more than people often realise.
Promoter-reporter orientation. The standard arrangement is 5' to 3' promoter, then reporter, then polyA. This gives a single mRNA with proper 3' end formation. Reverse-orientation reporters (reporter in the antisense direction) are occasionally used as background controls but are not the default for most applications.
PolyA signal. A strong polyA signal (SV40 late, bovine growth hormone, rabbit β-globin) is essential for nuclear export, stability, and translation of the mRNA. Weaker polyA signals give lower expression, which is sometimes desirable for studying strong promoters at near-endogenous levels but is rarely the deliberate choice.
Bacterial elements. The origin of replication determines plasmid copy number. High-copy origins (pUC-based) give you 500 to 1,000 copies per E. coli cell, which is great for DNA yield but can lead to recombination of repetitive elements. Low-copy origins (p15A, SC101) are used when the plasmid contains sequences that are unstable at high copy number, which occasionally comes up with certain viral promoters or repetitive enhancer elements.
Promoterless versus promoter-containing backbones
Most modern vectors are "promoterless", with an MCS upstream of the reporter and no promoter in place. You clone your promoter of interest into the MCS, and the reporter expression reflects that promoter's activity. This is the right design for studying a specific promoter.
Some vectors come with a "minimal promoter" already in the MCS, typically a TATA box or a synthetic core promoter. This is useful when you are testing enhancer activity rather than a full promoter. You clone your putative enhancer upstream of the minimal promoter, and the reporter tells you how much the enhancer boosts basal transcription. This is the standard format for STARR-seq and most enhancer-bashing experiments.
A third option is the "promoter-containing" backbone with a specific promoter already driving the reporter, for example a CMV-firefly construct. These are useful as positive controls and as normalisation standards, but they are not the right tool for studying promoter activity in the standard sense.
Selection markers and how to use them
Selection markers are genes that allow you to enrich for cells that have taken up the plasmid. They fall into two categories:
Drug-selectable markers (geneticin/G418, puromycin, hygromycin, blasticidin) work by killing untransfected cells. You add the drug 24 to 72 hours after transfection, wait for the untransfected cells to die, and what remains is a population stably expressing your plasmid. This gives you a stable cell line but takes 2 to 4 weeks of selection and expansion.
Fluorescent or metabolic markers (GFP, mCherry, fluorescent proteins in general; or the enzyme-based markers like thymidine kinase or hypoxanthine-guanine phosphoribosyltransferase in specific selection systems) allow live sorting or HAT selection.
For most reporter assays, you have two choices: transient transfection (no selection, read 24 to 48 hours after transfection) or stable integration (select, expand, validate, freeze down a cell line). Transient is faster and works for most promoter-bashing work. Stable is essential when you need a consistent, reproducible cellular background, for example when testing many compounds or doing time courses that extend beyond a single passage.
Bicistronic and IRES constructs
A common design challenge is normalising your reporter to a transfection control. The classical solution is co-transfection of two separate plasmids, but this has a problem: the two plasmids do not necessarily enter the same cell, and even when they do, the ratio of their expression varies cell to cell.
A better approach is a bicistronic construct where both reporters are translated from a single mRNA. This is done with an internal ribosome entry site (IRES) between the two coding sequences. The cap-dependent first cistron and the IRES-driven second cistron are translated from the same mRNA, so every cell that makes the mRNA makes both proteins. The trade-off is that IRES-driven translation is usually less efficient than cap-dependent translation, so the second reporter is dimmer. Ratios of 2:1 to 5:1 (first cistron:second cistron) are common.
A more modern alternative is a 2A peptide, which causes ribosomal skipping and produces two separate proteins from a single ORF. The 2A peptides (P2A, T2A, E2A, F2A) leave a small residual peptide on the upstream protein and are highly efficient, with 1:1 to 1:3 ratios depending on the peptide and sequence context. 2A peptides are now the standard for expressing two reporters from a single construct and are particularly useful in viral vectors where you need all infected cells to express both.
Reporter-specific vector considerations
Firefly vectors are mature and largely interchangeable. Promega's pGL3, pGL4, and pGL4.10 to 4.30 series are the workhorses, with the pGL4 series having a synthetic luciferase gene (luc2) with codon optimisation, removed cryptic regulatory elements, and improved consistency across cell types. Use pGL4 for any new work. The old pGL3 vectors still work but have known artefacts in some cell lines.
Renilla vectors come in two flavours: the wild-type Renilla reniformis luciferase (RLuc, in the pRL series from Promega) and the smaller, brighter synthetic Renilla variant in newer vectors. The wild-type version is still used in most dual-luciferase work for historical reasons, but the synthetic version gives better signal.
NanoLuc vectors are newer and still consolidating around a few standard backbones. Promega's pNL vectors cover the main use cases (intracellular, secreted, PEST-fused, fusion-tagged). Third-party vectors with NanoLuc under various promoters are widely available through Addgene. The main thing to verify is the presence of a degradation tag if you need one: not all NanoLuc vectors include a PEST domain by default.
Fluorescent protein vectors are the most varied. EGFP, mCherry, EYFP, mNeonGreen, and the iRFP series each have their own optimised coding sequences, and the best vector for one may be mediocre for another. The key specification is whether the protein is monomeric (most modern variants) and whether it is codon-optimised for your expression system.
Dual-secreted reporter vectors with GLuc and CLuc on the same backbone are available from several sources, including several with built-in bicistronic arrangements. They are useful for secreted dual-reporter normalisation but require careful validation.
Inducible systems
If your experiment involves turning your promoter of interest on or off experimentally, you have three main options:
Tet-on and Tet-off systems use the tetracycline-controlled transactivator (tTA or rtTA) to drive a CMV promoter containing Tet operator sites. Adding or removing doxycycline (or tetracycline) switches the system on or off. The kinetics are slow (hours) but the induction is robust. This is the most-used inducible system in mammalian cells.
Hormone-regulated systems (e.g. estrogen receptor fusions, modified progesterone ligand-binding domains) give faster kinetics (minutes to hours) but require ligand addition and have their own background issues.
Synthetic inducible systems (dTAG, auxin-inducible degron, rapamycin-induced dimerisation) are primarily used to control protein stability rather than transcription and are more relevant for destabilised-reporter work.
For most reporter assays, the choice is straightforward: if you need a chemical on/off switch for the whole system, use Tet-on or Tet-off. If you only need to control when the reporter is degraded, use a destabilised reporter with a dTAG or auxin degron.
What to look for in a published vector sequence
If you are reverse-engineering a vector from a paper, the things that matter are: the MCS sequence (to know what restriction sites or recombination arms to use), the polyA signal (to know whether to add one if subcloning), any degradation tags on the reporter (often overlooked in figure legends), the selection marker (to know how to select), and any insulator or boundary elements. Insulators like the cHS4 element are occasionally included to reduce position-effect variegation in stable cell lines and are worth knowing about.