How to use the BondLab ENCODE Enabled Promoter (B.E.E.P.) analysis tools
Open the tool: B.E.E.P. at thebondlab.net/beep/ — use http/https (not a local file copy) so ENCODE and Gene Ontology requests work.
B.E.E.P. (BondLab ENCODE Enabled Promoters) is a free browser tool on The Bond Lab site. It stitches together three jobs that people often run as separate command-line steps: define promoter regions for a list of human genes, pull transcription-factor ChIP-seq binding data from ENCODE, and find which promoters overlap those binding sites.
What is B.E.E.P.?
B.E.E.P. runs in your browser and works with coordinates on the human genome (GRCh38 / hg38): chromosome, start, and end. It does not fetch DNA sequence. That keeps the workflow fast and makes outputs easy to drop into a genome browser, overlap utilities, or downstream pipelines.
The promoter side uses a pre-built cache of transcription start sites from UCSC RefSeq (curated RefSeq, the same general idea as our Promoter Grabber tool). The ChIP side talks to the ENCODE Project REST API. Overlap is done in JavaScript in your browser once the BED files are loaded.
BED files: what are they?
A BED file is a text table of genomic intervals: stretches of chromosome with a start and end position. It is one of the common currency formats in genomics, especially for UCSC, IGV, and Galaxy.
Coordinates are 0-based and half-open. For a line like:
Example: chr1 1000 2000 MyRegion 0 +
the interval covers bases from position 1001 to 2000 in 1-based counting (the first base included is start+1 in everyday human-speak). The end position is not included. When you talk to biologists, mentally add one to the start if you need to.
BED3 is the minimal version: chromosome, start, end. B.E.E.P. can export that, or a richer
BED6 with a name column (e.g. TP53_NM_000546) and strand. Names help you see
which gene or TF track a row came from when you load the file in IGV.
ChIP peak files from ENCODE are often narrowPeak, which is BED-like with extra columns for peak strength and summit position. B.E.E.P. reads those and can filter by IDR score before overlap.
bigWig files: what are they?
A bigWig is not a list of intervals. It is a compressed signal track: ChIP enrichment (or RNA-seq, methylation, etc.) along each chromosome. ENCODE publishes bigWigs so you can see binding in IGV without loading every called peak.
Peak calls are a separate, sparser product (often IDR peaks in BED/narrowPeak form). Regions that look bound in the bigWig are sometimes missing from the peak BED. B.E.E.P. handles that gap: you can overlap on IDR peaks, scan bigWig signal inside promoter intervals (closer to what you see by eye in IGV), or optionally add bigWig-derived intervals into the ChIP BED when you build it.
Reading bigWigs in the browser is heavier than downloading a peak BED, which is why B.E.E.P. can limit scanning to your promoter regions only (much faster than scanning whole chromosomes).
What can you use B.E.E.P. for?
- You have a gene list from RNA-seq or a pathway (e.g. inflammatory response GO term) and want promoter coordinates for all of them in one BED for a screen or for sharing with a collaborator.
- You want to know which of those promoters show binding of particular transcription factors in ENCODE ChIP in a particular cell line, without writing custom bedtools commands.
- You need combinatorial logic: promoters with MYC and SP1, or MYC without SP1, with an optional maximum distance between two factor sites in the same promoter.
- You are preparing figures or supplementary tables: gene tables, CSV with per-TF site coordinates, overlap BEDs for IGV, and promoter schematics with TF positions relative to the TSS.
- You already have a ChIP BED from elsewhere and want to append it to an ENCODE-built set, manage multiple tracks, and run the same overlap workflow.
It is a practical bridge tool, not a replacement for full peak-calling pipelines or motif discovery on sequence (see our TF binding scanner for motifs on DNA).
How it works — the three tabs
1. Promoter BED
You supply genes in one of three ways:
- Paste symbols or upload a list (up to 5000 per batch).
- Search a Gene Ontology term via QuickGO, with an option to include child GO terms (which usually increases the gene count).
- Load the full cached human gene set (~28k) for a genome-wide promoter BED.
For each gene, B.E.E.P. looks up the RefSeq TSS and strand, then builds a promoter interval from two numbers you set:
- Bases 5′ of TSS — how far upstream to extend (classic promoter length, often 2000 bp).
- 3′ boundary vs TSS — where the downstream edge sits. Zero means the interval ends at the TSS; a positive value pushes into the gene body; negative pulls the downstream edge further upstream.
Plus-strand and minus-strand genes are handled correctly with respect to genomic coordinates; on the overlap plots, promoters are drawn with 5′ on the left and TSS on the right so you do not have to mentally flip minus genes.
Output: annotated BED6 or plain BED3. Genes missing from the cache are reported in the status line.
2. ENCODE ChIP BED
Here you assemble binding data. Search by transcription factor and optional cell line, or paste ENCODE file
accessions (ENCFF…). The tool lists matching peak BEDs and bigWigs; tick the tracks you want and
click Build ChIP BED.
For each track it prefers the IDR peak BED. If you only select a bigWig, it tries to find the matching peak file from the same experiment. If you want peaks that match what you see when viewing a bigWig track in IGV, tick Add bigWig signal regions to scan the bigWig signal above a threshold inside your promoters (or genome-wide if you untick promoter-only scan).
Useful options include:
- Minimum IDR score.
- Separate BED per track vs one merged BED.
- Stringency presets for bigWig scanning.
- Append imported BED from your machine.
- Add a second search to the same set (append vs replace).
- A file list at the bottom to display, append into, or delete individual tracks.
3. Overlap
With promoters and ChIP BED ready, overlap finds promoters that intersect ChIP intervals.
Peak overlap uses the BED intervals you built (IDR/narrowPeak, plus any bigWig islands you merged in).
bigWig signal overlap scans signal in each promoter for selected bigWig tracks — useful when IGV shows clear binding but the peak call is absent.
If several ChIP files are in the set, you can tick which ones count for this analysis. The
TF boolean filter lets you write expressions like NFYA AND SREBP1 or
(NFYA OR SREBP1) AND NOT POLR2A. Optional max TF peak distance (with AND) requires
two factors' sites inside the promoter to be within that many base pairs (edge-to-edge).
Results include:
- Sortable gene table.
- CSV with one row per TF site.
- TSV gene list and overlap BEDs.
- Promoter schematics (TF ticks coloured by factor).
- Histograms of TF position relative to the TSS (bin size adjustable).
- When max distance is set, a second plot shows where pairs of different TFs tend to sit relative to the TSS.
Under the hood (short version)
- Assembly: human GRCh38 only for promoters and ENCODE file matching.
- TSS cache: bundled RefSeq table (~2 MB), loaded once; no live NCBI call per gene (unlike Promoter Grabber for sequence).
- ENCODE: browser calls encodeproject.org (rate-limited); some requests may go via a small server proxy to avoid CORS issues.
- Overlap: interval intersection in JavaScript; filtering and distance logic in the bed-utils module.
- bigWig: parsed in the browser; scanning is the slow step — leave the tab open until the progress bar finishes.
Limitations worth knowing
- One genome build (GRCh38). Mixing hg19 ENCODE files without noticing will give nonsense.
- Promoter definition is TSS-centric RefSeq; not CAGE, not tissue-specific TSS unless you use another TSS source elsewhere.
- ChIP overlap is only as good as the ENCODE experiment you pick (cell line, antibody, replicate merged in ENCODE's calls).
- Very large gene lists and genome-wide bigWig scans can take time; promoter-limited bigWig scan is the intended workaround.
.bed.gzimport for ChIP append is not supported in-browser; decompress first.
Related tools on the site
- Promoter Grabber — single-gene promoter sequence from NCBI.
- Genome browser — view ENCODE bigWig tracks in IGV in the browser.
- TF binding scanner — motif scanning on DNA sequence, not ChIP coordinates.