Molecular biology guides

Antibody-shaped transcription factor binding a DNA double helix — B.E.E.P. promoter and ChIP analysis

How to use the BondLab ENCODE Enabled Promoter (B.E.E.P.) analysis tools

Open the tool: B.E.E.P. at thebondlab.net/beep/ — use http/https (not a local file copy) so ENCODE and Gene Ontology requests work.

B.E.E.P. (BondLab ENCODE Enabled Promoters) is a free browser tool on The Bond Lab site. It stitches together three jobs that people often run as separate command-line steps: define promoter regions for a list of human genes, pull transcription-factor ChIP-seq binding data from ENCODE, and find which promoters overlap those binding sites.

What is B.E.E.P.?

B.E.E.P. runs in your browser and works with coordinates on the human genome (GRCh38 / hg38): chromosome, start, and end. It does not fetch DNA sequence. That keeps the workflow fast and makes outputs easy to drop into a genome browser, overlap utilities, or downstream pipelines.

The promoter side uses a pre-built cache of transcription start sites from UCSC RefSeq (curated RefSeq, the same general idea as our Promoter Grabber tool). The ChIP side talks to the ENCODE Project REST API. Overlap is done in JavaScript in your browser once the BED files are loaded.

BED files: what are they?

A BED file is a text table of genomic intervals: stretches of chromosome with a start and end position. It is one of the common currency formats in genomics, especially for UCSC, IGV, and Galaxy.

Coordinates are 0-based and half-open. For a line like:

Example: chr1 1000 2000 MyRegion 0 +

the interval covers bases from position 1001 to 2000 in 1-based counting (the first base included is start+1 in everyday human-speak). The end position is not included. When you talk to biologists, mentally add one to the start if you need to.

BED3 is the minimal version: chromosome, start, end. B.E.E.P. can export that, or a richer BED6 with a name column (e.g. TP53_NM_000546) and strand. Names help you see which gene or TF track a row came from when you load the file in IGV.

ChIP peak files from ENCODE are often narrowPeak, which is BED-like with extra columns for peak strength and summit position. B.E.E.P. reads those and can filter by IDR score before overlap.

bigWig files: what are they?

A bigWig is not a list of intervals. It is a compressed signal track: ChIP enrichment (or RNA-seq, methylation, etc.) along each chromosome. ENCODE publishes bigWigs so you can see binding in IGV without loading every called peak.

Peak calls are a separate, sparser product (often IDR peaks in BED/narrowPeak form). Regions that look bound in the bigWig are sometimes missing from the peak BED. B.E.E.P. handles that gap: you can overlap on IDR peaks, scan bigWig signal inside promoter intervals (closer to what you see by eye in IGV), or optionally add bigWig-derived intervals into the ChIP BED when you build it.

Reading bigWigs in the browser is heavier than downloading a peak BED, which is why B.E.E.P. can limit scanning to your promoter regions only (much faster than scanning whole chromosomes).

What can you use B.E.E.P. for?

It is a practical bridge tool, not a replacement for full peak-calling pipelines or motif discovery on sequence (see our TF binding scanner for motifs on DNA).

How it works — the three tabs

1. Promoter BED

You supply genes in one of three ways:

For each gene, B.E.E.P. looks up the RefSeq TSS and strand, then builds a promoter interval from two numbers you set:

Plus-strand and minus-strand genes are handled correctly with respect to genomic coordinates; on the overlap plots, promoters are drawn with 5′ on the left and TSS on the right so you do not have to mentally flip minus genes.

Output: annotated BED6 or plain BED3. Genes missing from the cache are reported in the status line.


2. ENCODE ChIP BED

Here you assemble binding data. Search by transcription factor and optional cell line, or paste ENCODE file accessions (ENCFF…). The tool lists matching peak BEDs and bigWigs; tick the tracks you want and click Build ChIP BED.

For each track it prefers the IDR peak BED. If you only select a bigWig, it tries to find the matching peak file from the same experiment. If you want peaks that match what you see when viewing a bigWig track in IGV, tick Add bigWig signal regions to scan the bigWig signal above a threshold inside your promoters (or genome-wide if you untick promoter-only scan).

Useful options include:


3. Overlap

With promoters and ChIP BED ready, overlap finds promoters that intersect ChIP intervals.

Peak overlap uses the BED intervals you built (IDR/narrowPeak, plus any bigWig islands you merged in).

bigWig signal overlap scans signal in each promoter for selected bigWig tracks — useful when IGV shows clear binding but the peak call is absent.

If several ChIP files are in the set, you can tick which ones count for this analysis. The TF boolean filter lets you write expressions like NFYA AND SREBP1 or (NFYA OR SREBP1) AND NOT POLR2A. Optional max TF peak distance (with AND) requires two factors' sites inside the promoter to be within that many base pairs (edge-to-edge).

Results include:

Under the hood (short version)

Limitations worth knowing

About the author: Written by Dr Mark Bond, The Bond Lab, University of Bristol. Questions about these methods: contact us or email mark.bond@bristol.ac.uk ORCID.