From 39cec4017cef73330981cecfd4ab589e8210e74f Mon Sep 17 00:00:00 2001 From: arq5x <arq5x@virginia.edu> Date: Thu, 20 Nov 2014 09:42:46 -0500 Subject: [PATCH] better tutorial --- tutorial/answers.html | 8 +++-- tutorial/answers.md | 8 +++-- tutorial/bedtools.html | 43 ++++++++++++++++++++++-- tutorial/bedtools.md | 2 +- tutorial/template.html | 75 ++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 129 insertions(+), 7 deletions(-) create mode 100644 tutorial/template.html diff --git a/tutorial/answers.html b/tutorial/answers.html index 769de7c7..81ee0d88 100644 --- a/tutorial/answers.html +++ b/tutorial/answers.html @@ -25,10 +25,14 @@ <div class="span12"> <h1 id="puzzles-to-help-teach-you-more-bedtools.">Puzzles to help teach you more bedtools.</h1> <ol style="list-style-type: decimal"> -<li>Create a BED file representing all of the intervals in the genome that are NOT exonic.</li> +<li>Create a BED file representing all of the intervals in the genome that are NOT exonic and are not Promoters (based on the promoters in the hESC file).</li> </ol> <p>Answer:</p> -<pre><code>bedtools complement -i exons.bed -g genome.txt > notexons.bed</code></pre> +<pre><code>grep Promoter hesc.chromHmm.bed > hesc.promoters.bed + +cat exons.bed hesc.promoters.bed | sort -k1,1 -k2,2n | exons.and.promoters.bed + +bedtools complement -i exons.and.promoters.bed -g genome.txt > notexonsorpromoters.bed</code></pre> <ol start="2" style="list-style-type: decimal"> <li>What is the average distance from GWAS SNPs to the closest exon? (Hint - have a look at the <a href="http://bedtools.readthedocs.org/en/latest/content/tools/closest.html">closest</a> tool.)</li> </ol> diff --git a/tutorial/answers.md b/tutorial/answers.md index 95ef89cb..35cffe1c 100644 --- a/tutorial/answers.md +++ b/tutorial/answers.md @@ -6,11 +6,15 @@ Puzzles to help teach you more bedtools. ======================================== 1. Create a BED file representing all of the intervals in the genome -that are NOT exonic. +that are NOT exonic and are not Promoters (based on the promoters in the hESC file). Answer: + + grep Promoter hesc.chromHmm.bed > hesc.promoters.bed + + cat exons.bed hesc.promoters.bed | sort -k1,1 -k2,2n | exons.and.promoters.bed - bedtools complement -i exons.bed -g genome.txt > notexons.bed + bedtools complement -i exons.and.promoters.bed -g genome.txt > notexonsorpromoters.bed 2. What is the average distance from GWAS SNPs to the closest exon? (Hint - have a look at the [closest](http://bedtools.readthedocs.org/en/latest/content/tools/closest.html) tool.) diff --git a/tutorial/bedtools.html b/tutorial/bedtools.html index 0f32657b..a4d4b8b8 100644 --- a/tutorial/bedtools.html +++ b/tutorial/bedtools.html @@ -22,7 +22,46 @@ </div> <div class="container"> <div class="row"> - <div class="span12"> + <div id="TOC" class="span3"> + <div class="well toc"> + <ul> + <li class="nav-header">Table of Contents</li> + </ul> + <ul> + <li><a href="#synopsis">Synopsis</a></li> + <li><a href="#setup">Setup</a></li> + <li><a href="#what-are-these-files">What are these files?</a></li> + <li><a href="#the-bedtools-help">The bedtools help</a></li> + <li><a href="#bedtools-intersect">bedtools “intersectâ€</a><ul> + <li><a href="#default-behavior">Default behavior</a></li> + <li><a href="#reporting-the-original-feature-in-each-file.">Reporting the original feature in each file.</a></li> + <li><a href="#how-many-base-pairs-of-overlap-were-there">How many base pairs of overlap were there?</a></li> + <li><a href="#counting-the-number-of-overlapping-features.">Counting the number of overlapping features.</a></li> + <li><a href="#find-features-that-do-not-overlap">Find features that DO NOT overlap</a></li> + <li><a href="#require-a-minimal-fraction-of-overlap.">Require a minimal fraction of overlap.</a></li> + <li><a href="#faster-analysis-via-sorted-data.">Faster analysis via sorted data.</a></li> + <li><a href="#intersecting-multiple-files-at-once.">Intersecting multiple files at once.</a></li> + </ul></li> + <li><a href="#bedtools-merge">bedtools “mergeâ€</a><ul> + <li><a href="#input-must-be-sorted">Input must be sorted</a></li> + <li><a href="#merge-intervals.">Merge intervals.</a></li> + <li><a href="#count-the-number-of-overlapping-intervals.">Count the number of overlapping intervals.</a></li> + <li><a href="#merging-features-that-are-close-to-one-another.">Merging features that are close to one another.</a></li> + <li><a href="#listing-the-name-of-each-of-the-exons-that-were-merged.">Listing the name of each of the exons that were merged.</a></li> + </ul></li> + <li><a href="#bedtools-complement">bedtools “complementâ€</a></li> + <li><a href="#bedtools-genomecov">bedtools “genomecovâ€</a><ul> + <li><a href="#producing-bedgraph-output">Producing BEDGRAPH output</a></li> + </ul></li> + <li><a href="#sophistication-through-chaining-multiple-bedtools">Sophistication through chaining multiple bedtools</a></li> + <li><a href="#principal-component-analysis">Principal component analysis</a><ul> + <li><a href="#a-jaccard-statistic-for-all-400-pairwise-comparisons.">A Jaccard statistic for all 400 pairwise comparisons.</a></li> + </ul></li> + <li><a href="#puzzles-to-help-teach-you-more-bedtools.">Puzzles to help teach you more bedtools.</a></li> + </ul> + </div> + </div> + <div class="span9"> <h1 id="synopsis">Synopsis</h1> <p>Our goal is to work through examples that demonstrate how to explore, process and manipulate genomic interval files (e.g., BED, VCF, BAM) with the <code>bedtools</code> software package.</p> <p>Some of our analysis will be based upon the Maurano et al exploration of DnaseI hypersensitivity sites in hundreds of primary tissue types.</p> @@ -523,7 +562,7 @@ heatmap.2(jaccard_matrix, col = brewer.pal(9,"Blues"), margins = c(14, <p><br /></p> <h1 id="puzzles-to-help-teach-you-more-bedtools.">Puzzles to help teach you more bedtools.</h1> <ol style="list-style-type: decimal"> -<li><p>Create a BED file representing all of the intervals in the genome that are NOT exonic.</p></li> +<li><p>Create a BED file representing all of the intervals in the genome that are NOT exonic and not Promoters (based on the promoters in the hESC file).</p></li> <li><p>What is the average distance from GWAS SNPs to the closest exon? (Hint - have a look at the <a href="http://bedtools.readthedocs.org/en/latest/content/tools/closest.html">closest</a> tool.)</p></li> <li><p>Count how many exons occur in each 500kb interval (“windowâ€) in the human genome. (Hint - have a look at the <code>makewindows</code> tool.)</p></li> <li><p>Are there any exons that are completely overlapped by an enhancer? If so, how many?</p></li> diff --git a/tutorial/bedtools.md b/tutorial/bedtools.md index 985afcf7..56bf2678 100644 --- a/tutorial/bedtools.md +++ b/tutorial/bedtools.md @@ -691,7 +691,7 @@ Puzzles to help teach you more bedtools. 1. Create a BED file representing all of the intervals in the genome -that are NOT exonic. +that are NOT exonic and not Promoters (based on the promoters in the hESC file). 2. What is the average distance from GWAS SNPs to the closest exon? (Hint - have a look at the [closest](http://bedtools.readthedocs.org/en/latest/content/tools/closest.html) tool.) diff --git a/tutorial/template.html b/tutorial/template.html new file mode 100644 index 00000000..8d2f7194 --- /dev/null +++ b/tutorial/template.html @@ -0,0 +1,75 @@ +<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> +<html xmlns="http://www.w3.org/1999/xhtml"$if(lang)$ lang="$lang$" xml:lang="$lang$"$endif$> +<head> + <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> + <meta http-equiv="Content-Style-Type" content="text/css" /> + <meta name="generator" content="pandoc" /> +$for(author-meta)$ + <meta name="author" content="$author-meta$" /> +$endfor$ +$if(date-meta)$ + <meta name="date" content="$date-meta$" /> +$endif$ + <title>$if(title-prefix)$$title-prefix$ - $endif$$pagetitle$</title> + <style type="text/css">code{white-space: pre;}</style> +$if(quotes)$ + <style type="text/css">q { quotes: "“" "â€" "‘" "’"; }</style> +$endif$ +$if(highlighting-css)$ + <style type="text/css"> +$highlighting-css$ + </style> +$endif$ +$for(css)$ + <link rel="stylesheet" href="$css$" $if(html5)$$else$type="text/css" $endif$/> +$endfor$ +$if(math)$ + $math$ +$endif$ +$for(header-includes)$ + $header-includes$ +$endfor$ +</head> +<body> + $if(title)$ + <div class="navbar navbar-static-top"> + <div class="navbar-inner"> + <div class="container"> + <span class="doc-title">$title$</span> + <ul class="nav pull-right doc-info"> + $for(author)$ + <li><p class="navbar-text">$author$</p></li> + $endfor$ + $if(date)$ + <li><p class="navbar-text">$date$</p></li> + $endif$ + </ul> + </div> + </div> + </div> + $endif$ + <div class="container"> + <div class="row"> + $if(toc)$ + <div id="$idprefix$TOC" class="span3"> + <div class="well toc"> + <ul> + <li class="nav-header">Table of Contents</li> + </ul> + $toc$ + </div> + </div> + $endif$ + <div class="span$if(toc)$9$else$12$endif$"> + $for(include-before)$ + $include-before$ + $endfor$ + $body$ + $for(include-after)$ + $include-after$ + $endfor$ + </div> + </div> + </div> +</body> +</html> -- GitLab