diff --git a/docs/content/images/memory-comparo.png b/docs/content/images/memory-comparo.png new file mode 100644 index 0000000000000000000000000000000000000000..46940ec53a2826f41e8ceaf33922b385119b9659 Binary files /dev/null and b/docs/content/images/memory-comparo.png differ diff --git a/docs/content/images/speed-comparo.png b/docs/content/images/speed-comparo.png new file mode 100644 index 0000000000000000000000000000000000000000..34a5a3e2fc604543d9c249bd248c6a56e6567b3f Binary files /dev/null and b/docs/content/images/speed-comparo.png differ diff --git a/docs/index.rst b/docs/index.rst index c9df3e6ffea3ab461bc198c9c4f3fd2ebe20d291..6e8ba652d8aa592fd69225928bb585da51e2adb2 100755 --- a/docs/index.rst +++ b/docs/index.rst @@ -32,6 +32,44 @@ Table of contents content/related-tools +================= +Performance +================= +As of version 2.18, ``bedtools`` is substantially more scalable thanks to improvements we have made in the algorithm used to process datasets that are pre-sorted +by chromosome and start position. As you can see in the plots below, the speed and memory consumption scale nicely +with sorted data as compared to the poor scaling for unsorted data. The current version of bedtools intersect is as fast as (or slightly faster) than the ``bedops`` package's ``bedmap`` which uses a similar algorithm for sorted data. The plots below represent counting the number of intersecting alignments from exome capture BAM files against CCDS exons. +The alignments have been converted to BED to facilitate comparisons to ``bedops``. We compare to the bedmap ``--ec`` option because similar error checking is enforced by ``bedtools``. + +.. image:: content/images/speed-comparo.png + :width: 300pt +.. image:: content/images/memory-comparo.png + :width: 300pt + +Commands used: + +.. code-block:: bash + + # bedtools unsorted + $ bedtools intersect \ + -a ccds.exons.bed -b aln.bam.bed \ + -c + + # bedtools sorted + $ bedtools intersect \ + -a ccds.exons.bed -b aln.bam.bed \ + -c \ + -sorted + + # bedmap (no error checking) + $ bedmap --echo --count --bp-ovr 1 \ + ccds.exons.bed aln.bam.bed + + # bedmap (no error checking) + $ bedmap --ec --echo --count --bp-ovr 1 \ + ccds.exons.bed aln.bam.bed + + + ================= Brief example =================