reproducibility.md 2.36 KB
Newer Older
Vilem Ded's avatar
Vilem Ded committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Reproducibility
* ensures credibility
* key requirement for follow-up and collaborative studies

<div style="position:absolute">
<img src="slides/img/reproducibility_nature.png" height="650px">
</div>

<div class="fragment" style="position:relative;left:50%">

## Why is our workflow not reproducible?

 Lack of provenance:
  * Input data downloaded from “some website”
  * Copy & paste operations
  * Manual text entry
  * Analysis not coded
  * Intermediate and final data not generated by deterministic processes
</div>



# Reproducibility
## Learning to code workflows and analyses
<div style="display:inline-grid;grid-gap: 40px;grid-template-columns: auto auto;position:relative;left:12%">
<div class="fragment">
<div class="content-box">
<div class="box-title red">Excel alone</div>
<div class="content">

  * Is great for looking at data.
  * Data entry is fast.
  * Analysis flow is hidden and not in focus.
  
</div>
</div>
Vilem Ded's avatar
Vilem Ded committed
37
38
39
<div style="text-align:center">
<img src="slides/img/excel_data-sheet.png" height="280px">
</div>
Vilem Ded's avatar
Vilem Ded committed
40
41
42
43
44
45
46
47
48
49
50
51
</div>

<div class="fragment">
<div class="content-box">
<div class="box-title">Coding</div>
<div class="content">

  * Is great for controlling analysis
  * Data is hidden.
  * Flow is visible.
</div> 
</div> 
Vilem Ded's avatar
Vilem Ded committed
52
<img src="slides/img/code-example.png"  height="280px">
Vilem Ded's avatar
Vilem Ded committed
53
54
55
</div>
</div>

Vilem Ded's avatar
Vilem Ded committed
56
<div class="content-box fragment" style="left:15%;width:60%;position:relative">
Vilem Ded's avatar
Vilem Ded committed
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
<div class="box-title green">Develop data science skills</div>
<div class="content">

  * Develop good data management and analysis habits.
  * Start coding your analysis within Excel.
  * Make yourself familiar with a statistics environment such as R, Python or Matlab
  * No need to learn a high level programming language such as C++ or Java.

</div>
</div>

</div>



# Table
<div style="position:absolute">
"Tabular format of data"

### Header

 * one line!
 * **good** names of columns
 
### Rows
 * represent observations/entities
 
### Columns
  * represent property of the observations
  * one data type
</div>
<div style="left:50%; position:relative; top:-2em">
<img src="slides/img/excel_data-sheet.png" width="600px">
<div class="fragment" data-fragment-index="3" style="position:absolute">
<img src="slides/img/excel_analyses-sheet.jpeg" width="600px"><br>
</div>
<div class="fragment" data-fragment-index="4" style="position:relative">
<img src="slides/img/red-cross.png" width="600px"><br>
</div>
</div>