Commit 7dd440ee authored by Laurent Heirendt's avatar Laurent Heirendt
Browse files

Merge branch '2022_02_22_it101_dm' into 'develop'

small changes, new diapo about physical security

See merge request !114
parents da58170d 7b503290
Pipeline #53888 passed with stages
in 4 minutes and 45 seconds
../../../../2021/2021-07-27_IT101-DM/slides/img/red-cross.png
\ No newline at end of file
../../../../2021/2021-07-27_IT101-DM/slides/img/reproducibility_nature.png
\ No newline at end of file
../../../../2021/2021-07-27_IT101-DM/slides/img/rudi_balling.jpg
\ No newline at end of file
../../../../2021/2021-07-27_IT101-DM/slides/img/scripts/
\ No newline at end of file
../../../../2021/2021-07-27_IT101-DM/slides/img/undraw_secure_server_s9u8.png
\ No newline at end of file
../../../../2021/2021-07-27_IT101-DM/slides/img/wordcloud.png
\ No newline at end of file
# IT101 - Working with computers
<br>IT101 - Working with computers<br>
## Feb 22th, 2022
<div style="top: 6em; left: 0%; position: absolute;">
<img src="theme/img/lcsb_bg.png">
</div>
<div style="top: 5em; left: 60%; position: absolute;">
<img src="slides/img/r3-training-logo.png" height="200px">
<br><br><br><br>
<h3></h3>
<br><br><br>
<h4>
Nene Barry/Vilem Ded<br>
Data Steward<br>
nene.barry@uni.lu/lcsb-datastewards@uni.lu<br>
<i>Luxembourg Centre for Systems Biomedicine</i>
</h4>
</div>
# Data housekeeping
## Available data storage
<div class='fragment' style="position:absolute">
<img src="slides/img/LCSB_storages_full.png" height="750px">
</div>
<div class='fragment' style="position:relative">
<img src="slides/img/LCSB_storages_personal-crossed.png" height="750px">
<div style="position:absolute;left:65%;top:60%">
* Unless consortium/project has formally agreed to use a secure commercial cloud
</div>
</div>
<div style="position:absolute; width:45%; left:50%; top:28em; text-align:right">
<a href=" https://howto.lcsb.uni.lu/?policies:LCSB-POL-BIC-02" style="color:grey; font-size:0.8em;">Data Storage and Backup Policy</a>
</div>
# Data ingestion: Transfer and Integrity
* When sending data: <font color="red">Do not use emails, use secure platforms (Cloud, Aspera, Atlas share...)!</font>
<div class="fragment">
Data can be corrupted:
* (non-)malicious modification
* faulty file transfer
* disk corruption
</div>
<div class="fragment">
### Solution
* disable write access to the source data
* generate checksums!
<div style="position:absolute;left:40%;top:30%">
<img src="slides/img/checksum.png" width="500px">
</div>
</div>
<div class="fragment" style="position:relative; left:0%">
## When to generate checksums?
* before data transfer
- new dataset from collaborator
- upload to remote repository
* long term storage
- master version of dataset
- snapshot of data for publication
</div>
<div style="position:absolute; width:45%; left:50%; top:28em; text-align:right">
<a href=" https://howto.lcsb.uni.lu/?policies:LCSB-POL-BIC-02" style="color:grey; font-size:0.8em;">Data Storage and Backup Policy</a>
</div>
# Data ingestion/Integrity
## Encryption
<div class='fragment' style="position:relative;left:25%;top:60%">
<img align="middle" height="300px" src="slides/img/encryption.png">
</div>
<div class='fragment'>
* Guaranted confidentiality
</div>
<div class='fragment'>
* Encryption key need to be kept safe
</div>
<div class='fragment'>
* <font color= red>Loosing your encryption key means loosing your data!</font>
</div>
<div class='fragment'>
* Make a off-site backup of your data
</div>
# Introduction
<div class="fragment" style="position:absolute">
## Learning objectives
* How to manage your data
* How to look and analyze your data
* Solving issues with computers
* Reproduciblity in the research data life cycle
</div>
<div class="fragment" style="position:relative;top:80%;left:60%">
## Pertains to practically all people at LCSB
* Scientists
* PhD candidates
* Technicians
* Administrators
</div>
<center>
<img height="450px" src="slides/img/wordcloud.png"><br>
</center>
\ No newline at end of file
[
{ "filename": "index.md" },
{ "filename": "introduction.md" },
{ "filename": "access_management.md" },
{ "filename": "data-introduction.md" },
{ "filename": "data_flow.md" },
{ "filename": "ingestion.md" },
{ "filename": "storage_setup.md" },
{ "filename": "physical_security.md" },
{ "filename": "data-housekeeping.md" },
{ "filename": "howtos.md" },
{ "filename": "reproducibility.md" },
{ "filename": "code_versioning.md" },
{ "filename": "visualization.md" },
{ "filename": "data_life_cycle.md" },
{ "filename": "problem_solving.md" },
{ "filename": "fair-principles.md" },
{ "filename": "r3_group.md" },
{ "filename": "thanks.md" }
]
\ No newline at end of file
## Overview
0. Introduction - learning objectives + targeted audience
1. Data workflow
1. Ingestion:
* receiving/sending/sharing data
* file naming
* checksums
* backup
1. making data tidy
* what is table
*
1. Learning to code workflows and analyses - excel files, coding
1. Code versioning and reproducibility
1. Visualization
* see the data
1. problem solving
* guide
* rubberducking
* google for help
* oracle
1. R3 team
1. Acknowledgment
1. data minimization
# Physical Security
<div >
"<center>*Physical security describes security measures that are designed to deny unauthorized access to facilities, equipment and resources and to protect personnel and property from damage or harm (such as espionage, theft, or terrorist attacks)* </center>"
<center> <img height="230px" src="slides/img/physical_security.jpg"> </center>
<div style="position:absolute;top:30%;left:2%">
## LCSB offices
* Rouden Eck offices are locked by default
* Technical measures exist to individually control access to the building
* Physical access is limited to minimal authorized personnel
* Return of access badge is required when personnel contract is terminated
* Access to the data center requires approval(CIO)
* Visitors and external personnel acesss are monitoring
</div>
<div style="position:relative;top:80%;left:60%">
## Home Office, new security challenges
* Separate your work life from your home life
* Secure your home office
* Secure your home router
* Use VPN to access university applications.
* Encrypt your devices
* Keep your operating systems up to date
* Enable Automatic locking
</div>
</div>
# Problem solving
A guide for solving computing issues
1. Express the problem
* Write down what you want to achieve
2. Search for help
* Read **FAQs**, **help pages** and the **official documentation** well before turning to Google
* Use stack exchange, forums and related resources (carefully)
3. Ask an expert
* Submit the problem in writing
* Make the question interesting
# Responsible and Reproducible Research (R<sup>3</sup>)
## What is R<sup>3</sup>?
A multi-facetted change management
process built on 3 pillars:
- R3 pathfinder
- R3 school
- R3 accelerator
Common link module: R3 clinic
<div style="top: -1em; left: 50%; position: absolute;">
<img src="slides/img/3pillars-full.png">
</div>
<br>
<br>
<aside class="notes">
Pathfinder - policies, finding optimal data management changes<br>
School - courses, howtos, trainnings<br>
Accelerator - advanced teams and their boost/support, CI/CD setup<br>
Clinic - hands-on, meetings in groups, code review + suggestions<br>
</aside>
## R<sup>3</sup> Training
* LCSB's Monthly Data Management and Data Protection training
* ELIXIR Luxembourg's trainings <br>
https://elixir-luxembourg.org/training
* First steps with R (2021-07-27)
* Statistical tests and statistical learning for omics data (2021-09-14)
* R<sup>3</sup> school Git basics - every 4 months
<aside class="notes">
Direct newcommers to this monthly training
</aside>
# Responsible and Reproducible Research (R<sup>3</sup>)
<section data-transition="none" data-background-image="slides/img/r3-training-logo.png" data-background-size="1000px" data-background-opacity="0.1">
</section>
<div style="display:block;text-align:center;position:relative;" >
<div class="profile-container">
* Reinhard Schneider
* <img src="slides/img/R3_profile_pictures/reinhard_schneider.png">
* Head of Bioinformatics Core
</div>
<div class="profile-container">
* Pinar Alper
* <img src="slides/img/R3_profile_pictures/pinar_alper.png">
* Datasteward
</div>
<div class="profile-container">
* Yohan Yarosz</li>
* <img src="slides/img/R3_profile_pictures/yohan_yarosz.png">
* Development
</div>
<div class="profile-container">
* Laurent Heirendt</li>
* <img src="slides/img/R3_profile_pictures/laurent_heirendt.png">
* Git, CI
</div>
<div class="profile-container">
* Sarah Peter</li>
* <img src="slides/img/R3_profile_pictures/sarah_peter.png">
* Infrastructure
</div>
<div class="profile-container">
* Valentin Grouès</li>
* <img src="slides/img/R3_profile_pictures/valentine_groues.png">
* Development
</div>
<div class="profile-container">
* Vilem Ded</li>
* <img src="slides/img/R3_profile_pictures/vilem_ded.png">
* Datasteward
</div>
<div class="profile-container">
* Noua Toukourou</li>
* <img src="slides/img/R3_profile_pictures/noua_toukourou.png">
* Infrastructure
</div>
<div class="profile-container">
* Alexey Kolodkin</li>
* <img src="slides/img/R3_profile_pictures/alexey_kolodkin.png">
* Datasteward
</div>
<div class="profile-container">
* Maharshi Vyas</li>
* <img src="slides/img/R3_profile_pictures/maharshi_vyas.png">
* Infrastructure
</div>
<div class="profile-container">
* Nene Barry</li>
* <img src="slides/img/R3_profile_pictures/nene_barry.png">
* Datasteward
</div>
<div class="profile-container">
* Karim Chaouch</li>
* <img src="slides/img/R3_profile_pictures/karim_chaouch.png">
* Development
</div>
<div class="profile-container">
* Christophe Trefois
* <img src="slides/img/R3_profile_pictures/christophe_trefois.png">
* R<sup>3</sup> team lead
</div>
</div>
# Reproducibility
* ensures credibility
* key requirement for follow-up and collaborative studies
<div style="position:absolute">
<img src="slides/img/reproducibility_nature.png" height="650px">
</div>
<div class="fragment" style="position:relative;left:50%">
## Why is our workflow not reproducible?
Lack of provenance:
* Input data downloaded from “some website”
* Copy & paste operations
* Manual text entry
* Analysis not coded
</div>
# Reproducibility
## Learning to code workflows and analyses
<div style="display:inline-grid;grid-gap: 40px;grid-template-columns: auto auto;position:relative;left:12%">
<div class="fragment">
<div class="content-box">
<div class="box-title red">Spreadsheets alone</div>
<div class="content">
* Is great for looking at data.
* Data entry is fast.
* Analysis flow is hidden and not in focus.
</div>
</div>
<div style="text-align:center">
<img src="slides/img/excel_data-sheet.png" height="280px">
</div>
</div>
<div class="fragment">
<div class="content-box">
<div class="box-title">Coding</div>
<div class="content">
* Is great for controlling analysis
* Data is hidden.
* Flow is visible.
</div>
</div>
<img src="slides/img/code-example.png" height="280px">
</div>
</div>
<div class="content-box fragment" style="left:15%;width:60%;position:relative">
<div class="box-title green">Develop data science skills</div>
<div class="content">
* Develop good data management and analysis habits.
* Start coding your analysis within spreadsheets.
* Make yourself familiar with a statistics environment such as R, Python or Matlab
* No need to learn a high level programming language such as C++ or Java.
</div>
</div>
</div>
# Table
<div style="position:absolute">
"Tabular format of data"
### Header
* one line!
* **good** names of columns
### Rows
* represent observations/entities
### Columns
* represent property of the observations
* one data type
</div>
<div style="left:50%; position:relative; top:-2em">
<img src="slides/img/excel_data-sheet.png" width="700px">
<div class="fragment" data-fragment-index="3" style="position:absolute">
<img src="slides/img/excel_analyses-sheet.jpeg" width="700px"><br>
</div>
<div class="fragment" data-fragment-index="4" style="position:relative">
<img src="slides/img/red-cross.png" width="700px"><br>
</div>
</div>
# Storage set-up
* Download Anti-virus software
* Regularly update your SW/OS
* Encrypt movable media
<div class="fragment" >
### Backup
* take care of your own backups!
* don't work on your backup copy!
* minimum is <b>3-2-1 backup rule</b>
<div style="position:absolute;right:10%;top:10%">
<img src="slides/img/undraw_secure_server_s9u8.png" height="750px">
</div>
<div style="position:absolute; width:45%; left:50%; top:28em; text-align:right">
<a href=" https://howto.lcsb.uni.lu/?policies:LCSB-POL-BIC-02" style="color:grey; font-size:0.8em;">Data Storage and Backup Policy</a>
</div>
</div>
<div class="fragment">
### Passwords
* Strong passwords
* Password manager
* Safe password exchange channels
* Expiration time on password share
</div>
# Storage set-up
## Password exchange channels
<div style="position:relative">
<img src="slides/img/privateBin.png" height="350px">
</div>
<div style="position:absolute;left:65%;top:85%">
* Free service provided by LSCB at <a href="https://privatebin.lcsb.uni.lu" style="color:blue; font-size:0.8em;">privatebin.lcsb.uni.lu</a>
* **LUMS** account is required
* Set expiry period
* Can expire upon first access
* Password only accessible by sender and recipient
</div>
# Storage set-up
## Backup - Central IT/LCSB
<div style="position:relative">
<img src="slides/img/LCSB_storages_backed-up.png" height="750px">
</div>
<div style="position:absolute;left:65%;top:60%">
Server administrators take care of:
* server backups
* LCSB OwnCloud backups
* group/application server backups (not always)
</div>
# Storage set-up
## Backup - personal research data
<div style="position:relative">
<img src="slides/img/LCSB_storages_backup.png" height="750px">
</div>
<div style="position:absolute;left:55%;top:70%">
<font color="red">One version should reside on Atlas!</font>
</div>
# Thank you.<sup> </sup>
<center><img src="slides/img/r3-training-logo.png" height="200px"></center>
<br>
<br>
<br>
<br>
<center>
Contact us if you need help:
<a href="mailto:lcsb-r3@uni.lu">lcsb-r3@uni.lu</a>
</center>
<div style="position:absolute">
Links:
HowTo Cards / Policies: https://howto.lcsb.uni.lu/
Course Slides: https://courses.lcsb.uni.lu/
Internal Presentations: https://presentations.lcsb.uni.lu/
LCSB GitLab: https://gitlab.lcsb.uni.lu/
HPC: https://hpc.uni.lu/
Service Portal: https://service.uni.lu/sp
LCSB intranet: https://intranet.uni.lux
</div>
<div style="position:relative;top:1.5em;left:55%;width:45%">
Avalable SW and tools:
<div style="margin-left: 20px;">
SIU managed:
&ensp; - Service Portal > All Catalogs > IT > Softwares
</div>
<div style="margin-left: 20px;">
LCSB managed: