From 98e1c6b02677cfcf2fad71c842f79dba30707f86 Mon Sep 17 00:00:00 2001 From: Natalie Elphick Date: Tue, 3 Dec 2024 17:11:42 -0800 Subject: [PATCH] update for 2024, closes #18 --- docs/Working_on_Wynton_Part_1.html | 138 ++--- docs/Working_on_Wynton_Part_2.html | 491 +++++++++++------- .../Working_on_Wynton_Part_1.Rmd | 111 ++-- .../Working_on_Wynton_Part_2.Rmd | 274 +++++----- working-on-wynton-hpc/renv.lock | 2 +- .../file_system_node_relationship.png | Bin 0 -> 76026 bytes 6 files changed, 543 insertions(+), 473 deletions(-) create mode 100644 working-on-wynton-hpc/slide_materials/file_system_node_relationship.png diff --git a/docs/Working_on_Wynton_Part_1.html b/docs/Working_on_Wynton_Part_1.html index 320d301..580573f 100644 --- a/docs/Working_on_Wynton_Part_1.html +++ b/docs/Working_on_Wynton_Part_1.html @@ -574,7 +574,7 @@ document.addEventListener('DOMContentLoaded', function(e) {

Working on Wynton

Part 1

Natalie Elphick

-

April 15th, 2024

+

December 5th, 2024

@@ -591,10 +591,8 @@ document.addEventListener('DOMContentLoaded', function(e) {

TAs:

      Alex Pico
      Bioinformatics Core Director
-      Ayushi Agrawal
-      Bioinformatician III
-      Min-Gyoung Shin
-      Bioinformatician III

+      Michela Traglia
+      Senior Statistician

Target Audience

@@ -626,6 +624,10 @@ a fast local network

HPC Diagram

+
+

HPC File System

+

HPC File System

+

Wynton

Names:

log1, log2 and plog1 (for PHI users)

@@ -738,78 +738,6 @@ container

Storage

-
-
-

The File System

-
    -
  • A file system how information is stored and retrieved on a computer -
      -
    • Consists of files and directories
    • -
  • -
  • A local file system is function of the operating system and only -accessible from a single computer
  • -
  • A shared file system is accessible from multiple computers
  • -
-
-
-

BeeGFS

-
    -
  • Wynton uses a parallel shared file system called BeeGFS -
      -
    • The files are stored as “chunks” spread across many different -servers
    • -
  • -
  • BeeGFS has multiple services that work together to manage the file -system -
      -
    • Storage (stores the chunks)
    • -
    • Metadata (tracks the chunks and information about their file)
    • -
    • Management (tracks all of the services)
    • -
    • Client (provides linux access to the file system)
    • -
  • -
-
-
-

BeeGFS - Advantages

-
    -
  • High throughput
  • -
  • Redundancy can be built in by mirroring services
  • -
  • Adding new storage is fast and does not require downtime
  • -
-
-
-

BeeGFS - Caveats

-
    -
  • For any client node, performance is limited by the network bandwidth -of that node
  • -
  • Network latency becomes extremely important for all metadata -requests
  • -
  • Certain input/output patterns can be problematic
  • -
-
-
-

BeeGFS - I/O patterns

-
    -
  • Anything that requires lots of metadata operations can feel slow -
      -
    • e.g: lots of writes to the same directory and lots of file lookups -and directory searches (conda)
    • -
  • -
  • Keep the number of reads and writes to a single directory to a -reasonable number
  • -
-
-
-

BeeGFS - Takehome Message

-
    -
  • Prefer fewer, large files over many small ones
  • -
  • Distribute reading and writing over several directories
  • -
  • Use local scratch (/scratch) when possible
  • -
  • Don’t include anything in /wynton in your default -LD_LIBRARY_PATH
  • -
  • If using conda, putting the conda application inside a Apptainer -(formerly singularity) container will result in better performance
  • -

Storage

@@ -1009,20 +937,36 @@ from source
[alice@dev1 ~]$ mkdir -p "/scratch/$USER"
 [alice@dev1 ~]$ cd "/scratch/$USER"
-[alice@dev1 alice]$ wget https://github.com/samtools/samtools/releases/download/1.19.2/samtools-1.19.2.tar.bz2
-[alice@dev1 alice]$ tar -x -f samtools-1.19.2.tar.bz2
+[alice@dev1 alice]$ wget https://github.com/samtools/samtools/releases/download/1.21/samtools-1.21.tar.bz2 +[alice@dev1 alice]$ tar -x -f samtools-1.21.tar.bz2
  1. Create install location and configure
-
[alice@dev1 ~]$ mkdir -p $HOME/software/samtools-1.14
-[alice@dev1 ~]$ cd samtools-1.19.2
-[alice@dev1 ~]$ ./configure --prefix=$HOME/software/samtools-1.14
+
[alice@dev1 ~]$ mkdir -p $HOME/software/samtools-1.21
+[alice@dev1 ~]$ cd samtools-1.21
+[alice@dev1 ~]$ ./configure --prefix=$HOME/software/samtools-1.21
  1. Build and install
[alice@dev1 ~]$ make
 [alice@dev1 ~]$ make install
+
+

Install Samtools from Source

+
    +
  1. Add to PATH
  2. +
+
[alice@dev1 ~]$ echo "export PATH=$HOME/software/samtools-1.21/bin:\$PATH" >> $HOME/.bashrc
+[alice@dev1 ~]$ source $HOME/.bashrc
+
    +
  1. Test Installation
  2. +
+
[alice@dev1 ~]$ samtools --help
+
Program: samtools (Tools for alignments in the SAM format)
+Version: 1.21 (using htslib 1.21)
+
+Usage:   samtools <command> [options]
+

Install Nextflow

    @@ -1065,12 +1009,13 @@ Rocky 8 Linux

    Definitions

      -
    • Containers: An isolated environment for running -software that is created from an image file, preventing -conflicts with the host system.
    • -
    • Images: An ordered collection of root filesystem -changes that contain all necessary dependencies, ensuring software run -identically across various computing platforms.
    • +
    • Containers: An isolated environment for running +software that avoids conflicts with the host system. Containers are +stored, shared and executed as image files with a .sif +extension.
    • +
    • Images: are built from definition files (or +Dockerfiles) which are a set of instruction you specify for your +environment.
    @@ -1150,20 +1095,13 @@ run:

    Upcoming Data Science Training Program Workshops

    -

    Introduction -to Linear Mixed Effects Models
    -April 25-April 26, 2024 1-3pm PDT

    -

    Single -Cell RNA-Seq Data Analysis
    -April 29-April 30, 2024 9am-4pm PDT

    -

    Single -Cell ATAC-Seq Data Analysis Part 1
    -May 6-May 7, 2024 1-4pm PDT

    +

    This is our last workshop for 2024, please check the link below for +future workshop dates.

    Complete Schedule

diff --git a/docs/Working_on_Wynton_Part_2.html b/docs/Working_on_Wynton_Part_2.html index 1984afa..4022ae4 100644 --- a/docs/Working_on_Wynton_Part_2.html +++ b/docs/Working_on_Wynton_Part_2.html @@ -45,6 +45,69 @@ span.underline{text-decoration: underline;} div.column{display: inline-block; vertical-align: top; width: 50%;} div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;} ul.task-list{list-style: none;} +pre > code.sourceCode { white-space: pre; position: relative; } +pre > code.sourceCode > span { line-height: 1.25; } +pre > code.sourceCode > span:empty { height: 1.2em; } +.sourceCode { overflow: visible; } +code.sourceCode > span { color: inherit; text-decoration: inherit; } +div.sourceCode { margin: 1em 0; } +pre.sourceCode { margin: 0; } +@media screen { +div.sourceCode { overflow: auto; } +} +@media print { +pre > code.sourceCode { white-space: pre-wrap; } +pre > code.sourceCode > span { display: inline-block; text-indent: -5em; padding-left: 5em; } +} +pre.numberSource code +{ counter-reset: source-line 0; } +pre.numberSource code > span +{ position: relative; left: -4em; counter-increment: source-line; } +pre.numberSource code > span > a:first-child::before +{ content: counter(source-line); +position: relative; left: -1em; text-align: right; vertical-align: baseline; +border: none; display: inline-block; +-webkit-touch-callout: none; -webkit-user-select: none; +-khtml-user-select: none; -moz-user-select: none; +-ms-user-select: none; user-select: none; +padding: 0 4px; width: 4em; +color: #aaaaaa; +} +pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; } +div.sourceCode +{ } +@media screen { +pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; } +} +code span.al { color: #ff0000; font-weight: bold; } +code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } +code span.at { color: #7d9029; } +code span.bn { color: #40a070; } +code span.bu { color: #008000; } +code span.cf { color: #007020; font-weight: bold; } +code span.ch { color: #4070a0; } +code span.cn { color: #880000; } +code span.co { color: #60a0b0; font-style: italic; } +code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } +code span.do { color: #ba2121; font-style: italic; } +code span.dt { color: #902000; } +code span.dv { color: #40a070; } +code span.er { color: #ff0000; font-weight: bold; } +code span.ex { } +code span.fl { color: #40a070; } +code span.fu { color: #06287e; } +code span.im { color: #008000; font-weight: bold; } +code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } +code span.kw { color: #007020; font-weight: bold; } +code span.op { color: #666666; } +code span.ot { color: #007020; } +code span.pp { color: #bc7a00; } +code span.sc { color: #4070a0; } +code span.ss { color: #bb6688; } +code span.st { color: #4070a0; } +code span.va { color: #19177c; } +code span.vs { color: #4070a0; } +code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; }