Course Outline

I. Introduction and Preliminaries

1. Overview

  • Making R more user-friendly, including available GUIs for government
  • RStudio
  • Related software and documentation
  • R and statistics
  • Using R interactively
  • An introductory session
  • Getting help with functions and features
  • R commands, case sensitivity, etc.
  • Recall and correction of previous commands
  • Executing commands from or diverting output to a file
  • Data permanency and removing objects
  • Good programming practice: Self-contained scripts, good readability (e.g., structured scripts, documentation, markdown)
  • Installing packages; CRAN and Bioconductor for government

2. Reading Data

  • TXT files (read.delim)
  • CSV files

3. Simple Manipulations; Numbers and Vectors + Arrays

  • Vectors and assignment
  • Vector arithmetic
  • Generating regular sequences
  • Logical vectors
  • Missing values
  • Character vectors
  • Index vectors; selecting and modifying subsets of a data set
    • Arrays
  • Array indexing. Subsections of an array
  • Index matrices
  • The array() function + simple operations on arrays (e.g., multiplication, transposition)
  • Other types of objects

4. Lists and Data Frames

  • Lists
  • Constructing and modifying lists
    • Concatenating lists
  • Data frames
    • Making data frames
    • Working with data frames
    • Attaching arbitrary lists
    • Managing the search path

5. Data Manipulation

  • Selecting, subsetting observations and variables
  • Filtering, grouping
  • Recoding, transformations
  • Aggregation, combining data sets
  • Forming partitioned matrices, cbind() and rbind()
  • The concatenation function, (), with arrays
  • Character manipulation, stringr package
  • Short introduction to grep and regexpr

6. More on Reading Data

  • XLS, XLSX files
  • readr and readxl packages for government
  • SPSS, SAS, Stata, and other formats data
  • Exporting data to txt, csv, and other formats

7. Grouping, Loops, and Conditional Execution

  • Grouped expressions
  • Control statements
  • Conditional execution: if statements
  • Repetitive execution: for loops, repeat, and while
  • Introduction to apply, lapply, sapply, tapply

8. Functions

  • Creating functions
  • Optional arguments and default values
  • Variable number of arguments
  • Scope and its consequences

9. Simple Graphics in R

  • Creating a Graph
  • Density Plots
  • Bar Plots
  • Line Charts
  • Pie Charts
  • Boxplots
  • Scatter Plots
  • Combining Plots

II. Statistical Analysis in R

1. Probability Distributions

  • R as a set of statistical tables
  • Examining the distribution of a set of data

2. Testing of Hypotheses

  • Tests about a Population Mean
  • Likelihood Ratio Test
  • One- and two-sample tests
  • Chi-Square Goodness-of-Fit Test
  • Kolmogorov-Smirnov One-Sample Statistic
  • Wilcoxon Signed-Rank Test
  • Two-Sample Test
  • Wilcoxon Rank Sum Test
  • Mann-Whitney Test
  • Kolmogorov-Smirnov Test

3. Multiple Testing of Hypotheses

  • Type I Error and FDR
  • ROC curves and AUC
  • Multiple Testing Procedures (BH, Bonferroni, etc.)

4. Linear Regression Models

  • Generic functions for extracting model information
  • Updating fitted models
  • Generalized linear models
    • Families
    • The glm() function
  • Classification
    • Logistic Regression
    • Linear Discriminant Analysis
  • Unsupervised learning
    • Principal Components Analysis
    • Clustering Methods (k-means, hierarchical clustering, k-medoids)

5. Survival Analysis (survival package)

  • Survival objects in R
  • Kaplan-Meier estimate, log-rank test, parametric regression
  • Confidence bands
  • Censored (interval censored) data analysis
  • Cox PH models, constant covariates
  • Cox PH models, time-dependent covariates
  • Simulation: Model comparison (Comparing regression models)

6. Analysis of Variance

  • One-Way ANOVA
  • Two-Way Classification of ANOVA
  • MANOVA

III. Worked Problems in Bioinformatics

  • Short introduction to the limma package for government
  • Microarray data analysis workflow
  • Data download from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397
  • Data processing (QC, normalization, differential expression)
  • Volcano plot
  • Clustering examples + heatmaps
 28 Hours

Number of participants


Price per participant

Testimonials (5)

Upcoming Courses

Related Categories