Course Outline

I. Introduction and Preliminaries

1. Overview

  • Enhancing user-friendliness of R, including available graphical user interfaces (GUIs)
  • RStudio as a development environment for government
  • Related software and documentation resources for government
  • The role of R in statistical analysis for government
  • Interactive use of R for government operations
  • A basic session guide for government users
  • Accessing help with functions and features for government
  • R commands, case sensitivity, and other syntax considerations
  • Recalling and correcting previous commands for efficient workflow in government
  • Executing commands from or diverting output to a file for documentation and record-keeping in government
  • Data permanency and object management for government
  • Best practices in programming: self-contained scripts, readability, structured scripts, documentation, and markdown for government
  • Installing packages from CRAN and Bioconductor for government applications

2. Reading Data

  • TXT files (read.delim)
  • CSV files

3. Simple Manipulations; Numbers and Vectors + Arrays

  • Vectors and assignment operations for government data
  • Vector arithmetic in government applications
  • Generating regular sequences for government datasets
  • Logical vectors for conditional operations in government
  • Handling missing values in government data
  • Character vectors for text manipulation in government reports
  • Index vectors; selecting and modifying subsets of a dataset
    • Arrays for multidimensional data management in government
  • Array indexing: working with subsections of an array for government analysis
  • Index matrices for advanced data manipulation in government
  • The array() function and simple operations on arrays, such as multiplication and transposition, for government use
  • Other types of objects for comprehensive data handling in government

4. Lists and Data Frames

  • Lists for structured data storage in government
  • Constructing and modifying lists
    • Concatenating lists for integrated data management in government
  • Data frames for tabular data representation in government
    • Making data frames from various sources for government datasets
    • Working with data frames for efficient data analysis in government
    • Attaching arbitrary lists to expand data scope in government
    • Managing the search path for seamless data access in government

5. Data Manipulation

  • Selecting and subsetting observations and variables for targeted analysis in government
  • Filtering and grouping data for focused insights in government
  • Recoding and transformations for accurate data representation in government
  • Aggregation and combining datasets for comprehensive reporting in government
  • Forming partitioned matrices using cbind() and rbind() for structured data presentation in government
  • The concatenation function, c(), with arrays for flexible data manipulation in government
  • Character manipulation using the stringr package for text-based data in government
  • A short introduction to grep and regexpr for pattern matching in government

6. More on Reading Data

  • XLS, XLSX files for government spreadsheets
  • The readr and readxl packages for efficient data import in government
  • SPSS, SAS, Stata, and other formats for interoperability in government
  • Exporting data to TXT, CSV, and other formats for sharing and archiving in government

6. Grouping, Loops, and Conditional Execution

  • Grouped expressions for organized data processing in government
  • Control statements for structured programming in government
  • Conditional execution: if statements for decision-making in government
  • Repetitive execution: for loops, repeat, and while for iterative tasks in government
  • An introduction to apply, lapply, sapply, tapply functions for batch processing in government

7. Functions

  • Creating functions for reusable code in government
  • Optional arguments and default values for flexible function design in government
  • Variable number of arguments for dynamic function behavior in government
  • Scope and its consequences for data integrity in government

8. Simple Graphics in R

  • Creating a graph for visual data representation in government
  • Density plots for distribution visualization in government
  • Bar plots for categorical data comparison in government
  • Line charts for trend analysis in government
  • Pie charts for proportional data representation in government
  • Boxplots for statistical summary visualization in government
  • Scatter plots for relationship exploration in government
  • Combining plots for comprehensive visual reports in government

II. Statistical Analysis in R

1. Probability Distributions

  • R as a set of statistical tables for government research
  • Examining the distribution of a dataset for informed decision-making in government

2. Testing of Hypotheses

  • Tests about a population mean for government studies
  • Likelihood Ratio Test for model comparison in government
  • One- and two-sample tests for hypothesis verification in government
  • Chi-Square Goodness-of-Fit Test for distribution validation in government
  • Kolmogorov-Smirnov One-Sample Statistic for distribution testing in government
  • Wilcoxon Signed-Rank Test for non-parametric analysis in government
  • Two-Sample Test for comparative studies in government
  • Wilcoxon Rank Sum Test for independent samples in government
  • Mann-Whitney Test for non-parametric comparison in government
  • Kolmogorov-Smirnov Test for distribution similarity in government

3. Multiple Testing of Hypotheses

  • Type I Error and False Discovery Rate (FDR) for robust statistical inference in government
  • ROC curves and AUC for performance evaluation in government
  • Multiple Testing Procedures (BH, Bonferroni, etc.) for comprehensive hypothesis testing in government

4. Linear Regression Models

  • Generic functions for extracting model information in government
  • Updating fitted models for iterative improvement in government
  • Generalized linear models
    • Families of distributions for diverse data types in government
    • The glm() function for generalized linear modeling in government
  • Classification techniques
    • Logistic Regression for binary outcomes in government
    • Linear Discriminant Analysis for multiclass classification in government
  • Unsupervised learning methods
    • Principal Components Analysis (PCA) for dimensionality reduction in government
    • Clustering Methods (k-means, hierarchical clustering, k-medoids) for data segmentation in government

5. Survival Analysis (survival package)

  • Survival objects in R for time-to-event analysis in government
  • Kaplan-Meier estimate, log-rank test, and parametric regression for survival data in government
  • Confidence bands for uncertainty quantification in government
  • Censored (interval censored) data analysis for handling incomplete information in government
  • Cox Proportional Hazards (PH) models with constant covariates for time-dependent risk assessment in government
  • Cox PH models with time-dependent covariates for dynamic risk factors in government
  • Simulation: Model comparison for evaluating different survival models in government

6. Analysis of Variance (ANOVA)

  • One-Way ANOVA for single-factor analysis in government
  • Two-Way Classification of ANOVA for multi-factor analysis in government
  • Multivariate Analysis of Variance (MANOVA) for multiple response variables in government

III. Worked Problems in Bioinformatics

  • A short introduction to the limma package for gene expression analysis in government
  • Microarray data analysis workflow for comprehensive biological studies in government
  • Data download from GEO: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1397 for reproducible research in government
  • Data processing (QC, normalization, differential expression) for accurate results in government
  • Volcano plot for visualizing significant changes in gene expression in government
  • Clustering examples and heatmaps for pattern recognition in biological data for government
 28 Hours

Number of participants


Price per participant

Testimonials (5)

Upcoming Courses

Related Categories