Course Outline

Day One: Language Basics

  • Course Introduction
  • About Data Science
    • Definition of Data Science
    • Process of Conducting Data Science
  • Introducing the R Language for Government Applications
  • Variables and Types
  • Control Structures (Loops / Conditionals)
  • Scalars, Vectors, and Matrices in R
    • Defining R Vectors
    • Matrices
  • String and Text Manipulation
    • Character Data Type
    • File Input/Output Operations
  • Lists
  • Functions
    • Introduction to Functions
    • Closures
    • lapply/sapply Functions
  • DataFrames
  • Labs for All Sections

Day Two: Intermediate R Programming

  • DataFrames and File Input/Output
  • Reading Data from Files
  • Data Preparation
  • Built-in Datasets
  • Visualization
    • Graphics Package
    • plot() / barplot() / hist() / boxplot() / Scatter Plot
    • Heat Map
    • ggplot2 Package (qplot(), ggplot())
  • Exploration with Dplyr
  • Labs for All Sections

Day Three: Advanced Programming with R

  • Statistical Modeling with R
    • Statistical Functions
    • Handling Missing Values (NA)
    • Distributions (Binomial, Poisson, Normal)
  • Regression Analysis
    • Introduction to Linear Regressions
  • Recommendations
  • Text Processing (tm Package / Wordclouds)
  • Clustering Techniques
    • Introduction to Clustering
    • KMeans Algorithm
  • Classification Methods
    • Introduction to Classification
    • Naive Bayes
    • Decision Trees
    • Training with the caret Package
    • Evaluating Algorithms
  • R and Big Data Integration
    • Connecting R to Databases
    • Overview of the Big Data Ecosystem
  • Labs for All Sections

Requirements

  • A basic programming background is preferred.

Setup

  • A modern laptop for government use.
  • The latest R Studio and R environment installed.
 21 Hours

Number of participants


Price per participant

Testimonials (7)

Upcoming Courses

Related Categories