Course Outline

Statistics & Probabilistic Programming in Julia

Basic Statistics

  • Statistics
    • Summary statistics using the Statistics package
  • Distributions & StatsBase Package
    • Univariate and multivariate distributions
    • Moments
    • Probability functions
    • Sampling and random number generation (RNG)
    • Histograms
    • Maximum likelihood estimation
    • Product, truncation, and censored distributions
    • Robust statistics
    • Correlation and covariance

DataFrames

(DataFrames package)

  • Data input/output (I/O)
  • Creating data frames
  • Data types, including categorical and missing data
  • Sorting and joining data
  • Reshaping and pivoting data

Hypothesis Testing

(HypothesisTests package)

  • Principles of hypothesis testing
  • Chi-Squared test
  • Z-test and t-test
  • F-test
  • Fisher exact test
  • Analysis of variance (ANOVA)
  • Tests for normality
  • Kolmogorov-Smirnov test
  • Hotelling's T-test

Regression & Survival Analysis

(GLM and Survival packages)

  • Principles of linear regression and exponential family
  • Linear regression
  • Generalized linear models (GLMs)
    • Logistic regression
    • Poisson regression
    • Gamma regression
    • Other GLM models
  • Survival analysis
    • Events
    • Kaplan-Meier estimator
    • Nelson-Aalen estimator
    • Cox proportional hazards model

Distances

(Distances package)

  • Definition of a distance metric
  • Euclidean distance
  • Cityblock (Manhattan) distance
  • Cosine similarity
  • Correlation distance
  • Mahalanobis distance
  • Hamming distance
  • Mean absolute deviation (MAD)
  • Root mean square (RMS) error
  • Mean squared deviation

Multivariate Statistics

(MultivariateStats, Lasso, and Loess packages)

  • Ridge regression
  • Lasso regression
  • Local regression (Loess)
  • Linear discriminant analysis
  • Principal Component Analysis (PCA)
    • Linear PCA
    • Kernel PCA
    • Probabilistic PCA
    • Independent component analysis (ICA)
  • Principal Component Regression (PCR)
  • Factor Analysis
  • Canonical Correlation Analysis
  • Multidimensional scaling

Clustering

(Clustering package)

  • K-means clustering
  • K-medoids clustering
  • Density-based spatial clustering of applications with noise (DBSCAN)
  • Hierarchical clustering
  • Markov Cluster Algorithm
  • Fuzzy C-means clustering

Bayesian Statistics & Probabilistic Programming

(Turing package)

  • Markov Chain Monte Carlo (MCMC)
  • Hamiltonian Monte Carlo
  • Gaussian mixture models
  • Bayesian linear regression
  • Bayesian exponential family regression
  • Bayesian neural networks
  • Hidden Markov models
  • Particle filtering
  • Variational inference

Requirements

This course is designed for individuals who already possess a background in data science and statistics, particularly those working in roles for government.
 21 Hours

Number of participants


Price per participant

Testimonials (5)

Upcoming Courses

Related Categories