Big Data

Learn how to find insight within data to make better business decisions.


Data Courses

Big Data Modules:

Big Data Analytics: This module covers data analytics in a distributed computing environment using Apache Spark.

Big Data Systems: This module covers distributed computing environments for “big data.”  Big Data Systems are platform for parallel computation or cluster computing.  Topics include shared memory, map-reduce, clustering, concurrency, task parallel versus data parallel, database integration, multi-threading and networked file systems.

Blockchain: This module is an introduction to the basics of Blockchain and Distributed Ledger technology.

Cloud-Scale Machine Learning – Development 2021

Data Science Architecture: This module presents a general overview of frameworks and orchestration schemes for big data system processing, storage, and reporting.

Data Science Infrastructure: Development 2021

Data Warehousing (OLAP, OLTP, MDX, SSAS): Development 2021

Dimensional Reduction Methods (High Dimensional Data): This module explains the meaning of dimensionality reduction and the reasons for using it. It covers several dimensionality reduction methods such as linear discriminant analysis, principal components analysis, independent component analysis, factor analysis, and singular value decomposition. The module shows implementation examples using Python Scikit-Learn library.

HADOOP, Fundamentals: Development 2021

HDFS GDFS, Fundamentals: Development 2021

Internet of Things (IoT), Fundamentals: This module will provide students an introduction of the Internet of things. Students will learn the definition of IoT as well as its building blocks and its human components.

Logistic Regression: This module covers estimating a logistic regression model, hypothesis testing of coefficients of a model and interpreting results from a logistic regression analysis.

MR Fundamentals: Development 2021

NoSQL Databases: This module provides an overview of the more generally accepted techniques of the topic that is NoSQL.  NoSQL is defined as either “Not Structured Query Language” or “Not only SQL.”

Resilient Distributed Datasets (RDDS), Fundamentals: Development 2021

Resampling Methods: This module covers various resampling methods using R.

Shrinkage Methods: Development 2021

SPARK, Fundamentals: Development 2021



If you are external to the Minnesota State System, please complete the Interest Form.

Access: 1.) Click below to be redirected to D2L and sign on with your StarID and password. Then, select Self-Registration and choose from the ITCOE courses listed. 2.) If you are external to the Minnesota State System, please complete the Interest Form below.

Click Here to Go to D2L