This course is designed keeping in mind the growing need for skilled resources in the field of Data Science and Big Data technologies
-
Syllabus
- Introduction to Data Science
- Introduction to Hadoop
- Working with HDFS and Hive
- Introduction to Apache Spark 2
- Reading and Writing data
- Inspecting Data Quality
- Cleansing and transforming data
- Summarizing and grouping data
- Combining, splitting and reshaping data
- Exploring data
- Configuring, monitoring and troubleshooting Spark applications
- Overview of machine learning in Spark MLlib
- Extracting, transforming and selecting features
- Building and evaluating regression, classification and clustering models
- Cross-validating models and tuning hyperparameters
- Building machine learning pipelines
- Deploying machine learning models
0.0
0 total
5
4
3
2
1