Sail(Data Scientist)
Computational Data Scientists analyze, process, and model data and interpret the results to create actionable plans for companies and other organizations. The goal in this course is to develop the skills needed to become a computational data scientist.
Course Description:
Learners will gain knowledge and develop hands-on experience solving real-world problems in the field of computational data science. The course systematically covers the entire data science project lifecycle starting from problem analysis, its representation, and solution vision. First, the learners practice how to gather requirements and analyze/explore the domain. Then, they go through the process of data gathering, wrangling, and preparation. Several projects expose learners to exploratory data analysis. Then, the focus switches to building machine learning models. The learners also experiment with model deployment and comparison. Finally, the learners practice how to evaluate deployed models and how to optimize the evaluation.
Prerequisites:
- Practical Programming with Python
- Cloud Administrator
- AI User
Duration:
- 8 weeks per quarter
- 15 weeks per semester
Learning Objectives
Learners who complete the Data Scientist course should be able to:
- Define analytic requirements and develop appropriate questions to guide the solution design process.
- Design a data gathering plan that incorporates principles of data governance and sovereignty to ensure usability, integrity, security and availability of data.
- Use univariate and multivariate graphical and non-graphical techniques to identify trends, patterns and outliers in large datasets.
- Build and deploy models using the appropriate analytic algorithms (such as linear and logistic regression, k-nearest neighbors, naive bayes, k-means and hierarchical clustering among others) to gain understanding from data, make predictions to solve business problems, and inform decision making.
- Assess the goodness of fit between a model and data using model evaluation metrics and cross validation frameworks to evaluate predictive models.