Python for Data Science
Course number: CGIPDS40
The Python for Data Science Course teaches you to master the concepts of Python programming. Through this Python Data Science training, you will gain knowledge in data analysis, Machine Learning, data visualization, web scraping, and Natural Language Processing. Upon course completion, you will master the essential tools of Data Science with Python.
What skills will you learn?
By the end of this training, you will:
• Gain an in-depth understanding of data science processes, data wrangling, data exploration, data visualization, hypothesis building, and testing. You will also learn the basics of statistics
• Install the required Python environment and other auxiliary tools and libraries
• Understand the essential concepts of Python programming such as data types, tuples, lists, dicts, basic operators and functions
• Perform high-level mathematical computing using the NumPy package and its large library of mathematical functions
• Perform scientific and technical computing using the SciPy package and its sub-packages such as Integrate, Optimize, Statistics, IO and Weave
• Perform data analysis and manipulation using data structures and tools provided in the Pandas package
• Gain expertise in machine learning using the Scikit-Learn package
• Gain an in-depth understanding of supervised learning and unsupervised learning models such as linear regression, logistic regression, clustering, dimensionality reduction, K-NN and pipeline
• Use the Scikit-Learn package for natural language processing
• Use the matplotlib library of Python for data visualization
• Extract useful data from websites by performing web scrapping using Python
• Integrate Python with Hadoop, Spark and MapReduce
Prerequisites
To best understand the Data Science with Python course, it is recommended that you begin with the courses including Python Basics, Math Refresher, Data Science in Real Life, and Statistics Essentials for Data Science. These courses are offered as free modules with this program.
Exam and Certification
There is no certifying body to certify you with the Python certification, however, to receive our school’s certificate, you must:
• Complete 100% of the course
• Complete any one project out of the 14 provided in the course. You will submit the project deliverables in the student portal, which will be evaluated by our lead trainer
• Score a minimum of 60% in any one of the two simulation tests
• Pass the online exam with a minimum score of 80%.
Course Content
Lesson Objective: This introductory lesson gives an overview of data science and where it is being used. It also explains the components and purpose of Python.
Topics:
• Introduction to Data Science
• Different Sectors Using Data Science
• Purpose and Components of Python
• Quiz
• Key Takeaways
Lesson Objective: This lesson on Data Analytics overview explains the data analytics process in detail and also covers data types for plotting. Exploratory Data Analysis (EDA) techniques are also covered.
Topics:
• Data Analytics Process
• Knowledge Check
• Exploratory Data Analysis (EDA)
• EDA-Quantitative Technique
• EDA – Graphical Technique
• Data Analytics Conclusion or Predictions
• Data Analytics Communication
• Data Types for Plotting
• Data Types and Plotting
• Knowledge Check
• Quiz
• Key Takeaways
Lesson Objective: This lesson introduces you to statistics and statistical analysis process, data distribution, dispersion, histogram, and testing.
Topics:
• Introduction to Statistics
• Statistical and Non-statistical Analysis
• Major Categories of Statistics
• Statistical Analysis Considerations
• Population and Sample
• Statistical Analysis Process
• Data Distribution
• Dispersion
• Knowledge Check
• Histogram
• Knowledge Check
• Testing
• Knowledge Check
• Correlation and Inferential Statistics
• Quiz
• Key Takeaways
Lesson Objective: This lesson will teach you how to install Anaconda, what are the different data types, operators, and functions in Python.
Topics:
• Anaconda
• Installation of Anaconda Python Distribution (contd.)
• Data Types with Python
• Basic Operators and Functions
• Quiz
• Key Takeaways
Lesson Objective: This lesson begins with an introduction to NumPy, it then progresses to ND array and mathematical functions of NumPy.
Topics:
• Introduction to NumPy
• Activity-Sequence it Right
• Demo 01-Creating and Printing a ndarray
• Knowledge Check
• Class and Attributes of ndarray
• Basic Operations
• Activity-Slice It
• Copy and Views
• Mathematical Functions of Numpy
• Assignment 01: Evaluate the datasets containing GDPs of different countries
• Demo: Assignment 01
• Assignment 02: Evaluate the datasets of Summer Olympics, 2012
• Demo: Assignment 02
• Quiz
• Key Takeaways
Lesson Objective: This lesson will give you a detailed overview of SciPy and its sub packages.
Topics:
• Introduction to SciPy
• SciPy Sub Package – Integration and Optimization
• Knowledge Check
• SciPy sub package
• Demo – Calculate Eigenvalues and Eigenvector
• Knowledge Check
• SciPy Sub Package – Statistics, Weave and IO
• Assignment 01: Use SciPy to solve a linear algebra problem
• Demo: Assignment 01
• Assignment 02: Use SciPy to define 20 random variables for random values
• Demo: Assignment 02
• Quiz
• Key Takeaways
Lesson Objective: You will learn about Data Manipulation with Pandas in this lesson. Data frames, data demos, data operations, read and write supports, sql operation are all covered.
Topics:
• Introduction to Pandas
• Knowledge Check
• Understanding Data Frame
• View and Select Data Demo
• Missing Values
• Data Operations
• Knowledge Check
• File Read and Write Support
• Knowledge Check-Sequence it Right
• Pandas Sql Operation
• Assignment 01: Analyze the Federal Aviation Authority (FAA) dataset using Pandas
• Demo: Assignment 01
• Assignment 02: Analyze the dataset in csv format given for fire department
• Demo: Assignment 02
• Quiz
• Key Takeaways
Lesson Objective: This lesson covers the Machine Learning approach and how it works, supervised and unsupervised learning models.
Topics:
• Machine Learning Approach
• Steps 1 and 2: Understand the dataset and extract its features
• Steps 3 and 4: Identify the problem type and learning model
• How it Works
• Steps 5 and 6: Train, test, and optimize the models
• Supervised Learning Model Considerations
• Knowledge Check
• Scikit-Learn
• Knowledge Check
• Supervised Learning Models – Linear Regression
• Supervised Learning Models – Logistic Regression
• Unsupervised Learning Models
• Pipeline
• Model Persistence and Evaluation
• Knowledge Check
• Assignment 01: Evaluate a dataset to find the features or media channels used by a firm and sales figures for each channel
• Demo: Assignment 01
• Assignment 02: Analyze a dataset to find the features and response label of it
• Demo: Assignment 02
• Quiz
• Key Takeaways
Lesson Objective: Natural Language Processing is covered in this lesson. Overview, Applications, Libraries, Scikit Learn-Model Training are covered in detail.
Topics:
• NLP Overview
• NLP Applications
• Knowledge check
• NLP Libraries-Scikit
• Extraction Considerations
• Scikit Learn-Model Training and Grid Search
• Assignment 01: Analyze a given spam collection dataset
• Demo: Assignment 01
• Assignment 02: Analyze the sentiment dataset using NLP
• Demo: Assignment 02
• Quiz
• Key Takeaway
Lesson Objective: This lesson teaches you to visualize data in python using matplotlib and plot them.
Topics:
• Introduction to Data Visualization
• Knowledge Check
• Line Properties
• (x, y) Plot and Subplots
• Knowledge Check
• Types of Plots
• Assignment 01: Analyze the “auto mpg data” and draw a pair plot using seaborn library for mpg, weight, and origin
• Demo: Assignment 01
• Assignment 02: Draw a pie chart to visualize a dataset
• Demo: Assignment 02
• Quiz
• Key Takeaways
Lesson Objective: Web Scraping and Parsing, how to search, navigate, modify a tree are part of this lesson.
Topics:
• Web Scraping and Parsing
• Knowledge Check
• Understanding and Searching the Tree
• Navigating options
• Demo3 Navigating a Tree
• Knowledge Check
• Modifying the Tree
• Parsing and Printing the Document
• Assignment 01: Scrape the sample website page to perform some tasks
• Demo: Assignment 01
• Assignment 02: Scrape the sample website page to perform some tasks
• Demo: Assignment 02
• Quiz
• Key takeaways
Topics:
• Why Big Data Solutions are Provided for Python
• Hadoop Core Components
• Python Integration with HDFS using Hadoop Streaming
• Demo 01 – Using Hadoop Streaming for Calculating Word Count
• Knowledge Check
• Python Integration with Spark using PySpark
• Demo 02 – Using PySpark to Determine Word Count
• Knowledge Check
• Assignment 01: Determine the word count for Amazon dataset
• Demo: Assignment 01
• Assignment 02: Count and display all airports present in New York using PySpark
• Demo: Assignment 02
• Quiz
• Key takeaways