Curriculum
- Overfitting and Underfitting
- Regularization
- Cross-validation
- Early Stopping
- Parameter Tuning: Grid search and Randomize Search
- Simple Linear Regression
- Optimization Algorithms – Gradient Descent: Batch Gradient Descent and Stochastic Gradient Descent
- Multiple Regression
- Polynomial Regression
- Regularized Regression – Lasso and Ridge Regression
- Evaluation Metrics
- Support Vector Machine
- K-Nearest Neighbor
- Data pre-processing
- Data scaling and Normalization
- Feature scaling
- Dealing with Missing and Skewed data
- Handling text and categorical attributes
- Transformation Pipelines
- Introduction to scikit-learn
- Different types of data
- Background
- Types of Machine learning
- Machine learning pipeline
- Parametric and Non-parametric ML Algorithm
- Matplotlib
- Plotting functions in Pandas
Numpy
- Multi-dimensional Array (ndarray)
- Operations – Indexing, slicing, transpose
- Broadcasting
- File input and output
Pandas
- Data Structure: Series and DataFrame
- Indexing, Selection , Filtering, Sorting, Ranking and Summarization
- Data Aggregation
- Data loading, storage and file formats
- Python Installation
- IDE and packages - Anaconda, PyCharm and Jupyter
- IDE and packages - Anaconda, PyCharm and Jupyter
- Variables and data types
- Conditions and Loops
- Strings
- Data Structure – List, Dictionary, Tuples
- File Handling
- Basic data types and R Studio
- Control Structures and Functions
- Loop Functions
- File Operations
- Simulation and Profiling
- Data structure
- Need of learning data science
- Need of a data scientist in companies
- Who can become a data scientist
- Vs machine learning
- Vs deep learning
- Real time process of data science
- Applications of data science
- Technologies used in data science
- Pre requisites to learn data science