Machine Learning With Python Full Course 2026 | Python Machine Learning For Beginners | Simplilearn
Chapters14
Introduces the Python based ML course, its beginner friendly approach, and the topics to be covered.
A thorough, beginner-friendly deep dive into Python-based machine learning, covering regression, classification, evaluation metrics, regularization, cross-validation, and practical workflows with Scikit-Learn, Pandas, and NumPy.
Summary
Simplilearn’s comprehensive course walks beginners through the complete pipeline of machine learning with Python. You’ll start with the basics of supervised learning, distinguishing regression from classification, and then explore linear, polynomial, and non-linear models. The instructor emphasizes practical evaluation metrics (MSE, MAE, RMSE, R²) and the essentials of overfitting vs. underfitting, using training vs. testing splits. Real-world datasets anchor concepts: TV ads sales (simple/multiple linear regression), housing prices, and the Breast Cancer Wisconsin dataset for classification. You’ll learn about regularization (Ridge and Lasso), cross-validation (KFold, StratifiedKFold, Leave-One-Out), and hyperparameter tuning (grid search, random search). The course also covers data prep, feature engineering, and pipelines (ColumnTransformer, StandardScaler, OneHotEncoder) to prevent data leakage and streamline modeling. By the end, you’ll grasp how to build, evaluate, and compare models—regression and classification alike—using Scikit-Learn in a reproducible Python workflow. The narrator also references Anaconda and Google Colab as convenient environments for running notebooks and highlights the career benefits of a certificate from Simplilearn.
Key Takeaways
- Regression predicts continuous targets (e.g., house prices, future sales) using linear, polynomial, or regularized models like Ridge and Lasso.
- Classification predicts discrete labels (e.g., malignant vs. benign, spam vs. not spam) and relies on metrics such as confusion matrix, ROC-AUC, precision, recall, and F1 score.
- Cross-validation (KFold, StratifiedKFold, Leave-One-Out) provides robust model evaluation and hyperparameter tuning, reducing overfitting and data leakage.
- Regularization (L1/Lasso and L2/Ridge) controls model complexity to improve generalization, with L1 enabling feature selection by shrinking some coefficients to zero.
- Pipelines and ColumnTransformer streamline preprocessing (imputation, scaling, one-hot encoding) and modeling in a single workflow, minimizing data leakage and simplifying deployment.
- Key datasets mentioned for practice: TV ads vs. sales, housing prices, and the Breast Cancer Wisconsin diagnostic dataset for binary classification.
- ROC-AUC measures a classifier’s ability to separate classes across thresholds; an AUC of 1.0 indicates perfect separation, while 0.5 is no better than random.
Who Is This For?
Essential viewing for aspiring data scientists and Python developers who want a practical, hands-on foundation in ML with Scikit-Learn—covering both regression and classification, regularization, cross-validation, and end-to-end workflows.
Notable Quotes
"Machine learning is a subset of AI that assist systems to learn and improve automatically from the experience without being explicitly programmed."
—Definition of ML within AI and its practical, data-driven learning loop.
"R² value will always lie between 0 and 1, with 1 indicating a perfect fit and 0 indicating no linear relationship."
—Explanation of R-squared as a regression metric.
"There should be a limit of accepting and rejecting the results."
—Clarifying the inevitability of errors and the need to evaluate training vs. testing performance.
"ROC-AUC measures a classifier’s ability to distinguish between classes across thresholds; higher is better, with 1.0 being perfect."
— introducin g ROC curves and the meaning of AUC in classification.
"Cross-validation provides a robust estimate of model performance by averaging scores across folds."
—Justification for using KFold/StratifiedKFold in model evaluation.
Questions This Video Answers
- How do I choose between Ridge and Lasso regularization for my regression problem in Python?
- What is the difference between cross-validation and a simple train/test split, and when should I use each?
- What are ROC-AUC and confusion matrix, and how do I interpret them for a binary classifier?
- How can I implement a end-to-end ML workflow in Python with Scikit-Learn pipelines and ColumnTransformer?
- What real-world datasets can I practice regression and classification on to solidify ML concepts?
Python Machine LearningSupervised LearningRegressionClassificationLinear RegressionPolynomial RegressionRidge RegressionLasso RegressionCross-ValidationK-Fold Cross-Validation (KFold)”“StratifiedKFold”“Leave-One-Out”,
Full Transcript
Hey everyone, welcome to this machine learning using Python course. Today machine learning is being used everywhere. It help businesses predict sales, recommend product, detect fraud, understand customer behavior and make better decisions using data. So if you want to understand how machine learning works and how Python is used to build these models, then you are at the right place. In this course, we are going to learn machine learning in a very simple and beginner-friendly way step by step. Here's what we'll be covering. First, we will start with the basics of machine learning and understand what supervised learning means.
Next, we will focus on regression, which is one of the most important concepts in machine learning and learn how models are used to predict values. Third, we'll understand the difference between linear regression, multiple linear regression, and nonlinear regression with easy examples. Fourth, we will learn how to check whether a model is performing well using metrics like MSE, MAE, RMSSE and R squared. Next, we'll also talk about common problems like overfitting, underfitting and understand why training and testing performance both matter. We'll also move into polomial regression and also understand the basics of gradient descent and how it helps a model learn.
We'll also work on real data set in Python and go through the full process step by step from reading the data and doing basic analysis to training the model and checking the results. And finally, we'll also learn about regularization techniques like lasso and ridge regression which will help you improve model performance. And by the end of this course, you will have a strong beginner friendly understanding of machine learning regression using Python and how it is applied in real world problems. Also, if you are looking to dive deeper into machine learning and unlock the power of Python, I highly recommend you checking out the machine learning using Python courses by simply learn.
This course will help you build a solid foundation in machine learning starting with regression techniques and moving on to advanced topics like gradient descent and regularization. You'll also work hands-on with real world data sets. Learn how to evaluate your models using key metrics like MSE, R², and gain practical skills in data wrangling, visualization, and feature engineering. So whether you are just starting out or looking to enhance your skills, this course is perfect for anyone looking to enter the field of machine learning. And upon successful completion, you will earn a course completion certificate from SimplyLearn which will help you stand out in your career and prove your skills to the employers.
Check the description below for the link and start your learning journey with Simply Learn today. So before we get started, here's a quick quiz question for you. Which of these is used to predict continuous values in machine learning? Your options are regression, classification, clustering or tokenization. Let me know your answers in the comment section below. What is machine learning and why is there so much buzz around it? Let me first uh let me go through what all we would be covering in this particular course and then we begin the discussion. So if we talk about a learning path you know the first uh you know uh topic that we will be covering today is introduction to machine learning which focuses on the basics of machine learning.
Second is supervised learning regression and application which focuses on supervised learning with an emphasis on understanding and implementing different types of regression models. Third is supervised learning classification applications like basically we're trying to cover that what are the different learning techniques available in the machine learning. So first technique that we would be covering up is supervised learning. Under supervised learning we would be covering up regression as well as classification. Then we would be moving on to the ensemble learning method which focuses on advanced ensemble methods to enhance the performance and robustness of the models. Then we would be moving on to unsupervised learning and finally recommener system along with the application.
What is machine learning and why is there so much of buzz? Why are you here to learn machine learning? Let's let me put this in another way. So basically we want the machine to get trained with our data. We want the machine to learn from the data so that it can predict data. What kind of data that we want to predict? Why do we want to have want these predictions? Right? Because now are we living in the digital technology where data is all around us. Even this you know uh you know session is data right.
When all the material that has been sent to you is data, anything on the news which is coming is data. Anything doing for entertainment is data. E-commerce is data. You know your work profile is data because we are living in huge amount of data. Data is all around us. Do you think is there any escape from data now? No, not now. uh when I used to take sessions five years you know and seven years uh before you know uh the scenario was little trying to adapt but I don't think so there is now any survival without data can you survive without this data not moving on social media and you know one day you know as it says you know if the internet stops do you think your life also stops your your mobile phone is lost everything is lost Isn't it the data has become a lifeline you know and now we see several applications which are working or the concepts that are being bas you know based on data now it automatic translation translation has not become difficult difficult if you want to convert something from English to German to Spanish any language translation is right there you We speak to the machine and we get the translation.
Virtual personal assistants are there. Image recognition. Email spam filtering that's that's actually comes under the domain of machine learning that it you know based on the algorithm or the pattern or the text or the words which are there. It is able to filter out whether the mail is a spam or not. So what do you think would be the criteria? Generally the males with spam are saying that you have a lottery system other system or there is a bonus. So that becomes your email spam filtering. Then we have the text and the speech recognition. Medical diagnosis, online fraud detection is there you know where we want to detect uh the how is the online fraud happening.
Web search uh search and recommendation engines and of course what is going to be the traffic prediction. Not only the traffic prediction on the roads but it also relates to the traffic going onto a particular website whether that website is going to get crashed or not. Data becoming now difficult to handle because it's digital data. You know understanding data everything numerically is difficult. You know earlier the data was this much it could be handled. Now it's becomes like this much and this much and this much and it's increasing. So we need certain algorithm we need technologies which can help to analyze so that we can improve our performance.
But on if you if you look at on the overall scenario still there is a lot of confusion about the AI the machine learning and the deep learning. Yeah. What is the you know the subset or the issues like what is the artificial what is artificial intelligence machine learning and deep learning are you able to distinguish between the three. So this is one of the initial chess you know developed by the computers IBMD blue chess program developed in 1997. When was it developed? It was developed in 1997 by IBM and this particular program was strong enough to defeat the uh world chess champion at that particular point.
Uh his name is Gary Kasparov. Okay. He was capable of defeating the world chess champion. Getting my point right. And then we have this IBM Watson under machine learning. So AI is the bigger branch. All right. Artificial intelligence is the bigger branch. Right. under which we have the machine learning subset right which we are going to study. In machine learning basically it works on statistical algorithm that is why we say that before learning machine learning it is important to have a good concepts and knowledge of statistics that helps to understand. So based on statistical foundation, statistical algorithm uh you know the machine is capable of doing Google search algorithm, Amazon recommendation and email spam filtering.
How like we just saw that it is capable of filtering the email. And then finally we have the deep learning which is also under machine learning in which we have the alpho the natural speech recognition and the level four automated driving system. The scenario has changed and we are into the AI revolution. So the bigger branch is still AI. We are seeing we are the you know witnessing this revolution in front of us. Under AI we have the machine learning under machine learning we have the deep learning and under deep gen AI task. Right? So when we talk about the gen AI task generation fine-tuning we have the agents automation and the virtual assistants.
And this genai is now capable of even doing lot of text generation that we see we ask a lot of things to the chat GPT other uh models like Gemini copilot also we want to image uh generate images video generation all these things are being possible AI that we are talking at the moment is only and only related to the software through software we are able to or give intell intelligent answers. But what is basically the difference between a traditional programming and a machine learning programming? So let me explain you with this particular con uh example.
So in traditional programming again we have this data right that this is my data which is 1 2 3 and 4 right and over here we have machine learning algorithm where we have 1 2 3 and four okay the data has not changed I'm giving you a very simple uh you know example now if we talk about the program initially the program when we talk about C, Pascal, Photron, even C++ and Java and even Python they are capable of logic. Now if I want to distinguish that what are the numbers what is the logic behind that these uh you know numbers are even or odd.
So simply we understand the logic that if I if I talk about uh Python I percentage 2 is equal equal to zero that means if the remainder divided by two is zero then the number it is odd right else it is odd getting my point. Yes or no? This point is getting clear. So now even if a number 10, it would automatically give me that this is going to be an even number. And if I give the number 57, it is going to be odd. But now machine learning how things happen. I give the output along with it.
I say that one is odd, two is even, three is odd and four is even. Got it? Now the computer based on certain statistical concepts algorithm will try to detect that when I feed the number 10 over that it is going to be even. Can it predict me as odd also? Yes, the prediction can be wrong also. Clear? But if the algorithm has to be good enough that it is predicting that 10 is even and 57 is odd. So coming on to the concept the first kind of learning is known as supervised learning. What is it known as?
Supervised learning. that if this is my known data, this is already now images that this is my input and IO feed the output. When I feed the in as well as output to the machine, it becomes my labeled data, right? That means this is an image of an apple. I feed it into the machine and when I feed this apple it says predicts that is this apple or not and it predicts it's an apple but it can predict wrong also. So might have seen sometimes the chart GPT also predicts gives wrong answer the image uh development jibli and all all give wrong answers.
It's the biggest fear you know that these if the if a wrong or a incapable data is uh set to uh them then it can give false and nonsensical and fabricated information also. So this is the fake fear that we are now moving ahead right AJ Charardik but if we talk about system if we talk about the traditional systems expert systems expert systems were working that we have a user we are trying to give the query and then try to get the output out of it right if this is my query and this is how I get the output out of it and then I try to infer what is it but inference is not coming from the data but this data it's coming from data but this data is created from an expert now what do I mean by that let me explain this to you expert can be a cardiologist a lawyer in different domains right a cardiologist a lawyer somebody from in the finance domain maybe for 30 years 30 plus years of experience and weated if else knowledge you know inference knowledge that whether you are capable of getting a loan or not and if I have a cardiologist some information has been fed that maybe your BP rating is this much or your um terms if that is matching to the inference engine that's how it will give the output so what happens for example if you are a user and you enter uh to the user interface.
Maybe all your uh you know blood reports, your BPS and your test reports all things are given as the user interface right and then we do the inference engine and based on this knowledge base we get the output right. So what has been replaced now rather than knowledge base it is completely based on the original data and things have become complicated on images on you know nonstructured data. So uh do we understand supervised learning? Now can I say more the data more my system becomes intelligent is only capable of recognizing hexagon triangle and square and if I add a shape of a circle maybe a parallelogram maybe a rectangle it's capable of analyzing that.
So that is where you know the systems are getting modeled because the data is becoming huge day by day right. So this is supervised learning. So under supervised learning we have the label data. What is label data? We have the input as well as the output where we try to train the model. After the model has been trained based on the test data we try to do the prediction whether it is a square or a triangle. Clear. We will initially start with supervised learning that we have the input the output. We will try the model.
So where will we do the model? That means now the data the 70% of the data will be used for training and 20 to 30 uh you know percent will be used for testing. Then it will predict the output. Okay. And as I've been telling you, there is a very strong relationship between machine learning as well as statistics part of it. Right? So, statistics is a field of mathematics. But when it is combined with computer science, you know, the machine learning, it becomes statistics and machine learning. The idea of statistics is that it helps us to draw inferences, relationship between variables.
Whereas machine learning gives optimization, prediction, accuracy etc. Right? And then we have prior assumptions about the data. Some knowledge about the population usually required. This is none. Dimensionality of the data usually applied to the low uh dimensional data. a knowledge overlap. There's no ML knowledge required when we study statistics. But in machine learning, some statistics knowledge is usually needed as it is becomes the foundation for few algorithms. So if you do not know much about statistics, there is nothing to worry. It's not very difficult. Definitely concepts of probability would be required. So I would just require request you the learners to get familiar with the concepts of probability conditional probability basian theorem and probability distribution.
This is what I expect from you all. Got it? Right. And to make the picture a little more clear, you know, the boundaries are not very crisp now because there is data everywhere. But to make the picture a little more clear that you know when we start with initially Python course you know we are doing visualization exploratory data analysis maths and statistics and when we try to overlap do overlap between AI machine learning deep learning we get this data science. So data science becomes the foundation for AI ML and deep learning. If I technically ask you what is learning and now you might have been hearing this word agent you know it could be a human agent it could be a robotic agent it could be an AI agent this so if we talk about a little more technical definition of learning basically we are trying to improve the behavior based on the experience when we say you are a very learner thing you experience that means means uh different types of knowledge.
The range of behaviors is expanded. The agent can do more. Right? The range of behavior is expanded and agent can do more. The accuracy on the task is improved. The agent can do things better. And the speed is improved. The agent can do things faster. So what is the idea of learning? If you see this means that DS okay so if we talk about learning from the machine as well as from the human point of view this definition is valid that it is the ability to improve our behavior based on our experience right it's not always about acquiring new skills that's one of the thing that range of behavior that you are you know driving you know swimming you know stitching you know cooking of course the range is increased but also So accuracy by doing the thing again and again doing learning increases my accuracy as well as speed.
And if we now see technically in machine learning the different types of learning techniques are supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning. So all these of actually learning come from the fact that how we as humans also learn. So do you also agree only agree through training and testing? Training and testing happens when we are going to school we are trained and then we are tested in universities, colleges or even this session. Is there any other way we learn also? Do we learn through observations? Do we learn through our mistakes? Do we do we learn?
There are different kinds of learning also possible. And exactly the same thing we also try to replicate in our machines also. So one of the learning method is observation. So one uh basic difference between supervised learning and unsupervised learning. What do we mean by supervised and unsupervised learning? Supervised learning says that this is my data right are my apples right and I tell them that this is my image these are apples I 70% of my data is used for training and 30% of the data is used for testing label data is given in absolutely correct mega label data means supervised is learning and then the model predicts me that it is an apple and what is unsupervised learning?
Have I have do I have a label data? No, I have not told that this is an apple or this is a banana or this is a peach. But the model or the algorithm, sorry guys, is capable enough to distinguish between the three of them that this is my apple, this is my peach and this is my banana. Right? So initially we will you know build concepts on this supervised as well as unsupervised learning. The machine is given huge sets of data that are not labeled as inputs to analyze. The machine needs to figure out the output its own where it identifies the patterns.
And the two types of algorithm which come under unsupervised learning are association and clustering and K means for clustering problems and a priority algorithm for association role learning problems. Right? And then we have supervised learning. The input is in the form of raw data that is labeled. The machine is already fed with the required feature set to classify inputs divided into two types of problems. Regression and classification. And then we will try to understand different regression algorithm. This is the first stage that we are going to work on. Clear? So now let's understand what is selfsupervised learning.
You know here we have the input data. Is it labeled? No. This is not labeled data. But we have partial label data that this is an orange and this is a banana and its quantity is less. And here both the types of data are mixed machine learning model and this is my unlabelled data to predict the output. So it's an apple. So this is my input data, right? This is my partial data. And when I combine them together, prepare the model, I get the output, clear and reinforcement learning. The most effective way of learning from which we as humans learn a lot.
That means from our mistakes, from our feedbacks, from our punishments, from our rewards, right? So if this is the input given to the machine and the machine predicts that it is a mango and I give a feedback wrong, it's an apple, it notes it down. And now when I feed the apple again to the machine, it says that it is an apple. So take making the overall picture a quick recap that basically you know if we talk about machines there are three types of learnings we have the supervised learning where we have the input and output and this supervised learning is capable of calculating error that is output minus the input uh uh you know and what is the because why are we capable of calculating error in supervised learning because we have the actual output over here, right?
And based on the actual output, is my machine predicting the correct result that if it is an apple, is it actually telling me an apple or not? Or is it telling me this is a cherry? So I can calculate my error. So the biggest advantage or the simplest way to learn is supervised learning where we are capable of calculating the errors. Right? Here we have unsupervised learning where we do not know the output. It's just trying to do the clustering and association between the objects. And the other type is reinforcement learning that is not going to be part of this journey that's generally taken into deep learning concepts that where the machine learns from its you know uh punishments and rewards right and of course that we are capable of calculating error.
Again I'm telling you in this particular course we are going to try to cover supervised and unsupervised learning uh you know concepts algorithms in detail. Now moving ahead to supervised learning. There are two types of supervised learning. We have regression and then we have classifification. We have regression as well as classification. Now what is the difference between the two? Please try to understand. Under supervised learning, if the output, it's all about output. If the output is numerical then this is known as regression and if the output is categorical then it is known as classification. Getting my point learners?
What we are trying to achieve that what is going to be the temperature tomorrow. So if it gives me that tomorrow it is going to be 84° Fahrenheit or any other value then this kind of algorithm is regression. But if I want to predict whether the temperature is going to be cold or hot this is known as categorical data. Clear? So the regression works on numeric and classification works on variable and categoric. Two types of numerical data. One is discrete and other one is continuous. Do we understand that? And if we talk about categorical data, do we understand nominal and ordinal data?
This is what is covered in data science class. So basically you know data types are divided as qualitative and quantitative very very important when you um divide the data as qualitative it's categorical order data is something you know like rating ranking they come under order feedbacks of uh like movies nominal is that there is no order as such the color of the eyes or nationality and quantitative is numerical values continuous which can be divided divided such as distance, salary, price and something which cannot be uh divided is example cats etc. So now I hope the concept of regression and classification is clear to everybody right.
So uh making your concept more clear that regression is something when the task of predicting a continuous uh you know quantity that I want to predict the price of the house that price of the house in 2014 was this much. Then in 2024 it is this much. Right? And then what will happen? What will be the price in 2034? Clear? So since price is a numerical quantity it comes under regression and classification as I told you that if I want to separate whether the male is a spam or not that comes under a classification problem.
Clear? So now we can we will begin our journey in this machine learning through supervised learning. First we will try to complete algorithms which are like simple linear regression, multiple linear regression, polomial, support vector, decision tree, random forest. We are not covering neural network. All the others will be covered. Similarly in classification we will cover logistic K nearest neighbor support vector machine nave bias decision tree random forest. Again neural networks are not covered. Yeah. So with this uh we come to the end of the introduction and if you will now look at your u slide the ebooks that is lesson number two.
Now we can begin with lesson number two. So just I've prepared the foundation for that. So let me show you the lesson number two. All right. So we are now starting with this. I hope now you would be able to locate this particular file in your LMS in your material. Please look at that. So analyze the distinctions and applications of machine learning, deep learning, artificial intelligence through the real world examples of various technical applications. Differentiate among various machine learning models and explore each model learns from data to predict outcomes. Explore Python libraries for effective data manipulation, visualization, implementation and machine learning algorithm.
So the business scenario says that ABC is an e-commerce company which is struggling with a surge in fraudulent transactions on its website. The manual review process for transaction has caused delays in order processing and led to negative customer experience. To address this, ABC will use machine learning algorithms to detect realtime fraudinal transactions. So machine learning algorithms are capable for detecting the fraud transaction and these algorithms will be integrated into the company's transaction processing systems to flag suspicious transactions and prevent fraud. Additionally, the company will use these algorithm to predict the customer behavior on the past purchase history thereby improving the recommendation engine's performance.
Now, everybody is using mixture. Everybody wants the best you know the more you learn the best output you get. And when I say best output, you want the prediction to be highly accurate. So when your report goes to a machine learning or a AI machine, it it has to give accurate result that whether you have whether you are you know that predicting that you know it should be accurate enough to predict that you know yes you know your your you are you know uh cap you have a tendency to have cancer or not you know so no so not only one technique will be used it will try to use mixture of techniques to get the best results.
Got it? Now, so now are we clear? What is machine learning? So machine learning is a subset of AI that assist systems to learn and improve automatically from the experience without being explicitly programmed. Arthur Samuel coined the term machine learning in 1959. It enables programs to learn automatically making computers more intelligent without human intervention. But human feedback is extremely extremely important because see ultimately machines are not genius. We have to tell them that this is you know a an acceptable result or not. So who is known as the father of machine learning? It's Arthur Samuel and he coined the term machine learning in 1959.
Father of AI that's John Mcathi in 1956. Yes, John Mcathi in 1956. Are we now more clear what is the difference between the traditional approach where we had the data and where we were programming explicitly the output right and the machine learning approach based on the algorithm based on the data the patterns it understands it predicts the output that is why statistical techniques algorithms are the foundation for machine learning it automatically learns the features and reduces the need of manual featuring clear. It handles complex and unstructured data such as images, text, audio without requiring extensive pre-processing.
Performance improves with more data and learning iterations enhancing accuracy and generalization. Right? And now do we understand this ven diagram also that machine learning deep learning AI are often used interchangeably. So AI encompass encompasses simulation of human intelligence and machines. So self-driving cars are you know AI all the robotics come under the category of AI but machine learning Amazon Alexa where we are giving it specific instructions and it gives us output. And when I talk about deep learning, deep learning involves neural network which is going to be your next stage after uh you know machine learning to understand how neural network uh you know work.
How are they capable of understanding complex pattern recognition such as recognizing patterns in images, speech, text etc. So where are what are the examples of machine learning in a chess game between a computer and a person? Now why do you think that why is this chess game always coming into the picture? Why do you think is the chess game always coming into the picture? Well, when we talk about human beings, a person who plays uh chess well is said to be intelligent. It's an intelligent game. Agreed? Is said to be intelligent. And even if the results go wrong, there's no harm.
There are no catastrophic results. you know even if the person is winning and says the machine is winning it is not that harmful. Yeah. So that is why you know that that is was one of the way where AI exploded in you know intelligent gaming systems theoreance solving that is why chess alpha go these are the games which you will you know read or see when we talk about a little history of AI. So in a chess game uh between a computer and a person, the computer uses AI to analyze the game, predict moves, decide its decision.
AI uses machine learning to figure out the opponent is a beginner, intermediate, and an advanced level. How the AI uses machine learning to identify whether you are a beginner, intermediate, or an advanced player? By predicting our moves, you know, based on our moves, it will immediately judge, you know, immediately judge that you are a beginner or a intermediate or an advanced level, right? And you might be playing a lot of games where there is AI and other you know graphics games over there, you know, especially the young generation, right? Well, I don't, but you can, you know, how smartly, you know, they are capable of hiding things and they become smarter.
The level changes you know as you are also becoming smart the level of the game becomes smarter right we all observe that so AI decides its next move against the oppo opponent using a complex neural network that learns various features patterns from the data right then of course many applications are there we see machine learning all around us in spam filtering spam filtering actually use Drive based classifier, social media analysis, customer service, chat bots, uh now a lot of available online recommendation. We all see that sentiment analysis. How does the sentiment analysis happen? Anybody who has an idea based on the words and especially the emojis.
Is it a smiley? Is it a angry face? Is it a sad face? Crying face. All these things are taken into account. So these processes allow computers to learn patterns from the data, make predictions, predict outcomes, classify target feature and improve performance. So that's what uh we want machine learning that they help to predict the outcome, what is going to be the price of the house or any other thing after 10 years down the line. classify target features based on the similarities and improve the overall performance of the system. And we have seen is there a very very strong relationship between the data and the output.
Yes, the amount of data if it is more of course the output or the quality or the prediction also increases where the red line is uh the quantity and if if it is high quality data we get better results better insights and better predictions of the output. So of course maintaining the quality, authenticity and removing the errors. All these points have to be taken into account when we are talking about the data and the machine learning algorithm. Right. So when we talk about types of machine learning, ML can be divided into four main categories each characterized by its capacity to predict the our conditions or identify the patterns to produce outcomes such as what is supervised learning, learning and reinforcement learning.
Now I think the distinction is clear. Supervised learning, we've done this. What are the three main points under supervised learning? First is the label data. What do we mean by label data? It will have the input as well as the output. Second point is splitting of the data into training and testing. Right? Generally training happens on most of the data and testing of on the rest of the data. And third is calculation of the error because in supervised learning we know the actual output also and whether it is predicting it right or wrong. Clear? And some commonly known uh you know uh supervised learning algorithms are linear regression, decision trees, logistic regression, support vector machines etc.
So some examples of supervised learning are predicting temperature rise based on the yearly temperature trends. Predicting why supervised learning again predicting temperature rise it comes under regression because it's a numerical problem right numerical output predicting crop yield based on the seasonal crop quality changes. Again regression sorting waste based on the known waste items corresponding to the waste type. This types comes under classification or filtering. So under supervised learning calculation of error is also going to be an important aspect of machine learning because we have to be very clear that the output of machine learning will always not be 100% correct.
Clear? Now this examples are these example making the picture more clear. And if we talk about unsupervised learning very much used to you know uh identify different parts of the object image segmentation for object detection. Identific identification of user groups based on commonalities. Identification of anomalies over geographical landscapes based on the data patterns. The unlabelled data set is provided to an unsupervised learning algorithm to discover hidden patterns and to recognize their relationship. So it's not that unsupervised learning is not important. It is equally important to analyze different features, different relationships in the data. Right? Rather this is the algorithm which helps us to discover the hidden patterns and discover the relationships.
Clear? And now coming on to the unsupervised learning example. It automatically groups the images based on the similarity that this is unlabelled data. Based on that it is capable of distinguishing the middleage people, old age people, the young, the infants, the teenage etc. Got it. And what is semi-supervised learning? As I've already told you, it uses a combination of small amount of labelled data. Sometimes, you know, the data is that's one of the constraints that we see in data is not completely labeled. A large amount of unlabelled data is used for training. Like supervised learning, it aims to learn from a function that can accurately predict the output variable from the input variable.
It uses the unlabelled input to assist the learning process by collecting more information improving model generalization. So it falls between supervised and unsupervised learning. So suppose this is my raw data and I have partial that this is adults and these are kids then the machine automatically distinguishes between babies, teens and tween and also distinguishes between the senior citizens, youth and adults. Got it? So it automatically learns the correct uh you know uh groupings of the kids uh or the different people into teens uh tween and babies and the adult ones into into these category and another example of semi-supervised learning which we see it practically that was your question Mega that Google photos is popular example of semi-supervised learn learning that when a picture is taken it gets stored in the Google cloud platform and from slowly the Google tracks you know whose picture it is at what place it was taken so in various instances uploaders label images despite Google's lack of knowledge regarding image names its algorithm can identifying by analyzing visual features and shapes and colors and it does that it does a lot of lot for me it is able to it's able to identify my friends my family in which location I was and where and reinforcement learning is a type of machine learning where algorithms learn from the environment by performing actions and receiving either rewards or penalties as feedback.
If the pro program finds correct solution, the interpreter rewards the algorithm. If the outcome is incorrect, the algorithm is penalized for incorrect predictions. It must I reiterate until it finds a better result. Right? So ultimately reinforcement learning involves an agent. It interacts with an environment. Learning from the rewards and states to choose from and then based on the output it gives the best action and if it is an error it learns it again. All right. So this is my input raw data based on the environment reward state and action. it's capable of detecting them separately.
Okay. So the example is this type of learning is seen in YouTube recommendation where the user searches for a particular song. The program shows the list of available song. So when a user selects a specific song, the system trains itself to remember and deliver similar results for future s searches based on the user's interaction like views, shares etc. So this is the concept on which recommendation systems work. So other examples of reinforcement learning are game where players can play with bots, autocorrect tools, search recommendation income uh engines, self-driving cars and then the Python packages that we would be doing for machine learning.
We are aware about NumPy. It's a very powerful tool for numerical Python computing. Mattplot lib for drawing data visualization pandas. So I hope you all are aware about numpy mattplot lip pandas but we will be working more on the scikitlearn uh uh you know uh file which consists of different algorithm but the pre-processing the other part is also being taken care before we feed in into the algorithm. So a quick recap. So machine learning refers to the machine's ability to learn from the data and replicate human behavior. AI includes machine learning, deep learning, each with unique capabilities for simulating intelligence.
There are four main types of machine learning, supervised, unsupervised, semi-supervised, and reinforcement learning. Python packages are folders with modules that organize code for easy reuse and maintainance, improving the de development efficiency. Now clear. So now let's go in for a knowledge check. Question number one. Yes. Learners, are you there? Which of the following best describes the machine learning? A, B, C, and D? Question number two. Which example illustrates the use of machine learning to enhance customer experience in an e-commerce company? What distinguishes between deep learning, machine learning and AI? Yes, it is a subset of ML that uses multiple layers for complex pattern recognition such as recognizing patterns in images, speech and text.
Right? So with this we come to the end of the very introduction and basics of machine learning. Yeah. So what I'm looking forward is this is the overall uh picture of statistics that uh we have the learners who have already done data science are aware about it. The ones who are not aware about it. The different types of statistics that we have is descriptive. Under descriptive we have measures of central tendency and measures of variability. Under measures of central tendency we have the mean, mode and median. Under measure of variability we have the range, variance and dispersion.
And here we have the inferial statistics how we infer the results. So this is the important part that we are looking at that that includes confidence interval hypothesis testing. So you can do a lot of search on inferial statistics and uh this is what I am saying that uh statistical inference constructing confidence and intervals on population hypothesis testing that is what I'm looking and what is the advantages of these particular uh program because ultimately you know probability in data science and AI play a very very major role in understanding uncert certainty predicting outcomes how probable correct the output is even the LLM models the chart GPT is predicting on the probability okay this is the next word so if it is 70% above then let's predict the output how do we model complex system enhancing AI and it is the statistics and the probability together which help in exploratory data analysis and give meaningful full insights and features.
So as we are going through this flow we understand we are going to start with supervised learning. What is Before supervised learning you know uh I would like you to cover we would uh you know the cover the basic concepts of machine learning which are prepared through my PPT and then we will move on to what they have shared. Okay. So the first thing is regression. You know what are the two main algorithms which come under supervised learning? Yeah, regression and classification. What is the difference between regression and classification? Regression happens when the output is numerical and classification happens when the output is categorical.
All right. So I start with my PPT again so that it helps and it gives you strong foundation clear concepts so that we can move along with that. So if we talk about the first knowledge check what is machine learning? Correct answer it is see that it is an autonomous acquisition of knowledge through the use through the use of computer programs. Second question, what is the key difference A and B are both correct. What's the key benefit of using deep learning for task like rec? How do I u Okay, you've written yes. Okay. What's the key like recognizing images?
Yeah, they can learn from complex details from data on the own. That's the idea of deep learning. That's the use of neural network. Absolutely correct. All right. So now we start with the concepts of machine learning under supervised learning which are valid for regression and classification algorithms. Whenever you will say you know machine learning these are the basic questions that will be asked right. So if I talk about uh machine learning or supervised learning what are we trying to uh you know do in supervised learning what is the main aim what is our main objective of supervised learning if but prediction of the data.
Okay, we would call it prediction of the data future based on given data. Right? We want to predict and is this prediction always correct or it can be wrong. Can be wrong. They can be errors. Right? Right. And so they there should be a limit of accepting the errors or rejecting the result. Right? There should be some way of accepting and rejecting the results. We all understand this very in a conceptual subjective manner. Now let's try to understand it on the basis of mathematical functions. Right? So here we have the plus over here and here we have the minus over here.
Right? This is my data set. This is my x-axis or and this is my yaxis. These are given as my data. Right? And what are we trying to do over here? We are going we are trying to predict the output. So what are we trying to predict the data that what is the value of this question mark? What is the value of this question mark? All right. And what is the value of this Right? Is this question mark a plus sign or a minus sign? Can you tell me what is this question mark? A plus or a minus sign?
What about this? What about this? So this is my first question mark. Second question mark. Third question mark and fourth. So based on your observation, can you tell me what is this first? Do you think it is a plus or a minus sign? What does this particular data point represent? Is it a plus or a minus? Because this this particular data point is more close to the plus. So there are chances that this is going to be plus sign more chances. Yes, it can be negative also but we can say 70 to 75% or 90 to 95% chances are that this is going to be plus.
What about the fourth one? This is negative. Now what about the second and the third one? What about the second? What about the second and the third? This is going to be a little difficult to say that this is going this can be plus this can be minus based on the way the neighbors I am selecting what will be my output agreed but but if technically now if I use a straight line function ma mathematically how do we use a straight line function that y is equal to mx + c. So any all the points which lie on the left side of this line they all are known as plus they all will come under the category of plus and the data points.
Now all the points which are lying on the right side of the line they will be all negative because most of the points are minus or the red points. Can I say that all the points which are lying on the left side are plus and all the points which are lying on the right right side are negative. There are very few now this is there is only one plus sign and if you see all the points belong to the red red negative class one way I have to do something but again the question is why this straight line that how do I decide the straight line?
So one point to understand is that when data points are given that is known as a hypothesis space represented by capital H and the line is represented by small H right and it is going to be this red line which is going to divide the data points into two different classes. So there can be more than one solutions to a problem or infinite solutions. I can draw infinite straight lines. But which one to accept? The line which will be accepted is going to be one with gives me the minimum error. Right? The error word will be different for different algorithms.
But I will accept that straight line which will give me the minimum error or I can say the maximum accuracy. We will calculate as I told you in supervised learning we keep the track how we can have maximum accuracy and minimum error. So what is the capital H over here? Hypothesis space is the set of all possible legal hypothesis. This is the set from which the machine learning algorithm would determine the best possible only one which would describe the target function and small h is a hypothesis function that best describes the target in supervised learning algorithm.
So now technically what does supervised learning mean that the hypothesis or the small edge there can be several small edge that an algorithm would come up depends upon the data also depends upon the restrictions bias that we have imposed on the data. So next thing is we need to find out mathematical functions which give relationships between the data points. Right? So basically now the idea is that we want to find a mathematical function where you know uh we want the error to be minimum or the result to have maximum accuracy. Again I'm repeating we want to find a simple mathematical function which gives me minimum error or maximum accuracy.
Clear? Thank you at Thank you for understanding. Thank you learners. So now what are the different stages that we are looking at? Now suppose data points can be of any kind. So if this is my data points over here and if I want to solve it with with a simple linear function see straight line function is a simple mathematical form uh you know calculation that y is equal to mx + c. Okay. So when y is equal to mx + c, right? And if these are my data points. Now how do I calculate my error?
This is going to be actual minus the predicted. It will have very high error. And this particular concept is known as underfitting. Do you think that this straight line is consistent covering all the data points? No. Right? So the simple straight line equation or function is not capable of covering all the data points that is known as underfitting. Other way is that if I draw a function which passes through all the data points so it makes it a very very highly complicated mathematical function with high degree. Right? So it makes a complicated function with highderee mathematical function that also we don't want.
Why we don't want because if the data point is out of all the data points then this is going to give me my maximum error. So we want a mathematical function which is simple and which covers all my data points. So something like this exponential right and this gives me a good fit or a good balance now clear. So AJ says no. So the model even fails to predict labels what it learned. Right? It will it will predict the labels but with very low accuracy. The error will be high. So we we will not accept models with high errors or low So overfitting is a situation.
How do I know the error is high? Overfitting. So one way is that error is equal to if I talk in technical terms. So how do we know that it it is underfitting overfitting. So that's what I'm telling you. Generally the error is the formula for error is bias²ared plus variance. What is it equal to? It is bias squared plus variance. But the equation is in good balance not linear. Yeah it it it's not necessary that it has to be a linear equation. No, we want an equation which is simple and nice. So how do we know?
So now if I say that error is equal to bias squared plus variance over here right and this is you know so how do I know whether it's overfitting or underfitting it's not about the straight line but when I have variance if if the variance of the data goes very high then I know that it is an overfitting model if the bias of my data goes very high I know it's an underfitting model and to keep my error low I want the bias as well as variance to be low. Getting my point right now what do we mean by bias and variance?
A very very typical example. A bias means how far are we away from the original data points. So bias means we are close to the center. Variance what does variance mean? The spread of the data is also low. So this is the ideal situation we always want the data points to be to be in. This is where we will say okay a good fit has been achieved where we have the low bias and low Okay. And when we have the low bias and when we have the high variance low bias means that the data points are near to my actual target points but spread out.
So what is the case happens over here? This is known as the overfitting case. And what is high bias and low variance that over here these are my data points and this is high bias. So this is underfitting situation. So in whole of machine learning we don't want underfitting we don't want overfitting. Of course this is a worst case where the bias is more variance is both. The idea is to have low bias as well as low variance. Clear? Okay. Now pre I'll explain the previous example with this explanation after this punit I think. So this will make things more clear.
So these are my data points over here. A linear model is not a good fit. Under fit that means over here what is more my bias is more. What is bias? That my data points are very far away from the actual points. Bias means we are far away from the actual points. Right? This is high bias. And what is variance? Variance is when the spread of the data is more. Right? And when spread of the data is more or mathematical function is too complex then it is overfitting. When my mathematical function is too simple, it is underfitting and when it is a balance, it gives me the right data.
So now when I talk about complexity, now when I say overfitting means more complexity that generally linear models give me underfitting, right? We start with simple models but generally they do underfitting. Then we will move on to nonlinear models, support vector machines, treebased models, deep learning models. Why are we in doing this? Because it they give me more results, better results, more accurate result, less of error. But what is the cost that I my model has become complicated? I'm losing the interpretability of the model. What do I mean by interpretability? That if this is my input, right?
How do I know that this is going to be my output? That's possible in linear model because I know y is equal to mx plus c that if this is my input, I will get this output otherwise I will not clear. And to make it more clear you know the overall picture that I was talking about please look at this slide. So ultimately where is the trade-off? Where are we fighting in the whole of machine learning models? If life was so easier wouldn't have the problems be solved by now but no there is still a trade tradeoff which is going so how will I know it is underfitting though it is showing in diagram is there any yes yes they are mathematical way but first let's get the grasp of the thing okay that ultimately the idea is to get highest accuracy right and interpretability is also important Why?
Because we want to see why am I getting this output. So we always start with linear regression or logistic regression problems. That is what we do. That is what we are going to do in our uh whole of this journey. And why uh regression linear regression? Because they are linear and smooth well definfined relationship easier to compute. And as we move along this journey, decision trees provide good accuracy with high interpretability. KNN clustering clustering and KN&N algorithm are mid-range. Interpretability is okay and accuracy is also okay. But if I want really good accurate results that is why new algorithms complex algorithms were built with kernel based approach for support vector machines onsembled methods and of course neural networks now clear is this slide getting clear to everybody that is why complicated neural networks are capable of solving nonlinear relationships non non smooth relationships and long computation time clear okay if I want interpretability of the models then I'll might go in for decision trees but I will have to compromise on my accuracy but if I am not interested in interpretability that we are not we always want best result that is why neural networks deep learning have taken the market is we as users who will decide what is the difference between accuracy and interpretation ability interpretability is how am I going to find out the relationship of output given a particular input.
So if I know a mathematical equation okay this is related with beta x1 beta_2 x1 I know why am I getting the result. But if it is some integrable model like neural networks working in layers in different layers some in some layer differentiation integration addition subtraction is happening I will not be able to interpret the result but it is giving me highest result then of course I will use it. If I have a very mathematical complication, complicated system. See, now try to understand the whole story. Again, let me just uh repeat the story because this is very important in the whole of machine learning and these concepts are valid for deep learning also.
Okay. The story says that we want to predict the output or the data. We clear. The first point says we want to predict the output of the data with minimum error or maximum accuracy. First two points are clear. The the threshold Simon is maximum accuracy and minimum error. That point is clear right. And which is the simplest mathematical function that we have? We always talk about a straight line. If that straight line is giving me minimum error, is it so is it that only we will always apply one algorithm or is it a different algorithms or a different hit and trial methods that we have to apply?
It is different and trial methods algorithm that we will try to get the output. It's not just you know oh god regression and decision tree is working okay this is the end of it. No, you might go in for advanced algorithms and that's where the research is going. Why do you think that the problems are not solved? Because every time every algorithm will have its own pros and cons and the research is going at each and every level and it's still going you know we are just trying to improve on the algorithms every time. Why?
Because we still see the LLMs the chat GPT giving us wrong answers wrong predictions. So there is still a lot of scope of impro improvement you know where we have to work clear. So life was if the thing was so easy I think so by now everything would have been solved by AI and machine learning algorithm. No. Why is it not simple? First data is variable uncertain. It keeps on changing. Secondly every algorithm has its pros and cons. So the errors the accuracy the the the my requirements upon the product you know on the data or the algorithm keeps on changing and that is why this change happens right.
So they are pilotra of algorithms which have been built. As in machine learning we will always start with linear regression algorithms because of the simple mathematical function and high interpretability but they have minimum accuracy. But as we move along with the journey of decision trees, clustering, kernel based, we have we want maximum accuracy but the trade-off is that we decrease on the interpretability of the model. Right? So as I have been telling you splitting the data for machine learning in supervised learning, we always and always want to select the data randomly. So there is a function train test_split that we would be using to split the data.
Trainers test_plit which is part of the skarn library. This is my data and we will be randomly selecting it and dividing into training and testing data. Now clear and most of you you see this is around 70 to 80% of the data and this is around 30 to 20% of the data. Punit says just wondering if interpretability is more important as we go to complex problems. Not really. Not really. you know ultimately punit uh not really because we are always interested in the final result it's something like as I give my example suppose you know you have a magician and you know the magician turns a pigeon into a flower are we interested how does how does it does or are we interested in the output as a flower okay so we are always interested in the output got it now better yeah so if logist istic regression is having minimum accuracy then it is a no no the accuracy is less over here it doesn't mean that it's a wrong prediction logistic regression might not give you the best results for the prediction and maybe using a neural network would give you you know it's something like n uh you know using logistic regression is giving you 70% of accuracy and neural network is giving you 99 97%.
So that's the difference but of course neural networks are better that means right got it. So yeah there are a lot of terms which you have to be associate you have to understand it in the correct sense I understand I'm giving you time for doing that. Yeah. Now coming on to the part that you all are interested that ma'am what is the mathematical equation you know how do we know the mathematical term which is used to see the output. Now first of all we are going to calculate error right in regression the error is known as mean square error.
Now let's start understanding this. So we have this mean square error. Okay, the error name is that that I'll explain you technically when we move on to regression. Now in supervised learning whether it's classification or regression, we we know the data is divided into training and testing. Now tell me which error is more important. Now let me put it in this way that you go to a class you know and you are pursuing some course you know you you you do errors you are learning and you do errors while training right and still you don't perform well in the test and there is somebody who doesn't attend even one single class and performs well in the test.
So which error is more important? It is the training error which is important or the testing error among the two of course both are important but among the two which one is important. So what is more important to have the training error less or the testing error less? The training error is less but the testing error is very high. Is it a good thing? It's a even big failure. So again I'm asking you this question is the training error important or the testing error important? Of course the final test if you're not performing well on the final day of the test then the whole training is useless.
Agreed? So when my training MSSE is high, of course I can reduce my training by giving in more data, more data and my testing MSE is less what is this case? This is the case of underfitting. Of course, the training error should not be high than the testing MSE. The model is too simple for it to solve and then I will say the model is underfitting. Now clear that was the question I think so so now clear mathematically so this is how I will I will judge based on my training and uh and testing that my model is underfitting so I will now not use a straight line to solve my data points I will not use okay other is my training MSE is zero excellent but my testing MSE is so high absolutely not acceptable case.
So this is the case of overfitting which is high variance and what do you think the data or the model goes through? Is it more of underfitting problem or an overfitting problem? What do you think most of the models and machine learning phase which kind of problem? Reducing the training error is easy but failing on the testing is a total failure and that is what we will will we will look at most of the algorithms that how to prevent overfitting in the models that is why I was telling you these concepts are very very common to machine learning deep learning models so are you getting a grasp of it and I don't straight away start with regression come on let's start doing create other things.
You have to have good foundation of machine learning concepts so that you know you understand other materials the indepth knowledge how we accept and reject the particular model. Training and testing that we will decide because when the data is there we will split the data into training and testing and then we will calculate the training error separately and the testing error separately. Okay. We will have the suppose I said 1,000 rows 700 rows will be used for training then we will have uh 300 rows used for testing then calculate the errors for training and testing separately and then decide whether it's underfitting or overfitting.
So what will be the best case where my training error is also low testing error is also low or almost equal to each other. So again looking at the concept xaxis we have the model complexity and then we have the predictive error. So over here on the x-axis this is my underfitting over here and if what is overfitting that when I start with my model simple model my errors are high both the bias and the both the test and as well as the training as my training increases the training error or the model complexity increases the training error decreases but the test error will dip.
and then increase because of high variance. So we want to find out a model. How do I decide which model is best for my data where the training and the testing error are minimum or close to each other. So this is what I'm trying to explain you. Please try to understand that as the model complexity is less here we have the bias more as the complexity increases the bias decreases. When the interpretability is high the variance is low and as the complexity increases the variance increases. But we are looking at a point where these two inter intersect to find the optimal model complexity.
The variance is the spread of the data. Variance is a statistical term which means the spread of the data. So as my model complexity is going to increase the the the complexity as the complexity of the model increases the variance or the spread of the data also increases. Got it? That's the relationship. Again I am telling you we are trying to create a graph between error or accuracy is going to be opposite of this and this is my model complexity. We always start with minimum complex model. This is the case of underfitting. Agreed. when my model is simple but my bias is high as the complexity of the model increases my bias decreases but my variance increases so I don't want the overfitting situation or the underfitting situation I am looking at a situation where my error is minimum so the test error is U-shaped in curve and where the value is minimum that becomes comes my optimal model complexity.
Now clear to everyone? Okay. Now uh okay now since the uh topic has been taken now let's do one thing. If you have downloaded the uh ebooks material I want everybody to make a folder in the desktop and move under lesson number three. something like this. See what I would see is let me share my screen in work in data brick. How do I understand that? So once you are into now we've starting with lesson number three. So do you see this 3.1 and 3.2 learners? So how do you go about opening it? Open the Jupyter notebook environments.
install and open the file along with me. So chapter number three, we have to start with lesson three. Lesson three supervised learning regression and its application. Yeah. Are we all able to open this? Let me guide them that how do we go about it? See the prerequisite for the course I've been telling it is Python. So we always we like there are lot of tools like PyCharm, VS code, you can do it through uh other tools but generally we use Anaconda Navigator download. Please go ahead and go to the Anaconda, fill in your uh like the details, the email id and I want everybody to download the Anaconda right and uh based on your uh system whether it's Mac, Linux or Windows and go in for a full distribution don't go for many Google Collab if you're aware about Google Collab.
That's also one of the online tools u available you know which helps you to run the code in the jupile jupiter kind of environment. Let me show it to you. Anaconda distribution. Go and download it. Install it. Or the other one is go to Google Collab over here. Just login through your Google account. That's the another uh way to go about it. And from here I will upload the file. Which file? The file which I have already downloaded on my ebooks. That's there on my desktop. Machine learning third chapter 3.1 and I open it. Yeah.
So there are two three methods whichever one you are comfortable with that's not an issue. If you clear you're comfortable Anush on VS code that's not an issue. Got it? Now everybody is there with me now. So this is one of the very safest tool. You don't need to uh you know download the Anaconda on your desktop through the Google Collab through your good internet connection. You will be able to run the code. Yeah. So this particular file is uh you know basically if I talk about 3.1 again I'm repeating go on to your learning management system on the reference material please download the ebooks the Jupiter notebooks and then try to open.
So if we talk about if we are back to 3.1 let's let's quickly go through the file that file is completely theoretical okay nothing great in that file it talks about what supervised learning is and what are the two types of algorithms in supervised learning it's regression and classification what is the difference between regression and classification and supervised learning in regression the output is always numeric IC and classification output is always categorical. So when the target variable is categorical we do classification. So this example we are very clear. So when it's predicting numerical value right that is regression.
When we are predicting a categorical outcome determining the whether tomorrow it is going to be hot or cold. visualization. The same thermometer scale divided into two colored regions cold and red for hot where the threshold separating is And if you look at the applications of supervised learning, do you think HR people use machine learning to um uh identify the different job profiles uh shortlisting of the rumés? HR is also using machine learning now. Yeah. In finance, what is the use of machine learning and predicting of fraudulent data? Is are you a you know ready for that loan approval or not?
Right. And um this is similar to how to how a credit card company determines your credit worthiness issuing before issuing a card. Then emails this is about spam filtering whether it's a spam or not. manufacturing. Anybody in manufacturing? So in manufacturing, supervised learning is also used to inspect the quality, classify the products into different grades. For example, a factory might use a machine learning model to check for defects in products and ensure they meet quality standards. Much like the quality control inspector, maritime industry, it helps in forecasting the historical events, weather condition. So precautionary accident it's capable of predicting any kind of you know weather which might get wrong also which we have seen and supervised learning techniques like regression model can be used to predict tidal currents forecast demand and supply reducing inventory losses think of it as how weather forecast predicts rain based on past weather patterns broad predictions we use this in agriculture field too I'm into automating parking so under supervised algorithm you know there are lot of algorithms that we will begin tomorrow again as you understand my style we'll start with regression then linear regression logistic nave bias k&n and classification they all come under classification so under linear regression we would be working on multiple linear regression and before we move on to classification there are lot of concepts that we will study such as excuse Excuse me.
Cross validation, regularization, hyperparameter tuning, skarn pipelines. So there is lot more to be explored. Tomorrow we will be concentrating on linear regression and its concepts. Tomorrow, right? And linear regression is now very clear where the output is numerical. Then like predicting housing pricing we will use regression. After doing supervised uh regression in supervised learning then we will move on to classification algorithm. Under classification pothra of algorithms that need to be done initially we'll start with logistic regression kn bias kn decision trees random forest and support vector machines. So it's a long journey. Tomorrow we'll do linear regression.
Next weekend we'll we will be doing classification and then we will move on to the next stages of ensemble learning. then it is uh unsupervised learning right so I think so you all are excited about this journey so it's a long journey I don't want to burden you today with all the concepts so let's go slow and steady that's my rule slow and steady wins the race but I hope you got the crux of today's class right so this is 3.1 right and um we'll be beginning with uh 3.2 too. We started with the very first session of machine learning where we understood what machine learning is, what are the different types of learning techniques, what are the different types of m machine learning techniques, supervised learning, unsupervised learning and reinforcement learning.
What is supervised learning? Supervised learning means it consists of label data which consists of input and output. We split calculate the errors. The two main algorithms under supervised learning are regression and classification. Then we stood unsupervised learning. Unsupervised learning consists of unsupervised learning consist of unlabelled data where based on the similarity of the data. The groups and clusters are formed. So two main algorithms are clustering and association. And last but not the least is reinforcement learning that based on the feedback positive and negative the machine learns. Are we good to go? Everybody is clear with these concepts of machine learning the basic concepts and then we started with the concepts of overfitting, underfitting and a good fit.
Yes learners now you will tell me what do we mean by underfitting fitting in machine learning very very important question it will definitely be asked if you say that you know the concept of underfitting and overfitting. Yes learners can you tell me what do we mean by underfitting and overfitting? Underfitting is the case where the error is high. What error is high? Bias is high. We are very far from the actual data point. But we talk about simple mathematical function. What is the advantage of simple mathematical function? That it gives us more interpretability of the model.
What do we mean by more interpretability? That it gives the relationship between the output. And if we talk about overfitting again the error becomes high because the variance goes very high, right? and definitely it makes and more complex. So we don't want a algorithm to be or model to be underfitted or overfitted. We want a good balance fit. That is how do we judge that that the training error and the testing error should be similar. Right? So today we will start with the first uh you know learning technique that is supervised learning technique and the regression model.
Are we good to go? So as now you are getting more familiar with my style you know first we will try to cover the concepts and then move on to the practical aspect because unless and until you happening how is how are we going to interpret the model there is no point moving on to the code but definitely we'll move on to the code but let's start with the concept of regression so here we go with the regression so what is regression now if I ask you what is regression? Tell me. So when we're talking about regression learners, please be clear with this concept that always the target variable or the dependent variable are all numeric values or continuous value.
Don't say the term data. Right? See everything is data. But if you're not very spec technically you go very wrong right so you have to understand now data we're talking about structured data which cons in tabular form it consists of input as well as output values right whatever are the input or the output value x over here is referring to your independent variables right but whenever we are choosing any model or algorithm we always and always take into consideration the dependent variable. So it is very important that the dependent variable has to be numerical or continuous right.
So we are trying to find out so whole so basically what are we trying to find out we are trying to find out the relationship between the independent variable based on the output. Clear? So let's try to break the term linear and regression. Linear we understand from simple mathematics that anything of degree one any function which has value one is set to be linear. So this means that progressing from one stage to another in a single series of steps sequential extending along a straight line or nearly a straight line. So we all understand the term linear right and as very rightly said by Nanda regression is a statistical terminology which is used to find out the relationship between one dependent variable.
we always have one output. If the output is categorical, we will go in for classification algorithm. And if the outputs are continuous or numerical, we will go in for linear or not linear but regression algorithms. So moving ahead, thus a linear regression means a linear approach for so when I combine the two terms linear regression, it means that the linear approach for modeling the relationship between the dependent and independent quantitative variables. It will plot a straight line because that's the simplest linear uh approach as a best fit along the data points to predict the target value.
Right? So what are the different types of linear regression available to us? So a simple linear regression is with one depend obviously the dependent variable is always one that's the target or the output but if the independent variable is one it is simple linear regression and it is always represented by a straight line. So beta kn is the intercept and this refers to the slope. So this refers to beta kn plus beta 1 x1. getting my point? And then we have multiple linear regression that is over here when independent variable then this straight line turns into a hyper plane.
So line in higher dimension two dimension becomes a plane and in higher dimension it becomes a hyper plane. Getting my Are you all getting the concept mathematical equation along with the graphical view and of course with the code try to I'm trying to explain you from all aspects all right and when I say polomial sureik I can do that right over here we are talking about linear regression linear means something of degree one so do you see all the variables have all the coefficients have the degree E1 do you see this right? These are known as the regression.
So the term linear regression means we are trying to find out relationship between the independent uh variables with the dependent variable. So a simple linear regression is beta KN plus beta 1 X1. Clear? That is when we have one input uh variable and one output variable. Mathematically it is represented by a straight line where beta kn represents the intercept of the line and sorry the beta kn represents the intercept and beta 1 represents the slope of the line. Have multiple linear regression. Graphically a straight line now becomes a hyper plane that is a multiple linear regression.
Of course you know generally practically we will have more than one independent variables with output variables. And then we have polomial linear regression that is y is equal to beta kn beta 1 x1 beta 1 x² and q clear now. So now let's try to understand it from now the same thing from the machine learning point of view. So in simple terms it is a finding a ba best straight line fitting the given data set. So in simple terms find a simple straight line which tries to find the relationship between the independent and the dependent variable and best linear relationship.
How do I decide the term best? Something with minimum error or maximum accuracy. Right? Something with minimum error or maximum accuracy is taken in terms of simple terms. And if I talk about technical terms, it is a supervised machine learning algorithm that finds the best fit relationship on the given data set between independent and dependent variable. almost the same thing but the basic concept or the algorithm it works on it is OS the ordinary le square method also known as the sum of the square residuals so to explain these terms graphically mathematically I want everybody to concentrate here on the graph or the slid share so this is an equation of a straight line.
What is beta kn? Beta kn is the intercept. What do I mean by intercept? What is the value of y when x is = to0? Are you all getting this point? What does the term intercept mean? That what is the value of y when x is equal to zero? Clear? Is this point getting clear to everyone? And if I talk about beta 1, beta 1 refers to the slope of line. Is this term getting clear to everybody? Since I have only one input variable or independent variable, it represents a straight line. Now once we have understood the equation of the straight line, what do these blue data point uh blue dots represent?
the actual data points available right that the first data point has value one y is equal to three this is value four and value is equal to 6 all right now this is where how we are going to calculate the error what is the error the actual value this you can consider this y i as my actual value and the red dot on the straight line is the predicted value if I want to fit a straight line on these data points. Right? So this is my actual output and this is my predicted output. Clear? The error is clear.
How are we calculating the error? Now what algorithm are we technically using over here? ordinary le square method or the residual. Residual means the error sum of square error. So what does this mean? It is equal to the sum of the square of error which is actual minus predicted value. Getting my point? So and why are we taking the square of the error? There is a story behind that also. Why? Because is +2 for example and this error is minus2. So the total error is zero. So what I'm trying to say is if the error is plus over here and minus and if I add them it will show me zero error.
It does it is not zero error. So I will try to take the square of the value it will become four + 4 that is the error is equal to 8. So always the errors are taken as square of errors. And we are looking at minimum square of errors. That is why it is known as the ordinary le square method. We want the sum of the squared error to be the least. This is known as the RSS. Getting my point learners? So this is the total RSS. by the total for this particular data point. This is going to be my first error.
For this data point, my second error for the third one, fourth one, fifth one. And if they are infinite points, then infinite data points are calculated. Clear to everybody? So why y okay now? Because this is my actual y1 and what is my predicted y1? This is my predicted value right base because we are using nanda this particular function. This is my predicted value and predicted values are represented as ycap. So for the first one it will become beta minus beta 1 x1 then y2 beta beta 1 x2 for each data points. Now clear this is how based on this mathematical function I am going to predict my output for each data points.
Clear? So the formal statement of simple linear regression I'm talking about simple linear regression with one input and one output. Y1 is the value of the response variable. Again, predict, target, predicted, you know, dependent, they all mean the same thing. Do not get confused with the terminologies in the IAT trial. Beta not and beta 1 are the parameters. Xi is the value of the predictor variable in the IAT trial and epsylent I is the random error with mean E. epsylon i is equal to zero and variance is equal to squared. Clear? Okay. Now this random error this is very very important to understood that a random error we are talking about are we talking about the whole data set or a part of data set or a sample of data set over here.
This is my population and if I take a part of it for training and as well as testing do I always consider the whole data set or a part of data set it's always sample and when I and is my sample and we want the sample to be the true representative of the population but since it is a part of the data there will be some error also uh you know associated with it which is irreducible that is represented by epsylon right so epsylent I is the random error with mean this is the random error which is always going to be introduced it's it's not reducible like bias and variance or overfitting but it's part of the whole linear regression where the mean of that error is equal to zero and the variance is equal to epsylon square now clear so The idea of whole uh linear regression or significance model is that it is highly interpretable.
What is the idea behind this? That it is why the error is zero.…
Transcript truncated. Watch the full video for the complete content.
More from Simplilearn
Get daily recaps from
Simplilearn
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.


![DevOps Engineer Full Course 2026 [FREE] | DevOps Engineer Tutorial For Beginners | Simplilearn thumbnail](https://rewiz.app/images?url=https://i.ytimg.com/vi_webp/JYOxwSxw_Fk/maxresdefault.webp)
