Machine Learning With Python Full Course 2026 | Python Machine Learning For Beginners | Simplilearn
Chapters14
Introduces what machine learning is, how it uses data to learn and make predictions, and outlines the course focus on core concepts, regression, classification, evaluation, and practical Python-based modeling.
A thorough, practitioner-friendly intro to Python-based machine learning with Scikit-Learn, covering supervised/unsupervised learning, regression, classification, model evaluation, and practical pipelines.
Summary
Simplilearn’s machine learning course with Python builds a solid, hands-on foundation for anyone working with data. The instructor explains how ML uses data to learn and make decisions, then compares machine learning to broader AI and deep learning, highlighting where non-neural models fit in. You’ll learn key concepts like supervised vs. unsupervised learning, overfitting vs. underfitting, and the role of data quality (garbage in, garbage out). The course moves through regression and classification, including linear regression, logistic regression, SVMs, decision trees, and ensemble methods, with emphasis on training, evaluation, and hyperparameter tuning. Practical workflows are demonstrated using Scikit-Learn, including train-test splits, cross-validation (hold-out, K-fold, repeated K-fold), and model pipelines that streamline preprocessing (scaling, one-hot encoding) and modeling. The curriculum also covers regularization (lasso, ridge, elastic net), their impact on bias-variance trade-offs, and how to select hyperparameters through grid search. Real-world examples anchor the ideas: predicting housing prices, classifying spam, and more, plus a capstone project using a bike-rental dataset to practice regression with preprocessing pipelines. Overall, the video doubles as both a conceptual primer and a practical bootcamp for becoming proficient with Python ML tooling.
Key Takeaways
- Data quality matters: high-quality labeled data dramatically improves model accuracy, while garbage data yields garbage predictions.
- Supervised learning relies on labeled examples to learn input–output mappings, enabling tasks like predicting prices or spam detection.
- Linear regression serves as a baseline for regression tasks and scales to multiple features (multiple linear regression) with weights learned via least squares.
- Cross-validation (hold-out, K-fold, repeated K-fold) provides robust estimates of model performance and guards against overfitting.
- Regularization (lasso, ridge, elastic net) introduces penalties to reduce model complexity and mitigate overfitting, with alpha and L1 ratio controlling strength and balance.
- Pipelines simplify workflow by bundling preprocessing (scaling, imputation, one-hot encoding) with model training and prediction, ensuring consistent handling of new data.
- Grid search (and alternatives like random search) methodically tunes hyperparameters to find the best model configuration for the data.
Who Is This For?
Essential viewing for data scientists and analysts who want to master Python ML end-to-end, from data prep to model deployment, using Scikit-Learn and pipelines. Great for beginners who want practical code patterns and for developers ready to optimize models with regularization and cross-validation.
Notable Quotes
"Machine learning help systems learn from data and make better decisions."
—Intro defining ML as data-driven decision making.
"Supervised learning is any type of machine learning that involves learning from labeled data in order to predict outcomes."
—Core definition used to distinguish supervised learning.
"We will study a lot about those algorithms like the different models that we can build and what their differences are, what their strengths are, what their weaknesses are."
—Overview of modeling variety and evaluation.
"Grid search is the method to exhaustively test hyperparameters and find the best model configuration on your data."
—Hyperparameter tuning technique explained.
"Pipelines bundle preprocessing and modeling so you don’t forget steps and can apply the same transformations to training and new data."
—Key practicality of using pipelines in ML workflows.
Questions This Video Answers
- What is the difference between supervised and unsupervised learning in practical terms?
- How do you evaluate a regression model and what metrics matter most?
- How can I use Scikit-Learn pipelines to preprocess data and train a model in one go?
Python Machine LearningScikit-LearnSupervised LearningUnsupervised LearningRegressionClassificationCross-ValidationRegularizationLassoRidge Regression
Full Transcript
Hey everyone, welcome to this course on machine learning using Python. Today machine learning is becoming a part of almost everything around us from movie recommendations, fraud detection to price prediction and smart assistance. Machine learning help systems learn from data and make better decisions. And that is exactly why this skill has become so important. Now it's no longer just for researchers or data scientist. It is now one of the most useful and practical skills for anyone who wants to work with data and intelligent systems. So in this course you will understand how machine learning works using Python which is one of the most popular language for building machine learning models.
So we will start with the core concepts then we'll move into regression classification also understanding how to evaluate and improve models properly. So if you want to build a strong foundation in machine learning in a very simple and practical way, this course is a great place to begin with. So let's talk about what we have covered in today's video. First, we'll understand what machine learning is, how it learns from data, and how the basic workflow works using features, labels, training data, and test data. Next, we'll look at some important concepts like overfitting, underfitting, so you can understand why some models perform so well and others do not.
Then we'll move into regression where you will be learning how models predict continuous values using techniques like linear regression, lasso ridge and polomial regression. And after that you will explore classification where you will understand how models predict categories using methods like logistic regression, KN&N decision trees and SVM. We'll also cover model evaluation using metrics like accuracy, confusion matrix and mean squad error. And finally, you will see how to improve model performance using grid search, cross validation, and pipelines. Also, if you are interested in mastering the future of technology, then the professional certificate course in generative AI and machine learning is the perfect opportunity for you.
This is offered in collaboration with the ENIT Academy IT Kpur and it's an 11-month live interactive program providing you hands-on expertise in cutting edge areas like generative AI, machine learning, AI tools like chat GPT, DIU and hugging face. You'll also be gaining practical experience through 15 plus projects, integrated labs and life master classes delivered by esteemed IT Kpur faculty. Alongside earning a prestigious certificate from IIT Kpur, you'll be receiving official Microsoft badges for Azure AI courses and career support through Simply Learn's job assist program. So what are you waiting for? Hurry up and enroll now.
The course link is mentioned below. Now before getting started, here's a quick quiz question for you. Which type of machine learning problem is used to predict a continuous value like house price? Your options are classification, regression, clustering, or filtering? Let me know your answers in the comment section below. So without any further ado, let's get started. What machine learning is, which is a subset of artificial intelligence, right? That's uh basically um machines learning from data in order to uh make decisions essentially. Um so this was a big departure from the rules-based systems at the time, right, that were explicitly programmed to make decisions.
So just think of an example like a really big kind of if this then that then that then that and and else if this this this right so bunch of rules that had to be pre-programmed in order to um come out with some final answer. Uh with machine learning it's the exact opposite of that. we're actually training something from examples from existing data um in order to predict something or um make some type of decision. Uh and so we're going to learn about the various ways we can do machine learning. But if you guys remember we um talked about some of this like the differences and the uh basically rules-based approaches to learning from data approach.
Um and in included in that is going to be uh complex unstructured data. So things like images, text, audio. What handles those really well is uh deep learning which we will get to in the course after this. But uh those are certainly in there as learning from data even complex data. So we had this picture uh and I think this is kind of around where we left off last time was uh just distinguishing between those three terms. We see artificial intelligence, deep learning and machine learning kind of used interchangeably, but this is really how they fit in.
Artificial intelligence is kind of a broad anything mimicking human intelligence. Um which doesn't have to be learning from data but uh machine learning is part of that. And then um one way to accomplish machine learning is to use neural nets which is the focus of uh deep learning. Um and so deep learning has been has found a lot of success especially recently with uh those complex data types like images, speech, text, right? So deep learning used all over the place. Even in um modern like generative AI, we see deep learning used quite a bit. Um it really anything that's using neural nets is uh going to be deep learning.
Um and again we'll focus on that later but we're going to be mainly focused on machine learning for this course. Primarily machine learning that does not use neural networks. Okay. So just models that are not necessarily neural networks be our focus. So in machine learning we had an example of a game uh essentially um learning what decisions to make uh based on the uh kind of current um state of the board. This could be a um you know machine learning example that uh learns from many previous examples. So a lot of data around these games are used to train these um kind of robots that can play these games and play them at a very high level.
Um so there's been a lot of successes actually in machine learning and deep learning um around uh playing games like chess or go um using machine learning algorithms. So pretty cool. All right. So I think this is where we ended. We last time we said there's a bunch of different use cases for machine learning. So um recommendation system is going to be a big one and we will actually study that uh in one of our final lessons of this course. Um chat bots like generative AI doing sentiment analysis chat bots we'll study later but those are certainly an application of learning from data in order to uh generate responses to text prompts right.
Um spam filtering that's a good example like classifying an email as spam or not spam. Um that that gets trained from examples and uh learning from data such as previous emails. Um social media posts analysis is another kind of text data um use case but you uh can do a lot with that text like you can predict the sentiment um you can predict uh the category of what what the post is talking about um those kind of things all can be done with machine learning and many other use cases not on this list that we will uh cover you know as we as we go further.
Okay, so this is where we kind of left off. Um, so what's doing all the hard work here? Uh, machine learning algorithms. So these are things that will um these are things that will learn from the data. So they are uh they they are basically um algorithms or sets of rules that uh or mathematical rules I should say not formal rules like in the in the sense of a rule system but mathematical um formulas and mathematical uh rules essentially that help us learn from the data. So they correlate the data to some type of outcome.
So some type of prediction uh whether that's going to be as we will see whether that could be like a number like we're predicting a price or demand or sales um or it could be a category like is this transaction fraud or not fraud or what's the probability that this is fraud um so we have different kinds of predictions we can make with machine learning um but uh we will study the kind of the differences of those coming up. Um, but machine learning algorithms are really what power they're kind of the models, right? They're the models that help power uh machine learning to actually learn from data.
So, we're going to spend a lot of time in this course studying those algorithms like the different models that we can build and what their differences are, what their strengths are, what their weaknesses are. We'll we'll learn a lot about those. Okay. So I guess you can imagine like everything is so data dependent, right? Um we're learning from data. So uh it makes sense that the quality of data really really matters here in determining how strong the model can be. Um so you see this graph here charting kind of the um high quality data um versus just uh any old data but a decent enough quantity of it.
Um you can see that performance and the performance is measured by some evaluation metric. Um, so think of it as uh something like an accuracy. Like if we were predicting fraud or not fraud, how accurate can our model get at actually detecting fraud um it gets better and better and better the graph shows that the higher quality of data that we have. So there's kind of that there's a there's a saying in machine learning um called garbage in garbage out. What that means is if you have poor data, even the best model in the world, poor data is not going to result in having a good model that can be accurate and perform well.
Um, so it needs to be high quality, meaning um there needs to be a decent amount of it and it needs to be labeled appropriately as we will will talk about um and it needs to not have any, you know, significant outliers. it needs to be clean, not have those missing values, all of those things. Um, you can you have a good chance at deriving good predictions from higher quality data as this kind of shows. Okay. So, one thing we're going to learn um as we go along is quantity matters as well. So, not only quality, but a decent amount of it.
And um we're going to learn those kind of rules of thumb like how much data do I need for certain algorithms. Um one thing that we will see is that uh the the basic machine learning models that we'll study don't need as much as a neural network would. It you know neural networks are going to require a lot more um than a basic machine learning model learning model. So uh that's something we will see as we go along. But uh this is something we'll talk about and discuss with each model that we study is kind of how much data do we actually need to produce a high quality model.
Okay, any questions uh so far? Okay, let's talk about the different types of machine learning that we're going to discuss. PIM, there's going to be two primary ones that we will study in this course and then a couple others that'll be a little bit more advanced that we won't get to but worth knowing about. Um, so there's going to be four total that we'll study or talk about and they'll be on this list here, which is um supervised learning and unsupervised learning. Now, I'd say the majority of our focus will probably be on supervised learning, and we'll talk about what that means, but we'll also cover unsupervised learning as well.
And so, we'll look at the most popular techniques in each of these types of machine learning. Um, and then we'll talk about these two, but not really study them because they're more advanced topics um that that will be beyond the scope of what we'll do. But uh these are going to be um different styles of machine learning that are going to be characterized by um what kinds of predictions they make, what kind of data they need and require. Um and uh what kind of outcomes they're actually producing. Um, so let's let's get into each of these, but uh the the one that we'll probably spend the majority of our time on is going to be supervised learning, but we will study unsupervised learning as well.
We'll study both. And we're going to talk about we're going to define both of those um coming up. And again, these will be a little bit more advanced topics that we won't spend too much time on. Um but but we'll discuss their relevancy in machine learning um and give a good definition to it. All right. Let's start with supervised learning. Now this is going to be uh a term that really refers to using examples. So using labeled examples. So here we say labeled data to help our model train. In other words, help our model be able to predict guided by specific input output pairs.
So supervised really refers to the fact that we have answers. We have examples, we have answers with those and we use that collection of data to build our model off of so that we can predict um those kinds of things like a price, like a category, like a spam not spam in this in this slide. like we would be predicting if this shape is a square, a triangle or a circle. Um but but when we build a model for that, we have data that has an answer attached to it. Right? We've talked about this before a little bit with labels.
So there's a guide there that can guide us towards building our model. there's an actual every every example has an answer and that answer is really critical to help build our model off of. So, um that's it's almost like you have um a you have a bunch of exercises in let's say like a math textbook. You have a bunch of exercises and you have the answers and that way you can kind of check your work. you think about model training um that is the really a lot of that process of model training as we are going to discover is um basically checking our work against these answers in our data in our training data.
Okay. So supervised learning is any type of machine learning that involves learning from labeled data in order to predict outcomes. Okay. Predict outcomes like now the the outcomes can be numerical. They can be like a price, temperature, demand, sales, revenue. They can be numerical, but they can also be categorical. So they can be like spam, not spam, fraud, not fraud, cancer, not cancer. Um, dog, cat, giraffe, those kind of categories. Um, we could predict those. It's some type of outcome. Okay, some type of outcome. The key is we're using labeled examples to guide our model building.
That's why it's called supervised learning. So we know in our data we know what the inputs are. Of course, those are going to be think of the inputs as like all of our columns and then we have a special label column that represents the output we're trying to predict. So if you think about that housing price data, the label could be the price. And that's something we would build a model to predict, but we have answers for all of our examples in our rows. We have answers to help guide our model building. They help tweak our model because we know the answer ahead of time.
So they're they're really good examples to build our model off of. Okay. So that's that's supervised learning. Uh in this example is circle not in the prediction because it's not part of the test data even though it's in the labeled data. Um no it just not necessarily. It just means that like we learn against all of these examples that have these answers and then when we observe new examples um we can try to predict what those would be based on what we've seen before. So I if there was a you know it's just a coincidence we only have two two examples in our test data like we could have a circle here in which case we would predict circle that's fine or at least we would hope our model would predict circle right that's what we're hoping may or may not get it right um but it's it's only not there because we only like we're just assuming that we only have two examples we're testing against but in reality we would probably do a lot more than two.
It's just it's just a coincidence really. In reality, we would test against a lot more data. And we're actually going to see why we would do that. Like why would we train our model and then kind of use additional data um to to evaluate it? It's actually really important that we do that step to get a sense of how good our model is before we take it out in the real world. So if we apply our model that we build on our label data to um this kind of set of test data that we haven't been exposed to before.
It helps give us a sense of how good is our model. So it's tested is usually used for evaluation. So that's something that's something we'll study. How do we train? Uh it depends on the model. Um so training will be a sense uh will be an algorithm that will um basically update the model according to the data these labeled examples. Um every model is going to be different in exactly how it trains. So we're going to we're going to talk about that when we get to the individual models that we'll study. But uh loosely speaking, they're going to use the data to adjust itself.
Like imagine adjust like tuning a bunch of knobs. Um, like the best example I can give you is we I think I did this one last week where you have kind of a function that predicts the price and let's say it has um weights like weight one with feature one, weight two with feature two, weight three with feature three. So imagine we had three input features and we we built an answer according to that. Essentially what we would do to train the model is adjust these um in order to get this correct based on our our labeled examples.
So that's something we're going to learn about coming up shortly when we when we actually dive into model. Every model is going to be slightly different in how it trains, but at a high level it's going to use the training data with those examples, right? the labeled examples to help guide the formula essentially to adjust to generate the proper kind of model here. The these things are going to be adjusted according to the data in order to produce the correct output. So think about these as knobs that will turn. uh which type of machine learning is used?
Uh probably supervised um which is what we're talking about now. So probably supervised because most people want to um build some type of model to predict something. Uh so yeah, I'd say I'd say supervise. Yes, we're we are definitely going to learn how to train. Yeah, we'll see. We'll do the code. Um, I'll tell you about how it's done. Yeah, we're definitely going to learn it. But what I was saying is it's kind of on a model by model basis. So, I want to wait till we get into the individual models, then we'll talk about how they're trained.
But yeah, we'll we'll learn how to do that. But yeah, supervisor used all over the place. Even even for uh generative models, they use supervised learning because um like an LLM is going to use labeled examples in order to train, right? In order to train how to generate responses according to prompts. Um it needs to learn against a lot of text examples. So that supervised learning is what um results in that model, right? Learning from those labeled examples. It is yeah image image uh a lot of um yeah a lot of image processing is supervised like object detection.
So the yolo model is an object detection model. Yes. Um because it has to be trained right. It has to be trained on uh it has to be trained on images with labels such as this is what object is in this image. This is the box around the object. Um yes. So if if it's if it ever uses label data to train and build the model, it is supervised. So YOLO is definitely supervised and we actually we will we will cover the YOLO model later on in our deep learning course. We talk about object detection.
So we'll still we'll study that. But yeah, it's supervised Okay. So on the slide we have some common supervised learning algorithms that are we will study. So all of these we will study and understand what they do and how they work but just giving you some to name them. Linear regression is kind of the one I just drew out which is the um this is the prototypical like easiest to understand model that is kind of the um exactly like this where we have a weight times a feature um a weight times a feature and then a weight times a feature and on and on and on.
You can have as many as you want. um that is a linear regression. And so that is um that's a supervised model because we need this value here and we need all of our inputs in order to um actually train this model and generate all those weights um that that is uh that uses um labeled examples to help tune all those knobs. Um same with all these other models. So, we're going to talk about decision trees. We're going to talk about logistic regression and and SVMs, which are support vector machines. We'll talk about all of those, but they're all examples of supervised uh supervised Okay, we'll talk about all of these.
They're all supervised because they all require labeled examples in order to train them and and then subsequently use them. Okay. Okay. So what are some use case examples? So for for instance in uh supervised learning we may be predicting temperature based on yearly temperature trends. So we would have that yearly data as our um as our labeled examples and those would supervise the learning of a model that predicts temperature. Um, same thing with predicting crop yield based on um, seasonal crop quality changes. So maybe we have a bunch of features relating to crop quality. We could predict crop yield.
Um, we would just need historical examples with those labels, right? What the crop yield is for each time period. Let's say we would just need those uh, supervised examples and we could easily build a model off of it. Um uh this this last one sorting waste based on known waste items and their corresponding waste types. Um that's kind of like spam. It's like filtering basically like a spam filtering. Um so think of it like the the shapes example. We sorting things into squares, circles, triangles. Um, same kind of idea here where we have a bunch of examples on what those um what those waste items should uh should belong to, like what waste bins they would go to, for example.
Um, and those could be labeled and therefore then we could um understand what category of waste they belong to. Um, same thing with spam. something is fraud or not fraud, spam or not spam, cancer or not cancer. All of those are going to be supervised learning examples because they're going to require in order to train them, they're going to require data that has those labels. Okay? So, anything that has labels is going to be supervised learning. So, again, this is where we will spend probably the the majority of our time is doing supervised learning problems.
ones that we have labeled data. We're building a model and we're going to predict those those uh labels essentially. Okay, before we go to unsupervised, any questions about uh supervised All right. So, supervised requires labels in order to have an example to go off of to build your model. And that's because you're predicting those kind of outcomes like spam or not spam, cancer not cancer. Now unsupervised learning is completely different. It's the opposite. So unsupervised learning is where we do not use labels whatsoever. So we're not using any labels at all. So it's it c it can be completely unlabeled or even if it's labeled, we're not using labels in any way.
But um we primarily would say it's unlabeled data. We have no guidance because we're not using the labels in any way. we have no guidance to um predict anything but that's because we're not really predicting anything in unsupervised learning. Generally what we're doing is looking for some structure or pattern. Okay, with unsupervised learning we're looking for some structure or pattern. So um one type of example that's very very popular is going to be this second one which is um identification identification of user groups based on similarities or commonalities. Now this is going to be a problem basically known as clustering and it's a problem we will study quite a bit.
There's going to turn out to be lots of different algorithms that can accomplish clustering. So what clustering attempts to do is basically say um we have data that's like this and then data over here and then data over here. Let's just group these together. So like this should be one group. This should be one group and this should be one group. And we can find those structures and say okay this is group one this is group two and this is group three. One 2 3. And we can basically build what we would call clusters of data um based on how close together the points are kind of located in these kind of cluster zones like these boxes I've drawn.
Okay. Now that doesn't require any label to do which is really fascinating. So unsupervised you don't need any label at all to accomplish the algorithm. Um so clustering is one good example. Um finding outliers or anomalies is another. So we don't necessarily have any label of what is an outlier or what is an anomaly. We are deriving that from the features alone. There's no guidance. There's no label um to doing like outlier detection or anomaly detection. Okay. So that's another good example. One that's not listed on here um but is also really important that we will study is something known as dimensionality reduction.
So dim reduction and what that what this focuses on is basically compressing the data set a bit. So we take our data and basically compress it. Um so that but we do it in such a way that we retain as much information as we can. It's a very like smart compression and what it does is it lowers the dimension. um dimension. Think of the dimension as like number of columns. Number of columns. So imagine we had 100 columns in a data frame. What we could do is actually reduce that down to 10. So like 10% of that.
So we reduce it down to 10. And um but those 10 are it's not like we chopped out um 90 other columns. we um smartly kind of compressed all that information into these 10 new columns um that are compressed versions of the hundred that we used to have. Um so dimensionality reduction is is another unsupervised technique. It requires no guidance, no label to do, but is um a really useful technique to reduce the size of your data if you're doing things with it. Um so this is another one that we will we'll study how to do it and basically more details behind it what the algorithms are.
Um we'll so probably those two in unsupervised will spend the most amount of time on clustering and dimensionality uh and supervised if some data is present but we didn't label it means in example we had circle triangle square in the training data we add pentagon but we didn't label that in that case uh yeah so every um in supervised learning every row think about it as like every row in our data frame name needs to have a label uh associated to it. It needs to have a a column that represents the label. So if we've never seen Pentagon before, I can't use that as a label.
So it has to the pentagon has to exist in the data if I'm going to be able to predict it. Right? So, it can't predict, right? If we've never seen it before, we have no examples to go off. We have no guidance. So, how could we predict that, right? We can't predict it if it's if it's in there. So if if we have labels of Pentagon let's say then yeah we could predict Pentagon we could remove what We wouldn't if it was talking about the Pentagon, we wouldn't remove that. No, let me go back to that page.
We wouldn't remove it. Um, it's just if it's not in our labels, we're not going to be able to predict it. So, Pentagon's a good example here. Uh, Pentagon is not one of our labels. So, it currently is not in our data set as one of the labels. We only have data that's either a triangle, circle, or a square. We don't have Pentagon. So I would never be able to predict Pentagon. I'll never be able to do that if I haven't seen examples of it before. Okay. But let's say we had that in there. So we had Pentagon.
So if we had Pentagon, um we could have an example of it in our labels. And then yeah, we it could be then we could predict it. Yeah. Yeah. The the don't get worried. Don't worry about the test data. So the test data is just saying here's a new here's a shape. What is it? Okay, that's a square. Here's a shape. What is it? Okay, that's a triangle. And we could have as many of those examples as we want in our test data. So we could have a circle and say, okay, what's this should be circle, right?
The test data can be whatever it whatever it wants. But yeah, if if we've never seen Pentagon before, we're never going to be able to predict it. These are the the label data and labels are basically the talking about the same thing. The labels just mean what are the categories that are present in our data. So in this data we only have three labels that are present. So the labels is are relative to our label data, right? It's saying what labels, excuse me, what labels uh do we have in our data and we only have those three circle, triangle, square.
So so pentagon would not be part of those labels. We couldn't predict it. No. So unsupervised is not going to make a prediction. That's the big difference with unsupervised. They're not going to make a prediction like this. Um so unsupervised is not going to make a prediction. It's going to do something different like um basically say like these guys are similar, these are similar, these are similar, this is a cluster, this is a cluster, this is a cluster. It's not going to make a prediction. That's what supervised learning does. Clustering, yes, which is unsupervised. Yes, clustering does not require any labels.
Unsupervised just means we don't have any labels. We don't require any labels. So the other thing unsupervised might do is it might say and again without the labels it might say that this is an outlier. it might say that this guy is an outlier because there's only there's only one of those and they're not like the other. So that that's something that um that's something that uh unsupervised could do. Um it it yeah and no. It kind of labels a cluster in the sense that um it would basically assign a number to it like this is cluster one, this is cluster two, this is cluster three.
It'll assign a number to it, but it's not a very meaningful. It doesn't assign like a prediction label in in the traditional sense of a label. It does provide like a numerical index for the cluster to because what we want to know is like okay this guy has the cluster of one. This guy belongs to cluster one. This guy belongs to cluster one. This guy belongs to cluster two. This guy belongs to cluster two. Does that make sense? So there needs to be some like index of what cluster you So it's kind of like a label but not in the traditional like prediction sense.
Very good. So again, unsupervised, no labels. You're doing things like identifying clusters, um identifying outliers, doing dimensionality reduction. These are all like structure and pattern oriented things. They're not predictions of a label. Okay? They're not which is what we would see in supervised learning. Okay. So an example would be that we take we put in the data um we can group together uh data such as images into categories based on similarities um which would be like those clusters. So there's no these would be groups that we don't have any label on ahead of time like we don't have we don't say that this image should belong to this this image should belong to this we derive that from the characteristics of the data.
Um so think like a good example is um customer groups. So we would identify customers based on like okay do they have similar spending levels? How many days do they go shopping in a week? How much money do they spend? And we could kind of group together customers based on similar qualities. Clustering will find those groups that should exist. um it will discover those groups based on um the similarities in the data, but there's no labels that that say like this person should be in this group, this person should be in this ahead of time.
There's no labels of that. It gets derived during the algorithm. It's unsupervised, right? There's no unsupervised really literally means no guidance. There's no guidance to doing it. We just derive that from the structure of the data which is the similarities. All right. So, a couple more for you. So we had um supervised which uses the labels. We have unsupervised which uses no labels looking for structure. And then we have something that's kind of in between which is um what is known as semiupervised learning. And this is where you use a combination of a little bit of label data, but most of your data is actually unlabeled data.
Um, and you try to get some use out of that label data in order to um build a model out of it. And so, uh, it uses the, um, it uses that label data to, um, generally provide some guidance on usually what happens with semi-supervised learning is you use your label data to kind of predict what the label should be for the unlabelled data and then you can go from there. So you can create artificial labels on this unlabeled data and then you can use all of it once it's all been labeled kind of like a supervised learning uh approach.
So but but this is semi-supervised basically refers to the fact that you start out with most of your data not being labeled but you do have some labeled examples and what you can do is basically extrapolate those labels into the unlabeled data set and then provide some artificial labels and then now everything has a label you can do supervised learning. Okay. So, it falls kind of between um supervised and and unsupervised. Uh and there so this is this is kind of rare. Most of the time you're not going to do that. You're actually just going to um prefer to just start with all label data.
That's usually the preferred approach. Most of the time you'll actually just be doing supervised learning, not really semi-supervised learning. So, it's pretty rare, but um it it could like if Yeah, it could if the if we had a lot of examples of Pentagon and we wanted and so they were unlabeled and then we tried to guess what kind of shape they were um and provide an artificial label uh and then um then use that whole data set to build a model off of then then yeah, it could it could fall into this category. Okay, they Oh, going back to the question, they still use some kind of label data like age, gender.
They use uh that's those aren't those aren't really labels. That's the features. So, yeah, they still use the core features of the data. They just don't have any like labels in the traditional sense of a label. Like you should think of a label as something we are trying to predict. So whether that's a price, whether that's like a category like spam, not spam, cancer, not cancer, it's something we'd be interested in kind of predicting. And so um in our data, we would have an answer for every row. We'd have one of our columns would be like the the result like the outcome answer that we're trying to predict.
That's the label. So at unsupervised, we don't have any of the labels. We do have just the regular features like gender, age, income, square footage, bedrooms, bathrooms, all those things. So, we have semi-supervised that falls in between supervised. Now the reason it falls between is be is because there's a decent amount of data that's unlabeled. In fact, a majority of it unlabeled. But what we can do is try to label it. We can try to take what we know from our existing labels and predict an artificial label and then use all that data together in kind of a supervised fashion for a model down the road.
So that's kind of what this picture uh says is we can try to take um you know maybe we try to infer some labels based on we have some some labelled data here. We have most of our data is unlabeled and we try to supply some labels to it. Um like maybe we have a babies category of teens, a tween, uh you know youth and um adults. Um and then we try so we we take our our labels and we try to extrapolate those into artificial labels for this unlabelled data so that we can use it now because then everything has a label at this point and then we can just go ahead and do supervised learning from So we can do supervised from there.
What we would prefer to do and what we'll do in this course um is just start with supervised. We'll just start with the labels. We won't try to derive artificial labels usually. We'll just start with labels. So one example in the real world is something like Google photos which um whenever you take a picture it can provide uh uh labels based on previous uh images in your library. So it can it can produce tags or um labels on those. Uh generally when you take that picture it's kind of unlabeled unless you go in and specifically provide some tags and some labels.
But um if you don't do that it can still it can still uh make it can artificially create one of those based on the other label data that you already have. So that's um that's an example. All right. Last one in terms of machine learning. So we have supervised, we have unsupervised. Uh then we had semi-supervised which is somewhere in between a mixture of having some unlabelled data and label data. Um now we're going to talk about reinforcement learning which is completely different. Um it's it's completely different than the other three. It's a type of machine learning where we uh basically learn from interaction with the environment.
And you might ask what are we learning? We are learning what actions to take in the environment. Um and the way we do that is by reinforcing positive actions that lead to a a reward. Um, so that's where the word reinforcement comes from is we we basically uh imagine like a child that's, you know, learning from trial and error. Like they're trying to crawl, they're trying to walk and they keep falling down. um eventually they learn how to do it through trial and error and they might get a reward or they might um reinforce some of those positive movements that lead them to walk or crawl um or they might learn from the penalties, right?
They might learn from uh some type of feedback. So they might learn from falling down like, "Oh, that hurts. I should uh support myself a little bit better, right?" Or be a little more coordinated. Um and so they they learn from those actions and their interaction with the environment. Um uh so this is a complex um algorithm essentially uh it's it deals a lot with um again taking actions. Usually when you take an action something changes in the environment um then you kind of observe some type of feedback. So, think about like a a board game where you're trying to figure out what move you should make or another good example is like with a robot um trying to navigate a maze.
So, like what route should it take? Should it move forward? Should it move backward? Should it move left or right? Those are different actions it can take. Also, like a self-driving car, should it should it turn? Should it speed up? Should it slow down? Those are all good examples of things that have been trained from reinforcement Uh yeah. So real world examples would be like in a board game, uh a a reward would be like if you win the game. Um or if you like capture a piece like in checkers or chess, that's a reward.
A penalty would be like if you lose the game or lose one of your pieces, that could be a a penalty. um in a board game or sorry in like a a robot navigation task, it could get rewards for um moving in the right direction um towards the exit or like when it like let's say you wanted to train a robot on how to open the door and navigate a room. Um you would penalize it for bumping into the wall. Um you would give it a reward for moving usually oh like oh the algorithm themselves usually it's like a a step function um it's usually it's like a discrete function that kind of is based on the state so the reward it could be like um like depending on the let's let's go back to the board game example like the reward could be like or even the maze let's say like a navigating the maze like getting to this let's say this was the exit and this was the entrance.
Then if they make it to here, they get a numerical like if they make it to the exit, they get a numerical reward of like plus 100, let's say. So it's just a number. And then if they uh like if they bump if they go into here, like let's say this is kind of like a death trap or like a pit, this this would be like a minus 100. So it could be like discrete numerical values could be the reward. If they're moving in the right direction like let's say we want to encourage going this way then we could give smaller intermediate rewards like this should be a plus like if you move forward this is a plus five this is a plus 10 this is a plus 15 if you're moving in the wrong direction away from the exit.
Um that would be like a minus5 or a minus 10. Does that make sense? So they're they're numerical in nature. And what you're trying to do is collect the most reward. You're trying to get the largest reward you can through trial and error. So you you try this out many many many times. You basically simulate running through this maze many many times. And what dictates it what dictates like where I should go is based on what I've observed in the past. It's almost like you're a child remembering like, okay, what move should I make from this space?
Like, if I'm here, if I'm here, which way should I go? Should I go down? Should I go right? Should I go left? You kind of know that from experience. Does that make sense? Based on the reward that I've seen in the past, like when I've moved down, I've gotten a higher reward than moving left or right. That make sense? So, yeah, it's it's a numerical value as a reward. Yeah, that's a great question. Um, how does it differentiate rewards based on gain and loss, i.e. chess? So it's it's a very comp complicated uh answer but essentially every so in the chess board you can think of the board as like every every um space is a state.
So I could be in this state I could be in this state and then it's not not only is every every uh space but where all the other pieces are. So there's lots of states that are possible. Um, so the way there's a way to quantify essentially what's the value of taking a certain action like moving my piece left, moving it right, moving it up or down um given the rest of the state. So you're you're right, it may be beneficial to sacrifice. Um, but we would learn that through experience that okay, the best move in this situation is to sacrifice.
We would we would have to learn that through trial and error many many many times which is to say like okay if I'm in this current state of the world right all these pieces are distributed in this way the best move for me right now in the long run to get the most reward in the long run is to actually sacrifice my piece and move it right move it into like a bad position theoretically but we know from experience that's actually the most long-term reward is from that position like moving it right may be the best action for me.
So what you learn is how to take actions. And actions are usually like move right, move left, move up, move down. You think about like a self-driving car though, that's going to be like slow down, speed up, turn your wheel 10°. Um those kind of actions. So the the short answer is it's there's a calculation there that you learn what the long-term value of every state is every unique state and then you're trying to basically say what action should I take from that state given that current state of the world. And I really I really like reinforcement learning.
It's actually probably my favorite field of machine learning. Unfortunately, we won't be covering it um in our main uh course. We have offered uh electives around reinforcement learning in the past. So, um stay tuned. Maybe when we get to the end of this program, uh we'll offer an elective on it and if enough people sign up for it, we'll we'll run it. But, um we it's not part of our we don't really cover reinforcement learning as part of our main topics. It's it is an advanced uh more advanced topic than than what we'll cover. But, um I I really enjoy it.
Find it very fascinating. Okay. So, all of this is kind of um illustrating what I was saying, which is um you think of like uh the thing that's interacting in the environment like the robot or the car or the human moving a chest piece is known as the agent. It's interacting with the environment by taking actions which updates the state um of of the environment. So that's that's why you see this word state here. This gets updated constantly every time you take an action. Um ultimately what reinforcement learning is trying to do is learn the best action.
Like what would be the best action to take? Um and the best action is is the one that leads to the most long-term reward. That's the best action. Um, so you have to uh you have to learn what you know what leads to a good reward by kind of experiencing this over and over and over through trial and error. So there's a lot of um kind of simulation or letting the robot try something a lot um in order to kind of learn what's rewarding and what's not. Think about it again like I think a good example is like with children, right?
you kind of have to let them try things until they learn on their own what's what can they do and what can they not do what's the best actions right so reinforcement learning has made its way into other places so I I said like a good example is self-driving cars or ro robotics a lot of reinforce reinforcement learning is used there one place it's found its way into recently is recommendation systems have kind of merged with reinforcement learning Um, and this is because you you can imagine there's kind of a built-in reward for you clicking on a video and kind of watching it.
Um, so that kind of reinforces that recommendation and then uh that's where um you can then kind of recommend a similar thing and see if that's rewarding and generates a click or generates some view time or watch time or whatever. Um so reinforcement learning has found its way into a lot of areas. Um recommendations being one of them because it's just natural for the idea of like what um should I recommend next to generate the most reward. In this case the reward is kind of correlated to did they click on it or not or did they how long did they watch for longer it's more rewarding.
um those kind of things but uh place places where reinforcement learning have been used I said self-driving cars um games so uh one of the most famous examples if you want to look it up is the um Alph Go this was in 2016 um the Alph Go uh algorithm was a reinforcement learning bot that beat um some of the world's best Go players which go if you're unfamiliar or go is a um board game that is a little bit more uh complex than chess. It has more more uh it's a larger board um more pieces to it.
Um but they there was a reinforcement learning powered bot that actually um learned how to play the game so effective it could beat um world kind of masters at the games was pretty amazing. Um that's the alpha go and that was by deep mind Google and deep mind in 2016. That was pretty that was only in 10 years ago not that long. Um so certain uh we said recommendation uh even autocorrect um learning to predict like what is the best correction uh to generate a reward which would be like you accept that correction or you reject it would be a penalty.
Um so reinforced learning has been adapted to these kind of problems very successfully. Let's take a look at the packages that we will use throughout. So um of course we will rely on these three which we've already relied on to do a lot of things like numpy to do numerical manipulations and calculations. Uh mapp to do any plotting and not only mapp but maybe seabor as well. both of those to do plotting. Um, pandas is a big one because that's where all of our data is going to be manipulated and prepped before it goes into modeling.
So, all of that stuff we learned from pandis is definitely going to be applied here in this course uh as we actually build models. Um, so of course like these old ones that we've been working with quite a bit um still going to be useful here in the modeling stage. Um, mainly for different reasons though, mostly to get our data prepared to do some type of modeling or maybe to visualize it before we do modeling to get a sense of what it looks like, those kind of things. scien uh um processing like in unsupervised learning, we'll actually use scypi a little bit to do dimensionality reduction or help us do that.
Um so scypi will be used here and there and we've seen it before with hypothesis testing. We use scypi like the t test and z test came from there. Um some of the unsupervised learning stuff will come out of there. But the package we will use by far the most in this course is going to be scikitlearn which is here. Um and we've already seen a little bit about scikitlearn in terms of its pre-processing capability. So we use the uh minmax scaler and the standard scaler from there from the pre-processing module in scikitlearn. But it has um many different models built into it that we can use to help uh do our training and predictions.
Um so it's a incredibly useful machine learning library. It is the industry standard machine learning library. Um if you're going to do anything in machine learning, it would be expected that you know how to use scikitlearn. Now what's really lucky about that is that scikitlearn is a really easy package to get used to. Nearly everything we do in scikitlearn will mostly follow the same pattern and so um the code will be extremely simple. They did a great job with that package of making things really user friendly, really simple. Um, it's a really fantastic package and we're going to get a lot of practice with it uh as we go along.
Every model we build will essentially be from scikitlearn and not only like the models but um doing the training, doing the predictions and then doing the evaluation will all come from different uh scikitlearn u modules. So that'll be really nice and we'll get um good exposure to that package throughout the course. So if anything will come away from this course as um scikitlearn uh uh experts that'll be very nice. So this is this will be the new one for us learn but we'll get a lot of practice with it. All right. Right. So just to recap that lesson before we move on to lesson three.
Um we talked about machine learning as learning from data. Um which is included underneath the AI umbrella but deep learning is also included under machine learning because it's still learning from data but it's learning using neural networks. Um we talked about the four different types of machine learning. We had supervised, unsupervised, semi-supervised and reinforcement. So those are the the different types of machine learning that are out there. Um and then we talked about some of the pi Python packages uh that we will use. The main one being scikitlearn and of course we'll use our older like pandas to manipulate our data and get uh pass it into our model training etc.
But scikitlearn will be uh our goto for anything machine learning. All right. So I have some questions for you guys, some checks. So let me know in the chat what do you guys think? Uh which of the following best describes machine learning? Which choice do you think makes the best is the best for this? Very good. Very good. I see I see a lot of choices for A and A would be the correct choice. So machine learning is definitely um a a subset of AI. It's underneath that AI umbrella, but of course we're learning from experience and of course that experience is recorded in the data um without being explicitly programmed.
Uh so it's the exact opposite of BNC. We're definitely not learning from rules and it's definitely not just used for image and speech speech recognition. It can be used for many other things beyond those. So yeah, A is the best choice there. What we say here? Okay. What do you guys think about this? Which example illustrates the use of machine learning to enhance customer experience in an ecommerce company? In other words, what would be some what would be some uh typical use cases of machine learning? good. So I think uh C is going to be the best answer here.
Definitely C. So it's using machine learning to do uh fraud transactions. So So that would be a prediction probably a supervised learning, right? if if this is fraud or not fraud. Um and then maybe some customer behavior uh that might be unsupervised. So maybe grouping together customers uh clustering them based on their data like their shopping behavior and characteristics. Um that that might be unsupervised but either way it's machine learning. Okay. Final one. What distinguishes deep learning from machine learning and artificial intelligence? So what's unique about deep learning? Oh, very good. Yep. So, deep learning uses neural networks as so you guys are right on top of that.
Neural deep learning uses neural nets. That's what makes it unique. So, machine learning would be part A. Machine learning is focused on learning from data. underneath of that is learning from data using neural networks which is what uh deep learning is. Very good. All right, let's go to lesson three. And lesson 3 has two notebooks. We're going to be starting with 3.1. So, you'll want to open up that notebook. I'm going to go over to it now. Give you a moment to open that up. So, we're going to open the 3.1 notebook. Um, there's two of them.
We'll see how far if we can get into the second one today. Probably will. Um, but we're going to do the uh we're going to start with 3.1 notebook. Do you guys have this notebook? Should be in your materials for for this course. Let me give you a moment to open that one. Do you guys have it? All right. So, we're going to start by talking about uh supervised learning um in our machine learning journey. So remember, we're going to talk about uh supervised and unsupervised after we do supervised. Um and there's going to be a lot to cover with supervised mainly because um there are uh two different types of problems we can tackle uh which will be uh we'll talk about in a moment predicting different kinds of values.
Um but let's talk about the kind of what we're hoping to learn here which is um talk about the different kinds of problems that we'll study which are these these categories of supervised learning. Um those two categories are going to be called classification and regression. We'll talk about those and their differences and then talk about some applications and some uh example algorithms and that's just within this notebook. Um 3.2 two we'll get into uh regression in particular um which will be uh very very interesting. Okay. So that'll be our first models that we'll build will be over there in 3.2.
Okay. So if you guys remember um supervised learning is where we learn from labeled data. So we have input and outputs in our in our data set. Um and so you train a model on this data that includes input features and corresponding outputs that are that are the labels, right? So um the goal is to learn a relationship between the input and the output. Of course, that's what any model is trying to do. Um, and what this allows us to do is then take that model and use it to make predictions on never-beforeseen uh data.
Right? So then we have a predictive model out of that that we can use um going forward on new examples. Um so remember we will have in our data a bunch of features which are columns and then generally one of those columns will be the label that we're trying to predict. And our model is going to try to learn some type of relationship between those inputs and the output label. So the output label could be like fraud not fraud, cancer not cancer, uh a price, a temperature, those kind of things. So let's talk about that.
inside of um supervised learning there are two different types of learning that we can do and they're really based on the label or sometimes that label is known as the target that we're trying to predict. Um and depending on that type we get these two different categories of learning or two different types of learning. One is known as regression. So that's generally when we are predicting something that is continuous or something that is a numerical. So numerical numerical value. So think of price, think of temperature, think of revenue. We're trying to predict something like that.
Um versus something that is categorical. So that the predicting something categorical would be like fraud, not fraud, spam, not spam. um those are discrete categories and the problem of predicting categories is is known as classification because we're trying to classify examples as belonging to one category or another. So we have these two main types of supervised learning problems. We have regression and we have classification. and they're going to be handled slightly differently um for many reasons that we're going to uncover. Um one of the primary reasons is that of course we're predicting something that's continuous in the regression case versus something discrete.
So the models have to be slightly different to account for that. Um but then a step beyond that is the evaluation has to be different too. Um I kind of alluded to this last week, but when you're predicting a regression, it's very very difficult to to get the exact numerical answer. So um generally we don't care about that. Um generally we don't care about getting it exactly uh we don't care about getting it exactly right. um we just care about getting it um we're just we care about getting it nearby, getting it close enough. Um whereas classification, we do care about getting it exactly right because it's a discrete category.
So we're going to be able to evaluate that a little bit differently to say did we get the answer right or wrong. Regression is going to be did we get close? Um because it's we assume it's going to be nearly impossible to predict a a continuous number. Um that's very hard to do. So any questions on uh that? Any questions on those two differences? Let me give you some examples. Maybe it'll it'll help too. So again, the classification is going to be predicting uh something that's categorical. Regression is going to be predicting something that is continuous.
So think about trying to predict the price of a house based on those other features we talked about before like square footage, bedrooms, bathrooms, all of those things we predict the price. That would be a regression problem because the price is a continuous value. Let's take a look at an example here. Um, imagine we were trying to uh predict the temperature tomorrow. That's going to be a regression problem, a a supervised learning kind of regression problem because we're trying to predict a numerical temperature. Okay? And versus a category like a discrete category would be this would be a classification.
So this is a regression on the left. This is a classification on the right classification um because we are um predicting one of two categories. Is it just hot or cold? Now, we're not saying exactly where that threshold is on what's hot or cold. That would be a decision on on what we want to what our discrete categories actually mean. But, um we only have two choices, hot or cold. versus predicting the entire temperature which would be um a numerical prediction of some exact number. Right? So that'd be a regression and then on the right would be a classification.
Um now again why is this so different? You can see the types of predictions we're making are completely different. One's a number, one's a category. But again with evaluation it's like if the if the true answer in our labels was 84 and we predicted 83 that's a pretty good result. That's still pretty close. That's pretty close to this. So from an evaluation perspective that's pretty good. Um whereas like if I predicted cold and it's actually hot that's that's a wrong answer. So they're evaluated slightly different. Um, and that's something we're going to see as we talk about evaluation of our models once we build them is depending on if it's classification regression, there's going to be different ways of evaluating them.
You can kind of see why it's very difficult to say, okay, we got exactly 84 when it could be any number. Our model is going to be predicting a number. That's really hard to pin down an exact floatingoint number. So, the best we can do is kind of say, how close did I get? Like, this would be a worse answer. If I got something all the way down here, that's a really long distance to here. That's bad. That's a bad prediction. But if I get something really close, that's better, right? That's a decent prediction because it's pretty close, right?
Of course, being perfect would be getting exactly right, but that would be nearly impossible to do. All right. Any questions on this? Does it make sense on regression versus classification? We're going to use those words quite a bit as we go along. So, regression predicting that continuous value. Classification predicting a category. And they're going to be um different models that do that different models being used for regression versus different models being used for classification. All right, let's talk about supervised learning uh applications here. So just to name a few, we have HR operations. is imaginary recruiter tasked with finding the best candidates.
Um so supervised learning can help by um rejecting or accepting candidates. Now this is something that happens quite a bit even today. Um and that it's kind of like uh how recommendations happen like this this resume should be um recommended this should not um from a whole pool of applications. Um so there's those kind of use cases of of um predicting a category that would be like a classification. Should we should we accept or reject the the candidate? Um finance you see this all the time with things like risk and loan approvals. Um you can uh predict the the the category of like if the if the loan if we should accept or reject the loan application.
um you know that would be a classification. Um what's interesting about classifications by the way so it says here like we can predict the likelihood of a of a loan being repaid um is a lot of classifications um we we say that they predict a category but under the hood they can actually predict a probability and we turn that probability into a category. So, um, you know, like we could say what's we could say the likelihood of her loan being repaid is very low. Let's say it's less than 50% probability. Um, then we could label this as reject, right?
We could label that as a rejection. Um, if it's greater than 50%. Then we could label this as accept. So we can set a threshold there and say okay truly we're predicting a prob like our model spits out a probability but we turn that into a category by saying should we accept if it's less than 50% we should reject if it's greater than we should accept. Okay, so that's something we will see with some of our classification models is that they actually produce a probability and we turn that probability into a category label um by by doing something simple like this putting a threshold on it um for the for the category.
So finances is used all over the place. Not only just loans like fraud, we talked about fraud, not fraud. That Um predicting sales revenue, that would be a regression, right? What is the revenue going to be in the next two quarters? That's going to be a regression problem. Uh emails like spam, not spam, that's going to be a classification. um that's going to operate on the that's going to take the text input and predict if this email is a spam or not spam. That's going to be a uh supervised learning problem, but it's going to be a classification problem, right?
Uh manufacturing supervised learning is used to inspect and uh quality and classify products in different grades. For example, a factory might use a model to check for defects. So this is actually something that happens is you look at images of products as they go through the assembly line and you can take a look at those images and predict if it's a high quality, low quality, medium quality. Um so they can be this is a classification, right? They're going into different categories of quality. Um so it's much much like a manual kind of intervention by some uh QA or quality control uh specialist.
Okay. But that's a classification. So in the maritime industry, supervised learning can be used to predict current. So current level um and that can be used to forecast uh supply and demand. Um so those would be like regression models that are used to predict um kind of like temperature but in this case like title levels. We talked about fraud already, so that's there. Um, that would be a Okay, any questions on these uh examples? Of course, there's many more. Um, recommendation is kind of like a supervised learning problem uh where you are taking examples of things that people have viewed in the past or or reviewed in the past and using that to predict what they would want to watch in the future.
Um, so recommendation is supervised learning. Um and it's like a classification, you know, trying to predict um uh certain number of categories of of uh shows or movies that you would want to watch. Um and that's something that we will study in the future. Recommend we'll we'll have a whole lesson dedicated to recommendation as well. All right. So when it comes down to the uh actual models themselves, so there's going to be lots of different models that we are going to cover. Um and they are um going to be different in their purpose and kind of their uh what kinds of problems they're used for.
Um and uh their their how they actually train is going to be different. Um, but at a high level, they're all trying to do the same thing, which is learn some sort of relationship between the input data and the and the label, right? That's really what they're trying to do because they're all supervised. They're they have those labels, trying to build some relationship there. Um, they just do it differently. And what we're going to study is the pros and cons of a lot of these models, like when would I use one of them, when would I use another.
Um, so we'll try to talk about that as we go along. Um, but they're all trying to learn some relationship between the input features and the output, right? So we have to keep that in mind. They're trying to model that relationship. They just do it in different ways. Okay? So as we go along and learn about new models, um, we will learn the details. will learn the ins and outs um and those pros and cons, but they're no matter what, they're all trying to uh learn that relationship, right? And be able to make predictions on new data.
so here's a list of models that we will cover and work on throughout the uh the sessions that we have. um we're not going to do them all in one one sitting, but um the first one that we're going to start with and that we'll cover today is going to be linear regression. So we will cover linear regression and then we'll cover the rest of these guys mostly in the context of uh So, um, what's interesting is some of these guys can actually be used for both regression and classification as long as you make, um, certain adjustments to them.
They have variations that can be used to do classification and regression is very interesting. Um, but we're going to start with linear regression today and then work our way through the rest of these models when we do um, we're going to do a separate lesson four on classification. So these all these guys will come from lesson four. Um and then uh we will do this guy in lesson three in the 3.2 notebook. We'll do all about linear regression. Yeah. I so logistic regression is a classification. Um which is kind of strange that its name is regression but it's doing a classification.
But the the reason is that the logistic regression um computes a probability. So it does a regression to predict a number but that number is actually a probability. So it it produces a result that's between it produces a probability that's between um obviously uh zero and one. So it uh and then we take that probability and we turn it into a category like a spam not spam fraud not fraud. Um but so so logistic regression is kind of special. It's sort of like a regression but it's predicting a very specific type of value which is a probability.
So for for that reason it's a classification uh algorithm primarily. So we'll study that one in lesson four. Uh but yeah, that's that's why it's under that kind of umbrella of classification is because it's it's producing a probability as its main output which we can then turn into a category as long as we interpret that probability as um in the right way uh like the probability of spam, probability of not spam. Okay. So, let me focus on um let me focus on linear regression. I'm not going to go through all of these other use cases because we haven't learned these models yet.
Um so, I don't think they're good. Uh I don't think it's good to read about them yet until we've covered them. So, once we cover them in lesson 4, I'll come back and describe these examples to you guys and we'll see why it makes sense. But I think for linear regression um which is what we'll cover next, let me talk about that example. So a prototypical example would be like predicting the house prices that we've seen in that house price data set. So um if we wanted to uh if we wanted to predict um if we wanted to estimate the market value of a house so the price um we could do that by using the features such as number of bedrooms, square footage, location, age of the property.
Um and you know then when a new when a new house comes on the market we could estimate what the price should be based on those features. So linear regression is a good one to predict the price like a housing price. Um and we'll actually practice that in the next uh notebook. So we'll we'll uh and then all these other now there's descriptions of these other models but again we haven't covered these guys yet. So I don't want to really go through those until we get to those models. So we get to those I'll come back and mention the example.
Uh can K andN be used for clustering? No. So um the clustering model is going to be different. It's going to be uh K means K means that's the primary clustering model. Not K nearest neighbors. K nearest neighbors is used for uh it can be used for regression. It can be used for classification. So we'll we'll talk about K andN which is the K nearest neighbors in lesson four. It sounds really similar. Yeah, it sounds really similar but K means is a clustering algorithm that's that's slightly different different uh there's no labels used at all.
This K nearest neighbors is a is a supervised learning algorithm. that uses uh labels. Good. Any any other questions so far? So that being said, let's move on to the 3.2 notebook. Let's move on to that which will be our um first discussion around uh regression. So going into supervised learning and regression. Give you guys a moment to pull up this notebook. But yeah, you want to pull up the 3.2. We'll do this one next. So we'll focus in. So our plan is to do regression first and then we'll talk about classification in lesson four which we will cover all those other models which you you could use for classification uh on that list.
But then we're going to talk about linear regression uh first. All right. So we have a a big agenda. This is a big notebook um to go through a lot of material here surrounding regression. So we're we're going to start with linear regression and see um how we actually perform it, what that model is doing. Um which we've kind of seen the idea of it a little bit already. So it should be somewhat familiar. Um and then we'll talk about how to adapt that linear regression idea to um nonlinear what's called nonlinear regression which is going to be using like polomial uh features.
We'll talk about how to do that. Um and then a big big big topic for us is going to be evaluating the model. So it'll be it'll be quite easy to actually build it. building the model will be really easy but evaluating and interpreting that will be uh a lot of interesting work there um because we want to know what the performance of that model is once we have it built right we want to know how good of a model is it is it worth using or do we need to retrain it or get new data or change the model up we're going to talk about that um how do you determine what to do based on that performance um and then we'll talk about here um a couple things.
We may not get to this today, but regularization which is used to boost the performance uh in certain situations um whenever the model is kind of uh performing um poorly against test data even though it performs pretty well on training data. In that scenario, you can use offshoots of linear regression that do some uh what's called regularization. We'll talk about that. Um and then we'll talk about hyperparameter tuning uh generally as a strategy which is something you generally do want to do when you're training machine learning models. Um so again these two we may not get to today but um quite a quite a lot to get to prior to that mainly centered around evaluation and building linear Okay.
So pretty cool. we'll get to our first kind of model here. This linear regression to start with. Okay, so let's start with uh linear regression here. Um, and really what linear regression is attempting to do and I want to show you this in this picture is draw this line sometimes what is known as the line of best fit. So this is our model that kind of goes through the data and it's generally a good predictor um because if you give me um features uh if you give me new features and let's say they are let's say you give me a feature that's right here.
So you say, okay, I have a feature that's this value on the x- axis. Then I know all I have to do is plug that into my line equation, and I will generate a a value that's like right here. Okay, that's pretty that's on that line at that input. And that's going to be my prediction for what the output variable should be. It's just going to be something on that line. And what you can see is this line is a decent estimate for this data because it slices through this pretty evenly. So it's a good guess as to what the output should be given any one of these inputs.
It's a it's a good estimator this line. And so our goal building a linear regression is to kind of build the equation of this line. So we want this equation. Equation of this line is going to be our model. Yes, it's going to look just like that. MX plus B or yeah, MX plus C. It's going to look exactly like that. uh except that it's going to be more than just MX because we have um generally more than one feature. So you think of X as a feature um it will be more than just MX.
It'll generally be like uh it'll generally look like this and then plus maybe some bias here plus an intercept. Yeah, it'll generally look like that. So, yeah, you're exactly right. MX plus B is the right idea. Exactly right. It'll generally look like that. Nonlinear. It can be adapted to nonlinear. Yeah. If we transform, we're going to talk about that. If we transform all of our features in a nonlinear way, um we can apply linear regression to it. Yes. And and that would be a nonlinear regression. So yes, we can do nonlinear things too. We'll talk about that.
Okay. So linear regression again is the art or science I should say not really art but it is an exact science of finding the equation of this line that fits through this data. Um now why one thing you should be thinking about is why is this line a good predictor and the argument is that if you take a look at this distance from these blue points so let's say these blue points are our actual data points this line is going to be found such that it minimizes this distance from the points to actually I should draw it this way from the points to the line.
So, we want this distance to be um actually I should draw it that way. This way, we want this distance to be kind of at a minimum. So, it would be bad to draw a line all the way out here because then that's a lot of distance, right? So, and that would be a lot of error um contributed from not being able to predict those points in our data set very well. Um which is our training data. That's why we have labels, right? that that guide us in building this line. Um so our goal is to build that line especially so that this error or this distance can be as minimum as possible.
Right? Which are all these distances from these points to the line. We want those to be as minimum as possible. So our goal is to find this equation. So we're going to build a model that's going to find this equation. of the line um such that our error is minimal. And what is the error? The error is the of our data points to to the line that we build. So essentially what we'll do in order to train this will be to adjust the parameters or the or in that like I think is really good you brought up the MX plus C.
Basically the M and the C will adjust. So we adjust those accordingly to make this distance as small as possible. Okay. to minimize that distance as much as possible. So, um where is regression use? So, we've already seen some examples. Here's some more. Uh advertising like predicting sales, predicting um oil and uh oil production and demand. Those are like forecasts. Those are regression problems. Um retail like demand forecasting for inventory. Um healthcare predicting um uh the levels of certain um uh blood markers or you know something like that. Um real estate predicting prices based on those uh talked about like square footage, bedrooms, bathrooms, those things.
So regression is used again whenever we want to predict a number a numerical output um that's a regression problem. So this kind of regression we're talking about here is generally um known as uh a when that equation is linear that is known as a linear regression. So go back to that picture when we have a when that equation of the line that we find is a linear equation meaning that it is exactly the form I've been telling you. So it's it's something like um weight time feature plus weight time feature and then maybe some intercept um term like some some bias term there.
Um this is a linear equation because all of the features are to the single power. So it's a linear power and this is a linear combination of features with with those different weights. So this is a linear model because it is uh it's what in math we would call this a linear equation right everything is to the first power it resembles mx plus b it is a linear equation or linear model. Um so when we talk about linear regression that is a regression model so we're predicting some continuous target that assumes we are model our model is formed from this kind of equation a linear equation.
So this is going to be our our model for a linear uh regression. And so when you when you train a linear regression, your goal is to learn these weights so that you can plug in um you can plug in any one of your uh input features and you um can generate a prediction. You can which is going to be something on that line, right? It's going to be a value that's sitting here on this line. We put in all of our features and we end up there somewhere on that line. This output. Okay. Let me pause there.
Any questions on the linear model here or why it's called linear regression? Okay. And by the way in these notes um this bullet point here where it says it uses the least squares criterion to estimate the coefficients that is exactly what I said earlier with the distance. So the distance is based on the square of this this quantity like how far away you are from the line is based on this square distance here and here and here and here. So what we're trying to do is find the least distance or least squares which is that minimum distance.
So that's how we find all of these weights is from minimize. We basically tune them enough using our labels. So here's our label which is the y. We basically plug in our data and tune those enough to minimize the error. It's it's a it's an optimization problem, right? We we're trying to find the minimum of this quantity which is that best fit line. So we have linear regression um and we…
Transcript truncated. Watch the full video for the complete content.
More from Simplilearn
Get daily recaps from
Simplilearn
AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.
![Business Analysis Full Course 2026 [FREE] | Business Analytics Tutorial For Beginnners | Simplilearn thumbnail](https://rewiz.app/images?url=https://i.ytimg.com/vi_webp/_X6etf9ucd8/maxresdefault.webp)


