Applied Data Science With Python Full Course 2026 [Free] | Python For Data Science | Simplilearn

Simplilearn| 07:36:59|Jun 10, 2026
Chapters14
An overview of how data science underpins modern decisions and how Python, with libraries like NumPy, Pandas, and Matplotlib, is used for data handling, analysis, visualization, and basic machine learning. The chapter also emphasizes real-world application projects and the value of completing the course for practical data skills.

A practical, beginner-friendly tour of Python for data science: NumPy, Pandas, visualization, and intro machine learning with hands-on notebooks and real-world datasets.

Summary

Simplilearn’s Applied Data Science With Python course distills the essentials a data scientist needs to start solving real problems. The instructor walks through core libraries like NumPy for numerical arrays, Pandas for 1DSeries and 2D DataFrames, and visualization tools such as Matplotlib, Seaborn, Plotly, and the plotting flow in Pandas. You’ll see how to build and manipulate arrays, shape and reshape data, perform arithmetic and statistical operations, and handle missing values. The course also covers data frames, indexing, filtering with boolean masks, and the powerful “describe” and “info” summaries to quickly understand datasets. A substantial portion is devoted to Pandas workflows—reading CSV/Excel files, inspecting with head/tail, and performing common transformations (sorting, dropping columns, handling time series, and basic grouping). The transcript emphasizes practical patterns (vectorized operations over explicit loops, one-hot encoding with get_dummies, and label encoding concepts) and ends with a teaser of modeling-ready steps and how these tools set the stage for Scikit-Learn, while previewing future lessons on advanced plotting and data preparation. The course is designed to give you a strong, applied grounding so you can turn raw data into actionable insights and ready-to-model features. Reading material and notebooks are highlighted as you proceed, with a nudge to explore hands-on projects like sales analysis and data cleaning tasks that mirror real-world workflows.

Key Takeaways

  • NumPy arrays (np.array) are the foundation for fast numeric work; they support multidimensional shapes and built-in statistics like mean, median, and stddev.
  • Pandas Series (1D) and DataFrames (2D) layer on top of NumPy, enabling labeled indices, flexible data manipulation, and easy column/row selection.
  • Get familiar with head/tail, info, describe, and shape in Pandas to quickly understand data structure, summary statistics, and data types.
  • One-hot encoding with pd.get_dummies transforms categorical features into numeric columns, enabling machine-learning models to ingest non-numeric data.
  • Vectorized operations (e.g., df['col'] * 2) and apply() offer fast, scalable data transformations over explicit Python loops.
  • Resampling and time-series handling (pd.date_range, .resample) let you aggregate data to daily, weekly, monthly, etc., which is critical for summaries and modeling.
  • Plotting with Pandas (df.plot) provides quick visual checks, while more advanced plotting is reserved for libraries like Matplotlib, Seaborn, and Plotly for richer visual storytelling.

Who Is This For?

Essential viewing for beginners and practitioners who want an actionable, hands-on foundation in data science with Python. It’s especially useful for those moving from spreadsheets to Python for data cleaning, feature engineering, and basic visualization before diving into machine learning with Scikit-Learn.

Notable Quotes

"“Data science is working behind the scenes.”"
Opening framing of why Python and data science matter in real-world decisions.
"“NumPy is the first place we’re going to start just because it’s so important for working with data.”"
Intro to NumPy as foundation for array-based computing.
"“A column is a Series, and every column in a DataFrame is a Series.”"
Fundamental Pandas concept linking Series to DataFrame structure.
"“Get dummies does the one-hot encoding for us.”"
Practical technique to convert categories into model-ready numeric features.
"“Vectorized operations are faster than looping through every row.”"
Performance best practices when transforming data in Pandas.

Questions This Video Answers

  • how do I start with NumPy and Pandas for data analysis in Python?
  • what is one-hot encoding and when should I use get_dummies in pandas?
  • how can I read CSV and Excel files into a Pandas DataFrame?
  • what’s the difference between a Pandas Series and a DataFrame?
  • how do I perform time-series resampling in pandas for daily or monthly aggregations?
Python for Data ScienceNumPyPandasMatplotlibSeabornPlotlyScikit-LearnDataFrameSeriesOne-Hot Encoding (Get Dummies)`,`Label Encoding`,`Time Series`,`Resampling`,`Data Cleaning`,`Data Visualization
Full Transcript
Hey everyone, welcome to this applied data science with Python course by Simply Learn. Today, data is behind almost every smart decisions we see around us. When an app recommends what you might like, when a company studies sales trends and when a bank detects unusual transactions or when a business uses numbers to plan its next move. Data science is working behind the scenes. And one of the best ways to start learning data science is with Python. In this course, we will understand how Python is used in real world data science and why it has become one of the most popular languages for data analysis, visualization, and machine learning. We will begin with the important Python libraries used by data professionals including numpy, pandas, mattplot, lip, corn, and skiitplot. Then we'll move into working with real world data where you will learn how to create arrays, use data frames, clean the data, handle missing values, analyze patterns, and prepare data for meaningful insight. After that, we'll explore data visualizations where you will learn how to convert raw numbers into clear charts and graphs that are easier to understand and present. We'll also cover key foundations like statistics, linear algebra, categorical data handling, and basic machine learning concepts. So you can understand how data is used to make predictions and support better business decisions. By the end of this course, you'll have a strong practical understanding of how Python is used in applied data science from data handling and data analysis to visualization and machine learning basics. So let's get started and learn how to use Python to turn data into useful insights. Also, if you are ready to take your skills in Python and data science to the next level, check out the Simply Learns data science with Python course. This course is perfect for everyone who wants to master Python programming for data learning. You'll learn how to work with key Python libraries like NumPy, Pandas, and Mattplot Lift for data bragging and analytics. You'll dive deep into data visualization, feature engineering, and statistic methods that are crucial in the field of data science. Plus, you'll get hands-on experience through real world projects like sales analysis, marketing campaign analysis, and more. Upon completing the course, you will earn a course completion certificate from simply learn which can boost your career and showcase your skills to employers. Check the description below for the link and start your data science journey today with simply learn. Before we get started, here's a quick beginner friendly quiz for you. Which Python library is mainly used for data manipulation and working with the data frames? Is it mattplot lip pandas zon or skiit learn? Comment your answers below. You know we spent some time on it already but just to reiterate that Python will be our friend here when we're doing data science. So it'll it's going to be the preferred programming language for anything data science and that's true in the industry. Um, Python is widely used mainly because it has so many great packages to help us work with data, namely NumPy and Pandis, which are the first two we will look at. And then it has many others to help us build models like scikitlearn uh which we'll get familiar with and then it has others to do visualization that we'll study. Um, so it basically has packages that do most of the tasks for us that we're interested in doing. So that, you know, that's why we'll stick with Python. It's really great for data science. So we've talked about this before. It uh why we why people prefer to use Python is because it's open source interpreted. It has so many great packages that are uh that are oriented for data science and and can help us do data science really easily. Um a lot of people used to use R to do data science but people it's been a shift um over towards Python because of its flexibility. Um Python can integrate with other systems pretty easily whereas Rs R is more difficult to use. Um yeah, R is like another it's like a scientific uh um analysis language. Um you know it's it's used in a lot of like statistics. Um a lot of statistics people like using R but uh for doing data science it's almost exclusively done in Python. So there's really no you don't see R too often. I've I've really never seen it. I've only seen Python in the industry. So, no worries about the R. Historically, R has been around uh R has been around for a while, but um Python is by far in a way the most used uh data science uh language I've for sure. Okay. So, I want to briefly tell you about some of the packages that we are going to study in in our course. um that are in Python that we will use to do data science. So I just want to briefly talk about them and then of course we're going to have um a couple of lessons dedicated to going into those like numpy and pandas and all the visualization libraries. So the first one is numpy. So numpy is uh short for numerical python. So that's that's why it's called numpy numerical python. And it is a Python package for doing computing basically scientific computing uh using these uh array structures that NumPy has created. Um and so many things are built off of numpy arrays and the ability to operate on these numpy arrays. So, Numpai came around and um created these multi-dimensional arrays which are essentially like matrices um and and also had a lot of different um computing uh tools around the the matrices and arrays that so many other packages are built off of. So, we're going to learn pandas. Pandas is built off of numpy. So is uh map plot lid which is for plotting and so many other packages are built off of numpy. So it's a really foundational um package for working with data because data will be stored in numpy arrays. The numpy array is the kind of foundational data type of numpy and and so many things work with numpy arrays. Okay. So numpy will that's going to we're going to have a whole lesson dedicated to numpy coming up next. But NumPy is going to be the first place we're going to start just because it's so important for working with data. It's multi-dimensional arrays are so useful for storing and manipulating data. Um so it's it's pretty important. What is a forier transform? It is a transformation of uh data into like a signal uh basically like a signal transformation. And so you extract you go from like a a basically like a time series into like a frequency series. It's used in signal processing. Yeah. Analog to digital. Yeah, pretty much analog to digital. Yeah, it's used in signal processing. Okay. So the second package that we will study, so we'll start with NumPy. We'll start with that today. So right after this lesson, we'll dive right into NumPy and start working with examples of the NumPy arrays. But um right after that is the library pandas, which we'll spend a lot of time with. Pandas is a library built off of numpy. So it it depends on numpy and it basically comes around and provides a more structure uh to manipulating data. So if you're like I said earlier, if you're familiar with Excel, pandas has a lot of functionality that mimics what you would do with a spreadsheet. basically like structured row column data um is is what pandis excels at. Um so pandas is going to be a really fundamental package for us to manipulate um data that's structured in kind of a row column matrix format but it's built off of um numpy. So it uses numpy under the hood to do all the manipulation but pandas provides its own data structures to kind of put data into almost like a spreadsheet format that we can manipulate. Okay. So really pandas is going to be really really powerful for us to manipulate data and we'll use it all the time. So if if anything coming out of this course you guys will be pandas experts if anything else. Yeah, I mean of course you'll learn more than that, but I think you'll come away as being really really good um users of pandas and numpy for that matter, but but certainly pandas. So we're going to study numpy first and then we'll have a lesson dedicated to pandas right after numpy. So we'll have a lot more to say about it, but I just wanted to kind of preview that. You know, it's a really important package in the data science ecosystem because it helps us manipulate that structured data that's in like a row column format like a table. Okay. Then another package is the sci package which is um short for scientific Python. It is another open source library that's built on top of numpy. So it uses numpy arrays as its underlying uh data structures to do the manipulations. Sidebite contains a lot of um scientific formulas and a lot of um scientific computing uh tools that we'll use especially when we get into hypothesis testing. So it contains a lot of like Z test, t test, distributions, things like that. Um so it's it's tailored for that. It also has things like the fora transform as well. Um it has different linear algebra manipulations as well. Um so sci will be really useful uh when we get into our hypothesis testing and AB testing. It has those kind of uh um those distributions that we'll need to do our tests like a like a student t test or a z test or those kind of things we'll we'll use scypi for. So really important package we'll see later we do hypothesis testing. Um another one that is going to be useful from time to time is the stats models package. So it is one that um basically has a lot of statistics oriented things. Um it it has um some basic models in there like like linear regression or logistic regression. Um we will generally favor a different package to do those kind of models but uh just calling it out that um stats models does have some useful stuff when it comes to doing statistical testing. So there are some like kiquare tests or ANOVA tests that we will borrow from stats models that sci um we can borrow from stats models. So we will use it when we get into hypothesis testing as well. So these last two, so sci and stats models are two packages that we'll use when we get into AP hypothesis testing. Okay, so that brings us to scikitlearn. Now this is going to be our primary package for doing machine learning. Um, so this will be one that we'll build all of our models and machine learning off of when we get into our machine learning course. So we we won't really use scikitlearn in this current course, but when we get into machine learning, uh, it will be our go-to package to do all of our uh, machine learning with. It is a fantastic fantastic library that's been um, developed over over years to contain all the basic models that we would ever want to build. Second is really awesome. Um, so it can it can build models for so many different use cases. Um, and it's a really easy package to use. It has a really nice interface, really easy interface. So we will see that later on when we get into our next course on machine learning. But just calling it out that is a very popular uh data science library scikitlearn. So when we get into our modeling, we will use scikitlearn. when we do our data prep man manipulations we'll be using numpy and pandas. Finally for visualization um for visualization we will be using a library called mappot lib. So it is kind of the foundational um python plotting library that borrows inspiration from uh from from mat lab. So if you guys have ever used the mat lab plotting um it's actually very inspired by that hence the name Matt plot um from mat lab um but it's it's going to be our main tool for using uh for for building graphs. Okay so um it's a foundational library for building graphs. Almost every other library that does visualizations is built off of this one, built off of MPot Lib. So when we get into our visualization course, we will come back and do uh we will come back and talk a lot about Mattplot lib and practice with mapp quite a bit. Excuse me. What is that course called? Uh machine learning. Our next the next course is called machine learning. Okay. And then another uh visualization library that we will lean on heavily is the Seabor library. This is one that is built on top of mattpot lib. So mat mattpot lib is kind of like numpy. It's the foundation and then a lot of things are built on top of it. Seabour being one of them. Excuse me. Um Seabour being one of them. And it it basically has just better aesthetics. It provides better not only like better aesthetics than just basic MPA lib. It also has more um scientific kind of plots and more interesting plots than the regular ones you get out of the box with MPA lib. So it has really interesting histograms, file plots, heat maps. Um it can do statistical error like confidence interval bars. Um, so it just builds better plots than than basic mappic. It can it's really easy to use. You know, you can build a lot of plots with it as we're going to learn, but Seabor is really nice. It makes things aesthetically pleasing. And so we'll also use Seabor from time to time. It's another plotting library that we'll get some practice with uh when we get into visualization. And another one of those is Plotly. So we have Seabor and Plotley both built off of Mattplot LIIB in order to do plots. Now Plotley's specialty is for building interactive graphs. So when you build a Plotly graph, you can actually um it'll pop up in your web browser kind of like Jupyter notebooks do and you can click around in the graph and mark down points. You can zoom in, you can zoom out. Excuse me, I'm just getting over a cold here. So, don't mind don't mind the coughs. Um, but you know, you can zoom in, you can zoom out, you can do a lot of interactions with potly. So, if you want to build an interactive graph, potly is a good package. Again, it's built off of uh Mattplot Lib. Uh, so we'll get some practice with potly. So, so these three we're going to practice in a visualization. Seabor my potlib is the basic foundation. Seabour and plotly both build off of it. We'll get some practice with all three of those when we get into visualization. Um so the rest of these slides just go through some plots that we will be building later on when we get into our visualization. So uh just wanted to brief briefly go through those uh just to show you some of the different types of plots we'll do. So the easiest kind is basically a line plot that connects different points. So this would be like if we were plotting out something over time like a stock price or uh a sales value over different quarters or or weeks um temperatures over time something like that. So basic kind of plot we'll be able to build that no problem. Um, we can even mark different points on those. That'll be easy to do with Map Pot lib or seabor or potly. That'll be really easy to do. So again, we'll we'll show you how to build these with code later on when we get into visualizations, but just showing you the possibilities right now. Uh, scatter plots. We'll do these which have um different points kind of uh scattered throughout on on uh two axes here. Um, this is usually helpful to figure out how the data is kind of um, maybe clustered together or figure out if there's relationships between two variables, like if they tend to trend the same direction or in the opposite direction or if they're kind of just distributed all over the place. So, we'll be able to build scatter plots. That'll be helpful. Um, area plots that show like cumulative areas on top of each other. We'll be able to show that. um that'll be pretty easy to graph for different um maybe tracking total sales uh over successive quarters. um showing different contributions of categories. We'll be able to do uh so we'll be able to do area plot basic bar plot we'll be able to do uh again these all of these examples were built using mapplot liib so we'll be able to do that but they have equivalent versions in pli and seabour so um again th those are built off of mapplot liib so uh we can even put grids in the background to show uh to to kind of um assist the viewing of it uh to to give an idea of where the different points are in the grid. So that will be easy to do. Um histograms. Now histograms are going to be extremely useful for us. We'll build histograms a lot because they will help us visualize how data is distributed, which is extremely important to know. um you know is it kind of distributed like this in this picture which is kind of like a bell curve or is it flat? Is it um does it have kind of two peaks to it? Um knowing this distribution will be extremely useful to us. Um so the we will often build a histogram that kind of looks like this. So histograms will be extremely useful. Uh we can build piraphphs um which show different percentages. Uh so um you know there may be certain situations where that makes sense. We're telling a story of our data. It makes sense to use a piraphph. That'll be easy to do. Um the again these are all just examples of what's possible. We have to show you how to build these and we will when we get into the data visualization lesson which is what this kind of note says at the bottom. Once we get into that lesson we will show you um how to do the code to build these. So just to wrap up this first introductory lesson, we uh have shown you what data science is, which is kind of the uh extraction of insight, deriving insight from data. Um and we have a bunch of different packages are going to help us do that. We also have a process which is going to help us do that, which is usually defining a problem, collecting data, doing data preparation, and then doing modeling after that. Um so you know basic foundations at this point what we're going to do is now go into uh numpy. So we're going to um start with numpy and then go into uh pandas after that. So we're going to start studying these packages are going to help us do some of these different tasks in data science. Any questions at this point? Okay, I'm going to open up our next lesson then, which is actually going to be So, if you guys notice, the next lesson is actually going to be uh lesson three is broken into several different notebooks. So, we're going to be transitioning into uh notebooks for some of these guys. Um do you guys have those notebooks? The lesson three uh notebooks. I can try to download I can try to share them with you guys if you don't or does anybody have the folder and want to share it for those that don't have it. It should be a collection of several notebooks. Yeah, let me download it. Okay. So, give me a moment. I will upload it. I have it right here. Um, okay. I just uploaded it. Okay. So, you guys should have it. So, you want to open uh those notebooks. Again, you can uh open it wherever you want. Um you could do it in your own local uh Jupiter. You could do it in the lab environment. You could do it in Collab. I recommend Collab just because it's so easy to work with, right? So, I recommend opening them up in Collab. That's what I'm going to do. I think it's just so easy to work with. Uh so that that's what I'm going to be using. All right. Can everybody uh see the screen? I'm on the first notebook. The the 3.01 notebook is the one we're going to start with. The introduction to NumPy. So if you have a moment, you want to open that one up. Again, if you're working in Collab, you can upload the notebook. So you can uh go and like once you've extracted that folder, you can upload the notebook um into Collab uh using this like file upload notebook. Um that should work. Or if you have Google Drive, you can put it you can just upload that folder into your Google Drive and then you can just uh launch it through your Google Drive and it should open it in in Collab. That works too. Are people able to open the notebook? Yeah. Again, it doesn't matter where you open it, just as long as you can and you can you can uh run some of the cells, you know, because that's we're going to be running them. Good. Good. Good. Okay. All right. So, let's talk about numpy. Now remember numpy is the open-source library that is used for doing um you know that is used for doing uh math and scientific uh computing on uh basically these arrays. So um we are going to take a look at the numpy array object as the first thing that we'll look at. Um now the numpy array object behaves very similar to a list. The so we learned about lists in our previous course and the the numpy array is very similar to a list. We can slice it like a list. We can access elements like a list. It's ordered like a list. Um but it's a lot faster to do mathematics with the array. And it comes with a bunch of built-in functions like mean, median, mode, all these special things on the array that we don't get with a list. For instance, Python lists do not have a notion of an average. You can't calculate the average of a list without doing a manual calculation. So, but a numpy array has a mean function that comes built in that we can uh take the average um numpy has like an average function that we can take of an array or a median or a mode. So, arrays are really advantageous to work with inside of numpy. Um so, uh let's take a look at some examples of a numpy array. So in the in the first uh cell here I want to point your attention to two things. One is that in order to use numpy we import it. Do you see how we import numpy and we do this thing called asmpp. Now what this is is we do an alias. This is called an alias. Alias numpy as np. We basically shortorthhand it to NP which is an industry standard. So anytime you're looking at code and you see NP something that is short for numpy. So in the industry if you you know everyone is going to shorthand numpy to MP. That's just that's just what people do. So as is the way to alias and import so that when we use numpy in our code we don't have to type out the full word numpy. We can just do np. So that's why you see np here is because uh and really throughout our code we use np. You see it all over the place. It's it's a shorthand alias for the numpy package that we're using. So uh we are importing this package meaning that we are going to use it in our code but we are aliasing it to np. Um this is the this is the industry standard to do. Most of these packages have a nice uh alias to them like pandis will have an alias. Uh mattplot lib will have an alias um just to make it shorter. Uh do you have to import in VS Code? Um if you're using VS Code to run your notebooks, yes, you have to upload it there. You want to open it in VS Code? Yeah, you're going to have to you're going to have to open the folder where it exists. But that's if you're using VS Code. Like you don't have to, but yeah, if you want to. Yes. All right. So the next thing I want to point our attention to is building a numpy array. So notice that we can build this numpy array by doing np.ray. So np. array np. array um builds a numpy array object. And what we're passing in is just a list of data. Okay. So we have a list of integers that we pass into this MP array which will build a numpy array out of this list. So numpy arrays can be built out of lists, they can be built out of tupils, they can be built out of other numpy arrays. Um there's many ways to build a numpy array, but the most common is to pass in a list to convert a list into a numpy array. So here um here we are uh building a numpy array from a list which is pretty typical. By the way you guys remember I said that I'm going to be writing a lot of comments. I'll share these notebooks in our Slack after after the classes. but I encourage you guys to do the same thing is to write comments in your notebooks. Okay, try to write comments in your notebooks to outline what the code's actually doing. How do you install numpy? Uh you just need to do so inside of a cell inside of a cell you can run this uh command like pip install numpy. Try running that inside of a Jupiter cell. Yeah, this this command is not going to work for you because this is like a generic this is on um this is on Windows like a generic Windows command. but if you're inside, are you inside of a notebook, Mariel? If you're inside of a notebook, just run this inside of a cell. It should install it. Yeah, that works too. You can open your terminal and do pip install. Um, if you do that, you'll probably have to restart your kernel. No. So, so Collab comes with NumPy already installed. Yeah. So, that's another advantage of of Collab is it already has that installed. We don't need to worry about it. Yeah. So, so if if it says requirement already satisfied, um, which is what this is going to say if I run this, um, it's because it's already installed. So, it this means that I already have it installed. Yeah, you already have it installed. Yep. So in collab it already exists. This one new cell and this command If you can't get it to work in your VS Code, I really encourage you to to do collab as much as you can just again just to get something that works because NumPy is already installed in Collab. So there's really nothing you need to do extra. Thanks, Tim. That'd be great. That'd be great. All right. So if we so this builds um going back to this this builds a numpy array off of a list. So if we run this code what's happening is we are building an array and storing it in this array variable and we can print the array. Now look at what the array looks like. It kind of looks like a list when we print it except we the way we can tell this is a numpy array is that it does not when we print it it does not have the commas. Notice that the data in there does not have the commas. And that's because um it's being treated as a numpy array. So it doesn't have the commas. It's not at that point. It's now an array. It's not a list. So it it looks slightly differently. And you can even see when we print out the type that this array is actually a numpy n dimensional array which is the foundational data type of numpy. So this is an a numpy nd array which is the foundational uh data type of numpy. So we have created a numpy array and we now you can see what its type is is this uh MP and D array. Were you guys able to run this first cell? If you run it, it does it's not going to do anything but show this. You should see this and then you should see it's printing out the the type. You should see those two things. And do we see how that this is creating a numpy array? So np.array is how we that's the function we use to build a numpy array. And we're passing in a list of data to build that array. Yeah, it might take a it might take a moment to start up the kernel. All right. Any questions on on this so far? what was indie array? It's short for in-dimensional array. It's the it's the numpy array object type. So that is the data type that we are working with now is a numpy n-dimensional array. It's a it's a generic numpy array data type and you can see that because we we do type and we can see that we get um we get a numpy array numpy nd array as the type of this thing when we create a numpy array out of it. So, ND array is short for N dimensional array. what I wanted to do is go to the next cell and talk about how we can create some matrices essentially multi-dimensional arrays. So the this array that we've created so far is actually just a onedimensional array because it's it only has um it only has one dimension to it. It basically has one list of data. But of course we would be interested in working a lot of times with multi-dimensional data because that's typically like what a spreadsheet has right rows and columns. So um just to give you guys an example like numpy actually supports zero dimensions which is basically a constant. So a single number a single value is considered a zero d array. So a single value is considered a zero dimensions is just a single value. So if we built a numpy array and just passed in a single integer or it it doesn't have to be integer it could be float like 24.6 six, you know, whatever it is. Um, it that would be considered a zerodimensional array. But we've already built a 1D array which is just a single list basically a flat list with uh so just a if we use a list with um a list of uh I should say a single list of values is a 1D array. So we've already we've already seen that it is a scaler. Yeah, we would call that a scaler. Yes. Uh good. Yes, that's true. Scalar. Perfect. Perfect. So, a single list of values is going to be a one-dimensional array here. Now, what gets interesting is now when we do a list that has list as its elements. So, this is a list of lists is now a 2D So I want you I want you guys to see that how we we're building a numpy array out of a list. But look at what the elements of the list are. They're actually lists themselves. So you see how within this overall list, the first element is a list that is 111. That kind of mimics basically like a row. So you think of it as like each list each list is like a row in a matrix in a matrix. So this two-dimensional array is really like a matrix, right? So so this is an interesting use case where you know of course we could have more than just these two. So we could have a third list here that is like um four, five, six and that would be valid as well. So this would be basically a matrix that has three rows and um each each row has three basically three items in it. So we would think of it as basically having three columns, right? So it's like a 3x3 matrix, but it is two dimensions. It's a two-dimensional array. It has rows and columns at this point. So it has rows and columns. The two-dimensional array. Do let me ask you guys, do we see how this has two dimensions to it? It's a list of lists. So it has two dimensions. Does all the lists need to have the same rows? What do you think? What do you think would happen if Let's try it. Let's try making this a smaller dimension. Do you think this is going to be allowed? Is this what you're asking? Like if this can this be a shorter dimension? Let's see. So yeah, this gives us an error. So yeah, it it you're exactly right. This will give us an error that the dimensions do not match. So this this is uh not allowed. But let's see if I do if I add in the six, this should now be okay. And there it is. It's now this is now No more error. Okay, no more error. So, yeah, it's still going to be an error if if the shapes do not match. So, again, if we got rid of if we made this a smaller one, that's going to be an error. Um, and it's going to tell us that we are uh we have one dimension that is inhomogeneous, meaning it doesn't match. It's not the same. They have one dimension that's not the same. it its shape doesn't match. It's not correct. So therefore, we should correct that and make sure it is matching. Okay, so that is a twodimensional array list of lists. And by the way, that doesn't have to stop there. We could keep going. So now we have a threedimensional array which has lists of lists of lists. So it basically has one of these matrices as each element. So see how this has basically an overall list. So this has um each it has a list where each element is a 2D array. Right? Each element is a matrix. So here is one of those matrices is the first element and then here is another 2D matrix that is the next element. And this forms a threedimensional array. So if we print this out, we can see that the we get this 3D array where this first matrix this matrix is the first item. this matrix is the second item and um on and on and on. Didn't understand how 2D is different from 3D. Uh does it do you see how Okay, so do you see how with the 2D basically we take this whole thing and that's just one element of the 3D. This this 2D matrix is one element and then we have another 2D matrix as the next element. So we have matrices are now the elements of the 3D array. Whereas look at what's the elements of the 2D array. They're just lists. It doesn't have to be two. That's just the example we have. It doesn't have to be two. But um by the way, you one thing that gives away the one thing that gives away the dimensions is how many of these brackets we have. So you see how we have two brackets and see how this has three brackets. Yeah. Oh, you guys got it. You got it, Roberto. Perfect. Yep, you guys got it with the brackets. Can I So, what's an example of using a 3D array? Yeah. So, something that uses a 3D array would be like a a batch of images. So, let me give you an example. So, an image is like a 2D array because it has pixels, right? It's basically an image is broken down like this that has pixels with whatever resolution. And so if we have a collection of those that is it's like we have a collection of these guys is a 3D array. Like if we have a hundred of those it's like a 3D Does that example make sense? like a collection of images would be a 3D array because every image is a is a two-dimensional matrix of pixels. Yeah. A 3D array is a collection of matrices. Yep. Exactly. Does thisam does this example make sense though? Like this this is a good one. I think I'm glad you asked it because I think it's a good one for thinking about what a 3D array is. Every element is a 2D matrix. What is the maximum 2D element in a 3D array? I'm not sure what you mean by that. Maximum 2D element. Oh, how many can you fit in a three? You can have unlimited as as much as the memory will allow. Basically, as much as your memory will allow. You can have unlimited. You can have as many matrices in a 3D array as you want until you run out of memory essentially. perfect. All right, perfect. So, just to recap this, um, we have the numpy array. So, we're we are able to build numpy arrays using the nparray and we're able to take a list of data and populate it into an array. Um, and and then what we're going to do is just build off of this to learn how to manipulate that array and do different things with that array coming up next. All right. So, let's go to the next notebook. Let's go to 3.02. Well, before I do that, any questions? Any other questions about this uh building an array? perfect. So, let's go to our next notebook. Okay. Do you guys have the 3.02 notebook? Do you have it up? That's the next one we're going to do. So, take a moment to pull that one up. Yep, I see some thumbs up. Nice. So, we're going to build off of that numpy array by taking a look at some attributes of arrays. So, so assuming we have an array, no matter how many dimensions it is, what are some attributes of this array that we can that are useful to us. Okay. So, let's take a look at an example where again here we import numpy and we make we're initially making a 2D array, right? So, we make a 2D array. Um, so then what we're going to do is print out a bunch of the uh attributes of about this array and we're going to explain what they what they do. So, the first is if we ever want to know how many dimensions an array has, there's actually an attribute for that which is the which is called in dim. So if we just dot n dim by the way we access let me call that out here is we access attributes of an object by using the syntax um object dot attribute. So in this case we have our array is our object. So it would be like um i.e. i.e. array.shape would be an attri shape is an attribute of the array. Array is our object here. Okay. So that's our syntax to access different attributes. So the first attribute we're going to learn about is called in dim which gives us the number of dimensions. And so when we print that out you can see what it's going to be. It's going to be two. And that makes sense. It's a twodimensional array. Could the 2D array be? Yeah, that's fine. That's fine. You could use that. That's still two-dimensional. Um so the first is ndem which gives us the number this gives us the the um number of dimensions which equals two in this case. Okay. And that makes sense. We know it's a two-dimensional array based on the fact that our elements are 1D lists. So it's it's a we have a list of lists. It's going to be two dimensions. Now shape gives us shape gives us the um gives us the uh quantity. So it gives us the basically the number of rows and columns. So in this case what this is saying is we have two rows and three elements in each row. So shape is giving us an idea of how many elements we actually have or what the shape of this matrix is. That's why it's called shape. So in this case we have a 2D array that looks like that kind of looks like this, right? Where we have 1 2 3 and then in this example four to five. So we have two rows. two rows and we have three columns. Two rows and three columns. Hence we get a shape of 2x3. Does that make sense? 2 by3. Yes, that should be true. Yes, that should be true. Yep, that should be true. Okay, so shape shape is going to be incredibly useful as we go forward because um there's a lot of times where if we have an array, we actually just want to know how many rows and columns it has. Uh which is the shape. So the shape is a good attribute to know. Um does the bracket define it as a 2D versus 3D array? It's it's the fact that it's it's two brackets defines it as a 2D array. It has two brackets here. It's a list of lists. This is 2D. It can. So, sorry. Yes, a 2D array can have more than two rows. Yes, it's the fact that it has So, I meant to say I meant to say that the in this shape we're going to see two entries. Sorry. In the shape, we're going to see two entries. Yeah, let me correct that. So, the the dimensions will match how many entries we have here. So, if we have two two entries uh because it's two twodimensional. That's that's what the truth that's the thing that um so sorry Roberto I was uh wrong on what I told you. it is when we have two entries here. That's because it's two mentions, not the fact that this is a two. It's two entries in the shape. By the way, how can we get a third row? Uh we could just add in another list here, right? So if we add in another list that has three elements like um 7 8 9, this is now this is still a two-dimensional array. But what I want you to notice is what I want you to notice is that it's going to turn into this shape is going to turn different and this size is going to turn different but it should still be 2D. So see how this shape went from 2x3 to 3x3. Yeah. The size is going to be the product. That's true. That's true. I was going to get to that next. What is the shape with three rows? It's just 3x3. Do you see that now, Marielle? Do you see it? It's 3x3 with this has a third row. It's now a Yep. Three rows. Three rows with three elements each in a 2D array. Perfect. Yep. So now I want to talk about the size. The size is the total number of elements total number of elements in the array. So uh in this case we have um in this case we have 3x3 so we have nine total elements. So this will always be the product product of the shape, right? Lots of questions. Let me see. I'm trying to keep up. Yeah, you can have as many rows columns as you want. Yep, no limitation. Do you have a real world example of a 3D rate? Yeah, it's the one I gave earlier. It's like a batch of images would be which we are going we're going to work with images quite a bit when we get into deep learning is a 3D array because it it is an array that has as its elements these matrices that are that are pixels pixel matrices right of different resolutions that's a 3D array is a collection of Because these are each 2D that's a real so images these are all images usually are in a 3D array. uh all the rays in the same dimension needs to be of equal shape. Yeah, they do. They So, so yeah, we saw that example earlier like if we try to So, if we try to change the shape like even in a 2D sense, if we got rid of this, this would be an error, right? This gives us an error. We can't we need this to match the size, right? We need this to match the size in order to build the array. Okay. Do we get do we see what size is? Size just gives us the total number of elements. Okay. A total number of elements which is really just a multiplication of the shape. There's three rows, three items in each row. So a 3x3 is nine total Okay. So that's the total number the size is the total number of elements. Now the the DT type is telling us what every member's type is. So maybe not that interesting but this is kind of the default. So this gives us what each elements each elements data type. So uh we can um grab in this case they're all integers but they're in N64. Um so we can also grab how many bytes each one takes up which is the item size attribute. This is each element's uh memory footprint each element's memory size which is going to be uh eight bytes. Um and we can also get if we want to very rare we would actually need to access this but we can actually get the uh um memory reference memory reference for the array data. So array.data we we won't really ever need to worry about this but uh this is actually really important for pandas to be able to access later on because everything in pandas is built off of numpy. It needs to manipulate the raw memory uh often in order to do different calculations with that data. So it typically will need access to that uh data attribute but we generally will never need to know what that memory address is. Any questions about these attributes? I think the one that we'll use the most is probably going to be shape. We'll probably worry about the shape the most uh when we're working with numpy arrays. Any other questions about them? Okay, perfect. Okay, let me show you a couple of functions. Uh, all elements in a shar must have the same data type. No, they don't have to. Just like a list, they don't have to. When we're working with data, they typically will, but they don't have to. Yeah, you can have that. Try it out. Try it out for yourself. Yeah, you can definitely have that. All right, I want to show you a couple of functions we can do to manipulate the shape of an array. So, the first one is we can actually reshape an array using the array.reshape reshape function. So this is a function that we can pass in. So we can do arrayshape and then we put in the uh a tupole with the new shape. So in this case we're putting in uh 4, 3, which is to say we want to take this existing array that is a one-dimensional array. So notice this is a 1D array and we want to turn it into a two-dimensional array. All right, that is 4x3. Okay, 4x3 meaning there should be four rows and three columns. And do but first of all, let me ask you guys, do you think this is even possible? What do you think needs to be true in order to reshape this properly? What do you think? If I want to take a a flat 1D array and put it into something that's 4x3, how many elements do I need to do to do that? Perfect. You guys are right on top of it. 12. Perfect. Yep. I need 12 elements. So, what happens if I don't have 12? Do we think this is going to work? Yeah, let's try it. Error. And look at what the error tells us. I cannot reshape something of size 11 into shape 4x3. It even tells us directly we can't do that. So yes, that is definitely a prerequisite to using reshape is that you need this total number of elements to match the number of elements that you start with and then it will work. So if you're going to use reshape, you can reshape into any shape that you want to. So, we could even reshape this into 3x4. That would be okay. We could do 3x4 because that totals up to to uh 12. We could do 2x six. That would be okay. But could we do 2x7? No. We don't know how to reshape something that is 14 into uh into into a shape 12. We only have 12 items. 2x3x two. Sure, we could do that. That'd be a three-dimensional. We could do that. And now we have a 3D array because we have uh each element is 3x two. We have two of them. So each matrix is 3x two and we have two of those. So we could do that. So that's what reshape does is reshape can take an existing array and move it into a new shape assuming that the shapes align properly. So reshape can do that for us. So that's actually incredibly useful. We'll use reshape from time to time. And then we can actually do the reverse of reshape. So we can always take something and flatten it out into a 1D Okay, we can always do that as well. So the flatten function can take something and put it into so this will always always give us a 1D array. So no matter what shape we start with, we can flatten it out into a one-dimensional version of it. Okay. by using the flatten function. So this this does a particular reshape that will completely flatten the array. So you can see it takes this uh three-dimensional array here and goes ahead and reshapes it or or it basically flattens it right into this exactly flat uh onedimensional array. So flatten always returns a 1D array. pretty pretty straightforward. Is there any benefits? Yeah. Sometimes we want to take something that is in one shape and move it into another because we're going to manipulate it. Uh we we're going to assume it has a particular shape to manipulate it. We'll we'll see that later on when we get into deep learning. Especially when we work with images or text, it's going to be important to reshape things from time to time. So maybe not not maybe not the second but we get into deep learning. We'll we'll go ahead and reshape. Okay, one more I want to show you and then we'll take a break is that uh there is a transpose function. Now what this does is it swaps rows and columns. So it it transposes uh this into uh whatever was our rows. So this one two three now becomes the columns. And so this was a 2x3 matrix. It now transposes into a 3x two matrix. So transpose swaps our rows and columns. And that can that's going to be useful down the road too for different uh algebra calculations. We may need to do may need to transpose from time to time. any questions on any of these uh functions? Hopefully they're they're not too bad. They're straightforward. They're just different reshaping. We have reshape. We have flatten. We have transpose. They just change the shape of the array. So, we're going to start with doing some arithmetic operations. Uh so just to show you guys that when you have data in numpy arrays you can do elementwise operations meaning that we can do operations that go element by element match them up and do uh some type of mathematical operation between them. So things like addition, subtraction, multiplication, division, we can do those uh between elements. Um, so for instance, we have these two arrays of the same size, the same shape, and we can go ahead and add them together. Meaning that like this uh position is going to be added to this position, this position is going to be added to this position, this position is going to be added to this position. Okay. So when we do that, we get um basically 40 in every slot because we get 30 + 10 is 40. 20 + 20 is 40. And 10 + 30 is also 40. But look at the syntax of it. There's actually two different ways to do it. You can do the numpy.add and then you pass in a and b. So this is the um this is one way to do it. One way to add is to use mpadd and then you pass in your array one and array two. So you can do that. Um and so we we add those two and and store it in the result. And notice that the result is a same shape array but just with each element added together from the original arrays. So that's one way to do it. The other is you could just do regular uh arithmetic. So you could just do a plus b. This is an alternative. Alternative is to just use standard arithmetic arithmetic operations. It it doesn't really matter which one you do. I've seen both. Um both will result in the same kind of array. So we could store um we could do something like result equals a + b and then um store that in the result and then print um print the result. So it's same thing as before. Um we get the same array. So you can do either one. npadd a +b either one will do that elementwise addition uh between the elements. Okay, pretty straightforward. Um, we also have the same thing for subtract, multiply, and divide. So, for instance, when we have these, now we have a 2D array. So, this is now two-dimensional. And but the same exact thing is going to happen. We're going to go through and subtract. This is going to do um this is this is the same as a minus b. So this takes a and subtracts b from it. So we have 30 minus 10 is 20. 40 minus 20 is 20. 60 minus 30. Um and then we do 50 minus 40. So we're subtracting those elements in the same positions to get a uh to get a result um to get a result of uh this 2D array that is a result of subtracting every element from the original arrays. So again you could do a minus b, you could do np.subtract um either way should work. Uh 2D plus 3D could you do it? Well, uh, you could try it out. So, this is a 2D. Um, we could try it out. So, let's copy this guy. Let's do Let's do this guy, which is going to be a Let's see. So, let's do uh a 3D array. So, let's do um let's do one of these guys. And then let's do another one, but let's just let's just change up the numbers. So, let's do 10, 15, 20, 25, 30, 45. Let's do um MP.tubract. subtract A and B. Let's see what we get. So we actually do get a result and uh the reason we do is something called broadcasting which is um an interesting idea in numpy that what they do is they basically will force the shapes to to be aligned when you do a mathematical operation but you might get some unintended consequences of doing that. Um for instance we get like we get some of these uh actually work where we get 30 minus 30 40 60. So we get these zeros here for this first guy. But notice that we basically take this and apply it to this. It basically takes this and and subtracts to this guy secondarily. So we have 30 - 10 is 20 and then 30 - 15 is um uh 15. Sorry, 40 - 15 is 25. Um and so the shapes the dimensions don't have to be the same. dimensions don't have to be the same, but you can get uh unintended uh results. So be be careful is the thing I would say be careful of doing the subtraction of numpy will try to force the results to fit by taking this and applying it to this 2D matrix here. Right? So it it can work. You're just going to get um maybe some unintended results that don't really make sense but are possible in numpy. If you subtract a bigger value from smaller it will no it'll give you a negative. Yeah it'll just give you a negative. So like try doing in this example what would happen if we put B first you get negatives. So yeah it's still it's still possible. You just get negatives there. Okay. So as well so we can take uh like 30 * 10 20 * 20 10 * 30 and we can get uh those multiplications. So if we do that we get this array which is going to be 300 400 300. So these are just elementwise multiplications. Again, the alternative is to do a time b. Um, and that would be the same thing. So, we just did result equals a * b. Uh, that would be the same thing. But we can do mp.m multiply to make it more um make it more uh explicit the operation we're doing that it's a time b and this is not to be confused with matrix multiplication. So uh that's something I should call out here is that matrix multiplication is a different matrix multiplication is different and will be covered later. So traditional matrix multiplication will be covered later in that uh that requires the matrices to be compatible and uh it's a completely different operation than doing elementwise multiplication. Okay, which is just a star b or asterisk b or np.m multiply. We can do division. So um notice that this is again a scenario where the shapes are not the same. We have a 2D array and we're dividing it by a 1D array. Now what happens is we basically take this and divide it by this and then take this and divide it by this and get this second row. So this size is going to basically match the larger of the two shapes. So this first shape is two-dimensional 2D. So basically the shape of this is um 2x3 shape and this is only a um basically a a 1x3 uh not even it's just a three element shape. Um it because it's just a 1D array. So it's a 1D array. Um so therefore uh therefore um when we do the division it's going to do that broadcasting um thing that I mentioned earlier and try to force this to be able to divide by this. And the way numpy will do that is say okay which is the bigger shape this is the bigger shape because it has more dimensions. Try to take this and divide it by these guys. What do you think's going to happen if we reduce this size? If we did this, do you think this would work? If I reduce this down to only having two elements? Do you think this division would work? We could try it. It's It doesn't broadcast that way. No. So, you could even see it. It even says it tries to broadcast, but it doesn't know how to. Um, so the even even with broadcasting it the shapes still need to align to some degree. So, this still needs to be like um so needs to be a valid shape to be able to broadcast to each one of these uh dimensions here. Let's try it. Yes, we can divide B by A. We're just going to take now this again is going to try to match the shape and just do this divided by this. this divided by this. But remember that um a / b is not the same as b / a. They're not the same result. Yeah, you can get a division by So, uh try making one of these zeros. So try let's say this was all zeros and we did a divided by b. Do you think that'll Do you think this will work? Yeah, it basically now it it technically it technically returns a result, but they're all infinity. Yeah, it basically says that we have a warning. We cannot divide by zero. so it basically says that uh sure you could tech you could do it, but you're going to get infinities all over the place and it's not you get a warning that you're dividing by zero. So it's kind of like an error there. All right. So, let's go to doing exponents. So, we can do uh elementwise exponents where every element in A is raised to a power of something in B. So, for instance, we like 2^2 we do 2 cub 2 4th 2 to the 5th 2 to the 6. And you could see what each one of those would be is this. So this just does um every element in a each element in a gets raised to the exponent of the corresponding element in B. So we get 2 ^2 as I said 2 cub 2 4th 2 5th 2 6 um that may be useful from time to time we may have a reason to take elements from one list and expon exponentiate them from another list um so there's a power um function here to do that um so that may be useful uh before we move on to the statistics functions which are going to be really interesting. Any questions about the do they kind of are they kind of straightforward? They kind of make sense in terms of the arrays. Any questions on them? Hopefully they're they're very much like just regular arithmetic. They're not too bad. Hopefully. Can you build arrays from set of arrays? You mean like a Python set? Oh yeah, you can. Yeah, you can build arrays from arrays. Yeah. Uh, we haven't done that yet, but I think that's coming up shortly. Yes, but you can. Yeah. Yeah, it's not too it's not too hard to do that. So, in fact, I can just show you a quick example of that. So we could do um we could do we could call a numpy array uh from an existing uh so we could have inside of a a list we could have mparray um and then we could have one two three and then we could have um an MP array and then we could have um four five six. So this should make So if we then we uh let me display x. So this this makes a 2D array. Does this make sense? Like I'm I'm defining an array as the input elements to build an So yeah, you you absolutely can do that to build a new one. Okay, good question. Any others before we go on to the statistics functions? All right. So, what's great about NumPy is with the arrays, it can easily do statistics on arrays. So, it can find the medians, it can find the average, it can find the standard deviation, it can find the variance. Now I know we haven't uh technically defined what each of those are but that's okay. Um it's useful to know that given the different statistical functions that we may want to do numpy can easily compute them on collections of data collections of arrays right or data that's inside of an array. So for instance, if we have this 2D array here, we can compute the median of all elements in the array, which would be npmedian. So there's a builtin function from numpy npmedian. Um so we can do npmedian um finds the median of an array. Okay, so this will calculate the median. So if you're uh again we haven't we will get to what the median is later on in our statistics overview later on but the median is kind of the 50th uh percentile like the middle element right the middle element of a we we basically order it from least to greatest and find that middle element um so in this case the middle element is four out of all these elements that we have okay npmedian uh we can take the average which is the mean. So that's very nice. We can take the average um so we can do np mean which will take the average of this array. So it'll average all the elements uh together and we get an average of 6.3333. Um so that's very convenient for us that we can compute an average of an array. I want you to now this seems really straightforward but I want you to see how powerful this is. is that lists for example do not have this ability. There is no builtin mean function for a list. Um so that's why this is so useful that numpy has this ability and it's really optimized. It's really fast to find the mean, really fast to find the median. Um it's a really optimized function to do it. Um if we wanted to find the average, we would have to do it manually on a list. we'd have to total up all the elements and divide by the size of the list. Um, not that that's hard to do, but it is something manual that we would have to define. It does not automatically exist uh like what we see here with mp.m mean as a simple function that we can apply and uh we can apply it to numpy arrays very easily to compute the average. So, same thing with standard deviation and variance. Those are uh more advanced statistical functions that um figure out the spread of the data away from the average. Uh again, we're going to learn about these later on, but um MP. STD does the um standard deviation and then VAR does the variance. Um so, and and really the standard deviation is the square root of the variance. So if you to if you took this um variance sorry if you took the standard deviation and just uh raised it to the second power you would get the variance. Um but uh you can compute them separately this way. So pretty convenient that numpy provides those statistical operations. We're going to be using these quite a bit. Um when we especially when we do like uh exploration of our data and we want to find an average, we want to find how what the standard deviation is. Um this is going to be really useful to use the numpy function to compute that on a on an array of data. Okay. Really useful. All right. Any questions about these guys? I mean, they're pretty straightforward, but I know and I know we haven't defined exactly what these are. We will later. So, no worries if you're wondering exactly how to calculate these. We'll talk about that later. But, um, any questions about the numpy functions themselves? Very good. Very good. let's talk about uh percentile then. So, numpy also has a percentile function. so we can take an array and compute the 50th percentile, which would be the uh median. Again, we haven't learned what a percentile is, but if you think about uh ordering all the elements and figuring out like the median is at the 50th percentile and then um half the data is below that. So there's a point where like 25% of the data uh is below this certain value and then 75% of the data is below this certain value or um 95% of the data is below this value. Um so the percentile is something that ranges between 0 to 100 we can take uh so like the 99th percentile means that most like 99% of the data is below this certain value. Okay so like 99th percentile means that 99% of the data falls below that value if we ordered it and kind of sorted it that way. Um so uh we can compute any percentile we want by just passing in the array and then giving it a a number between 0ero to to 100. Okay. So and this should be a whole number 0 to 100. Um so for instance we can compute the 99 percentile. oops. I have to actually run this. Let me run that. There we go. So the So 99% of the values are below 22.8. Um, which kind of makes sense because most numbers are pretty low. So there's really only one number below that number, which which is the 24. So if I did if I did the 100th percentile that would basically be the max right basically be that 24. So only only 100% of the numbers are below the max. So it kind of makes sense. And then the if I did the 50th percentile that would be the median. Half the values are below that half are above. So that that matches the uh median that we found here which was four. Okay. So we can do percentile. Uh any questions about percentile. All right. Finally uh wanted to mention that numpy you can manipulate strings in numpy. Now, this is uh less often used because typically when we're working with numpy data, we typically don't have strings inside of there. We usually are dealing with numerical data, hence why it's called numerical python numpy. Um but it is possible to work with strings and do different string manipulations. Um so uh for instance if we have a numpy array that has two strings hello world and then another array that has welcome learners. So these are two 1D arrays. We can actually concatenate um elementwise strings by using the mp uh character module. Um, and instead of doing the the the reason it's inside of the character module is instead of doing like a numerical addition, so typical arithmetic, it's doing a string addition, which is concatenation. So, it's doing character addition character addition here, which will um concatenate these two strings, hello and welcome. So, those end up merged together. concatenated together and then world learners uh get concatenated together. So this is the uh string concatenation uh elementwise string concatenation. So it's from the mpchar or character module um within numpy. Okay. Very rare. It's very rare we would have to do this, but I'm just pointing us out that it does exist. Okay, if if for some reason we need to manipulate strings, uh we will have that ability to. All right. So, um then we can replace uh substrings with new strings. So, uh if we have this original string called hello, how are you? um we can print it out and we can replace uh we can use a character replacement to replace within this string replace hello with hi. So once we do that we can uh print out the new string. So um then we get hi how are you as the new as a new uh string. So this does a um string replacement um if we can find the uh substring. So if hello exists. So uh we should test this out and see like uh is there something uh so we could just put in something that doesn't exist in there. Um you know it's not going to be replaced. So if we this is saying okay let's let's try to do a replacement of this string let's replace something with high but something doesn't exist so it's just going to return to us it's not going to replace anything it's just going to return to us of the original uh string but this does exist so that's going to be replaced with hi and do a string replacement um if we want to we can also uh manipulate strings to do upper uppercase everything, lowercase everything. You can see how those uh like this is all lowercase but we do upupper pass in the string it will uppercase everything. Um this is the this is all uppercase we can lowerase everything. Um you can see how that all works. Okay, again, very rare we would ever need to manipulate strings, but if we did, there's this character module that can uh help us manipulate strings. Very rare that we would need to because most of our data is going to be numerical inside of an umpire array. Any questions there. All right. So, just to recap, we have our arithmetic operations. Basic arithmetic between numpy arrays. We can do pretty straightforward. Um, we can we can even use our standard. We can use MP add or we can use a + b mp subtract a minus b multiply divide. We can all use those basic um arithmetic operations. We also have a power function which will raise things to exponents. Um these incredibly useful. We'll use these all the time going forward are the statistical functions like average, standard deviation, variance, median. um we'll be able to compute really easily on an array. All right. So, let's go to our next notebook and continue working with NumPy. So, go to the 3.04 notebook. All right. Now, the whole point of this one uh is to practice accessing data within the notebook or sorry, within the numpy array. And truly, this is going to be great because it's going to work exactly the same as a list. We're going to be able to access elements by their position and also slice just as we did with lists, right? So, everything's going to work the same, which is going to be really nice. Um the one unique difference is that with numpy arrays we can have multiple dimensions. So that's where we need to actually be careful is that if we want to access elements that are in different shapes like they're in a second row third column position. How do we do that? And it's actually going to be really easy to do um if we just think about it as kind of a coordinate of passing in like an index as if it was a coordinate of this is exactly what I what I want to access. All right. So if you take a look at this picture, this is a really great picture to break down a 2D numpy array. So imagine we had a 2D numpy array whose shape was two rows by three columns. So it's a it's a 2x3 All right. And these we have elements 1 2 3 and four five six in our array. So we have two rows. Each row has three elements. Okay. So a 2x3 shape. Now the element that is right here is at index zero because it is at the first row. So it's at index zero in terms of the row, right? So so there's two rows. So it's either going to be index zero or index one for the row coordinate. So it's at it's at index zero for the row. But which column is it in? It's in the first column. So it is at index zero for the column. So this coordinate for this guy would be like if we passed in 0, 0 as the index, right? So if we passed in 0, 0 as our index, we could access that element right there. Because what this is signaling is we are at row 0 and we are at column zero, column zero. Okay. So with multi-dimensional arrays, we have to be careful about that is that things can be accessed by their coordinate. Now their their index coordinate rather than just a single index like we saw with list, right? With a list, it was just okay, we can grab something at index zero, index one, index two, maybe index minus one. Um, but with a 2D array, we're actually grabbing things uh at their coordinate, right? So row 0, column zero is this should return this should give me the element one, which is this guy, right? Okay, if I were to access that element. All right, so let's try accessing. So let's take a look at this guy. This guy is now going to be at same row. So this guy is still going to be at row zero, but it's going now be at column one, right? So it's now it's now here, column one. So we should be able to access that two sitting at row 0 column one index. So this should this should equal two right this item should be that element too. Okay. And then lastly from this same example we have this three. This three should be we should be able to get from we're still within row zero but we're now at column index two and this should equal this should equal three. Okay. So in a 2D array this first entry is what row we want to go to. So imagine like scanning over this grid. What row do we want to go to? Okay, row zero. That's the first row. What column do we want to go to? Okay, column two, that's the last column. That's the third column Let me ask you guys, does that make sense in terms of thinking about it like a grid and a coordinate of how we access Any questions on that? Are the INJ just random? Uh no. Like in the in this example, it's random. It's just it's just a 2x3. But uh they represent how many rows and columns we have, right? So I represents how many rows we have. J represents how many columns we have. Oh, sure. Yeah. Yeah. They could be any letters. Yeah. No, we're just using I and J because that that's the traditional indices for row and column. I and J. That's just kind of like a tradition to do that. But yeah, they could be you could you could use any letters there. Yeah, no worries. It Yeah, that's just the uh it's kind of the tradition there. Okay, so let's see how this works inside of code. So if we create a let's create some arrays. All right. So let's create some some arrays to practice with. So so we create a 1D array, we create a 2D array, and we're going to create a 3D array so we can practice accessing certain elements. So if we look at the 1D array, this behaves exactly like a list. In fact, I'm going to write that down. This behaves just like a list. the 1D array we can access like the third element or the first element or you know in this case we're accessing the fourth element which is at index three um and that returns to us the four. Um so we're we're doing that just like a list. Um in fact we can also do the very last element um which would be at the minus one index and that is the six. So it it behaves a 1D array behaves exactly like a list. Not much different in terms of accessing the the elements there. We don't need to worry about coordinates because there's no rows and columns. It's just a single. It's basically like a list, right? Very easy to access Okay. So and then and for instance, we can add two elements from these positions together. This is this adds the position one and position zero elements together. And so that ends up being three. And we can see that because that is uh 2 + 1 which is uh three. So pretty easy to do. All right. So here is that exact uh here's another picture of everything we just drew earlier where we're thinking of the elements at these positions as coordinates. Right? So this the element right here in the first row first column is at coordinate 0 0 and then the element right here is at coordinate 01 02 and then if we go down to the next row this element would be at row one column zero index row one column one index row one column two on and on and on you know however many rows and columns we have. So we think of accessing elements that way by their row and column index in a 2D array. All right. So for example, let's get the element that's in the first row. So this should be the first row because that is index zero and this should be the third column. This should be the third column, right? Is the index two. So if we go back to our 2D array, it should be the first row, which is this guy, and then the third column, it should be a three. And it is, right? So if we print that out, we get a three. How do we feel about that example? Do we see how we're accessing it from inside these brackets just like we would a list, but now it's a coordinate? Do you see that one? Good. And uh we also have now we can grab something from the second row and from the uh from the second row and this because this is now one and this is now the second column. So if we go back to our array, second row, second column should be this guy here, right? Second row is this guy. Second column is this guy. So this should be the five, which it is. We print it, print that out, we get the five. Okay, that's what that coordinate represents. Second row, second column. Very good. Um, I want to take a look at a 3D array now, which is going to be a little interesting in that we're just going to add an extra coordinate to the mix. So, with the 3D array, we basically need to know which matrix are we talking about. So remember a 3D array is going to look like this. We basically have rows and columns. Row, column, row, column, row, column. We basically have an array of matrices as our 3D array. So the first coordinate is going to say which matrix are we at? Are we at this one? Are we at this one? Are we at this one? Uh, Roberto, I'm just having a little trouble with the comments. Uh, if we already know the value we're looking for, then why do we need to use the coordinates? Yeah. So, it it's because we need to get familiar with how to access different elements of our data. Um, for instance, like maybe we maybe we need to access only that last row or maybe only that last column in a collection of data, which we're going to see we're going to practice slicing coming up in a minute. So, um, yeah, it we're going to need to be able to access entire collections of data within like a within a matrix. So, it's going to be important. It's going to be important to be able to to grab that collection of data from from a large data set. Yeah. So, so it seems I I think I see what you're saying is it kind of seems like redundant to do that right now when we can clearly see exactly what value it is, but in a large data set, it wouldn't be that obvious and we need a programmatic way to select those contents. All right. So in the in the 3D example, what I wanted us to see though is that notice notice that like the first coordinate is going to say are we talking about this matrix or are we talking about this matrix? So we're actually going to have three coordinates. So this first coordinate is relating to which matrix are we talking about? And then once we know what this is, like this zero would say, okay, we're talking about this first matrix. Then it's the same as usual with the these two coordinates are these two coordinates here are now what row and column within that matrix are we talking about? Okay. So with 3D that first coordinate is which matrix is it? Is it the first, the second, the third, the fourth? Because remember in a 3D array every element is a matrix. Every element is a 2D array. So the first coordinate is saying which matrix are we talking about. All right. And then once we know which matrix it is, we can use the next two coordinates to figure out what row and column of that matrix are we talking about. So let's see an example. So look at how this 3D has three elements. So if we if we break this down, this is saying that we want to access within the second matrix. That's what this first coordinate is saying. Within the second matrix, I want to I want to access the element that's at row 0, column zero. Right? This is the second matrix. second matrix. So in a in the setup it's like we have a matrix here, we have a matrix here, we have a matrix actually we only have two in our example. So this element means we want to look at that second matrix and we want to grab the element that's at coordinate 0 uh coordinate 0 comma 0. So let's let's go back to our matrix and see what that should be. So what is the second matrix? It's this guy. This is the second matrix that is within our And then we have we want to access row 0 column 0. That should be this. It should be this seven. Right? So row 0 column 0 of that second matrix should be that seven. And that's exactly what we get. If we print out this uh this coordinate we get seven matrix refers to row uh kind of yeah it's cuz in a 3D in a 3D array every element is a matrix is a 2D matrix Okay. All right. How do we feel about this three threedimensional indexing? Does that make sense? This is matrix matrix that this is the second matrix and then this is row 0 column 0 of that second matrix. Finally, I want to mention that we could do negative indexing for all of this, right? So that still applies. Like if we did minus3, that would be, you know, third from the last. So if we go back to our 1D array, we could do minus one, we could do minus3. Minus one, minus2 minus3 should be the four here. Um so we can still use our negative indexing as we would with any list. So that does not change. We can still use minus indexing. Um we can even use it in the 2D array. So if you imagine there this is now in terms of the column because now this is saying in a 2D array we want to grab the element that's in the second row but in the last column. That's what the minus one would mean in this case. Second row, but the last column because we have a minus one index…

Transcript truncated. Watch the full video for the complete content.

Get daily recaps from
Simplilearn

AI-powered summaries delivered to your inbox. Save hours every week while staying fully informed.