⇐ Use this menu to view and help create subtitles for this video in many different languages.
You'll probably want to hide YouTube's captions if using these subtitles.

# Predicting with Linear Models

相關課程

選項
分享

0 / 750

- Let's do a couple of problems where we estimate or predict
- answers to questions using linear models.
- It sounds very fancy.
- But what a linear model is is just using a line to describe
- some trend in data.
- With our linear models we will do a little bit of
- interpolation and a little bit of extrapolation.
- You might have already heard these words before in your
- everyday life.
- Interpolation means trying to estimate what happened between
- two data points.
- We're going to do a couple of examples of
- that in this video.
- Extrapolation means we'll see what the last few data points
- were, and see what that trend would look like, and then keep
- on continuing that trend and see what might happen if that
- trend were to continue.
- I'll show you some examples of that.
- So let's do some example questions here.
- So I have this chart here.
- Median age, remember median is the middle.
- So if I have the numbers 3, 7, and 9, the median is the 7.
- It's the middle age.
- So this is the median age of males and females at first
- marriage by year.
- So if we just look at this data right here, if we look at
- 1900, the average man was getting married, not the
- average, the median male.
- So the middle man was getting married at looks like at
- around the age a little over 25, almost 26, while the
- median woman was getting married looks at around 22 or
- 23 years old.
- As we went through the century, those ages got lower,
- and lower, and lower, until we got to about 1960.
- This problem is interesting irrespective
- of the actual problem.
- It's just interesting to see this trend that at the
- beginning of the century, people got married reasonably
- similar to when they're getting married right now.
- But there was a minimum point.
- Around the 1960s, people got married at a younger age.
- This was true for males and females.
- You can also see back here the age difference between the
- median male and the median female.
- That's just gotten smaller and smaller over time.
- Now it's really small.
- So these are just all interesting
- things from this chart.
- This is the actual data from that chart.
- So actually I didn't even have to estimate it here.
- In 1900, the median age of males getting married was
- 25.9, that's that right there, and the median age of women
- getting married was 21.9, that's that over there.
- So this is just a scatter plot of this data.
- We've just plotted each of these points over here, these
- in blue, these in red.
- Let's do the questions.
- Use the data from example 1-- this is example 1 up here-- to
- estimate the age at marriage for females in 1946.
- Fit a line by hand to the data before 1970.
- So what they want to do, is they want me to look at all of
- the data before 1970.
- So let's see, this is 1970 right here.
- That's 1970 right there.
- They want to fit a line by hand to the data before 1970.
- What they mean is-- and I'm just going to eyeball -- but
- they want me to draw a line that gets as close to all of
- this data before 1970 as possible.
- So it can't go through all of these points.
- Because all these points don't sit exactly on one line.
- But let me try to draw a line, or a linear model, for the
- data from, what is this, 1890 all the way to 1970.
- So I'm going to try my best to draw a line here that gets
- close to all of these points.
- It won't be able to go through all of them.
- So all the way to 1970, maybe a line would look
- something like that.
- Let me draw a better one than that.
- I'll draw it in red because we're dealing
- with the red data.
- So I want to go all the way to 1970.
- It would have been nice to have a line drawing tool.
- Maybe the line might look something like that.
- There are actual mathematical ways to figure out the best
- fit line on this.
- We won't go into that right now.
- That's called linear regression.
- But I just eyeballed it.
- That line looks like a pretty good fit for all of that data.
- None of the data is too far away from that line.
- So that's what they mean by fit a line by hand to the data
- before 1970.
- This is the data before 1970.
- If we assume that this is a good linear model for that
- data, we can use it to estimate the age at marriage
- for females in 1946.
- Let's see, this is 1950.
- This would be 1945.
- 1945 would be right over here.
- So 1946 would be right over there.
- So if we wanted to estimate that, if I were to figure out
- where that hits the vertical axis, it seems to hit the
- vertical axis at around-- I don't know, my best to guess
- is, I should have zoomed in more-- it hits
- it right about there.
- This is 22.5.
- So this looks like maybe 20.5.
- Because this would be maybe 20.5 years.
- So I'll call it twenty 20.5 years.
- Obviously I can't do it exact, but you get the idea.
- I am interpolating.
- Here I'm using the model to do.
- I won't use necessarily the word interpolation yet.
- We're going to do a much more direct form of interpolation
- later on in this video.
- But here I fit a line, and then I just use that line as a
- model to estimate what was the median age of women in 1946.
- Let's do part 2.
- Use the data from example 1 to estimate the age of marriage
- for females in 1984.
- Fit a line by hand to the data from 1970 on in order to
- estimate this accurately.
- So now they want us to fit a line from the data on.
- So let me make a line.
- So fit a line by hand from the data 1970 on.
- So let me draw a line that gets close to all of these
- points, that gets close to describing
- all of these points.
- Maybe it looks like something like that.
- Not all of them sit exactly on the line, but they're all
- pretty close.
- If we assume that this is a good a linear model-- I
- haven't done this mathematically, I'm eyeballing
- it-- but let's see what it would be in 1984.
- This is 1990.
- So 1984 is going to be right around there.
- Actually it looks amazingly similar to the median age of
- men in 1970.
- So we can actually get the exact number.
- It's about 23.2.
- So this looks to be about 23.2 years.
- Part 3, use the data from example 1 to estimate the age
- at marriage for males in 1995.
- Use linear interpolation.
- Now we're going to do more direct form of interpolation.
- We're not going to use a linear model or we're not
- going to fit a line.
- We're going to use linear interpolation between 1990 and
- 2000 data points.
- We're looking at men now.
- So I'll do it in blue.
- Between 1990, that is 1990 and that is 2000 for men.
- So if I were to draw just a line there, we could assume
- that the trend would have looked something like that
- between 1990 and 2000.
- They want us to interpolate what happened in 1995.
- So 1995 is that point right there.
- So this is pure interpolation.
- We're using the data point in 1990 and the
- data point in 2000.
- We're drawing a line between them and assuming that 1995
- would have been right in between them.
- We're interpolating between the 1990 data
- and the 2000 data.
- So if you were to try to estimate where that is, it
- looks like it's a little bit over 26.
- Actually if we say that it's right in between these two
- values right there, it'll be 26.45 years.
- If it's exactly in between, 1995 is exactly
- 5 years from each.
- So 26.45 looks pretty good.
- So let's do another example So here's a percentage of women
- smokers by year.
- I have to say these charts are interesting in and of
- themselves.
- You'll notice in 1990 there were a good number of women
- who were pregnant who smoked.
- Almost 18% or 19% of pregnant women smoked in 1990.
- Now that's down.
- Well this is 2004 so hopefully this trend is continuing to go
- down eventually to 0%.
- But it went down to 10%.
- So there still are a percentage of pregnant women
- who unfortunately smoke.
- Let's do these problems. Use the data from example 2--
- that's this right here-- to estimate the percentage of
- pregnant smokers in 1997.
- Use linear interpolation between the
- 1996 and 2000 data.
- We're going to use linear interpolation.
- So let me draw a line.
- You could always do this much more exact if you have better
- tools or a spreadsheet.
- But I'm going to draw a line, and I want to get 1997.
- This is 1997.
- It's going to be right over there.
- If we use linear interpolation, we are
- interpolating between that data point
- and that data point.
- It's a linear interpolation.
- We assume It's a linear trend between the two.
- So this looks like it's-- let me go all the way over here--
- looks a little bit more than 12, maybe
- 13% is my best estimate.
- Now part 5.
- Use the data from example 2-- same example right here-- to
- estimate the percentage of pregnant smokers in 2006.
- There isn't any 2006.
- This chart ends at 2004.
- So we have to use linear extrapolation with the final
- two data points.
- So this here you're going to see the difference between
- interpolation and extrapolation.
- Interpolation, you are estimating between two data
- points you already knew.
- Extrapolation, we're going to take the final two data points
- and we're going to continue that line.
- We're going to extrapolate that line.
- So if you were to just continue this line between the
- final two data points, it might look
- something like that.
- So if you had 2006 out here-- these are going up by two, so
- 2004, 2006-- we extrapolate this line.
- The percentage of women smoking might be, well it
- looks a little bit under 10%.
- So maybe 9.5%.
- Once again, it's just an estimate.
- But we're just extrapolating the trend.
- We're assuming the trend of the last
- two data points continues.
- Then using that, it would be around 9.5% in 2006.
- Now maybe, that's not definitely the only prediction
- you could make.
- You could extrapolate based on all of the data and get a line
- that looks something-- let me try to do it a little bit
- better than that-- that looks something like that.
- Then this line would predict a lower smoking rate.
- So it depends on what data you use, what you're using to do
- your linear model with, and how you're extrapolating.
- But anyway, hopefully you found this useful.

載入中...