Math Problems to Save for Later: Statistics

Showing posts with label Statistics. Show all posts

Sunday, August 17, 2025

Tiny Pop Stars

This is why we need to learn about accurate scaling of graphs.

Friday, September 4, 2015

Some statistics are useful, and some are not. I don't know if this one is useful, but it's kind of fun:

The average book has 64,500 words.

I don't know exactly what I'd do with this in class, but it seems like an interesting opportunity to discuss why we use different statistical measures to describe data, and to get kids thinking about which measures feel useful in which situations. Is mean really the right measure of central tendency for this statistic? Is comparing a measure of central tendency even useful? Why do we care?

There also might be something (less exciting, more practice-y) about using all the stats for each book to work backwards to calculate the standard deviation. That also raises the question of whether mean and standard deviation are really the right descriptors. I am very curious whether word length is a normal distribution. It might depend on what genres of books you include (children's books seem like the have the potential to skew the data).

What other statistical questions might kids generate? How could they use info about the books they're reading in English class to do some further exploration?

Counting Trees

The "Counting Trees" Formative Assessment Lesson from MARS/Shell Centre is one of my favorites. I think it's open ended in an interesting way and I love that kids need to be okay with not knowing the exact right answer.
The other day I heard this story on NPR about how many trees there are in the entire world.

Of course my first thought was the FAL and how I would use this story in conjunction with that lesson. I wonder how I would structure it. Would it be a hook to the lesson or a "beyond"? What parts would I have kids listen to or read? I haven't read the Nature article yet, so I wonder what's in there.

I loved that the story went through a whole process of asking a question, making a conjecture, revising the conjecture, and so on--exactly the kind of thinking process that I want to highlight for kids.

It also raises some interesting questions about rate (how long it would take to plant 1 billion trees), density (if so much forest has been depleted, what did forests used to look like?), and large numbers (what does 3 trillion trees even mean?!)

Saturday, July 18, 2015

Could you be an Olympian?

http://www.theguardian.com/sport/datablog/2012/aug/07/olympics-2012-athletes-age-weight-height

The Guardian has stats on height, weight, and age across athletes in the 2012 Olympics. Some questions I would want kids to ask/investigate:

For your height/weight/age what sport are you most likely to play. That is, for what sport are you most "normal"?

I don't know if I'd include age depending on how old my kids are. If I mostly had 14 year olds, it would be tough because most athletes are older. But it would probably be fine for 17 year olds
I might also leave out weight because that's a touchy subject, but maybe I'd give kids the choice.
It would be interesting to see how kids combined all three variables

If you played ___, what percentile would you be in for height/weight/age? (I think this would require the assumption that the variables are normally distributed)
Some kind of comparison of shape, center, spread across sports. Which sport has the longest window of time where you can reasonably play (we could discuss whether this referred to range or standard deviation)? For which sports is mean a better measure of center and vice versa?
For which sport are athletes the most different from the general American population

What's best is that all the data is available in a spreadsheet, so you can do whatever you want with it

Tuesday, September 23, 2014

Estimation 180 and Confidence Intervals

I love Andrew Stadel's Estimation 180 collection and sequence, for all the reasons why lots of people have been praising it. Estimation is an undervalued skill! Kids are terrible with units of measurement! It's super-accessible across multiple grade-levels! It's quick! It's fun! Etc!

One of the features is that Stadel always first asks for a guess that is too low and a guess that is too high, before asking for a the final estimate. I initially liked this because of how it increases accessibility for students and also trains them, in the long-term, to get more specific and accurate with their estimates and reasoning. Now I found a new reason why I like this practice: preparing kids for confidence intervals. Confidence intervals are just a more systematic way of making estimations, and really what the confidence interval is saying is "here is my range of guesses that are neither too high or too low." I imagine that when having kids share to high/low guesses for Estimation 180, there will sometimes be guesses that are actually correct, especially as students get better at their estimation and try to get "just a little bit" off in their too high/low values. That, in a lot of ways, is like a confidence interval! A 95% CI is saying that it is possible--5% of the time--that our interval doesn't capture the true population statistic (mean, proportion, etc). The too high/low guesses for whatever population statistic fall in those outer 2.5% tails. They're possibly correct, but it's highly unlikely. You'd be really surprised a value from those tails turned out to be the true population statistic.

Somewhere in here there has to be a lesson/activity where we collect estimations from a bunch of people and find out that the mean estimation is actually pretty close to the actual value. This is true for people guessing about the number of jelly beans in a jar, and so on. Can we use estimations to set up a confidence interval for the real number of jelly beans in a jar? Is that legitimate statistics?

Wednesday, July 2, 2014

Guess My Weight

I've always been interested in those people at amusement parks and fairs who guess people's weight or age or birthday month or whatever. One interesting question from that situation is which variable you should have the person guess for the best chance of winning. On one hand, birthday month feels nearly impossible for someone to guess by just looking at you, but the guesser does have a 1/12 chance of being correct. I can't remember the usual ranges for age and weight that let the guesser win, but it would also be interesting to think about how the amusement park sets those and if they're fair. To complicate things even more, how do social factors change what the guesser guesses (e.g. does the guesser under-guess age and weight for older people and women respectively, because that's what our society says is better?)

This problem is super-interesting:
http://nrich.maths.org/6957

I like that there is a lot of open-endedness to the solution and "correct answer." Unfortunately I am not sure what unit it might fall in because it involves a ton of different possibilities. Just a fun math problem? That's okay with me too!

--------------------

Update: I tried this task with a group of approximately 80 secondary math teachers (6th-12th grade). My version was slightly modified to (1) give it a little bit more of a hook and make it look pretty; and (2) obscure the task name so no one could google it... Teachers are sneakier than students. They shouldn't get to do what I did and just go straight to the sample student solutions!

I haven't looked at feedback from the session yet, but I personally enjoyed listening to what people came up with. There as a great deal of disagreement in the room about who should "win" and lots of different takes on a scoring system. Unfortunately we didn't have as much time as I wanted to, so I didn't get as much of an opportunity as I would have liked to push on some of the justification aspects, especially about why a scoring system is fair or ideal.

A lot came out about mean and standard deviation that I also didn't get a chance to make sense of. I wanted to ask people mean and standard deviation of what? What's their sample? Is mean or median a better measure of center? Part of that last question might rely on an assumption about guessing whole number weights. What happens when this task changes from discrete to continuous mathematics? I doubt the answer changes, but the questions you ask will definitely change.

Saturday, May 31, 2014

Baby Name Distributions

http://fivethirtyeight.com/features/how-to-tell-someones-age-when-all-you-know-is-her-name/

I am fascinated by baby name trends, but I don't know if students are. What I am most interested in with this is the intuition students will have about names, and how that helps set them up to understand distributions, especially bimodal distributions. I know that I have fairly set ideas about what names come from what eras (I hear "Agnes" and I picture an elderly woman; I hear "Kaylee" and I picture a young girl), so it helps with thinking about when median may or may not be the best measure of center. Also, there's a nice graph with interquartile ranges, which demonstrates why we care about the interquartile range. Finally, there's so much baby name data out there that kids could definitely research and construct their own graphs based on the questions that (I expect) will come up from looking at all this.

Saturday, May 10, 2014

Who's Lurking behind These?

42 strange things that correlate:
http://tylervigen.com/

Obviously, it's interesting fuel for the "correlation is not causation" discussion, particularly because it's interesting to think about what the lurking or confounding variables might be.

What I also think is interesting about these graphs is some of the graphs that seem to follow each other closely, but don't really have that high of a correlation coefficient. For example, Number people who drowned by falling into a swimming-pool vs. Number of films Niclas Cage appeared in. Most of the data has an r above .9, which is good, but I think it would be interesting for kids to talk about why the curves on that graph seem to rise and fall together, but the correlation coefficient is not really that convincing of there being a statistical correlation.

Also cool: if you click on one of the variables, you can see how it correlates with a whole mess of other variables. This site could clearly could be a huge time suck for stats teacher trying to find interesting data to work from.

Friday, March 14, 2014

Mario's a Baller

http://www.supercompressor.com/tech/13-things-you-probably-didn-t-know-about-nintendo

See fact #9: Mario has a 27’ vertical leap.

This seems like a fun addition to "How High Can Your Teacher Jump?" or any kind of proportional reasoning kind of thing.

What would we look like if we measured human heights in pixels? What would that mean for how tall Mario is compared to a human? How much bigger is Big Mario vs. Little (pre-mushroom) Mario? How big would YOU be if you ate a mushroom (or alternatively, if you're full size now, how tall would you be after running into a goomba)?

Do kids even recognize pixelated Mario anymore these days?

Tuesday, February 11, 2014

Lies, Damned Lies, Beautiful Lies

https://visualisingadvocacy.org/blog/disinformation-visualization-how-lie-datavis

I am in love with this article. Obviously data visualization can be just as persuasive as data provided in different ways (raw data vs. percentage vs. percent increase, etc.), but I like how this article specifically calls out the visual persuasion tactics.

Sunday, February 9, 2014

When are Babies Born?

Are births evenly distributed across time of day, day of the week, and time of year?

http://journals.lww.com/greenjournal/Fulltext/2004/04000/Timing_of_Birth_After_Spontaneous_Onset_of_Labor.8.aspx

The raw data is in the tables. I think kids would be interested in this question, and the results are kind of surprising.

Wednesday, February 5, 2014

Population Density Redistribution

http://www.washingtonpost.com/blogs/the-fix/wp/2014/01/30/what-the-u-s-map-would-look-like-if-population-matched-state-size/

I wonder what else you could have kids redistribute.

Monday, December 9, 2013

Headlines from a Mathematically Illiterate World

http://mathwithbaddrawings.com/2013/12/02/headlines-from-a-mathematically-literate-world/

The longer I teach, the more I think that the math that feels most important for students to take away from my class is about learning to read, interpret, and critically evaluate the logical/illogical statements that float around every day. I do actively enjoy pure math kinds of things, but if it came down to a choice of residue, I'd give up a robust understanding of derivatives for my kids leave being able to read a graph in the newspaper and evaluate the reasonableness of the latest study's claim.

I think it would be really fun for kids to find ridiculous statements in news articles and correct them like this. It would be interesting to develop that critical eye for poorly worded statements, both from a language and a mathematical perspective.

Love, and Love Lost

Visualizations of Love:
http://love.seebytouch.com/#LetMeShowYou

Clearly some are more mathy than others, but I like the variation in the types of visualizations.

-------------------

And quantifications of love no more:
http://quantifiedbreakup.tumblr.com/

Sometimes when kids tell me that they're in a bad mood or not feeling well, I respond that doing math problems always makes you feel better. Looks like I wasn't making it up. Doing math as therapy is real!

Wednesday, November 20, 2013

Where's Waldo?

If this isn't real-world math, I don't know what is.

http://www.slate.com/articles/arts/culturebox/2013/11/where_s_waldo_a_new_strategy_for_locating_the_missing_man_in_martin_hanford.html

I like what you can do with this around probability and area. It also makes me think about what kids hear/understand when they use and read the word "random." Do they think that Waldo is placed randomly? Why or why not? What would the pages look like if he were to be placed randomly?

I also like that it takes something that looks like it has little order, and uses math to help you see something you wouldn't otherwise be able to notice.

Friday, September 27, 2013

Poor Pete Tries Data Visualization

http://wtfviz.net/

The title may not be school-appropriate, but the awful data representations are a goldmine of "What's wrong with this?" problems.

What's Number Are You?

http://www.bbc.co.uk/news/world-15391515

According to this website, of all the people living on Earth, I was the 4,611,347,584th person to be born. Can you figure out how old I am???

There are lots of other interesting mathematical things about this. First, just the fact that the population of the Earth increased by about 2.5 billion people in my lifetime--more than half of what it was when I was born--is just staggering. I know that large numbers are hard for kids to conceptualize (adults too! me (non-kid, non-adult) too!), but there's something in here that makes you say WOAH. Even if you put in a kid's birthday (I picked a random 15 year old), there were still about 1 billion people born in their lifetime.

Of course there's also interesting stuff with exponential growth, how we would calculate this number, and so on. I'm also interested in the statistic of how many people have been born on Earth since the beginning of time. This website gives an interesting summary of that calculation, including this video:

The World Bank has a shorter video on the same topic:

I like the stat in this video that 7% of all the people who have ever lived are alive today. Holy smokes!

Tuesday, September 17, 2013

Reminiscing about the Good Old Days of Gas Prices

http://consumerist.com/2013/09/17/you-will-probably-never-pay-less-than-3-for-a-gallon-of-gas-ever-again/

The Consumerist says that we will never again in our lifetimes pay less than $3/gallon for gas. When I first got my driver's license, I remember paying less than a dollar! I also remember when gas prices started to go up and I swore I would never pay more than $2/gallon. (Maybe the math problem in here is figuring out how old I am...). The Consumerists's logic seems to be that because today was the 1000th consecutive day that average gas prices are over $3, we can safely say that we've reached the point of no return.

Here are my mathematical questions about this situation:

Are you convinced the 1000th day is "enough" of a pattern to say that we're never going back?
Would you be as convinced if the 1000 days were not consecutive? What kind of pattern of conductivity would be "enough" for you?
Does it matter that this about the national average of gas prices?
Does it matter what that average price is for you to be convinced? For example, if the national average price of gas is $3.89, I'll probably be convinced that <$3 is unlikely. But if the national average is $3.02, I feel confident in saying that I won't have too much trouble driving around and finding a gas station where I can pay $2.99. --> This seems like a perfect opportunity to talk about the importance of standard deviation!
How could looking at a graph of gas prices help add to an argument? I feel like there's a very powerful visual argument of showing the $3 line and looking at when the graph stops dropping below that.
If you wanted to convince someone with a graph, what data would you show?

National averages only, or the range?
How often would you want to chart data points? Daily? Weekly? Monthly?

Thursday, March 14, 2013

Percents of Percents

Percent increase and percent decrease still confuse the heck out of me, no matter how many problems I do or how many times I teach the topic. The language gets

Has the high school graduation rate of black males increased by 6.6% or 5.1%?
Has their high school dropout rate decreased by 37.9% or 5.8%?
Has their college enrollment rate increased by 32.7% or 1.7%?
Has their incarceration rate decreased by 25.3% or 2.1%?

Two bigger questions:

Would the positive changes for black males highlighted by this list of statistics still be as powerful if they had cited the change in percentage rather than the percent change (of the percent)? Either way, they still show increases where we would want there to be increases (high school graduation rate, college enrollment, college "by the numbers") and decreases where we would want there to be decreases (dropout rate, "incarcerated").
In what other ways could this information be presented (pure numbers, different types of graphs, etc.) that would make them more or less powerful? How do you think the author chose this table?

An underlying question:

What is the difference between "net increase" and "percent increase"? Is there a difference? How does this very, very subtle difference change how we present and interpret statistics about changes that are measured in and by percents? The Wikipedia article on percents has some interesting things to say, including how the use of the term "percentage points" can help clear up confusion.

As teachers, especially teachers of English Language Learners, how do we support students in navigating this very tricky language. It seems particularly important/frustrating given that percent increase/decrease problems are a not insignificant part of the California High School Exit Exam. (Really, no concept is insignificant when one or two questions could make the difference in whether you earn a high school diploma.)

Monday, November 29, 2010

Stories vs. Statistics

http://opinionator.blogs.nytimes.com/2010/10/24/stories-vs-statistics/

For one day when I get to do my interdisciplinary unit with an English class about how to persuade people using numbers. Brings up the good point of what do you want to sucker people in with? Will they be drawn in more by data or by a person's story?