Class Notes: Thursday
9/5/02
- Check new material on student pages (from Class Home Page)
- Give introduction to Excel and Simple graphics (from Computing Tips)
- Review Histograms (from last meeting)
-
Excel construction of histograms (from Computing
Tips)
Analysis of Buffalo Snowfall data...
Background: City of Buffalo, N.Y., known for heavy snows
Data: TIme
Series of annual accumulated snow falls (inches)
Recall Excel default histogram
constructed in: Toy Example Excel File
Comments:
- Excel chose binwidth = ~14
- Only 8 bins chosen, too large?
- Too few bins for "serious structure"?
-
Note one year unusually small
Binwidth deliberately "too small"
- Tried binwidth = 3
- Requires many bins to include all the data
- Histogram looks "very bumpy"
-
Hard to see "large scale features of distribution"
Binwidth "clearly too big"
- Tried binwidth = 30
- 10 times as big as above
- Averages taken over too big a range
-
Obscures potential interesting population structure
Binwidth "about right"???
- Tried binwidth = 10
- "in between" above 2?
- large enough to remove "sampling artifacts"?
- Small enough to suggest 3 modes?
-
Interesting question: are modes "important underlying
structure"???
Again highlights important
issue for histograms: choice of binwidth
Recommendation: try several binwidths
Including
both
too big, and too small
Third Class Assignment: Explore a new data set with histograms
- Start with data in spreadsheet StudyHabitsIndexData.xls
* Number attempt to quantify "quality of study habits
* Measured for 18 females and 20 males
* How do the populations compare???
- Address this question by an Excel analysis based on histograms
* Just try something, then we compare and discuss
- Display your results and conclusions on a new web page
* Linked to your home page
* You select format and style of presentation
* But insert some graphics generated by Excel
- Some graphics ideas to consider:
* Look at two separate histos, or some "combined version"???
* I.e. single graphic showing both "together" (experiment with Excel)
* Answers depend on binwidth, how to effectively display several?
- Some additional questions (answer on your web page, w/ discussion):
* Which group "looks better on average"?
* Can you "quantify this idea"? (e.g. give numerical measures)
* Which group "looks more spread" (i.e. has "greater variation")
* Quantify this idea by using the STDEV function in Excel
* Suppose you are an employer who must hire
somebody from one of the two groups.
Would you hire a female or a male, if:
+ You are forced to choose "at random"
+ You can carefully select from a large group of each type
Why?
Back to Statistics
6D Home Page