Expand
-
Collapse
Basic Statistics
Terms of Endearment
pop'n vs. sample, parameter (true value) vs. statistic (estimate)
mean, median, mode
standard deviation vs. coefficient of variation
Middle Aged Spread (or, does this graph make me look fat?)
shape (distribution)
samples means
normal
what normal "looks like"
what exactly is normal?
if nature is not normal ...
use properties of normal
centre
typically the mean
THE most important statistic
spread
variation
... linked to sampling error & confidence
SD & CV - for pop'n
SE & SE% - for the sample
pop'n data & sample data
likely distr'bns
skewed
bimodal
strange
helps to understand the pop'n
sample data distr'bn will be similar
It's Magic
SD & CV
SE & SE%
the "magic table"
column 1: estimates of pop'n; too big? ... "live with it"
column 2: indicates "uncertainty" of the sample; too big? ... do more plots
Final vs. True
What's the diff?
e.g. consider a small sample size ... likely the real answer?
what happens if a add a couple more samples?
what if I have a very large sample size?
what if I add a couple more samples?
technique? eqpt? sample selection?
What happens as more samples taken
mean
SD (CV)
SE
Bias
UNbiased def'n: long run average from repeated samples = real (pop'n) value
biased: sampling process results in final <> true
reasons for bias
measures (technique, eqpt)
sample selection
calculation/ improper weighting
How to ensure Final ~ True
Accuracy & Precision
Accuracy
how close sample mean is to True
stats give NO clue
"check cruise"
Precision
how close sample mean is to Final
this IS stats (SE & E)
accurate but NOT precise
precise but NOT accurate
I'll Have a Guiness
± 1SE ~68%, ± 2 SE ~ 95%
magic table - add 3rd column
beer??
small sample size? ... ± 2 SE ~ 95% NO longer true
Student's t-table ... Guiness
What does ±15% @ 95% mean??
2 parts to statement of certainty
sampling error (range)
confidence level
Sampling Error vs. Confidence Interval
sampling error = std error at some other confidence level
confidence interval
a) sampling error in units for silv surveys (can be %)
b) expressed as a range
Assignment
ave wt of 2nd yr forestry students
calc: mean, sd, cv, se and e (units & %)
values on board
hand-in lab when done
Stop Already!
when to stop sampling
becomes inefficient, extra plots don't help (see graph)
becomes too expensive, not worth $$/effort to do more
you already answered the question to your satisfaction
"you met stats" - met necessary precision (not accuracy)
OR reached the required maximum plot density (e.g. 1.5 plots/ ha)
Rule of Thumb for Silv Surveys
80% of strata need 10 or less plots
15% of strata need 11-25 plots
5% may need >25 plots
Meeting Stats
sampling error < threshold
eq'n for sampling error ...
rearrange for n
but
what about t-value?
process
use t=2 (or existing t if sample didn't meet stats)
get a value for n
revise t-value, and calculate again
then add 2 or 10% for good measure (remember CV is an estimate)
Break It Up!
What
divide pop'n into homogeneous groups
each stratum is (typically) less variable than the pop'n as a whole
each strata is sampled independently, then ...
... the results are combined (inventory & cruising)
... the results are considered independently (silv. surveys)
typically based on geography (space), but ...
could be other parameter, e.g. species
Why
reduce variation ... leads to reduced std. error (& sampling error)
gymnasium of "basketball players"
high volume rich site vs. low volume scrub
patch of "happy" seedlings vs. patch of marginal survival
variation btwn groups "disappears" from the calculations
simply to recognize that separate sub-pop'ns exist
i.e. HB & CH types
even though they have similar stocking (or volumes ...)
... "typing" provides a better description
isolate a variable (or problem) sub-pop'n
e.g. OG patch in a 2nd growth cutblock
e.g. patch of "borderline NSR scrub" adjacent to "typical stand"
allows for different sampling methods
3P in riparian corridor
prism for rest of block
Silviculture Surveys & Stats
Number of Plots
5 plot minimum per stratum
maximum density of 1.5 plots/ ha - regardless of stats
Rule of Thumb for Silv Surveys
80% of strata need 10 or less plots
15% of strata need 11-25 plots
5% may need >25 plots
Filling in the card (FS 1138A)
Opening - "the block"
Strata - usually a SU
Area - important for max. plots
No. of Plots
mean (stems per plot & per ha)
std dev (S)
std error (Sx)
t (@90%) n-1
CI (t* Sx)
LCL (x - CI)
MSS (sph)
e (precision, <0.5 or 100 sph)
(new) n
Decision Rules
1) if mean & LCL > MSS ... SR/FG
2) if mean < MSS ... NSR/NFG
3) if MSS is btwn mean and LCL ... oh crap, gotta think
a) if CI < e (0.5 or 100 sph), use mean ... SR/FG
b) if CI > e ... need more plots
calculate new n
new n - plots already done = additional plots (to a max of 1.5/ha)
re-calculate mean with extra plots
compare mean to MSS to decide (no longer look at LCL)