Class Notes 9/17/01
Last Time:
- Finished (?) study of heavy tails
- Began study of "Long Range Dependence"
- via correlation analysis (sensible??)
- in context of:
Course Goals:
-
Explore Internet Traffic from several viewpoints
-
Highlight interesting open problems
-
Promote possible joint research
-
Maximize understanding by all class members
Wednesday's meeting:
(Sound Bite) Introduction
to Time Series Analysis
- Autocorrelation
- ARMA process
- Periodogram
- Partial Correlation
- ARIMA processes
- Long Range dependence
-
Fractional ARIMA processes
Investigation II: Long Range Dependence?
Question 1:
Is it really there?
- Early conceptions: no
(renders classical queueing theory useless?)
-
Current thought: yes
-
Very recent work (Cleveland, et. al.): not important
-
Motivated zooming autocorrelation view.
-
Revealed "both viewpoints correct" depending on scale
-
Surprised at "how dependence comes in"?
-
Expected "lump of dependence" coming in from right??
Investigation II: Long Range Dependence? (cont.)
Notion of large lump on right (in autocorr.):
consistent with “periodicities”?.
Caution 1:
periodicities
large lump,
but not clear that
large lump
periodicity
Caution 2: TCP has its periodicites
An aside about aggregation
A tempting idea:
"packet loss effects will kill independence at small scales"
BUT: aggregated data
say something different
AN EXPLANATION: depends on where loss occurs:
- loss at link where measuring? then YES
-
far away from measurement point? then NO
Recall simple view of the Internet:
Current situation:
- Backbone is "over-provisioned"
(working at 5-10% capacity)
- Loss occurs mostly at "edges"
(or between backbones)
- Thus aggregation of these could be independent
(since loss is happening at many different places)
Investigation II: Long Range Dependence? (cont.)
Observed effects due to data
sparsity?
Time Series’s 2: For increasing seq’s of 10,000 bins
-
time scales
-
# obs’s / bin
-
total length
Major Problem:
assumes “stationarity”
Investigation II: Long Range Dependence? (cont.)
- can’t distinguish from indep. at small scales
- strong dependence at larger scales
- “vertical lifting of dependence”
-
not “coming in from right”
Questions:
- looking at too narrow a lag range?
-
where are “times” in zooming auto-correlation?
Investigation II: Long Range Dependence? (cont.)
Larger lag range &
“time markers”
-
cyan bar shows old lag boundary
-
yellow bars show how time zooms
-
vertical lift not completely level
-
but still doesn’t “move in from right”
-
instead “lifts first on left”
Investigation II: Long Range Dependence? (cont.)
Zooming Autocorrelation 4: Time invariant view
Rescale to fix yellow time bars
-
expect “curve follows mountains of dependence”
-
from “dependence at time scale” model
-
instead see “dependence increasing with scale”
Explanation: simple cross scale calculation
Hannig, J., Marron, J. S.
and Riedi, R. H. (2001) Zooming statistics: Inference across scales,
Journal
of the Korean Statistical Society, 30, 327-353. Go
here to download.
Idea: Compare autocorr’n
when adjacent bins are combined:
Relate lag
at scale
to lag
at scale
Can show:
Explanation (cont.)
Notes:
- when really uncorr’d, always stays at 0
- slight positive autocorr. Magnified by 2
- big lift for small lag one autocorr.
- small lift for large lag one autocorr.
- small scale Poisson model is not correct
-
but still OK as a fine scale approximation???
Investigation III: Zooming SiZer
Idea: Study "dependence" in terms of
"non-stationarity in mean"
Recall SiZer
finds "significant slopes"
Need for zooming: to
view wide range of scales
SiZer Background
-
settings: scatterplot smoothing and histograms
-
Fossils data
-
Incomes data
- Central Question:
Which features are “really there”?
-
Solution Part I, Scale Space
-
Solution Part II, SiZer
SiZer Background (cont.)
Smoothing Setting 1: Scatterplots
E.g. Fossil
Data
-
from T. Bralower, Dept. Geological Sciences, UNC
-
Strontium Ratio in fossil shells
-
reflects global sea level
-
surrogate for climate
-
over millions of years
SiZer Background (cont.)
Smooths
of Fossil Data (details given later)
-
dotted line: undersmoothed (feels sampling variability)
-
dashed line: oversmoothed (important features missed?)
-
solid line: smoothed about right?
Central question: Which
features are “really there”?
SiZer Background (cont.)
My scatterplot smoothing method (others disagree):
local linear smoothing
Main idea: (illustrated by toy example)
use kernel window to “determine neighborhood”
then “fit a line within the window”
then “slide window along”
Window Width, h, is
critical
SiZer Background (cont.)
Smoothing Setting 2: Histograms
Family Income Data: British Family Expenditure Survey, 1975
- Distribution of Incomes
-
~ 7000 families
- Again under- and over- smoothing issues
- Perhaps 2 modes in data?
-
Histogram Problem 1: Binwidth (well known)
Central question: Which features are “really there”?
(e.g. 2 modes?)
SiZer Background (cont.)
Why not use (conventional)
histograms?
Histogram Problem 2: Bin shift (less well known)
- For same binwidth
- get much different impression
-
by only “shifting grid location"
Solution to binshift problem: average over all shifts
- 1st peak all in one bin: bimodal
-
1st peak split between bins: unimodal
Smooth
histogram provides understanding,
so should use for data analysis
Another name: Kernel
Density Estimate
SiZer Background (cont.)
Kernel density estimation
View 1: Smooth histogram
View 2: Distribute probability
mass, according to data
E.g. Chondrite
data (from how many sources?)
SiZer Background (cont.)
Kernel density estimation
(cont.)
Central Issue: width of window, i.e. “bandwidth”, h
Controls critical amount of smoothing
Old Approach: data based bandwidth selection
Jones M. C., Marron, J. S.
and Sheather, S. J. (1996) A brief survey of bandwidth selection for density
estimation, Journal of the American Statistical Association, 91,
401-407.
New Approach: "scale space"
(look at all of them)
SiZer Background (cont.)
“Scale Space” – idea from
Computer Vision
Conceptual basis:
- Oversmoothing = “view from afar” (macroscopic)
-
Undersmoothing = “zoomed in view” (microscopic)
Main idea: all smooths contain useful information,
so study “full spectrum” (i. e. all smoothing levels)