The upper graphic on the next page was produced because Vaske provided data that included years in life of hunting deer or elk.  The data are weighed (see Vaske’s book Survey Research and Analysis p. 214, 2008, Venture). The “wavy” deheaped curve below could not even be appropriately estimated until deheaping could be done using case weights. One of the two weighting options being included in Program 2 is allowing for a single “inflator” like 2000 to inflate responses so estimates are for the population (e.g., one in 2000 hunters provided data). The alternative is allowing an average weight for each x-value. One computes (e.g., using proc summary as below) averages case weights for x-values and the program uses these. Detailed documentation will be available.

 

Once the program was run, it became clear that regression to establish a smooth curve under heaps could involve regressions being over ranges that were too short. New parameters will be available in the upgraded release of Program 2 that allow adjusting the number of points covered by regressions and their overlap. One will be able to adjust the following within limits: %LET OVERLAP=4;%LET LENGTH=9 ;  %LET DIF=%EVAL(&LENGTH-1);. In fact, the program in the first release had overlap of 3 and length of 6. This means every regression includes 6 non-heap points and is overlapped by a next/previous regression by 3 of those points.

 

The graphics you see on the next page are automatic (thank Jen Schmidt). The waves you see in two functions largely result from such matters as low use of responses ending in 1 and 9. People are rounding to multiples of 10 as if a prototype of 10x±1 is being used (actually, probably is being used). The fascinating thing is use in relation to numbers ending in 0 but not for ones ending in 5. It is likely that dealing with the 10x±1 prototype use will occur in a later program release.

 

data deheap4.colodeerelk_uf2;merge deheap4.colodeerelk_nf(keep=hunt2003 harvest2003 sex--income deerelk residency yearsinstate yearsinlife hunt2004 hunt2005 bucksbulls doescows)

colodeerelkweight(keep=staterecode weight_sample weight_pop);weightp=round(weight_pop,1);if weightp<1 then weightp=1;weights=round(weight_sample,1);if weights<1 then weights=1;run;

proc gchart data=deheap4.colodeerelk_uf2;vbar yearsinstate yearsinlife/ sumvar=weights discrete;run;

proc gchart data=deheap4.colodeerelk_uf2;vbar yearsinstate yearsinlife/  discrete;run;

proc sort data=deheap4.colodeerelk_uf2 out=x;by yearsinlife;run;

proc summary data=x;by yearsinlife;var weights;output out=deheap4.codeerelkYRnLife sum=count mean=weight;run;

data deheap4.codeerelkYRnLife;set deheap4.codeerelkYRnLife;count=round(count,1);if count>0 and yearsinlife>0 then output;keep yearsinlife count weight;run;

 

The following was used to generate case weights for testing the program:

 

data deheap.xwt_frequency;set deheap.cntl_data(keep=&x &y);

case_weight=round(.5+abs(5+normal(-1)),.5);

put &x= &y= case_weight=;

Run;

 

Plotting uses:

ods html;

 ods graphics on;

PROC SGPLOT DATA = &lib..&out_data._smooth_out6

(rename=(dist_smth=Deheaped_function _p_mean=Smooth_Fn_under_Heaps

 &y=Observed_Frequency &x=%upcase(&x)));

series X= &x Y=Observed_Frequency / LINEATTRS=(Color= "black" Pattern= Solid Thickness= 1);

series X= &x Y=Smooth_Fn_under_Heaps / LINEATTRS=(Color= "black" Pattern= DashDotDot Thickness= 1) ;

series X= &x Y=Deheaped_Function / LINEATTRS=(Color= "black" Pattern= Dash Thickness= 1);

TITLE "Figure for run labeled &out_data with frequencies for variable %upcase(&x) with frequencies in variable %upcase(&y).

Weight option is: %upcase(&weighting)";

RUN;

ods html;

 ods graphics off;

Home
Program 2: Estimates with STD