Home » Extra Credit 1

Extra Credit 1

Tutorial/Extra Credit Assignment

How to partition Births Data Set1 by gender?

How to create a histogram of Birth Weight Data Set for each gender?

Upload data set by using the following syntax:

BirthsD <-read.csv(file.choose(),header =TRUE)

attach(BirthsD)
head(BirthsD,3)
##                         FACILITY         INSURANCE GENDER..1.M.
## 1 Albany Medical Center Hospital Insurance Company            0
## 2 Albany Medical Center Hospital        Blue Cross            1
## 3 Albany Medical Center Hospital        Blue Cross            0
##   LENGTH.OF.STAY ADMITTED DISCHARGED BIRTH.WEIGHT TOTAL.CHARGES
## 1              2      FRI        SUN         3500       13985.7
## 2              2      FRI        SUN         3900        3632.5
## 3             36      WED        THU          800      359091.0
Select all rows from BirthsD data set that pertain to girls’births and include all variables, (columns).
Start by naming this subset girlD, (girls data).
girlD <-BirthsD[GENDER..1.M. == "0",]
attach(girlD)
## The following objects are masked from BirthsD:
## 
##     ADMITTED, BIRTH.WEIGHT, DISCHARGED, FACILITY, GENDER..1.M.,
##     INSURANCE, LENGTH.OF.STAY, TOTAL.CHARGES
head(girlD)
##                          FACILITY         INSURANCE GENDER..1.M.
## 1  Albany Medical Center Hospital Insurance Company            0
## 3  Albany Medical Center Hospital        Blue Cross            0
## 6  Albany Medical Center Hospital        Blue Cross            0
## 7  Albany Medical Center Hospital          Medicaid            0
## 9  Albany Medical Center Hospital Insurance Company            0
## 13 Albany Medical Center Hospital Insurance Company            0
##    LENGTH.OF.STAY ADMITTED DISCHARGED BIRTH.WEIGHT TOTAL.CHARGES
## 1               2      FRI        SUN         3500       13985.7
## 3              36      WED        THU          800      359091.0
## 6               4      FRI        TUE         2400        6406.0
## 7               3      TUE        FRI         4200        4778.0
## 9               2      SAT        MON         3100        3860.0
## 13              4      SUN        THU         2000        6986.9
summary(BIRTH.WEIGHT)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     300    2700    3100    3037    3500    4700
Create frequency table for Girls birth weights.
breaks <-seq(0,5000,by=500)
BIRTH.WEIGHT.cut <-cut(BIRTH.WEIGHT,breaks)
BIRTH.WEIGHT.freq <-table(BIRTH.WEIGHT.cut)
frequency.table <-transform(BIRTH.WEIGHT.freq)
frequency.table
##    BIRTH.WEIGHT.cut Freq
## 1           (0,500]    1
## 2       (500,1e+03]    5
## 3   (1e+03,1.5e+03]    1
## 4   (1.5e+03,2e+03]   12
## 5   (2e+03,2.5e+03]   19
## 6   (2.5e+03,3e+03]   50
## 7   (3e+03,3.5e+03]   75
## 8   (3.5e+03,4e+03]   33
## 9   (4e+03,4.5e+03]    7
## 10  (4.5e+03,5e+03]    2
Create a histogram of birth weights for girls data subset using breaks and hist command.
breaks<-seq(0,5000,by=500)
hist(BIRTH.WEIGHT, xlab = "Birth Weight in [grams]", ylab="Frequency",ylim=c(0,80),main="Distribution of Birth Weights for Girls", col="pink",border="blue")
detach(girlD)
Now: Select all rows from BirthsD data set that pertain to boys’ births and include all variables, (columns).

Start by naming this subset boyD, (boys’ data).

boyD <-BirthsD[GENDER..1.M. == "1", ]
attach(boyD)
## The following objects are masked from BirthsD:
## 
##     ADMITTED, BIRTH.WEIGHT, DISCHARGED, FACILITY, GENDER..1.M.,
##     INSURANCE, LENGTH.OF.STAY, TOTAL.CHARGES
head(boyD,3)
##                         FACILITY         INSURANCE GENDER..1.M.
## 2 Albany Medical Center Hospital        Blue Cross            1
## 4 Albany Medical Center Hospital Insurance Company            1
## 5 Albany Medical Center Hospital Insurance Company            1
##   LENGTH.OF.STAY ADMITTED DISCHARGED BIRTH.WEIGHT TOTAL.CHARGES
## 2              2      FRI        SUN         3900        3632.5
## 4              5      MON        SAT         2800        8536.5
## 5              2      FRI        SUN         3700        3632.5
summary(BIRTH.WEIGHT)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     300    2900    3400    3273    3650    4900
Create frequency table for Boys’ birth weights.
BIRTH.WEIGHT.cut <-cut(BIRTH.WEIGHT,breaks)
BIRTH.WEIGHT.freq <-table(BIRTH.WEIGHT.cut)
transform(BIRTH.WEIGHT.freq)
##    BIRTH.WEIGHT.cut Freq
## 1           (0,500]    1
## 2       (500,1e+03]    2
## 3   (1e+03,1.5e+03]    2
## 4   (1.5e+03,2e+03]    5
## 5   (2e+03,2.5e+03]    8
## 6   (2.5e+03,3e+03]   39
## 7   (3e+03,3.5e+03]   69
## 8   (3.5e+03,4e+03]   57
## 9   (4e+03,4.5e+03]   10
## 10  (4.5e+03,5e+03]    2
Create a histogram of birth weights for boys data subset using breaks and hist command.
breaks<-seq(0,5000,by=500)
hist(BIRTH.WEIGHT, xlab = "Birth Weight in [grams]", ylab="Frequency",ylim=c(0,80),main="Distribution of Birth Weights for Boys", col="blue",border="yellow")
detach(boyD)

Extra Credit Assignment

1. Use parts of above syntax and create two box-and-whisker plots, (one for each gender), describe variability in each subset, and compare varaiability between genders.
2. Write a report using any word-processing program. Use full and complete sentences; remember to include numerical and graphical summaries in your report. In addition, attach R printout with imput/output.

  1. Data Set 4: extracted from M. F Triola, Essentials of Statistics Sixth Edition, Pearson