Blog‎ > ‎

### R Result Analysis - Simple Reporting

posted Jun 18, 2019, 12:14 AM by MUHAMMAD MUN`IM AHMAD ZABIDI   [ updated Jul 20, 2019, 11:09 PM ]
 This post is the continuation of R Result Analysis - Preparing the Data.Commands that you enter is in blue. Some recap:In the previous post, the total marks for each student were computed. Each student was given a grade, and passing students is copied from the full dataframe my to a subset called x.> table(my\$grade) A A- A+  B B- B+  C C- C+ D+  E 73 42 48 42 40 42 22  4 25  5  1 > x = my[my\$total >=40,]> table(x\$grade) A A- A+  B B- B+  C C- C+ D+ 73 42 48 42 40 42 22  4 25  5Columns clo1, clo2 and clo3 reports the students' achievement on the course learning outcomes normalized to 1. The optional options(digits=2) command limits the significant digits to 2.> x\$clo1 <- (x\$tests+x\$final)/70> x\$clo2 <- x\$mile1/15> x\$clo3 <- x\$mile4/15> options(digits=2)> head(x,2)      name  matric tests final mile1 mile4  prog sect total grade clo1 clo2 clo31 STUDENT1 MATRIC1   9.1    32    14  15.0 2SKEE    1    70    B+ 0.58 0.95 1.002 STUDENT2 MATRIC2  12.3    22    13   9.2 2SKEE    1    57    C+ 0.50 0.87 0.62CLO results for different sections: use the last 3 columns.> aggregate(x,by=list(x\$sect),mean)   Group.1 name matric tests final mile1 mile4 prog sect total grade clo1 clo2 clo31        1   NA     NA    12    30    14    14   NA    1    70    NA 0.61 0.91 0.932        2   NA     NA    12    31    14    14   NA    2    71    NA 0.62 0.94 0.903        3   NA     NA    13    30    15    13   NA    3    71    NA 0.62 1.00 0.854        4   NA     NA    12    29    15    14   NA    4    70    NA 0.58 0.99 0.945        5   NA     NA    13    36    13     9   NA    5    71    NA 0.70 0.86 0.606        6   NA     NA    15    40    14    13   NA    6    82    NA 0.79 0.90 0.857        7   NA     NA    12    33    12    12   NA    7    68    NA 0.63 0.80 0.788        8   NA     NA    11    31    12    12   NA    8    66    NA 0.60 0.82 0.809        9   NA     NA    14    35    14    13   NA    9    77    NA 0.70 0.95 0.8910      10   NA     NA    14    35    14    14   NA   10    77    NA 0.70 0.94 0.9611      11   NA     NA    16    41    14    13   NA   11    83    NA 0.80 0.91 0.8512      12   NA     NA    12    32    14    13   NA   12    71    NA 0.63 0.91 0.8713      13   NA     NA    12    29    14    13   NA   13    68    NA 0.58 0.92 0.89There were 50 or more warnings (use warnings() to see the first 50)Ignore the warnings. We got all the numbers we need.CLO results for all sections:> mean(x\$clo1) 0.66> mean(x\$clo2) 0.92> mean(x\$clo3) 0.87Find how many students achieved KPI of 0.4:> sum(x\$clo1>.4) 314> sum(x\$clo2>.4) 343> sum(x\$clo3>.4) 329The academic programs differ by the 5th character in the prog column. We can create the e, l, and m data frames (for SKEE, SKEL and SKEM) as subsets of the x data frame. The following commands extract the analysis for SKEE.> e <- subset(x,'E'==substr(x\$prog,5,5))> mean(e\$clo1) 0.62> mean(e\$clo2) 0.94> mean(e\$clo3) 0.87The SKEL and SKEM has 3 and 4 subgroups, respectively. The SKEL and SKEM subsets can be extracted using the subset or filter function. Here two more ways to perform the same operation s are shown.> l <- filter(x, x\$prog=="1SKEL" | x\$prog == "2SKEL" | x\$prog=="4SKEL")> m = subset(x, substr(x\$prog,5,5) == 'M)The filter function does not seem to work all the time. Both filter and subset have subtle differences. Subset is part of the base R, filter is part the dplyr package. After subsetting, find the means for SKEL and SKEM (shown below):> mean(m\$clo2) 0.92etc...We can also perform subgroup analysis without first extracting a subset. We can just use the x dataframe directly. To find the mean for the variable 'clo1' for the 'SKEE' cohort, look for the character 'E' in the 5th position.> mean(x\$clo1) # mean CLO1 for all students 0.66> mean(x\$clo1['E'==substr(x\$prog,5,5)]) # mean CLO1 for SKEE only 0.62> mean(x\$clo1['L'==substr(x\$prog,5,5)]) 0.69> mean(x\$clo1['M'==substr(x\$prog,5,5)]) 0.68etc.. Repeat for CLO2 and CLO3. To get the mean for only 2SKEE students, the command is:> mean(x\$clo1[x\$prog=='2SKEE']) 0.62All the numbers required for reporting are now ready for presentation using PowerPoint. If you want R to create plots, use the ggplot2 package. That will be my homework for the future.