Part 1: Data

The data collected by The Behavioral Risk Factor Surveillance System (BRFSS) seems to be a randomly selected sample of the US population. Between 2013 - 2014, the population of the US was approximately 316 - 318 million. The sample size of 491,775 people interviewed is approximately 0.15%. I have some reservations on the nature of the selection process, which was not specified completely.

First, approximately 1.5% of the US population may not have been represented, as “Overall, an estimated 97.5% of US households had telephone service in 2012.”.

Second, the distribution frequency among the residents interviewed in each state could be a misrepresentation. The vast majority of states had over 5,000 residents interviewed. However, if this were a more reflective representation, the states with the highest numbers of interviewed people would be, in order, California, Texas, Florida, and New York, the most populous states in 2013. However, Florida is vastly overrepresented with 34,186 interviews, followed by Kansas (23,282), Nebraska (17,139), and Massachusetts(15,071).

This was an observational study, as no hypothesis, controls, nor confounding variables were specified beforehand. We can use this data to generalize the United States, but not make causal arguments.


Part 2: Research questions

Research question 1:

Interview Month Frequency varies between 34,172 in January to 44,452 in March. It seems to me that people would have less time in the spring and summer months, as people are more likely to have kids at home, as well as have previous engagements. In the fall and winter months I would guess people would have more time to complete the interview, but some would again have to spend more time with their kids, or more time shopping for gifts.

Is there a relationship between the Final Disposition (whether or not the interview was completed), the Interview Month, and number of Children?

Research question 2:

I am genuinely curious as to how aware people are of their own health. Are they accurate and not in denial?

Is there a relationship between opinion of ones General Health and having been diagnosed with High Blood Cholesterol and a Heart Attack?

Research question 3:

Numerous studies link sleep with diabetes, as hormones play an important role during rest, influencing glucose regulation. I have sourced two such studies.

Impact of sleep and sleep loss on glucose homeostasis and appetite regulation

Role of sleep duration in the regulation of glucose metabolism and appetite

Is this relationship of Duration of Sleep and Diabetes consistent in the BRFSS dataset?


Part 3: Exploratory data analysis

Research question 1:

Is there a relationship between the Final Disposition (whether or not the interview was completed), the Interview Month, and number of Children?

I first selected only the three columns I am observing, imonth, dispcode, and children. Here is the summary.

      imonth                                dispcode         children      
 March   : 44476   Completed interview          :433222   Min.   : 0.0000  
 July    : 43667   Partially completed interview: 58548   1st Qu.: 0.0000  
 April   : 42936   NA's                         :     5   Median : 0.0000  
 February: 42867                                          Mean   : 0.5167  
 August  : 42301                                          3rd Qu.: 1.0000  
 (Other) :275525                                          Max.   :47.0000  
 NA's    :     3                                          NA's   :2274     

3 NAs are from imonth, 5 NAs are from dispcode, and 2,274 NAs are from Refused, and [Missing] from children. I removed these entries. From the tally of children, we see that the vast majority of children in the households interviewed are 0 - 3. 4 or more children will be grouped as one count.

        Month  Children                   Disposition Count
1     January         0           Completed interview 22529
2     January         0 Partially completed interview  2457
3     January         1           Completed interview  3290
4     January         1 Partially completed interview   471
5     January         2           Completed interview  2866
6     January         2 Partially completed interview   404
7     January         3           Completed interview  1145
8     January         3 Partially completed interview   178
9     January 4 or More           Completed interview   617
10    January 4 or More Partially completed interview   103
11   February         0           Completed interview 27817
12   February         0 Partially completed interview  3271
13   February         1           Completed interview  4133
14   February         1 Partially completed interview   593
15   February         2           Completed interview  3596
16   February         2 Partially completed interview   585
17   February         3           Completed interview  1542
18   February         3 Partially completed interview   252
19   February 4 or More           Completed interview   789
20   February 4 or More Partially completed interview   137
21      March         0           Completed interview 28662
22      March         0 Partially completed interview  3446
23      March         1           Completed interview  4382
24      March         1 Partially completed interview   709
25      March         2           Completed interview  3777
26      March         2 Partially completed interview   657
27      March         3           Completed interview  1458
28      March         3 Partially completed interview   274
29      March 4 or More           Completed interview   746
30      March 4 or More Partially completed interview   145
31      April         0           Completed interview 27909
32      April         0 Partially completed interview  3317
33      April         1           Completed interview  4051
34      April         1 Partially completed interview   681
35      April         2           Completed interview  3524
36      April         2 Partially completed interview   640
37      April         3           Completed interview  1501
38      April         3 Partially completed interview   258
39      April 4 or More           Completed interview   725
40      April 4 or More Partially completed interview   128
41        May         0           Completed interview 25909
42        May         0 Partially completed interview  3395
43        May         1           Completed interview  3828
44        May         1 Partially completed interview   667
45        May         2           Completed interview  3300
46        May         2 Partially completed interview   594
47        May         3           Completed interview  1394
48        May         3 Partially completed interview   271
49        May 4 or More           Completed interview   704
50        May 4 or More Partially completed interview   136
51       June         0           Completed interview 25179
52       June         0 Partially completed interview  2838
53       June         1           Completed interview  3507
54       June         1 Partially completed interview   581
55       June         2           Completed interview  2911
56       June         2 Partially completed interview   503
57       June         3           Completed interview  1257
58       June         3 Partially completed interview   226
59       June 4 or More           Completed interview   633
60       June 4 or More Partially completed interview   107
61       July         0           Completed interview 28506
62       July         0 Partially completed interview  3396
63       July         1           Completed interview  4054
64       July         1 Partially completed interview   622
65       July         2           Completed interview  3652
66       July         2 Partially completed interview   535
67       July         3           Completed interview  1511
68       July         3 Partially completed interview   247
69       July 4 or More           Completed interview   789
70       July 4 or More Partially completed interview   130
71     August         0           Completed interview 27700
72     August         0 Partially completed interview  3207
73     August         1           Completed interview  3964
74     August         1 Partially completed interview   659
75     August         2           Completed interview  3419
76     August         2 Partially completed interview   582
77     August         3           Completed interview  1439
78     August         3 Partially completed interview   229
79     August 4 or More           Completed interview   789
80     August 4 or More Partially completed interview   143
81  September         0           Completed interview 25223
82  September         0 Partially completed interview  3014
83  September         1           Completed interview  3527
84  September         1 Partially completed interview   545
85  September         2           Completed interview  3134
86  September         2 Partially completed interview   508
87  September         3           Completed interview  1329
88  September         3 Partially completed interview   231
89  September 4 or More           Completed interview   742
90  September 4 or More Partially completed interview   120
91    October         0           Completed interview 27887
92    October         0 Partially completed interview  3271
93    October         1           Completed interview  3721
94    October         1 Partially completed interview   602
95    October         2           Completed interview  3351
96    October         2 Partially completed interview   578
97    October         3           Completed interview  1476
98    October         3 Partially completed interview   278
99    October 4 or More           Completed interview   755
100   October 4 or More Partially completed interview   145
101  November         0           Completed interview 27160
102  November         0 Partially completed interview  3595
103  November         1           Completed interview  3727
104  November         1 Partially completed interview   676
105  November         2           Completed interview  3334
106  November         2 Partially completed interview   643
107  November         3           Completed interview  1416
108  November         3 Partially completed interview   318
109  November 4 or More           Completed interview   676
110  November 4 or More Partially completed interview   153
111  December         0           Completed interview 26359
112  December         0 Partially completed interview  3431
113  December         1           Completed interview  3584
114  December         1 Partially completed interview   634
115  December         2           Completed interview  3025
116  December         2 Partially completed interview   564
117  December         3           Completed interview  1342
118  December         3 Partially completed interview   256
119  December 4 or More           Completed interview   740
120  December 4 or More Partially completed interview   152

The plot below graphs the data from the summary above. This representation highlights that most of the households interviewed had no children, and as the number of children increase the number of households decreases. The number of interviews conducted seems to be fairly spread amongst the months. The number of interviews that were not completed seems to be a consistent small percentage, regardless of children and months.

I was not satisfied with the above plot, so I decided to explore the data again. The Counts of Completed Interviews and Partially Completed Interviews were used to calculate a Percent. The data is shown below.

       Month  Children   Percent
1    January         0 0.9016649
2    January         1 0.8747673
3    January         2 0.8764526
4    January         3 0.8654573
5    January 4 or More 0.8569444
6   February         0 0.8947826
7   February         1 0.8745239
8   February         2 0.8600813
9   February         3 0.8595318
10  February 4 or More 0.8520518
11     March         0 0.8926747
12     March         1 0.8607346
13     March         2 0.8518268
14     March         3 0.8418014
15     March 4 or More 0.8372615
16     April         0 0.8937744
17     April         1 0.8560862
18     April         2 0.8463016
19     April         3 0.8533258
20     April 4 or More 0.8499414
21       May         0 0.8841455
22       May         1 0.8516129
23       May         2 0.8474576
24       May         3 0.8372372
25       May 4 or More 0.8380952
26      June         0 0.8987044
27      June         1 0.8578767
28      June         2 0.8526655
29      June         3 0.8476062
30      June 4 or More 0.8554054
31      July         0 0.8935490
32      July         1 0.8669803
33      July         2 0.8722235
34      July         3 0.8594994
35      July 4 or More 0.8585419
36    August         0 0.8962371
37    August         1 0.8574519
38    August         2 0.8545364
39    August         3 0.8627098
40    August 4 or More 0.8465665
41 September         0 0.8932606
42 September         1 0.8661591
43 September         2 0.8605162
44 September         3 0.8519231
45 September 4 or More 0.8607889
46   October         0 0.8950189
47   October         1 0.8607449
48   October         2 0.8528888
49   October         3 0.8415051
50   October 4 or More 0.8388889
51  November         0 0.8831084
52  November         1 0.8464683
53  November         2 0.8383203
54  November         3 0.8166090
55  November 4 or More 0.8154403
56  December         0 0.8848271
57  December         1 0.8496918
58  December         2 0.8428532
59  December         3 0.8397997
60  December 4 or More 0.8295964

The plot below graphs the data from the summary above. Some trends are now clearly visible. The households with 0 children have the highest percentage of completion, and it seems that as children increase, the percentage drops slightly.

November has a noticeably lower rate of completion than the other months, which could possibly be attributed to traveling for Thanksgiving and preparation for Christmas. December also has a low rate of completion, possibly because of Christmas and New Year. January and February have relatively high rates of completion, which could possibly be attributed to end of the holiday season, colder weather, and people mainly spending time at home.

It should be emphasized that this completion rate is between 81.5% and 90.2%, regardless of number of children and month.

Research question 2:

Is there a relationship between opinion of ones General Health and having been diagnosed with High Blood Cholesterol and a Heart Attack?

I first selected only the three columns I am observing, genhlth, toldhi2, and cvdinfr4. Here is the summary.

      genhlth       toldhi2       cvdinfr4     
 Excellent: 85482   Yes :183501   Yes : 29284  
 Very good:159076   No  :236612   No  :459904  
 Good     :150555   NA's: 71662   NA's:  2587  
 Fair     : 66726                              
 Poor     : 27951                              
 NA's     :  1985                              

1,985 NAs are from Don’t know/Not Sure, Refused, and [Missing] from genhlth, 71,662 NAs are from Don’t know/Not Sure, Refused, and [Missing] from toldhi2, and 2,587 NAs are from Don’t know/Not Sure, Refused, and [Missing] from cvdinfr4. I removed these entries. toldhi2 and cvdinfr4 were combined into one column. The columns were renamed.

      Health   Ch_HA Count
1  Excellent   No_No 51752
2  Excellent  No_Yes   471
3  Excellent  Yes_No 18439
4  Excellent Yes_Yes   589
5  Very good   No_No 81654
6  Very good  No_Yes  1478
7  Very good  Yes_No 50502
8  Very good Yes_Yes  2590
9       Good   No_No 63254
10      Good  No_Yes  2629
11      Good  Yes_No 55098
12      Good Yes_Yes  5985
13      Fair   No_No 22152
14      Fair  No_Yes  2222
15      Fair  Yes_No 27334
16      Fair Yes_Yes  5705
17      Poor   No_No  7889
18      Poor  No_Yes  1444
19      Poor  Yes_No 10770
20      Poor Yes_Yes  4579

I was not satisfied with the above plot, so I decided to explore the data again. I created a stacked barchart by percents.

We can clearly see that the Interviewees’ opinion of their health consistently corresponds with rates of being diagnosed with neither high cholesterol nor heart attacks. Inversely, it also consistently corresponds with being diagnosed with both high cholesterol and heart attacks.

Without doing an in depth analysis, this graph is very telling. People seem to be somewhat aware of their own health.

Research question 3:

Is this relationship of Duration of Sleep and Diabetes consistent in the BRFSS dataset?

I first selected only the two columns I am observing, sleptim1 and diabete3. Here is the summary.

    sleptim1                                             diabete3     
 Min.   :  0.000   Yes                                       : 62363  
 1st Qu.:  6.000   Yes, but female told only during pregnancy:  4602  
 Median :  7.000   No                                        :415374  
 Mean   :  7.052   No, pre-diabetes or borderline diabetes   :  8604  
 3rd Qu.:  8.000   NA's                                      :   832  
 Max.   :450.000                                                      
 NA's   :7387                                                         

7,387 NAs are from Don’t know/Not Sure and Refused from sleptim1, and 832 NAs are from Don’t know/Not Sure, Refused, and [Missing] from diabete3. I removed these entries. There were also one interview each where the recorded entry for sleep 103 and 450. Both of these observations were eliminated. The columns were renamed.

   Sleep                                   Diabetes  Count
1      1                                        Yes     49
2      1 Yes, but female told only during pregnancy      1
3      1                                         No    170
4      1    No, pre-diabetes or borderline diabetes      5
5      2                                        Yes    251
6      2 Yes, but female told only during pregnancy     14
7      2                                         No    766
8      2    No, pre-diabetes or borderline diabetes     34
9      3                                        Yes    749
10     3 Yes, but female told only during pregnancy     42
11     3                                         No   2588
12     3    No, pre-diabetes or borderline diabetes    102
13     4                                        Yes   2783
14     4 Yes, but female told only during pregnancy    154
15     4                                         No  10908
16     4    No, pre-diabetes or borderline diabetes    374
17     5                                        Yes   5272
18     5 Yes, but female told only during pregnancy    355
19     5                                         No  27045
20     5    No, pre-diabetes or borderline diabetes    686
21     6                                        Yes  13282
22     6 Yes, but female told only during pregnancy   1115
23     6                                         No  89692
24     6    No, pre-diabetes or borderline diabetes   1948
25     7                                        Yes  13574
26     7 Yes, but female told only during pregnancy   1322
27     7                                         No 125286
28     7    No, pre-diabetes or borderline diabetes   2114
29     8                                        Yes  17440
30     8 Yes, but female told only during pregnancy   1182
31     8                                         No 120013
32     8    No, pre-diabetes or borderline diabetes   2279
33     9                                        Yes   3379
34     9 Yes, but female told only during pregnancy    203
35     9                                         No  19754
36     9    No, pre-diabetes or borderline diabetes    433
37    10                                        Yes   2512
38    10 Yes, but female told only during pregnancy     88
39    10                                         No   9185
40    10    No, pre-diabetes or borderline diabetes    285
41    11                                        Yes    160
42    11 Yes, but female told only during pregnancy     12
43    11                                         No    634
44    11    No, pre-diabetes or borderline diabetes     25
45    12                                        Yes    887
46    12 Yes, but female told only during pregnancy     30
47    12                                         No   2671
48    12    No, pre-diabetes or borderline diabetes     80
49    13                                        Yes     33
50    13 Yes, but female told only during pregnancy      3
51    13                                         No    163
52    14                                        Yes    107
53    14 Yes, but female told only during pregnancy      6
54    14                                         No    321
55    14    No, pre-diabetes or borderline diabetes     10
56    15                                        Yes    113
57    15 Yes, but female told only during pregnancy      2
58    15                                         No    241
59    15    No, pre-diabetes or borderline diabetes     10
60    16                                        Yes     92
61    16 Yes, but female told only during pregnancy      2
62    16                                         No    266
63    16    No, pre-diabetes or borderline diabetes      8
64    17                                        Yes      5
65    17 Yes, but female told only during pregnancy      1
66    17                                         No     28
67    17    No, pre-diabetes or borderline diabetes      1
68    18                                        Yes     46
69    18 Yes, but female told only during pregnancy      2
70    18                                         No    111
71    18    No, pre-diabetes or borderline diabetes      5
72    19                                        Yes      4
73    19                                         No      9
74    20                                        Yes     23
75    20 Yes, but female told only during pregnancy      1
76    20                                         No     37
77    20    No, pre-diabetes or borderline diabetes      3
78    21                                        Yes      2
79    21                                         No      1
80    22                                        Yes      4
81    22                                         No      6
82    23                                        Yes      1
83    23                                         No      2
84    23    No, pre-diabetes or borderline diabetes      1
85    24                                        Yes      9
86    24 Yes, but female told only during pregnancy      1
87    24                                         No     24
88    24    No, pre-diabetes or borderline diabetes      1
       Sleep                                   Diabetes  Count
1  5 or Less                                        Yes   9104
2  5 or Less Yes, but female told only during pregnancy    566
3  5 or Less                                         No  41477
4  5 or Less    No, pre-diabetes or borderline diabetes   1201
5          6                                        Yes  13282
6          6 Yes, but female told only during pregnancy   1115
7          6                                         No  89692
8          6    No, pre-diabetes or borderline diabetes   1948
9          7                                        Yes  13574
10         7 Yes, but female told only during pregnancy   1322
11         7                                         No 125286
12         7    No, pre-diabetes or borderline diabetes   2114
13         8                                        Yes  17440
14         8 Yes, but female told only during pregnancy   1182
15         8                                         No 120013
16         8    No, pre-diabetes or borderline diabetes   2279
17 9 or More                                        Yes   7377
18 9 or More Yes, but female told only during pregnancy    351
19 9 or More                                         No  33453
20 9 or More    No, pre-diabetes or borderline diabetes    862

Upon inspection of the summary, one can clearly see that the vast majority of interviewees reported 6-8 hours of sleep. Therefore, I grouped 1-5 hours as “5 or Less” and 9-24 hours as “9 or More”. The second above output is that new modified summary.

I created two plots below, corresponding to the summaries. On the left is the unmodified data, by percent The plot below on the right is the modified data with only five categories for hours of sleep, in absolute number of interviews.

From the graph on the left, one can see that the lowest rate of diabetes corresponds to those who reported 7 hours of sleep. Coincidentally, this is also the most common hours of sleep reported. 6 and 8 hours of sleep have the lowest rates of diabetes and are also the most common, after 7 hours. One can see a sharp increase in the rate of diabetes for those who reported 5 hours or less. The trend for 9 or more hours follows no obvious pattern, but overall it seems that too much sleep/being sedentary is worse than too little sleep.

It is important that 9 or more contains the smallest number of interviews spread across the widest range of hours of sleep.

Without doing any specific, in depth analysis, the preliminary exploratory plots here, comparing sleep to diabetes, agree with the common saying of 7-8 hours of sleep a night for adults.


Appendix

All code for each research question is included here.

Research question 1 :

Plot : Number of Household Interviews by Children and Month

#Helper Function to create a two line plot of completed and partially completed interviews based on number
#of children entered.
plot_Q1 <- function(df, children, boolean) {
  g <- ggplot(df %>% filter(Children == children), aes(x = Month, y = Count, color = Disposition))
  g <- g + geom_point(size = 5) + geom_line(aes(group = Disposition), size = 1.2)
  #Viridis Color Scheme.
  g <- g + scale_color_viridis_d()
  #Boolean is FALSE for the first 4 graphs.  No X-axis labels and no legend. The last graph defines those.
  if (boolean == FALSE) {
    #Y-axis.
    g <- g + scale_y_continuous(name = paste(children, "Children"),
                              labels = comma)
    #Modify labels and text.
    g <- g + theme(axis.text.x = element_blank(),
                   axis.title.x = element_blank(),
                   axis.title.y = element_text(face = "bold"),
                   legend.position = "none")
  } else {
    #Y-axis.
    g <- g + scale_y_continuous(name = children,
                                labels = comma)
    #Modify labels and text.
    g <- g + theme(axis.text.x = element_text(hjust = 1, size = 12, angle = 45),
                   axis.title.x = element_blank(),
                   axis.title.y = element_text(face = "bold"),
                   legend.title = element_text(face = "bold"),
                   legend.key = element_blank(),
                   legend.position = "bottom")
  }
}
#Main Title.
title <- ggdraw() +
    draw_label("Number of Household Interviews by Children and Month",
               fontface = 'bold', hjust = 0.45, size = 16)

#Alignment of 5 plots.
graphs <- plot_grid(plot_Q1(month_Code_Children, 0, FALSE),
                    plot_Q1(month_Code_Children, 1, FALSE),
                    plot_Q1(month_Code_Children, 2, FALSE),
                    plot_Q1(month_Code_Children, 3, FALSE),
                    plot_Q1(month_Code_Children, "4 or More", TRUE),
                    align = "v", nrow = 5,
                    rel_heights = c(2/13, 2/13, 2/13, 2/13, 5/13))
#Add Title.  
plot_grid(title, graphs, ncol = 1, rel_heights = c(.05, .95))

#Spread Disposition into two new columns, values are the Counts.
month_Code_Children <- month_Code_Children %>%
  spread(key = Disposition, value = Count)
#Two new columns are Yes (Completed) and No (Partially Completed)
colnames(month_Code_Children)  <- c("Month", "Children", "Yes", "No")
#Create a new column Percent from Yes and No.
month_Code_Children <- month_Code_Children %>%
  mutate(Percent = Yes / (Yes + No)) %>%
  select(1,2,5)

#% Completion grouped by Month and Children Summary.
month_Code_Children

Plot : % of Completed Interviews by Children and Month

Research question 2 :

Plot : Cholesterol and Heart Attack Diagnoses vs Opinion of Health

Plot : Cholesterol and Heart Attack Diagnoses vs Opinion of Health

Research question 3 :

Plot : Comparing Quantity of Sleep and Diabetes Diagnosis

#Stacked Barchart, total = 100%.
g5 <- ggplot(sleep_Diabetes1, aes(x = Sleep, y = Count, fill = Diabetes))
g5 <- g5 + geom_bar(position = "fill", stat = "identity")
#No title, but leave space at top.
g5 <- g5 + ggtitle(label = "")
#X-axis.
g5 <- g5 + scale_x_continuous(name = "",
                              breaks = c(6, 12, 18, 24),
                              labels = c("6", "12", "18", "24"),
                              expand = c(0, 0))
#Y-axis.
g5 <- g5 + scale_y_continuous(name = "Percent of Interviews",
                              labels = percent,
                              expand = c(0, 0))
#Viridis Fill Scheme.
g5 <- g5 + scale_fill_viridis_d()
#Modify labels and text. Remove legend from left plot.
g5 <- g5 + theme(plot.title = element_text(hjust = 1, size = 16, face = "bold"),
                 axis.text.x = element_text(size = 12),
                 axis.title.x = element_text(size = 14, face = "bold"),
                 axis.text.y = element_text(size = 12),
                 axis.title.y = element_text(size = 14, face = "bold"),
                 legend.position = "none")

#Stacked Barchart, total = absolute count.
g6 <- ggplot(sleep_Diabetes2, aes(x = Sleep, y = Count, fill = Diabetes))
g6 <- g6 + geom_bar(position = "stack", stat = "identity")
#Shared Title.
g6 <- g6 + ggtitle(label = "Comparing Quantity of Sleep and Diabetes Diagnosis")
#X-axis.
g6 <- g6 + scale_x_discrete(name = "Reported Hours of Sleep Each Night",
                            expand = c(0, 0))
#Y-axis.
g6 <- g6 + scale_y_continuous(name = "Number of Interviews",
                              position = "right",
                              labels = comma,
                              expand = c(0, 0))
#Viridis Fill Scheme.
g6 <- g6 + scale_fill_viridis_d()
#Modify labels and text.
g6 <- g6 + theme(plot.title = element_text(hjust = 1.1, size = 16, face = "bold"),
                 axis.text.x = element_text(size = 12),
                 axis.title.x = element_text(hjust = 3, size = 14, face = "bold"),
                 axis.text.y = element_text(size = 12),
                 axis.title.y = element_text(size = 14, face = "bold"))
#Modify legend orientation.
g6 <- g6 + guides(fill = guide_legend(title = "Diabetes Diagnosis",
                                      title.position = "left",
                                      nrow = 2, byrow = TRUE))
#Modify legend text.  Save as object.
legend3 <- g_legend(g6 + theme(legend.title = element_text(size = 12, face = "bold"),
                               legend.text = element_text(size = 10),
                               legend.position = "right"))
#Remove legend from right plot.
g6 <- g6 + theme(legend.position = "none")
#Arrange left plot, right plot, and legend on the bottom.
grid.arrange(g5, g6, legend3, layout_matrix = matrix(rbind(c(1, 1, 1, 1, 1, 1, NA, 2, 2, 2, 2, 2, 2),
                                                           c(1, 1, 1, 1, 1, 1, NA, 2, 2, 2, 2, 2, 2),
                                                           c(1, 1, 1, 1, 1, 1, NA, 2, 2, 2, 2, 2, 2),
                                                           c(1, 1, 1, 1, 1, 1, NA, 2, 2, 2, 2, 2, 2),
                                                           c(1, 1, 1, 1, 1, 1, NA, 2, 2, 2, 2, 2, 2),
                                                           c(NA, NA, NA, NA, NA, 3, 3, 3, NA, NA, NA, NA, NA)),
                                                     ncol = 13))