Introduction

In this script, we will analyze correlation between ordination axes values and foods.


Load functions and packages

Name the path to DietR directory where input files are pulled.

main_wd <- "~/GitHub/DietR"

Load necessary packages.

library(ggplot2)

Load the necessary functions.

source("lib/specify_data_dir.R")
source("lib/corr.axes.foods.R")
source("lib/ggplot2themes.R")

Load the distinct 100 colors for use.

distinct100colors <- readRDS("lib/distinct100colors.rda")

You can come back to the main directory by:

setwd(main_wd)

Specify the directory where the data is.

SpecifyDataDirectory(directory.name = "eg_data/NHANES/Laboratory_data/Ordination/")


Weighted unifrac distance ordination results

From sorted IFC table, generate a table of total amount of food consumed by all the individuals, and a table with correlation coefficients, p-values, and q-values with a desired threshold between food items and Axes that were saved in the ordination section. This function will print out the food items’ names while calculating the correlation measures.

Be careful not to confuse WEIGHTED and UNweighted unifrac distances as you name the files.

CorrAxesFood(food_ifc_soted = "../Foodtree/Food_D12_FC_QC_demo_QCed_males60to79_3Lv.food.ifc_sorted.txt", 
             AmountSums_out_fn =          "Food_D12_FC_QC_demo_QCed_males60to79_3Lv_AmountSums.txt",
             qval_threshold = 0.05,
             meta_users =            "Food_D12_FC_QC_demo_QCed_males60to79_3Lv_ord_WEIGHTED_meta_users.txt",
             corr_axes_foods_outfn = "Food_D12_FC_QC_demo_QCed_males60to79_3Lv_ord_WEIGHTED_corr_axes_foods_thr0.05.txt")
Argument Description
food_ifc_soted xxx.food.ifc.sorted.txt file, saved in the ordination section
AmountSums_out_fn Output filename to be saved which has the total consumption amount of each food
qval_threshold q-value threshold to call a correlation significant
meta_users xxx.meta_users.txt file, waved in the ordination section
corr_axes_foods_outfn Output filename to be saved which has the correlation between foods and Axes



Load and analyze the output

dat <- read.delim("Food_D12_FC_QC_demo_QCed_males60to79_3Lv_ord_WEIGHTED_corr_axes_foods_thr0.05.txt")

Check the number of food items with significant q-values.

nrow(subset(dat, Significance=="*"))
## [1] 31

There are 31 food items correlated with particular food items.

Show only food items that are significantly correlated with one of the axes.

subset(dat, Significance=="*")
##                                              Food    Axis correlation
## 1                                       Water tap  Axis.1  -0.8487995
## 2                       Water bottled unsweetened  Axis.1   0.8097436
## 3                                   Coffee brewed  Axis.3  -0.7107605
## 4                                 Soft drink cola  Axis.5  -0.5557616
## 5                                   Coffee brewed  Axis.2   0.5292725
## 6                                      Beer light  Axis.6   0.4848544
## 7                                      Milk whole  Axis.2   0.4815899
## 8                             Milk reduced fat 2   Axis.2   0.4609677
## 9                                      Beer light  Axis.5   0.4542099
## 10                                  Coffee brewed  Axis.4   0.4453274
## 11                                Soft drink cola  Axis.4  -0.3908675
## 12                                Soft drink cola  Axis.6   0.3809187
## 13                      Water bottled unsweetened  Axis.4   0.3594993
## 14                                           Beer  Axis.7   0.3520281
## 15 Ice cream regular flavors other than chocolate Axis.10  -0.3476261
## 16                            Milk reduced fat 2   Axis.9  -0.3410168
## 17                            Milk fat free skim   Axis.8   0.3205856
## 18 Tea iced brewed black pre sweetened with sugar  Axis.6  -0.3165401
## 19                                     Banana raw  Axis.7  -0.3154898
## 20                                     Milk whole Axis.10  -0.3105564
## 21                                           Beer  Axis.9  -0.3065722
## 22                            Milk fat free skim   Axis.5  -0.2857583
## 23                                Soft drink cola  Axis.8  -0.2854799
## 24                                      Water tap  Axis.3   0.2784664
## 25                                      Water tap  Axis.4   0.2763216
## 26                           Soft drink cola diet  Axis.9  -0.2737740
## 27                            Milk reduced fat 2   Axis.3   0.2719744
## 28 Orange juice 100 canned bottled or in a carton Axis.10  -0.2654601
## 29              Tea iced brewed black unsweetened  Axis.6  -0.2622820
## 30                                Milk low fat 1   Axis.9   0.2607144
## 31                            Milk fat free skim   Axis.7  -0.2610995
##            pval         qval Significance
## 1  1.049973e-66 1.761854e-62            *
## 2  4.280100e-56 3.591004e-52            *
## 3  1.299079e-37 7.266184e-34            *
## 4  1.571715e-20 6.593345e-17            *
## 5  1.938004e-18 6.503943e-15            *
## 6  2.560240e-15 7.160137e-12            *
## 7  4.171242e-15 9.999063e-12            *
## 8  8.100359e-14 1.699050e-10            *
## 9  2.051963e-13 3.825770e-10            *
## 10 6.751299e-13 1.132868e-09            *
## 11 4.907704e-10 7.486480e-07            *
## 12 1.445241e-09 2.020928e-06            *
## 13 1.311234e-08 1.692501e-05            *
## 14 2.726032e-08 3.267344e-05            *
## 15 4.158947e-08 4.652476e-05            *
## 16 7.747876e-08 8.125585e-05            *
## 17 4.848080e-07 4.785340e-04            *
## 18 6.862093e-07 6.396995e-04            *
## 19 7.503549e-07 6.626818e-04            *
## 20 1.136569e-06 9.535811e-04            *
## 21 1.580760e-06 1.263103e-03            *
## 22 8.198166e-06 6.108955e-03            *
## 23 8.373419e-06 6.108955e-03            *
## 24 1.416126e-05 9.901082e-03            *
## 25 1.658287e-05 1.113042e-02            *
## 26 1.996896e-05 1.288766e-02            *
## 27 2.274451e-05 1.413529e-02            *
## 28 3.615393e-05 2.166653e-02            *
## 29 4.513158e-05 2.611407e-02            *
## 30 5.029718e-05 2.722538e-02            *
## 31 4.897894e-05 2.722538e-02            *

It is also possible to view each axis separately.

# Select Axis 1 rows
dat_1 <- subset(dat, Axis=="Axis.1")
head(dat_1[order(dat_1$qval), ], 10)
##                            Food   Axis correlation         pval         qval
## 1                     Water tap Axis.1 -0.84879953 1.049973e-66 1.761854e-62
## 2     Water bottled unsweetened Axis.1  0.80974358 4.280100e-56 3.591004e-52
## 51                Coffee brewed Axis.1 -0.21150806 1.079057e-03 3.550310e-01
## 62  Egg whole boiled or poached Axis.1 -0.19837925 2.199611e-03 5.953140e-01
## 68                     Milk NFS Axis.1  0.09053595 1.656542e-01 8.020252e-01
## 78          Milk fat free skim  Axis.1 -0.13590236 3.694459e-02 8.020252e-01
## 87    Buttermilk reduced fat 2  Axis.1  0.10484862 1.081434e-01 8.020252e-01
## 90             Buttermilk whole Axis.1  0.10484862 1.081434e-01 8.020252e-01
## 93   Kefir NS as to fat content Axis.1 -0.10772118 9.876618e-02 8.020252e-01
## 109                   Rice milk Axis.1  0.10389110 1.114203e-01 8.020252e-01
##     Significance
## 1              *
## 2              *
## 51              
## 62              
## 68              
## 78              
## 87              
## 90              
## 93              
## 109
# Select Axis 2 rows and sort by qval.  
dat_2 <- subset(dat, Axis=="Axis.2")
head(dat_2[order(dat_2$qval), ], 10)
##                              Food   Axis correlation         pval         qval
## 5                   Coffee brewed Axis.2   0.5292725 1.938004e-18 6.503943e-15
## 7                      Milk whole Axis.2   0.4815899 4.171242e-15 9.999063e-12
## 8             Milk reduced fat 2  Axis.2   0.4609677 8.100359e-14 1.699050e-10
## 40            Milk fat free skim  Axis.2  -0.2317225 3.308411e-04 1.387878e-01
## 45         Tea iced bottled black Axis.2   0.2211288 6.228534e-04 2.322551e-01
## 53 Sugar white granulated or lump Axis.2   0.2063469 1.435065e-03 4.543468e-01
## 58          Coffee creamer powder Axis.2   0.2008213 1.932983e-03 5.592321e-01
## 66                      Water tap Axis.2  -0.1920650 3.051289e-03 7.757670e-01
## 85          Buttermilk low fat 1  Axis.2   0.1086787 9.578786e-02 8.020252e-01
## 98                       Soy milk Axis.2  -0.0818681 2.101663e-01 8.020252e-01
##    Significance
## 5             *
## 7             *
## 8             *
## 40             
## 45             
## 53             
## 58             
## 66             
## 85             
## 98
# Select Axis 3 rows and sort by qval.  
dat_3 <- subset(dat, Axis=="Axis.3")
head(dat_3[order(dat_3$qval), ], 10)
##                               Food   Axis correlation         pval         qval
## 3                    Coffee brewed Axis.3 -0.71076052 1.299079e-37 7.266184e-34
## 24                       Water tap Axis.3  0.27846636 1.416126e-05 9.901082e-03
## 27             Milk reduced fat 2  Axis.3  0.27197439 2.274451e-05 1.413529e-02
## 47                       Apple raw Axis.3  0.22009987 6.613055e-04 2.361001e-01
## 50       Water bottled unsweetened Axis.3  0.21549415 8.618017e-04 2.896430e-01
## 70                      Milk whole Axis.3  0.08554006 1.903598e-01 8.020252e-01
## 75                 Milk low fat 1  Axis.3 -0.10695949 1.011874e-01 8.020252e-01
## 117      Yogurt low fat milk plain Axis.3  0.10484862 1.081434e-01 8.020252e-01
## 124 Yogurt Greek nonfat milk plain Axis.3  0.08761323 1.797989e-01 8.020252e-01
## 144 Yogurt Greek nonfat milk fruit Axis.3  0.08761323 1.797989e-01 8.020252e-01
##     Significance
## 3              *
## 24             *
## 27             *
## 47              
## 50              
## 70              
## 75              
## 117             
## 124             
## 144
# Select Axis 4 rows and sort by qval.  
dat_4 <- subset(dat, Axis=="Axis.4")
head(dat_4[order(dat_4$qval), ], 10)
##                                Food   Axis correlation         pval
## 10                    Coffee brewed Axis.4  0.44532739 6.751299e-13
## 11                  Soft drink cola Axis.4 -0.39086753 4.907704e-10
## 13        Water bottled unsweetened Axis.4  0.35949933 1.311234e-08
## 25                        Water tap Axis.4  0.27632163 1.658287e-05
## 36                       Beer light Axis.4 -0.24342502 1.589345e-04
## 65                   Muffin English Axis.4  0.19229201 3.016107e-03
## 71                       Milk whole Axis.4 -0.09027923 1.668623e-01
## 94       Kefir NS as to fat content Axis.4  0.08378314 1.996592e-01
## 119 Yogurt Greek low fat milk plain Axis.4 -0.07708049 2.381587e-01
## 129       Yogurt low fat milk fruit Axis.4  0.08509615 1.926789e-01
##             qval Significance
## 10  1.132868e-09            *
## 11  7.486480e-07            *
## 13  1.692501e-05            *
## 25  1.113042e-02            *
## 36  7.408112e-02             
## 65  7.757670e-01             
## 71  8.020252e-01             
## 94  8.020252e-01             
## 119 8.020252e-01             
## 129 8.020252e-01


Unweighted unifrac distance ordination results

xxx_AmountSums.txt will be generated again, but its content will be the same regardless of which distance method (weighted or unweighted unifrac or else) was used, as long as the food.ifc_sorted is the same.

CorrAxesFood(food_ifc_soted = "../Foodtree/Food_D12_FC_QC_demo_QCed_males60to79_3Lv.food.ifc_sorted.txt",
             AmountSums_out_fn = "Food_D12_FC_QC_demo_QCed_males60to79_3Lv_AmountSums.txt",
             qval_threshold = 0.05,
             meta_users =            "Food_D12_FC_QC_demo_QCed_males60to79_3Lv_ord_UNweighted_meta_users.txt",
             corr_axes_foods_outfn = "Food_D12_FC_QC_demo_QCed_males60to79_3Lv_ord_UNweighted_corr_axes_foods_thr0.05.txt")



Load and analyze the output

UNweighted can be viewed in the same way.

dat <- read.delim("Food_D12_FC_QC_demo_QCed_males60to79_3Lv_ord_UNweighted_corr_axes_foods_thr0.05.txt")

Check the number of food items with significant q-values.

nrow(subset(dat, Significance=="*"))
## [1] 78

There are as many as 78 food items at alpha=0.05, so it will be probably better to look at most significant food items or a particular Principal Coordinate Axis of your interest.

Show the first 10 food items that are significantly correlated with one of the axes.

head(subset(dat, Significance=="*"), 10)
##                                              Food    Axis correlation
## 1                       Water bottled unsweetened  Axis.3   0.7070852
## 2                                    Tomatoes raw  Axis.4  -0.6803295
## 3  Ice cream regular flavors other than chocolate  Axis.9   0.5728066
## 4                                     Lettuce raw  Axis.4  -0.5633409
## 5                  Sugar white granulated or lump  Axis.2   0.5475776
## 6                             Milk reduced fat 2   Axis.1  -0.5377079
## 7                                 Cheese American  Axis.2   0.5362031
## 8                                   Coffee brewed  Axis.3  -0.5349615
## 9                                      Milk whole  Axis.1  -0.5293084
## 10          Ham prepackaged or deli luncheon meat Axis.10  -0.5131655
##            pval         qval Significance
## 1  4.448087e-37 7.463889e-33            *
## 2  2.010864e-33 1.687115e-29            *
## 3  5.613733e-22 3.139948e-18            *
## 4  3.657487e-21 1.534316e-17            *
## 5  7.281465e-20 2.443660e-16            *
## 6  4.379006e-19 1.224662e-15            *
## 7  5.727158e-19 1.372882e-15            *
## 8  7.139692e-19 1.497550e-15            *
## 9  1.925940e-18 3.590808e-15            *
## 10 2.964411e-17 4.974282e-14            *



Come back to the main directory.

setwd(main_wd)