mschecht
5/8/2018 - 5:32 PM

subsample_ord

## 7. Permutate subsampling of larger matrices

Now lets subsample the "all" and the "knowns" to have the same number of components as "unknowns"

### Matrix subsampling function
```{r}
phyloseq_subsample <- function(phyloseq_obj) {
  subsample_size <- taxa_names(unk_physeq) %>% length() # get number of variables in unk matrix
  matrix <- phyloseq:::veganifyOTU(phyloseq_obj) # Pull out matrix from phyloseq
  sub <- sample(x = seq_len(ncol(matrix)), size = subsample_size, replace = FALSE) # create vector of subsampled variables from larger matrix
  matrix_sub <- matrix[,sub] # subset matrix
  new_phyloseq <- phyloseq(otu_table(matrix_sub, taxa_are_rows = FALSE), sample_data(contex)) # Remake the phyloseq object
  ord <- ordinate(new_phyloseq, method = "RDA") # run PCA
  PCs <- ord$CA$eig %>% as.matrix() # extract PCs
  PC1 <- PCs[1] # grab PC1
  return(PC1)
}  
```

### Permuate the "Knowns" and "All" matrices
```{r}
known_sub <- replicate(n = 10, phyloseq_subsample(known_physeq))
all_sub <- replicate(n = 10, phyloseq_subsample(all_physeq))
```

## Explore permutation
```{r}
summary(known_sub)
as_tibble(known_sub) %>%
  gghistogram(x = "value", fill = "lightgray", title = "Knowns")
```

## Explore permutation
```{r}
summary(all_sub)
as_tibble(all_sub) %>%
  gghistogram(x = "value", fill = "lightgray", title = "All categories")
```