luca2
7/7/2017 - 1:27 PM

any trick, or trivial way to do thing that I easily tend to forget, about most used R packages

any trick, or trivial way to do thing that I easily tend to forget, about most used R packages

glimpse(dt)

dt %>%
    filter(condition) %>%
    select() %>%
    mutate() %>%

delete multiple columns at once

cols2del <- c('v1', 'v2', 'v3')
dt[, (cols2del) := NULL]

summarizing multiple columns

  • sum all columns
    dt[, lapply(.SD, sum, na.rm = TRUE), .SDcols = names(dt)]
    
  • sum all columns except the id:
    dt[, lapply(.SD, sum, na.rm = TRUE), .SDcols = setdiff(names(dt), 'id')]
    

count number of NA in dt (or whatever other condition)

  • only one column: dt[, sum(is.na(var))]
  • some columns: dt[, sum(is.na(var))]
  • all table: dt[, sum(is.na(var))]

eliminate duplicates keeping only the first (last) given the order

  • dt[order(V1, V2)][, .SD[1], id]
  • dt[order(V1, V2)][, .SD[.N], id]

select first n rows within group

There are different methods:

  • dt[, head(.SD, n), grp]
  • dt[, .SD[1:n], grp]
  • dt[dt[, .I[1:n], grp]$V1] this is the fastest, despite being a bit weird to write down

View all 35 palettes, grouped by their main use:

display.brewer.all()

Uses of palettes

  • Sequential
    Suited to ordered data that progress from low to high. Lightness steps dominate the look of these schemes, with light colors for low data values to dark colors for high data values.

  • Qualitative
    Qualitative schemes are best suited to representing nominal or categorical data.do not imply magnitude differences between legend classes, and hues are used to create the primary visual differences between classes.

  • Diverging
    Put equal emphasis on mid-range critical values and extremes at both ends of the data range. The critical class or break in the middle of the legend is emphasized with light colors and low and high extremes are emphasized with dark colors that have contrasting hues.