Chapelle and Li (2011) use simulations as well as real data set to demonstrate empirically that Thompson Sampling asymptotically outperforms Upper Confidence Bounds for solving Multi-Armed Bandit (MAB) problems. In the following, I will replicate their findings using synthetic data.
The empirical exercises have been implemented in R. To boost computational performance, bottleneck problems have been implemented in C++ through Rcpp. Where appropriate I use object-oriented programming (S3 classes). Code chunks in this document can be shown or hidden through buttons in their top right corner. All code is contained in the ./R folder.
To start off with, let us create the MAB. As in Chapelle and Li (2011) we will consider a Bernoulli MAB: at each step \(T\) the bandit is faced with the problem of choosing one of \(K\) arms, all of which yield rewards that are generated by a Bernoulli distribution. For simulation purposes, two approaches come to mind:
The first approach is perhaps more conventional in the context of benchmarking, since it provides a fair performance comparison of different algorithms with respect to the same data set. But since we will run the simulation many times over random realizations of the Bernoulli MAB, there is no need to have different algorithms work on the exact same realizations (asymptotically they are working with the same bandit problem).
The second approach is arguably more closely resembles real-world applications of reinforcement learning: at each step \(T\) the bandit chooses an action and only then they observe the reward. It also has computational advantages, some of which are obvious and others that will become clearer below. An obvious benefit of generating rewards on the fly is the smaller memory costs associated with this approach. Storing a \((T \times K)\) matrix consumes a fair bit of working memory and the for the simulations we will need as much of that as possible.
In the end I have therefore opted for the second approach. The mab function below takes basic input parameters concerning the time horizon of the game, the (Bernoulli) probabilities of each arm and their distribution (only implemented for Bernoulli, but easily extended). It returns an S3 class that will be used as the input for the different algorithms below.
#' Multi-Armed Bandit
#'
#' @param prob Bernoulli probabilities.
#' @param horizon Number of trials.
#' @param method Method to use. Only Bernoulli is implemented.
#'
#' @author Patrick Altmeyer
mab <- function(prob, horizon=1000, method="bernoulli") {
# Parameters: ----
K <- length(prob)
v_star <- max(prob)
# Reward generating function: ----
if (method=="bernoulli") {
generate_rewards <- function(prop) {
sapply(
prob,
function(p) {
rbinom(1,1,p)
}
)
}
}
# Output: ----
output <- list(
prob = prob,
generate_rewards = generate_rewards,
v_star = v_star,
K = K,
horizon = horizon,
method = method
)
class(output) <- "mab"
return(output)
}
unpack.mab <- function(mab, envir=parent.env) {
list2env(mab, envir = envir(environment()))
}
unpack <- function(mab, envir=parent.env) {
UseMethod("unpack")
}
The code chunk below shows the implementation of the UCB algorithm in R. This implementation does not make use of Rcpp and is only shown for illustrative purposes. The code is thoroughly commented, so I here only talk to the high-level details. Like all other algorithms that we will see, ucb_R takes an object of class mab as its main input. The optional action_values argument can be used to supply initial values, should we have prior knowledge. By default all action values are initialized as zero. Regret is computed in each period as \(V^*-Q(a_T)\) where \(V^*\) is the optimal choice and \(Q(a_T)\) represents the true reward probability of the arm chosen in period \(T\). In particular, following Chapelle and Li (2011) here we have \(V^*=0.5=Q(a^{(1)})\) and \(Q(a^{(j)})=0.5-\varepsilon\) for all \(j\ne1\), \(j \in [1,K]\). It is easy to see that with this implementation, we can simply take the cumulative sum of the resulting vector of single-period regret to recover the total cumulative regret in period \(T\) as specified in Auer, Cesa-Bianchi, and Fischer (2002), namely
\[ \begin{equation} \begin{aligned} && V^*T- \sum_{j=1}^{K}Q(a^{(j)})N_j(T)&= \\ \end{aligned} \tag{1.1} \end{equation} \]
where \(N_j(T)\) denotes the number of times that arm \(j\) has been chosen up until (and including) time \(T\).
The function outputs an object of class policy, which contains the proposed solution to the MAB. The policy class uses a set of methods (e.g. print and plot) which will be used below.
#' UCB
#'
#' @param mab An object of class `mab`.
#' @param update_every Integer specifying length of interval after which parameters will be updated.
#' @param action_values Initial action value estimates (optional).
#'
#' @author Patrick Altmeyer
ucb_R <- function(
mab,
update_every = 1,
action_values = NULL
) {
# Parameters: ----
unpack(mab) # unpack bandit problem
# Uncertainty implemented according to Chapelle et al. (2011):
uncertainty <- function(action_value,m,delta) {
ifelse(m==0, Inf, sqrt((2 * action_value * log(1/delta)) / m) + (2*log(1/delta)) / m)
}
# If unambiguous, select arm with maximum UCB, else draw uniformly:
select_arm <- function(ucb) {
ifelse(sum(ucb==max(ucb))==1, which.max(ucb), sample(which(ucb==max(ucb)),1))
}
# Initialization: ----
T_ <- 1 # iteration counter
delta <- sqrt(1/T_)
policy <- rep(0, horizon) # vector to keep track of choices
regret <- rep(v_star, horizon) # initialize regret as v_star (maximal true action value)
times_chosen <- rep(0,K) # count of times that actions were chosen
if (is.null(action_values)) {
action_values <- rep(0,K)
}
ucb <- action_values + uncertainty(action_values, times_chosen, delta) # initial upper confidence bounds
# Recursion: ----
while (T_ <= horizon) {
# Select arm:
arm <- select_arm(ucb)
# Observe reward:
r <- generate_rewards(prob)[arm]
# Update choices in T:
policy[T_] <- arm # update policy
regret[T_] <- regret[T_] - prob[arm] # compute regret in this period
times_chosen[arm] <- times_chosen[arm] + 1 # update counter
q <- action_values[arm] # action value at T
m <- times_chosen[arm] # times chose at T
action_values[arm] <- q + 1/m * (r - q) # action value increment
# Update parameters and estimates for T+1:
T_ <- T_ + 1
update_this_round <- T_ %% update_every == 0 # update only if modulus equal to zero
if (update_this_round) {
delta <- sqrt(1/T_)
# action_values[arm] <- q + 1/m * (r - q) # action value increment
ucb[arm] <- action_values[arm] + uncertainty(action_values[arm], m, delta) # UCB increment
}
}
# Output: ----
output <- list(
policy = policy,
regret = regret,
times_chosen = times_chosen,
action_values = action_values,
algo = list(
type = "ucb",
upper_confidence_bounds = ucb
),
mab = mab
)
class(output) <- "policy"
return(output)
}
Before moving on to Thompson sampling, let us run a single game to see how the algorithm performs. The code chunk below generates the game using the mab function introduced above.
eps <- 0.1
K <- 10
T_ <- 10000
prob <- c(0.5,rep(0.5-eps,K-1))
bernoulli_mab <- mab(prob, horizon = T_)
Below we use the Rcpp version of the UCB algorithm to fit our model. For \(T=10^{4}\) it only takes a few seconds to solve the problem. The ucb_Rcpp function is a wrapper function that executes the compiled C++ code under the hood and returns a policy object. The C++ code will not be discussed here but details can be found in the annex. For \(K=10\) and \(\varepsilon=0.1\), the cumulative regret after \(T=10^{4}\) periods is around the same levels as the average cumulative regret for that specification in Chapelle and Li (2011) (Figure 1.1).
set.seed(42)
fitted <- ucb_Rcpp(bernoulli_mab)
plot(fitted)
Figure 1.1: An example run of the UCB algorithm.
The implementation of Thompson sampling is very similar to UCB. Once again commented R code is shown below.
#' Thompson Sampling
#'
#' @param mab An object of class `mab`.
#' @param update_every Integer specifying length of interval after which parameters will be updated.
#' @param successes Initial success counts (optional).
#' @param failures Initial failure counts (optional).
#'
#' @author Patrick Altmeyer
thompson_R <- function(
mab,
update_every = 1,
successes = NULL,
failures = NULL
) {
# Parameters: ----
unpack(mab) # unpack bandit problem
# Function to compute posterior means (action values):
posterior_means <- function(successes, failures, method=method) {
K <- length(successes)
if (method=="bernoulli") {
theta <- rbeta(K,successes,failures)
}
return(theta)
}
# Function to select based on action values:
select_arm <- function(theta) {
arm <- which.max(theta)
return(arm)
}
# Initialization: ----
T_ <- 1 # iteration counter
policy <- rep(0, horizon) # vector to keep track of choices
regret <- rep(v_star, horizon) # initialize regret as v_star (maximal true action value)
times_chosen <- rep(0,K) # count of times that actions were chosen
if (is.null(successes)) {
successes <- rep(1,K) # initialize success count
}
if (is.null(failures)) {
failures <- rep(1,K) # initialize failure count
}
theta <- posterior_means(successes = successes, failures = failures) # initilize action values
# Recursion: ----
while (T_ <= horizon) {
# Select arm:
arm <- select_arm(theta = theta)
# Observe reward:
r <- generate_rewards(prob)[arm]
# Update choices in T:
policy[T_] <- arm # update policy
regret[T_] <- regret[T_] - prob[arm] # compute regret in this period
times_chosen[arm] <- times_chosen[arm] + 1 # update counter
if (r==1) {
successes[arm] <- successes[arm] + 1
} else {
failures[arm] <- failures[arm] + 1
}
# Update parameters and estimates for T+1:
T_ <- T_ + 1
update_this_round <- T_ %% update_every == 0 # update only if modulus equal to zero
if (update_this_round) {
theta <- posterior_means(successes = successes, failures = failures)
}
}
# Output: ----
output <- list(
policy = policy,
regret = regret,
times_chosen = times_chosen,
action_values = theta,
algo = list(
type = "thompson",
successes = successes,
failures = failures
),
mab = mab
)
class(output) <- "policy"
return(output)
}
Let us also look at an example run with Thompson sampling using the Rcpp implementation as before. We will use the same bandit as for the UCB above. As evident from Figure 1.2 the cumulative regret after \(T=10^{4}\) trials is much lower than for UCB. The curve describing the trajectory is also smoother than for UCB, perhaps indicating that when learning from posteriour probabilities, the bandit is less likely to stick with any particular choice throughout multiple consecutive periods.
set.seed(42)
fitted <- thompson_Rcpp(bernoulli_mab)
plot(fitted)
Figure 1.2: An example run of Thompson sampling.
The choice to compile the algorithms in C++ through Rcpp was driven by the fact that we are looking at computationally very intensive simulations.1 This implementation leads to massive performance boosts: Figure 1.3 shows a micro benchmark of the computing times of the different implementations.
library(microbenchmark)
library(ggplot2)
mb <- microbenchmark(
"UCB (R)" = {ucb_R(bernoulli_mab)},
"UCB (Rcpp)" = {ucb_Rcpp(bernoulli_mab)},
"Thompson (R)" = {thompson_R(bernoulli_mab)},
"Thompson (Rcpp)" = {thompson_Rcpp(bernoulli_mab)},
times = 100
)
p <- autoplot(mb)
ggsave("www/mb.png", width = 6, height = 4)
Figure 1.3: Reduction in computing time due to compiling through Rcpp.
First, let us set up a grid for the simulation, which will use both to draw the different parameters (\(K\), \(\varepsilon\)):
library(data.table)
t_0 <- 1
T_ <- 1e7
K <- c(10,100)
eps <- c(0.02,0.1)
algos <- list(
ucb = sim_ucb,
thompson = sim_thompson
)
param_grid <- CJ(K=K,eps=eps,algo=names(algos))
The functions going into the list algos are compiled through Rcpp to also speed up that part of the process. Details can again be found in the annex.
We will also set up a corresponding table to store the results from the simulation and eventually plot them. This involves performing a cross-join across the different parameters and the trials (\(T=[1,10^{4}]\)), which is efficiently implemented in data.table.
simulation <- CJ(K=K,eps=eps,T_=t_0:T_)
The first variable we will add to simulation is the lower bound for the different parameter specifications. The lower bound is implemented according to the formulation in Chapelle and Li (2011):
#' Lower bound
#'
#' @param t Time T.
#' @param prob Action values.
#' @param kl Function to compute KL-Divergence.
#'
#' @author Patrick Altmeyer
lb <- function(t, prob, kl) {
p_star <- max(prob)
log(t) * sum(
sapply(
prob,
function(p) {
ifelse(p_star-p>0,(p_star-p)/kl(p,p_star),0)
}
)
)
}
Next we apply this function for each \(K\) and \(\varepsilon\). Following Chapelle and Li (2011) we will shift the lower bound such that it passes through the origin at \(T=100\). Again these computation can be done very efficiently using data.table:
simulation[,value:=lb(T_,c(0.5,rep(0.5-eps,K-1)),kl_bernoulli) - lb(100,c(0.5,rep(0.5-eps,K-1)),kl_bernoulli),by=.(K,eps)]
simulation[,variable:="lower_bound"]
Now let us finally turn to the simulation exercise. While the code Rcpp code runs fast, running multiple simulations of Bernoulli bandits with \(K=100\) arms over \(T=10^7\) trials is computationally challenging.2 The code chunk below was used to run n_sim\(=25\) simulations as in Chapelle and Li (2011). While the authors chose to only update parameters every 100 iterations presumably to speed up computations, we do not use that simplification here and instead update every round.
start <- Sys.time()
simulation <- rbind(
simulation,
rbindlist(
lapply(
1:nrow(param_grid),
function(param_spec) {
list2env(c(param_grid[param_spec,]), envir = environment()) # unpack variables
prob <- c(0.5,rep(0.5-eps,K-1)) # compute corresponding probabilities
bernoulli_mab <- mab(prob, horizon = T_) # generate MAB
unpack(bernoulli_mab) # unpack objects
# initialize table to store output for parameter specification:
simulation_spec <- data.table(K=K, eps=eps, T_=t_0:T_, value=0, variable=algo)
# run simulation:
message(sprintf(
"Starting with: %s, %i arms, eps=%0.2f, T=%i, n_sim=%i",
algo,
K,
eps,
T_,
n_sim
))
regret <- algos[[algo]](
n = n_sim,
horizon = horizon,
v_star = v_star,
K = K,
prob = prob,
update_every = update_every
)
message(sprintf(
"Done with: %s, %i arms, eps=%0.2f, T=%i, n_sim=%i",
algo,
K,
eps,
T_,
n_sim
))
simulation_spec[,value:=cumsum(regret)] # store results
return(simulation_spec)
}
)
)
)
time_passed <- Sys.time() - start
print(time_passed)
saveRDS(simulation, file="results/simulation.rds")
The results are consistent with the findings in Chapelle and Li (2011): for all parameter specifications both the UCB and Thompson Sampling attain sub-linearity before reaching \(10^7\) trials (Figure 2.1). As expected, Thompson Sampling consistently outperforms UCB. Compared to the results in Chapelle and Li (2011), both the UCB and to a lesser extent Thompson Sampling yield somewhat lower absolute levels of cumulative regret for any given \(T\). This is likely due to the fact that here we have updated parameters continuously in each round, while Chapelle and Li (2011) only update every 100 trials.
Figure 2.1: Comparison of algorithms as in Chapelle and Li (2011).
For the complete picture we also run the simulation with updates at discrete intervals of 100 trials. The results shown in Figure 2.2 are broadly consistent with the results from above.
n_sim <- 25
run_simulation(n_sim=n_sim,horizon=1e7,update_every = 100,file="ps1_sim_discrete.png")
Figure 2.2: Comparison of algorithms as in Chapelle and Li (2011). Parameter updates every 100 trials.
Rcpp code#include <Rcpp.h>
using namespace Rcpp;
// Helper function to generate rewards:
// [[Rcpp::export]]
NumericVector generate_rewards(NumericVector prob, String method="bernoulli") {
int n = prob.size();
NumericVector v = no_init(n);
if (method=="bernoulli") {
for (int i=0; i<n; i++) {v[i] = as<double>(rbinom(1, 1, prob[i]));}
}
return(v);
}
// Helper function to select arm:
// [[Rcpp::export]]
int select_arm(NumericVector ucb) {
int out;
if (sum(ucb==max(ucb))==1) {
// if there is an unambiguous maximal value, choose that one
out = which_max(ucb);
} else {
// else draw uniformly from maximal values
IntegerVector indices = seq_along(ucb)-1; // indices of values
LogicalVector is_max = ucb==max(ucb); // boolean vector identifying maximal values
IntegerVector candidates = indices[is_max]; // indices of maximal values
IntegerVector out_v = sample(candidates, 1); // sample from those
out = out_v[0];
}
return out;
}
// Helper function to quantify uncertainty:
// [[Rcpp::export]]
NumericVector uncertainty(NumericVector action_values, NumericVector times_chosen, double delta) {
NumericVector out=ifelse(
times_chosen==0,
INFINITY, // if arm has never been selected, set to infinity
sqrt(2 * log(1/delta) * action_values / times_chosen) + (2*log(1/delta)) / times_chosen
);
return out;
}
// Main UCB function:
// [[Rcpp::export]]
List ucb(
int horizon,
double v_star,
int K,
NumericVector prob,
int update_every=1,
String method="bernoulli",
Nullable<NumericVector> action_values_ = R_NilValue
) {
// Initialize:
int T = 1;
double delta = sqrt(1/T);
NumericVector policy (horizon); // vector to keep track of choices
NumericVector regret (horizon, v_star); // initialize regret as v_star (maximal true action value)
NumericVector times_chosen (K); // counts of times selected
NumericVector action_values (K); // action_values initialized to zeros
// If initial action values have been supplied, then overwrite:
if (action_values_.isNotNull()) {
NumericVector action_values(action_values_); // casting to underlying type
}
NumericVector u = uncertainty(action_values, times_chosen, delta); // uncertainty
NumericVector ucb = action_values + u; // upper confidence bound
// Recursion:
while(T <= horizon) {
// Rcout << "UCB: " << ucb << "\n";
// Select arm:
int arm=select_arm(ucb);
// Observe reward:
NumericVector rewards = generate_rewards(prob, method);
int r = rewards[arm];
// Update choices in T:
policy[T-1] = arm+1; //update policy
regret[T-1] = regret[T-1] - prob[arm]; // compute regret in this period
times_chosen[arm] = times_chosen[arm] + 1; // update counter
double q = action_values[arm]; // action value at T
int m = times_chosen[arm]; // times chose at T
action_values[arm] = q + 1.0/m * (r - q); // action value increment for selected arm
// Update:
T++;
// Check if modulus is zero:
int update_this_round = T % update_every == 0;
if (update_this_round) {
double delta = sqrt(1.0/T);
// action_values[arm] = q + 1.0/m * (r - q); // action value increment for selected arm
u = uncertainty(action_values, times_chosen, delta); // update uncertainty
ucb = action_values + u; // update UCB
}
}
// Output
List output = List::create(
Named("policy") = policy,
Named("regret") = regret,
Named("times_chosen") = times_chosen,
Named("action_values") = action_values,
Named("ucb") = ucb
);
return output;
}
// Function to run simulation:
// [[Rcpp::export]]
NumericVector sim_ucb(
double n,
int horizon,
double v_star,
int K,
NumericVector prob,
int update_every=1,
String method="bernoulli",
Nullable<NumericVector> action_values_ = R_NilValue
) {
NumericVector regret (horizon); // allocate memory
double sim_counter=1.0; // counter
while (sim_counter <= n) {
Rcout << sim_counter << "/" << n << "\n";
// Run algorithm:
List policy = ucb(
horizon,
v_star,
K,
prob,
update_every,
method,
action_values_
);
NumericVector reg_sim=policy["regret"];
regret = regret + reg_sim / n; // increment regret
sim_counter++;
}
return(regret);
}
// You can include R code blocks in C++ files processed with sourceCpp
// (useful for testing and development). The R code will be automatically
// run after the compilation.
//
//
// /*** R
// prob <- c(0.5,rep(0.4,9))
// bernoulli_mab <- mab(prob, horizon = 10000)
// unpack(bernoulli_mab)
// sim_ucb(
// n = 50,
// horizon = horizon,
// v_star = v_star,
// K = K,
// prob = prob,
// update_every=100,
// method = method,
// action_values_ = NULL
// )
// */
#include <Rcpp.h>
using namespace Rcpp;
// Helper function to generate rewards:
// [[Rcpp::export]]
NumericVector generate_rewards(NumericVector prob, String method="bernoulli") {
int n = prob.size();
NumericVector v = no_init(n);
if (method=="bernoulli") {
for (int i=0; i<n; i++) {v[i] = as<double>(rbinom(1, 1, prob[i]));}
}
return(v);
}
// Helper function to compute posterior means (action values):
// [[Rcpp::export]]
NumericVector posterior_means(IntegerVector successes, IntegerVector failures, String method="bernoulli") {
int K=successes.size();
NumericVector theta = no_init(K);
if (method=="bernoulli") {
for (int i=0; i<K; i++) {theta[i] = as<double>(rbeta(1, successes[i], failures[i]));}
}
return(theta);
}
// Helper function to select arm:
// [[Rcpp::export]]
int select_arm(NumericVector theta) {
int arm = which_max(theta);
return arm;
}
// Main UCB function:
// [[Rcpp::export]]
List thompson(
int horizon,
double v_star,
int K,
NumericVector prob,
int update_every=1,
String method="bernoulli",
Nullable<IntegerVector> successes_ = R_NilValue,
Nullable<IntegerVector> failures_ = R_NilValue
) {
// Initialize:
int T = 1;
NumericVector policy (horizon); // vector to keep track of choices
NumericVector regret (horizon, v_star); // initialize regret as v_star (maximal true action value)
NumericVector times_chosen (K); // counts of times selected
IntegerVector successes (K,1); // successes initialized to one
IntegerVector failures (K,1); // failures initialized to one
// If initial success/failure values have been supplied, then overwrite:
if (successes_.isNotNull()) {
IntegerVector successes(successes_); // casting to underlying type
}
if (failures_.isNotNull()) {
IntegerVector failures(failures_); // casting to underlying type
}
NumericVector theta=posterior_means(successes, failures, method);
// Recursion:
while(T <= horizon) {
// Rcout << "Round: " << T << "\n";
// Select arm:
int arm=select_arm(theta);
// Observe reward:
NumericVector rewards = generate_rewards(prob, method);
int r = rewards[arm];
// Update choices in T:
policy[T-1] = arm+1; //update policy
regret[T-1] = regret[T-1] - prob[arm]; // compute regret in this period
times_chosen[arm] = times_chosen[arm] + 1; // update counter
// Update:
T++;
if (r==1) {
successes[arm]++;
} else {
failures[arm]++;
}
// Check if modulus is zero:
int update_this_round = T % update_every == 0;
if (update_this_round) {
theta=posterior_means(successes, failures, method);
}
}
// Output
List output = List::create(
Named("policy") = policy,
Named("regret") = regret,
Named("times_chosen") = times_chosen,
Named("action_values") = theta,
Named("successes") = successes,
Named("failures") = failures
);
return output;
}
// Function to run simulation:
// [[Rcpp::export]]
NumericVector sim_thompson(
double n,
int horizon,
double v_star,
int K,
NumericVector prob,
int update_every=1,
String method="bernoulli",
Nullable<IntegerVector> successes_ = R_NilValue,
Nullable<IntegerVector> failures_ = R_NilValue
) {
NumericVector regret (horizon); // allocate memory
double sim_counter=1.0; // counter
while (sim_counter <= n) {
Rcout << sim_counter << "/" << n << "\n";
// Run algorithm:
List policy = thompson(
horizon,
v_star,
K,
prob,
update_every,
method,
successes_,
failures_
);
NumericVector reg_sim=policy["regret"];
regret = regret + reg_sim / n; // increment regret
sim_counter++;
}
return(regret);
}
// You can include R code blocks in C++ files processed with sourceCpp
// (useful for testing and development). The R code will be automatically
// run after the compilation.
//
//
// /*** R
// prob <- c(0.5,rep(0.4,9))
// bernoulli_mab <- mab(prob, horizon = 10000)
// unpack(bernoulli_mab)
// thompson(
// # n = 10,
// horizon = horizon,
// v_star = v_star,
// K = K,
// prob = prob,
// update_every = 1,
// method = method,
// successes_ = NULL,
// failures_ = NULL
// )
// */
utils::sessionInfo()
## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] Rcpp_1.0.6 scales_1.1.1 ggplot2_3.3.3 data.table_1.14.0
##
## loaded via a namespace (and not attached):
## [1] pillar_1.6.0 compiler_4.0.3 highr_0.8 tools_4.0.3
## [5] digest_0.6.27 evaluate_0.14 lifecycle_1.0.0 tibble_3.1.1
## [9] gtable_0.3.0 pkgconfig_2.0.3 png_0.1-7 rlang_0.4.10
## [13] yaml_2.2.1 xfun_0.20 withr_2.4.2 stringr_1.4.0
## [17] dplyr_1.0.2 knitr_1.30 generics_0.1.0 vctrs_0.3.7
## [21] grid_4.0.3 tidyselect_1.1.0 glue_1.4.2 R6_2.5.0
## [25] fansi_0.4.2 rmarkdown_2.6 bookdown_0.20 purrr_0.3.4
## [29] magrittr_2.0.1 ellipsis_0.3.1 htmltools_0.5.1.1 colorspace_2.0-0
## [33] utf8_1.2.1 stringi_1.5.3 munsell_0.5.0 crayon_1.4.1