treepplr is an interface for using the TreePPL program.
All functions start with tp_ to easily distinguish them
from functions from other packages.
The three necessary parts for doing analysis with TreePPL are: model, data and inference machinery.
Model
You can choose a model in TreePPL language from our library or you can write your own model.
To list all available models in our library, use
tp_model_library() to retrieve the models within the TreePPL
github repository.
model_lib <- tp_model_library()To use one of these models you just need it’s name in
model_lib$model_name.
If you want to use your own custom model, you will need to write it
in TreePPL language and pass it to an R object that contains either the
full path to the .tppl file containing the model, or a
string with the full model.
# import a model from file
model_path <- "path/to/my_model.tppl"Data
TreePPL only reads a custom JSON format, so treepplr
converts a variety of input data to this format and writes to file,
which will then be used by TreePPL. Here are some examples:
# for models that only need a phylogenetic tree
phylo <- ape::read.tree(file = "path/to/your/file.tre")
data_path <- tp_data(data_input = phylo)
# or sequence data
fasta_file <- "path/to/your/file.fasta"
data_path <- tp_data(data_input = fasta_file)As for models, you can also use test datasets from the TreePPL library by passing the name of the model (we’ll come back to this later).
Inference method
TreePPL offers a variety of inference methods. Different methods work best for different models. See the model library for our recommendations of which inference methods to choose for each model.
Compilation
Once you have chosen the model and the inference method you want to use, you can compile your model to en executable that also contains the necessary machinery to run the chosen inference method.
# Using a model from the library and a Sequential Monte Carlo method
exe_path <- tp_compile(model = "crbd", method = "smc-apf", particles = 10000)
# Using a custom model and a Markov chain Monte Carlo method
exe_path <- tp_compile(model = model_path, method = "mcmc-lightweight",
iterations = 10000)Running
Now you are ready to run your analysis. All you have to do is to pass your data to the compiled executable and choose how many independent runs you want to do.
output <- tp_run(compiled_model = exe_path, data = data_path, n_runs = 4)Convergence
Then you can parse your output to produce a data frame and check for convergence.
# If using SMC
output_df_smc <- tp_parse_smc(output)
tp_smc_convergence(output_df_smc)
# If using MCMC
output_df_mcmc <- tp_parse_mcmc(output)