How likely are we to make the observations of *data* if the parameters of our model θ are correct.

When you have your model, it is usually straightforward to calculate this likelihood. For a simple example, if we are modelling coin tosses as a fair coin being tossed, and we see two heads in a row, $p(data | \theta)$ is given by:

Picking a good model that you can parameterise is a key part of this process. Ben Lambert's book gives you this framework for picking a good model:

- Write down the real life behaviour that the model should be capable of explaining.
- Write down the assumptions that it is believed are reasonable to achieve step 1.
- Look for probability models that are based on these assumptions. If necessary, combine multiple modles to capture all the assumptions.
- After fitting the model to the data, test its ability to explain the behaviour identified in step 1. If unsuccessful, go back to step 2 and assess which of your assumptions are invalidated. Then choose a new more general probability model that encompasses these new assumptions.