Journal of Statistics Applications & Probability

Author Country (or Countries)

South Africa


The internet has become a digital marketplace that offers goods and services globally. Thereby, compelling enterprises to optimize their websites and online strategies. This paper sought to employ Markov chain models to predict the most likely next webpage viewed. A website comprises several pages (such as “Home”, “About-Us”) and visitors would transition from one page to another by clicking on respective links. The study was conducted on the website of a South African engineering and engineering training company “TEKmation”. The transition probabilities therefore represent the likelihood of moving to a certain webpage, given that the visitor is on a specific webpage all within the studied website. However, a key Markov chain assumption is that the “next state” is solely dependent on the “present state” and independent of “previous states” (“memoryless”). However, according to chi-squared tests on the observed data, the “future” state has shown dependence on “previous” states. And this was due to a visitor being less likely to re-visit a page again relative to the likelihood of visiting an unseen page within the visit. The aim of this study was to explore a tiered approached to the Markov models to minimize the impact of the “memoryless” assumption. The study further split each visit into a tier one portion (which represented the first two viewed pages of the visit) and a tier two portion (which represented the third or more pages viewed). The tiered approach (accuracy = 62%) fitted the data a lot better than the standard Markov model (accuracy = 53%). It was also observed that the tiered model had on average more accurately predicted the drop-off events (the movement from the “current state” to exiting the website). Thereby, in conclusion, the tiered Markov models proved to reduce the “memoryless” assumption on the studied data.

Digital Object Identifier (DOI)