We created an estimator tool to demonstrate the potential relationship between web application performance and business impact, like increased revenue.
The revenue increase estimation is based on a model constructed from case studies of various companies on performance improvement and revenue impact. The model consumes percentage improvement in various web metrics and outputs a percentage of revenue increase. It is not dependent on characteristics specific to the site/brand like current revenue size or business domain.
While the actual impact for any given company would depend on their specific situation, the data generally shows that across different types of websites, improving speed and responsiveness is tied to increased business results.
The estimation is provided for informational purposes only and is subject to change. They are not intended to be, and should not be construed as, a guarantee, representation or assurance of actual outcomes.
The data is sourced from various case studies around the web pertaining to performance to revenue improvement. For example, Vodafone improved LCP by 31% and increased sales by 8%. Using a collection of these case studies, metric improvements and respective revenue increases were pulled into a dataset. From there, the following metrics were chosen to train the model. These metrics were primarily chosen due to their relative abundance in the dataset.
Because the data was pulled from case studies, some studies contained some metrics but not others depending on what data the study was willing to provide publicly. In order to address this for training, some values were imputed using basic methods. Page load time improvement, first contentful paint time improvement, and content layout shift score improvement were imputed using an average of existing values. Largest contentful paint time improvement was imputed using a linear regression of content layout shift score improvement.
The dependent variable was set as percentage revenue increase.
Using this dataset, the model was trained using XGBoost. XGBoost is a gradient-boosted decision tree machine learning library. The model was trained with repeated k-fold cross-validation. The training was repeated 3 times against 10 folds. Additionally, the model was monotonically constrained against all independent variables.
To determine a given site's estimated revenue increase by improving performance, model inference is performed. For Econiscore, Econify's work with MotorTrend is used as a reference of potential performance improvement. For example, if a site's FCP is 4.2 sec and 2.4 sec is used as a reference for potential improvement, 43% is utilized as the FCP improvement metric in inference.
Because the data is sourced from various case studies/blog posts, the training data is rather small and sparse. A great deal of imputation is utilized in order to allow for training. As such, the data set suffers from the typical problems introduced with imputation.
Additionally, it's important to note that the given case studies may suffer from selection bias. A company which improves its performance and does not see any revenue increase may not be inclined to publicly share this data. As such, this model is biased in this way.
Percentage improvement was utilized for independent variables as opposed to raw metric values. This implies that the model is blind to the scale of these improvements. For example, an improvement from 60 second to 1 second is a 98% improvement. However, 60 milliseconds to 1 millisecond is also a 98% improvement. The model is blind to such details and will provide a revenue estimation independent of these realities. This constraint is primarily due to the dataset. This issue is tempered by the fact that inference is performed using reference values and for the most part, improvements will be on the same order of magnitude. Nonetheless, this limitation should be kept in mind.
Finally, no company specific data was utilized for the training of this model. Case studies ranged from 7% revenue increase all the way up to 66% revenue increase. While it is believed that these case studies controlled for other factors when providing these numbers, this model does not factor other variables across companies/case studies. For example, revenue increase may be drastically different depending on the business model for the site (e.g. e-commerce, advertising). Additionally, a $100 million dollar revenue company may expect different ROI on performance vs. a $1 million dollar revenue company. These variables are not included as part of the model due to the complexity of sourcing them and the size of the dataset.