I was able to obtain a data set which describes characteristics of 184 gas stations. I eliminated many of the variables because I do not know what they describe. Of the variables that remain, the main ones of interest are related to marketing campaigns. I also included demographic information. I want to see if the demand for gas (gas volume) can be predicted using the variables I have included in the data set. It will be interesting to see if one of the marketing campaigns is a good predictor. This knowledge would help a manager to know if spending money on a similar marketing campaign would be justified based on the expected increase in demand for gasoline.
I used the new Data Health Node in a STATISTICA Data Mining Workspace to eliminate some of the redundant data. I would like to recommend this new node to everyone. The report does not take long to run and the resulting information is very useful. To recreate all the output manually within STATISTICA would take a considerable amount of time. It was great that the node could remove the redundant data automatically so I could move onto model building right away. I was so impressed that I plan on devoting a whole blog post in the near future to the Data Health Node.
Please refer to the following YouTube video where I describe the gas station data set along with the regression model to predict the demand for gasoline.
Anyone that has questions about the model I built or would like to obtain a copy of the data set, please post a comment and I will get in touch with you. Thanks for reading and I hope you had a great weekend!