How To Make The Best Of Your Data Set?

The predictions we make from the data set are probably our best bet when it comes to taking business decisions. In spite of choosing the right data set,  cleaning the data and applying the right analytics model, there might still be problems that we face in getting the right predictions. Let us take a look at some of the reasons that could be behind it.

We might able be failing despite using the correct techniques and variables. As a beginner, it is quite difficult to gauge why exactly this is happening. Well, there can be several reasons but one probable reason is giving the same priority to every element in the data set.

Suppose you have a Big Bazaar and you have to predict the average sale that a new customer is likely to have.  Thus, your target variable is the sale amount.  This target variable affects several business decisions that you need to take such as quantity of and types of products to be ordered, additional employees that need to be taken in, etc.

To compute this target you may use the following data set or something similar- Your present sales sheet which contains your customer details that affect our product sale. Here, even the correct data and linear regression model might end up giving you an inaccurate prediction.

Here is why. Your list probably contains all sorts of customers, the established long-standing (Let’s call the Type A), relatively new clients(Type C) and those which are somewhere in between(Type B). In this case, the data from long-established is likely to be more accurate than that of the newer clients, but you’re giving the same priority to each element in the dataset. In order to get a more accurate prediction, higher priority needs to be given to the data by the more established clients.

This priority can be implemented in the analytic model simply by creating a new column which gives the weightage of each row of data. Each row that contains information by the established customers is to be given a higher weightage and the newer clients a lower weightage. This can be implemented in the following manner:

company and weightage

As more emphasis is placed on the data obtained from long established companies your prediction will turn out to be more accurate leading to much better decisions.

As a data analyst or even an entrepreneur, we must realise that different factors can have a different effect on the predictions. All factors do not have the same effect and our prediction models must echo this. I hope this helped you in making better decisions for creating and evaluating your analytical model. I would really like to hear the kind of experiences in creating such analytic models in the comments.