First Steps as a Data Analyst!

Mayank Jha
2 min readJan 29, 2021

--

Working on : Health Insurance Cross Sell Prediction
Predict Health Insurance Owners’ who will be interested in Vehicle Insurance

I submitted my first task on Kaggle today. The task was to build a model to predict whether a customer would be interested in Vehicle Insurance is extremely helpful for the company because it can then accordingly plan its communication strategy to reach out to those customers and optimise its business model and revenue.
Breaking down the problem, my first task was to identify the business objective. An insurance policy is an arrangement by which a company undertakes to provide a guarantee of compensation for specified loss, damage, illness, or death in return for the payment of a specified premium. A premium is a sum of money that the customer needs to pay regularly to an insurance company for this guarantee.

Just like medical insurance, there is vehicle insurance where every year customer needs to pay a premium of certain amount to insurance provider company so that in case of unfortunate accident by the vehicle, the insurance provider company will provide a compensation (called ‘sum assured’) to the customer. Now, in order to predict, whether the customer would be interested in Vehicle insurance, you have information about demographics (gender, age, region code type), Vehicles (Vehicle Age, Damage), Policy (Premium, sourcing channel) etc.

I applied Logistic Regression on the problem and categorised the non-categorical data using the get_dummies from pandas module. The model achieved an accuracy score of 87.28.

The source code is available at my github page : https://github.com/mayankjha-purdue/data_science/blob/master/insurance_cross_sale.ipynb

You can also access it on Kaggle: https://www.kaggle.com/mjboilermaker/notebookadcecd409c

--

--

Mayank Jha

Hi, I am a Data Analytics student at Purdue University. I intend to use this platform to showcase and learn from people in the data science community.