Applying ML over the simpler problems isn’t that much difficult you just need to remember some steps to perform.
You’re a Pirate or Captain of Marine but I think Pirate will suit you and planning to attack the Diamond mine under water near some country, that country has protected the Diamond mine with some Mines(explosion) and Rocks it’s difficult to differentiate to find out which one is Mine and which one is Rock. But the luck here is you have data and you can different among them which one is Mine and which one is Rock.
Just import all the required dependencies you can read about individual library, Just by going through the documentation.
Read the data through file by using pandas. Our data is stored in csv format so we will use .read-csv. It will simply read the values from CSV formatted file and stored it as a Data frame.
Our data look something like this
Know about your data the more you know the better you will handle the handle. However the following dataset don’t contain any information about the columns but still you can note the distribution of data and summary of the data and detect outliers in dataset.
Lets find the distribution of the target variable. If the data isn’t distributed then take appropriate steps to equally distribute the data.
You can see that distributed is slightly imbalance but its acceptable. Almost none of the data is equally distributed.
R in target variable represent Rock and M represent Mine.
Separate the target and features in a dataset.
Split the data into train and test set. The train set will be utilized for the training the model and test set will be used to validate the model.
Select your model there isn’t any hard and fast rule to select model you have to try multiple models and select the one which outperform among others. However we’re just using Logistic regression you can find the intuition behind Logistic Regression through google.
Model is fitted over the data and then making predictions over the test data.
Model evaluate helps you in figuring out the best model there’re different metrics used for Classification and Regression. As this problem is related to classification so we can use different metrics like confusion matrix, F1-Score and some others. You can read about the classification metrics here
Tada!!! That’s all it you’re done with your beginner level ML Project.
This project was related to classification of Rocks vs Mines. You simply perform some statistical operation and data separation to train and evaluate your model.