Recent Posts

Decision tree using sklearn

 

What is a Decision tree algorithm?

The decision tree Algorithm belongs to the family of supervised machine learning algorithms. It can be used for both a classification problem as well as for regression problem.

The goal of this algorithm is to create a model that predicts the value of a target variable, for which the decision tree uses the tree representation to solve the problem in which the leaf node corresponds to a class label and attributes are represented on the internal node of the tree.






We can use Scikit learn for the decision tree which makes it very easy to implement.

CODE:

import pandas as pd
from sklearn.tree import DecisionTreeClassifier 
from sklearn.model_selection import train_test_split 
from sklearn import metrics 

data = pd.read_csv("diabetes.csv") #Dataset

data.head()
PregnanciesGlucoseBloodPressureSkinThicknessInsulinBMIPedigreeAgeOutcome

61487235033.60.627501

1856629026.60.351310

8183640023.30.672321

18966239428.10.167210

0137403516843.12.288331

features=['Pregnancies''Insulin''BMI''Age','Glucose','BloodPressure','Pedigree']
X = data[featurs] 
y = data.Outcome

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)

clf = DecisionTreeClassifier()
clf = clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)

print("Accuracy:",metrics.accuracy_score(y_test, y_pred))
>>Accuracy: 0.6796536796536796

No comments

If you have any doubts, Please let me know