Classical Dances of India
Image Classifier that classifies the different classical dances of India
There are various classical dances that are available all throughout in India. These classical dances are culture specific. For example, Tamil Nadu follows a classical dance called Bharathanaatiyam where as its nearby state, Kerala has a classical dance of Kathakali.
Being not much knowledgable in any of the classical dances, I was triggered by the idea of Jeremy Howard who was suggesting that we might be able to solve deep learning problems even if we ourselves arent the domain experts. So, I decided to try myself out if I could build a model that can perform well in classifying the classical dance names.
I have already done the same. But this time on the latest version of FastAi (V2) and is part of the ongoing course on it which is going to be public by June 2020. If you have checkd out the previous blog post on this, You may consider this as just a version updated blog.
Steps
To build a classifier, we have to build a classifier on our own that can be run end to end, we need to follow the below steps.
- Download the dataset
- Do preprocessing of data, if any
- Create a model to detect the classical dance
- Deploy it using a single GUI
To download a dataset, we need some Image Search API. Bing is one such API which provides better image downloading capabilities. But we may have to register to Bing Image Search to get the access key to use the API.
from azure.cognitiveservices.search.imagesearch import ImageSearchClient as api
from msrest.authentication import CognitiveServicesCredentials as auth
from fastai2.vision.all import *
from fastai2.vision.widgets import *
def search_images_bing(key, term, min_sz=128,count=150):
client = api('https://api.cognitive.microsoft.com', auth(key))
return L(client.images.search(query=term, count=count, min_height=min_sz, min_width=min_sz).value)
after signing up, add the key in the below cell and list the classes that you want to classify
key='xxx'
classes = ['Bharathanatyam','Kathakali','Kathak','jagoi dance']
Select a path where you would love to do image classification
path=Path('./data/image-classification')
Download images for each class and put it in a different folder where the folder name specifies the class name
path.mkdir(exist_ok=True)
for o in classes:
dest = (path/o)
dest.mkdir(exist_ok=True)
results = search_images_bing(key, o)
download_images(dest, urls=results.attrgot('content_url'))
After downloading images, verify if the files are generated as appropriate
fns = get_image_files(path)
fns
Now that we have downloaded the images. Our next step is to train the model to learn identifying classical dance images.
After we downloaded the images, we need to check if the downloaded images have some corrupt images as well. These images will not open for some reason and hence break when we try to open. We dont want to train our network with these images.
Fast AI comes with a method to verify these images using verify_images
method
failed = verify_images(fns)
failed
Delete the corrupted images by using unlink method in Path
failed.map(Path.unlink)
classical_dances_db = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.3, seed=42),
get_y=parent_label,
item_tfms=Resize(128))
In the above code:
-
blocks
is used to specify the way to generate Dependent and independent variables -
get_items
provides a way to get each item from the path. In this case, get individual image paths as files -
splitter
provides a way to split validation and training data. We have made sure the same set of images are used evertime we randomly divide the dataset into training and test set. -
get_y
a method to get Y value out of individual path value. Since getting Y from the path is little tricky i.e looking at the parent folder name, we specify the way to get y -
item_tfms
specifies transformations that are to be doneat item level. Such as resizing to 128*128 image (usually done through center cropping)
There are some transforms like moving to GPU, Normalisation that are done implicitly.
Now we specified the shell of the data, we now load the data to be used for training using dataloaders. The dataloaders convinently converts all the data that are provided and do all the transformations to be done so that the data can be fed into the model for training.
dls = classical_dances_db.dataloaders(path)
Now that we have transformed the data, we might look at the data if it is transformed right.
dls.show_batch(max_n=4, nrows=1)
Now after some analysis we find that these images like ok. So, lets start by looking into training the machine
Train the model with ResNet-18 and finetune 4 epochs
As we are all set for training the machine, we can continue to train the machine learn the type of dance from image
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)
Here we created a cnn_learner
which takes:
-
dls
- The data to be loaded -
arch
- Pretrained Resnet Model (resnet18
) so that we can ran faster -
metrics
- The metrics to be displayed after training
we will analyse the model so that we can get enough information out of it
Confusion Matrix is a matrix which tabulates the Actual Vs Predicted output. If both are same, then out machine did great job in identifying the image. Lets see how our model preformed
Look at the confusion matrix to see which one goes wrong mostly
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
The confusion is mostly around Kathak. The model gets wrong on Kathak often with Bharathanatyam and jagoi dance.
Look at the most confused images
interp.plot_top_losses(4, nrows=2)
There can be cases where some images are messy. Like, The search result which has a dancer who is famous in, say Bharathanatyam performing Kathak. These images can confuse the model in learning. Lets look if thats the case
This is also a place where we have to probably consult people who are data experts. They might help us to know if the image we call as Kathak were actually of Kathak Dance per say.
FastAI comes with this very cool widget which is very useful in these cases
cleaner = ImageClassifierCleaner(learn)
cleaner
The output of the previous one is an interactive widget which allows you to select images that were wrongly classified or images that are to be deleted.
Some common images that had to be removed includes:
- Photos of probably famous dances that were taken during some interview or award ceremony
- Agenda of the dance festival
- The ornaments and dresses thats associated with that dance
- The image of artist while they do the make over or dressing while preparing for the dance.
We had Images one of each type. Hence we do the appropriate using change
and delete
methods.
cleaner.change()
cleaner.delete()
Since the mode is rerun, the model might have a better at understanding now since the confusing images are removed. So, We are essentially repeating things that we were doing earlier
classical_dances_db_2 = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.3, seed=42),
get_y=parent_label,
item_tfms=Resize(128))
dls = classical_dances_db_2.dataloaders(path)
dls.show_batch(max_n=4, nrows=1)
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)
learn.fine_tune(4)
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
We do Data Augumentations, as this is a very basic work, we try to have default augumentations in aug_tfms
. and try the process one more time
classical_dances_db_3 = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.3, seed=42),
get_y=parent_label,
item_tfms=Resize(128),
batch_tfms=aug_transforms())
dls = classical_dances_db_3.dataloaders(path)
dls.show_batch(max_n=4, nrows=1)
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)
learn.fine_tune(4)
interp.plot_top_losses(4, nrows=2)
There is obviously more things that can be done here. But with some good defaults, we get around 91% accuracy. Since we are satisfied with them, lets export the model by calling learn.export()
learn.export()
path = Path('/')
path.ls(file_exts='.pkl')
We can load the model using load_learner
method.
learn_inf = load_learner(path/'export.pkl')
Now we can load a random image. It can be outside our dataset. Just for ease I am choosing one from the dataset itself.
path=Path('./data/image-classification')
paths=(path/'Bharathanatyam').ls()
paths[0]
Now we predict by calling the predict
method passing in the path of the image
learn_inf.predict(str(paths[0]))
As you can see, the file was in the Bharathanatyam folder and the prediction was also Bharathanatyam with the confidence of 99.98%. Hence we got the correct prediction
If you have a real world data and see how it is performing with your set of images, test it here.
It might take some time to start the app as we are using binder to get it running though