X-Ray Diffractions Detector
A Multi-Label Image Classification
- Tracking Experiments
- Extract Data
- Setup the notebook
- Analysis
- Data Munching
- Preparing the data
- Training
- Results
- Prediction
- Conclusion
This program is try to do something that I have no idea to detect if it is even distantly correct. I was able to find a peculiar dataset called X-Ray Defraction Imageset thanks to Czyzewski et. al. The imageset was downloaded for traning the dataset from the link available here. I love to try out something new as a hobby and see how it works out to be.
Czyzewski, Adam, Krawiec, Faustyna, Brzezinski, Dariusz, & Porebski, Przemyslaw J. (2019). RefleX: X-ray diffraction images dataset (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.2605120
X-Ray Defraction Imageset from here
Now we are going to explore options on how to get our accuracy high enough. We have various tools that track the experiments that were done. Having such things help us to understand what hyper parameters to choose and get intutions on what experiments to try.
We are going to use WandB which is one amoung the top tools used.
I may use this for klogging my experiments. However, Since the one that gave the best result is shown below, its trivial if you use WandB and WandBCallback.
#collapse
import os
os.environ['WANDB_BASE_URL'] = 'http://localhost:8080'
from fastai2.callback.wandb import wandb,WandbCallback
If you havent extracted the zip file, the first step might be to extract it. The extraction can be manual or can be done using Python
The first step might be to setup the notebook and set path correctly. This will remove all possible hindrences on later.
Lets just import the required libraries from fastai2
. As it is a vision task we shall try importing fastai2.vision
Also the file that maps the labels to the images is a CSV. we know pandas
is good at tabular data and is the most convinent way to handle CSV data. So, we include that as well
from fastai2.vision.all import *
from fastai2.vision.widgets import *
import pandas as pd
We create path
variable to be pointing to the location wherever it is downloaded
#collapse
path=Path("../../data/Multi-label/")
Path.BASE_PATH = path
Path
frompathlib
by itself doesn't have a convenience function to list all child files/folders. we may have to actually dolist(path.iterdir())
to get that. Fast-AI however has made a convenient function.ls()
that gets overloaded as we import the library
files=path.ls()
files
Here we see that the images are available in the images
folder and the label-image mapping is done in labels.csv
. The original downloaded zip file is the other reflex_img_1024_inter_nearest.zip
file that is available. However it is never used in this program
I have no idea how the data is in the CSV or image. Lets just load the data and have a look at it
labels_df=pd.read_csv(path/'labels.csv')
labels_df
Now we see that the CSV has image
as the first column and columns that representing the label as list of columns each reflecting 1 or 0 as in True or False.
There are totally 6311 samples. We also see that the number of columns available are 8. 1 input (Image file name) and the 7 probable classes.
All the label columns are having values as 0 or 1. This is one of the ways to directly feed this as on-hot encoded vector to the model. But we may have inconvinence in interpreting 0 and 1 and mapping columns.
If we let fastai2
do the one-hot-encoding
, we may have some extra convience in interpreting the model output. So I prefer to have the labels as the actual string with comma seperation between those columns. I do that in the following steps.
- Get all label columns
- For each row, if the label is marked as present (i.e) value is 1, then add that to the string
- Add a new column
labels
with the label names of the ones that just exist
y_cols=list(labels_df.columns[1:])
def get_labels(row):
return ', '.join([col for col in y_cols if row[col]==1])
labels_df['labels']=labels_df.apply(get_labels,axis=1)
We have made up a new column labels
which lists the labels for that image. Lets see if we got it right
labels_df
There are 6311 rows. which means there should be atleast 6311 images in the file
len(Path(path/"images").ls())
We have one more than required. Lets see if we have any non .png
file or folders
Path(path/"images").ls().filter(lambda x: x.suffix!='.png')
Ok. There seems to be an empty models folder present in images
folder.
Ok. Now lets look at random image
from random import randint
total_count=len(Path(path/"images").ls())-1 # Discount the models folder
index= randint(0,total_count)
image_paths=Path(path/"images").ls().filter(lambda x: x.suffix=='.png')
image_path=image_paths[index]
im = Image.open(image_path)
im.thumbnail((100,100))
im
Rerunning the above cell provides enough samples to get an idea of the dataset. Seems the dataset is grey scale
We need DataLoaders
to point out the list of transformations that are to be done to the data for preparing for training. Luckily, we have ImageDataLoaders.from_df
method to get an handy function for generating the dataloader. But it doesnt have provision to provide a uni channel Image. So we do some edit of the same method.
There are couple of things to note on the decision of parameters:
- First parameter is the dataframe itself
-
label_col
is the column which has the list of labels -
path
must specify the path where the images are stored -
suff
is the suffix that must be added to all the filenames. In short the file extensions -
label_delim
specifies the delimiter that segregates two labels apart (in thelabel_col
) -
item_tfms
specifies the item level transformations that must be done. Here we are just doing resizing of original image to 224*224 image size.The idea of using Squish to resize was because I am guessing cropping might remove some of the features. Especially after looking at some label names like
loop_scattering
,non-uniform_detector
orstrong_background
.loop_scattering
is fairly obvious there seems to be some disturbance in the rings which is that that obvious in the inner rings. cropping might not be that useful.
Note:The Default augmentations are not added on purpose as brighness, contrast can affect the
strong_background
feature
def generate_dataloaders( df, path='.', valid_pct=0.2, seed=None, fn_col=0, folder=None, suff='', label_col=1, label_delim=None,
y_block=None,x_block=None, valid_col=None, item_tfms=None, batch_tfms=None, **kwargs):
"Create from `df` using `fn_col` and `label_col`"
pref = f'{Path(path) if folder is None else Path(path)/folder}{os.path.sep}'
if y_block is None:
is_multi = (is_listy(label_col) and len(label_col) > 1) or label_delim is not None
y_block = MultiCategoryBlock if is_multi else CategoryBlock
if x_block is None:
x_block=ImageBlock
splitter = RandomSplitter(valid_pct, seed=seed) if valid_col is None else ColSplitter(valid_col)
dblock = DataBlock(blocks=(x_block, y_block),
get_x=ColReader(fn_col, pref=pref, suff=suff),
get_y=ColReader(label_col, label_delim=label_delim),
splitter=splitter,
item_tfms=item_tfms,
batch_tfms=batch_tfms)
return DataLoaders.from_dblock(dblock, df, path=path, **kwargs)
We now create the dataloader with this data frame:
-
x_block
might be the black and white monochrome image. -
path
specifies the location of images -
label_col
specifies the column where the dependent variable is available. -
label_delim
soecies the delimiter that splits the labels available inlabel_col
-
suff
adds the suffix to the image. -
item_tfms
specifies the transformation to be done at item level. Here we resize to a 48*48 image by squishing so that the entire image is given -
bs
specifies thebatch_size
to be used
dataloaders=generate_dataloaders(labels_df,x_block=ImageBlock(PILImageBW),label_col='labels',path=path/'images',suff='.png',label_delim=', ',item_tfms=Resize(48,ResizeMethod.Squish),bs=128)
Lets get a batch of data and see if we have a have the shape of input as we expect.
dataloaders.one_batch()[0].shape
The default bs
being 128, images having 1 channels after resizing the image to 48*48 the shape seem reasonable
Let's see some random image out of it and see if we have everything right
dataloaders.show_batch(max_n=9)
Now we use a pretrained Resnet18 and use the train for this perticular dataset
learn = cnn_learner(dataloaders, resnet18, metrics=partial(accuracy_multi, thresh=0.5), cbs=WandbCallback())
lr_find
is an useful functionality that is useful in finding optimal learning rqate for training. It outputs the graph and the two values onw which specifies the lr where the loss was at its least and other one at the steepest slope. The steepest slope is the interest for us for most of the cases
learn.lr_find()
fine_tune
tunes the newly introduced layer for one epoch and unfreeses the previous layers for training for the later epochs. We would see two tables representing its loss and other metrics for the same reason
learn.fine_tune(10, 4e-2)
So, I tried running through various experiments. The code below helps in loading the previous saved model. I save models which have better accuracies and use it as a starting point for going further.
learn=learn.load(run_name+'.h5')
For this experiment I didnt change architecture much. I tried changing the image size and batch_size and learning_rate. Choosing the learning rate after initial training is little hard. Thanks to W&B I could see the learning rate for various epochs. I chose the lr which had the slopiest accuracy improvement at the end of the training and updated it
learn.fit_one_cycle(10,1e-6)
We would have to save the model for later inference
learn.save(run_name+'.h5')
wandb.save(run_name+'.h5')
Of some of the experiments we did, we saw the results as follows:
Run name | Image Size | Performance | Batch Size |
---|---|---|---|
Basic Resnet 18 | 224 | 0.941193 | Default |
Basic Resnet 18 - 48 (256) | 48 | 0.926874 | 256 |
Basic Resnet 18 - 48 (128) | 48 | 0.938307 | 128 |
Basic Resnet 18 - 48 (64) | 48 | 0.936382 | 64 |
Basic Resnet 18- 48 (32) | 48 | 0.931741 | 32 |
Which Model to choose:
Of some of the various experiments trained, we didnt include any of them when them were we used 3 channels as input simply because it gave the same result but occupying more weights for model anf might have slightly more time to train for infer.
Run 1 has around 0.3% more accuracy than Run 3 but it uses bigger image. This means:
- Model of Run 1 is bigger than Run 3
- Run 1 might take more time to train or infer than Run 3
Tip: If the usecase allows a slightly accurate model always prefer the smaller model
X-Ray images are less popular. So I am choosing a image from the validation set to check how it is predicting. To do so, first lets look at validation image.
learn.dls.valid_ds.items[['image','labels']]
I am choosing the second one as a random image to test
learn.predict(path/'images/9172_1_E1_001.png')
Here it provides the Predicted labels, The One Hot Encoding values infered and the actual tensor values of the model.
We also see that this label prediction is indeed correct
The Idea of this blog is to go through a simple Multi-Label Image Classifer. Though there are lots of other options that could have experimented with, since the dataset was relatively simple, we leave them here.
We can try a different dataset which demands more accuracy in a different blog post
Kindly let know your comments on the same