MobileNet for Photo
Following the last article, this time we would like to get what the photo is about before it’s opened. If we could get this information, this could also be used in the dashboard filter for the user to check what they would like to see. Then the question comes to how to get this information. Luckily, living in this tech-booming world, many AI tools are well-packaged, so we could just use them directly without designing them.
Here, I would choose MobileNet as the tool to resolve our problem. The MobileNet can be used in classification, detection, and other CNN tasks. The result it provides may not be the best but it’s quite good in terms of the CP value. The MobileNet’s size is quite small and it still could provide a quite quality result. Based on some information I got, the reduction in the accuracy is quite small. I would not go further to explain the reason as this is really not my field. And here we would focus more on how to apply this tool to our case.
In the same vein as the last article, here we also take Colab as our editor. First of all, we have to import the package we need.
Then we create the function to load the image_path, get the image and then hand it over to the tf.keras.applications.mobilenet.preprocess_input() function to preprocess the image, so it could later be used by the MobileNet.
In the next step, we read the data saved in Google Drive and filter the data. Following the last article, we set data’s DateTime = 1900–12–30 13:32:00 if the data can’t get the DateTime data. And usually, the data without DateTime is the data with the format issue. Therefore, we would filter them out in the process of cleaning the data.
After the data is cleaned, then we create the new column “mobile_net_result” for us to store the result of the MobileNet. And then we use the imagenet_utils.decode_predictions to decode the prediction of an ImageNet model. The result would then be given to the data frame. As for the current design of the photo’s analysis result, I only plan to have this column used for key text search, so I would just put the whole result in each cell.
Finally, let’s randomly check the performance of the MobileNet, so here I randomly pick up two photos and check their analysis result. The first photo’s analysis result shows the photo should be highly related to book_jacket. The photo itself is actually a book, but it’s kinda close to the analysis result. And the second photo’s analysis result is exactly accurate and gives the correct answer dock. Though I can’t say whole photos would exactly give a good result, overall performance is quite good to meet what I need.
This article focuses more on the application of the MobileNet rather than the theory or the optimization of the model itself. Hope this would be useful for people having the same pain point of handling the photo as me. :)