Mehrab Mashrafi

UX

AI

DATA

Loading time...

Avian Odyssey: A Revolution in Bird Classification with AI

Topics Covered: Data Collection Data Preprocessing Image Augmentation Transfer Learning Computer Vision Machine Learning

The vast diversity of bird species within the Aves class of the Animal Kingdom has long fascinated scientists and naturalists alike. Traditionally, the classification of birds has relied on meticulous examination of physical attributes, ranging from size and coloration to dietary habits and ecological niches. This age-old practice has formed the foundation of our understanding of avian biodiversity.

In an era where artificial intelligence (AI) has revolutionized numerous domains, we find ourselves presented with a unique opportunity. What if we could harness the power of AI to enhance our ability to classify and identify birds within this rich tapestry of avian life?

This project seeks to address this question by endeavoring to construct an innovative image classifier model capable of recognizing any bird species on Earth, categorizing them according to their respective orders. In the realm of ornithology, there are a total of 42 distinct orders within the avian class, each order representing a fascinating mosaic of avian diversity, distinct from the others in various ways. This ambitious project aims to harness the capabilities of AI and machine learning to create a tool that not only simplifies the classification process but also opens new avenues for research and understanding within the field of ornithology.

Project Presentation Video

Data Collection

Data Sources

The dataset used for training the model consists a total of 12,600 bird images, with each order represented by approximately 250-300 images. The images were collected from various sources from the web using the search_images_ddg() function of fastai, and carefully labeled with the corresponding bird order.

Data Cleaning & Model Training

Data Cleaning

To combat potential biases, we implemented a thoughtful strategy during data collection. Leveraging the versatile search_images_ddg() function of fastai, we carefully selected diverse keywords to download images, transcending the conventional "bird sitting on trees" archetype. By incorporating various positions and behaviors, including "flying bird" and other dynamic scenarios, we aimed to capture a holistic view of each bird species, fostering data consistency and minimizing inherent biases.

The result? An intricately curated dataset, where each order is represented by approximately 250-300 images, ensuring a balanced and comprehensive foundation for our image classifier model. Our commitment to data integrity and inclusivity is embedded in every pixel, setting the stage for a cutting-edge exploration of avian biodiversity through the lens of artificial intelligence.

Iterative Refinement: Enhancing Model Precision

Following the initial fine-tuning phase of our image classifier model, we embarked on a meticulous journey of iterative refinement to elevate the precision and reliability of our classification system. Recognizing that the devil often resides in the details, we turned our focus to the images causing the most loss to the model.

This process required a careful dissection of misclassified and problematic images. With surgical precision, we identified and rectified misplaced images while discerningly removing unwanted elements that could compromise the model's accuracy. This phase proved to be both time-consuming and challenging, demanding a keen eye for detail and a commitment to the highest standards of quality.

Yet, in the pursuit of excellence, we understood the imperative of this thorough approach. By addressing and rectifying discrepancies at the image level, we fortified the foundation of our model, ensuring that it not only met but exceeded the expectations of precision and reliability. This dedication to fine-tuning, though arduous, stands as a testament to our unwavering commitment to delivering a cutting-edge and robust solution in the realm of avian species classification.

Benchmarking: Unveiling Model Performance

Training Details:

  • Batch Size: 16
  • Learning Rate: Not used
  • Model Freezing: No
  • Epochs: 5

Performance Metrics:

Model Train Loss Valid Loss Error Rate Accuracy
ResNet101 0.295220 0.618716 0.148737 0.851263
ResNet152 0.352030 0.439929 0.126900 0.873100
DenseNet201 0.262946 0.570337 0.145833 0.854167
VGG16 0.899876 0.768018 0.220257 0.779743

Insights:

  • Training Loss: DenseNet201 achieved the lowest training loss (0.262946), indicating an excellent fit to the training data. ResNet101 (0.295220) and ResNet152 (0.352030) demonstrated commendable performance in terms of training loss. VGG16 had the highest training loss (0.899876), suggesting potential challenges in fitting the training data.
  • Validation Loss: ResNet152 obtained the lowest validation loss (0.439929), showcasing strong generalization to unseen data. ResNet101 (0.618716) and DenseNet201 (0.570337) also performed reasonably well in terms of validation loss. VGG16 had the highest validation loss (0.768018), suggesting potential overfitting.
  • Error Rate: ResNet152 had the lowest error rate (0.126900), demonstrating superior accuracy on the validation data. ResNet101 (0.148737) and DenseNet201 (0.145833) also exhibited low error rates, indicating strong predictive performance. VGG16, despite having a higher training and validation loss, still achieved a reasonable error rate of 0.220257.
  • Accuracy: ResNet152 achieved the highest accuracy (0.873100), indicating its effectiveness in correctly classifying data instances. DenseNet201 (0.854167) and ResNet101 (0.851263) also demonstrated high accuracy levels, though slightly lower than ResNet152. VGG16, despite its higher losses, still managed to achieve an accuracy of 0.779743.

Discussions

Overall Performance

Overall, ResNet152 emerges as the top-performing model in terms of accuracy, error rate, and validation loss. It shows a strong ability to generalize and make accurate predictions on unseen data. However, it's essential to consider other factors, such as model complexity, training time, and resource constraints, when selecting the most suitable model for a specific application.

ResNet101

ResNet101, while not the highest in accuracy, provides a good balance between model performance and potential overfitting concerns. It may be favored if there are limitations on computational resources or if avoiding overfitting is a primary concern.

DenseNet201

DenseNet201 also performs well and can be a solid choice, particularly if computational resources are available.

VGG16

VGG16, while providing reasonable results, seems to struggle more with generalization, as indicated by its high validation loss relative to other models.

Selected Model

Model Selection: ResNet101

Upon considering our benchmarking results, the classification model used in this project is ResNet101, a deep convolutional neural network known for its excellent performance in image classification tasks. The model has been pre-trained on a large dataset and fine-tuned on the bird image dataset to improve its accuracy and ability to classify bird orders.

Performance Metrics

The model shows improvement in terms of decreasing loss, error rate, and increasing accuracy as the training progresses. These initial results indicate that the model was learning and becoming more accurate in classifying the bird orders.

Epoch Train Loss Valid Loss Error Rate Accuracy
0 0.381994 0.207986 0.060047 0.939953

Promising Performance

The model shows promising performance with a low validation loss, low error rate, and high accuracy. It demonstrates its ability to accurately classify bird species based on the provided dataset.

Conclusion

Project Achievement

In the pursuit of harnessing the power of artificial intelligence for avian classification, our project has achieved a significant milestone. By combining the intricate art of ornithology with cutting-edge technology, we've developed an image classifier using the ResNet101 model, fine-tuned to recognize and classify bird species based on a diverse dataset of 12,600 images spanning 42 distinct orders.

Model Evaluation

Our rigorous benchmarking process revealed ResNet101 as the optimal choice, showcasing superior accuracy, low error rates, and strong generalization capabilities. The model's ability to learn and adapt, as evidenced by decreasing loss and increasing accuracy during training, underscores its effectiveness in classifying bird orders.

Future Directions

While we celebrate our current achievements, the journey doesn't end here. Future endeavors may involve expanding the dataset, collaborating with ornithologists for further refinement, and exploring advancements in neural network architectures. The application of AI in ornithology opens doors to new realms of understanding and research within the intricate world of avian biodiversity.

Impact and Utility

The culmination of our efforts is not merely a technical triumph but a tool with tangible implications for ornithological research. Our image classifier, driven by ResNet101, has the potential to streamline bird species identification, contributing to conservation efforts, ecological studies, and fostering a deeper appreciation for the diverse tapestry of avian life.