Status : Completed
Tags: python ML OpenCV numpy flask librosa
Automating the music classification using machine learning to make the selection of songs quick and less cumbersome.
python
flask
librosa
ml
numpy
heroku
Introduction
If one has to classify the songs or music manually, one has to listen to many songs and then select the genre. This is not only time-consuming but also difficult. Automating the music classification aims to make the selection of songs quick and less cumbersome.
Description
A music genre classifier is a software program that predicts the genre of a piece of music in audio format. These devices are used for tasks such as automatically tagging music for distributors such as Spotify and Billboard and determining appropriate background music for events.
The ambiguity of genre classification makes machine intelligence well-suited to this task. Given enough audio data, of which large amounts can be easily harvested from online music, machine learning can observe and make predictions using these ill-defined patterns.
This project aims to build a proof-of-concept music genre classifier using a deep learning approach that can correctly predict the genre and confidence level of Western music from popular candidate genres (classical, jazz, rap, rock, etc.. ).
Tech Stack and Libraries
These data analysis and visualization libraries build the base of exploratory data analysis, like transforming the training testing and validation sets.
Dataset Used:
The dataset used was - GTZAN
The dataset is the most-used public dataset for evaluation in machine listening research for music genre recognition (MGR). The files were collected in 2000-2001 from various sources, including personal CDs, radio, and microphone recordings, to represent various recording conditions.
The Steps Involved:
1. Preprocessing:
2. First, Tried the CNN model on the Mel Spectrogram data provided with the parent dataset.
a)it did not give good accuracy due to overfitting, as the dataset wasn't enough to train a CNN.
b)data augmentation was tried but did not work due to incompatibility of the Spectrograms dimensions created by the librosa library.
3. We trained the final model, whose description is given in detail below that was trained on the CSV data provided with the dataset with an accuracy of ~90% was achieved.
4. For the prediction, we created a custom function(named - getdataf) that would create an exact replica of the feature table of the dataset provided.
5. The predictions were then mapped to the genre map dictionary created, which was the encoding.
ML MODEL BREAKDOWN
The dataset we used for the model train was the GTZan dataset. The model is an ensemble of two independent classifiers
A)
B)
Finally, Using the Stacking Classifier Library in Python the ensemble of SVC and XGBoost Classifier was made to fit the training set of the data. The trained model produced a decent accuracy of around ~ 85% and using the hyperparameter-tuned SVC Classifier increased the accuracy score and F1 score to ~ 90% for the prediction accuracy of the validation set.
WebApp details
In this project, we make our app using the Python web framework FLASK,
→ the main page of the app provides the user with a UI enabling them to upload an audio file(.wav type), and the user will get the prediction of the genre of the audio file.
→ On the main, the user is provided with a link to go to the page from where they can even record their own audio pieces for its genre prediction from our model.
After recording the audio, they can listen to the recorded audio from the player available. Then, the audio will be saved as a static file by clicking on the Confirm button. After waiting for a few seconds, the user can click on Predict for the genre prediction.
We were able to achieve accuracy even on the small audio piece that the user recorded (any part of the song ), and the genre differs on the basis of what that audio piece consists of rather than the whole song’s genre.
→Apart from the Genre prediction, our web app provides the user with recommendations of the other popular songs of the predicted genre which the user might want to listen to.
The recommendations are fetched from the Spotify API in real-time. As a further development of the app, the user can be provided with the facility of searching and playing the music of their genre liking on the web app itself.
The Predictions and Recommendations page:
Source Code
Github: https://github.com/roboclub-mnnit/Music_Genre_classification-2022-23-Project
Video: https://www.youtube.com/watch?v=1MGkDt1iHKA
Research paper referred:
Resources
Official Python Documentation:
Librosa Module(audio analysis library)
https://librosa.org/doc/latest/index.html
GTZan Dataset on Kaggle:
https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification
Audio Analysis Using Python Tutorial:
1. Sound of AI official Youtube channel
https://www.youtube.com/@ValerioVelardoTheSoundofAI/playlists
2. https://www.youtube.com/playlist?list=PL-wATfeyAMNrtbkCNsLcpoAyBBRJZVlnf
Spotify API Documentation:
https://developer.spotify.com/documentation/web-api
Real-life Applications
The project finds a large spectrum of real-life areas of application.
CONTRIBUTORS
Name |
Branch |
Reg. no. |
Alok Kumar Singh |
CSE |
20214240 |
Shreyansh Sinha |
CHEM |
20218002 |
Siddhant Bhardwaj |
MECH |
20213067 |
MENTORS