ICL-Project15-ELP

Back to home

Meeting Minutes

Overview of KEY decisions taken:

Signal Processing method, Spectrogram vs MFCC 28/04/2021, 30/04/2021
Classification vs Object Detection 25/05/2021
Custom Vision vs Yolov4 vs Faster-RCNN 10/06/2021
USING OUR DETECTOR AS A FILTER 10/06/2021

27/04/2021

Agenda:

Write single page A4 summary of meeting with client
Technical questions we may need to ask Microsoft:
- Azure ML Studio (no code solution, can convert to spectrogram using this? or feed spectrogram images) vs Azure Notebooks (using keras)
- Custom Vision Azure https://azure.microsoft.com/en-gb/services/cognitive-services/custom-vision-service/, https://www.customvision.ai/
- Migrating the MS Teams dataset (from Peter) from Cornell to Azure storage
- Azure cloud storage and passes them to Azure Video Indexer to obtain quick insights into a video.
Technical questions we have amongst ourselves
- Splitting and labelling dataset (pydub)
- Spectrogram (use python lib)
- Using convolutional neural network (CNN) and other computer vision techniques
Check everyone is up to date on using Azure services, and has been added to the subscription

Minutes:

Write a single page – done, checked, all good
Set up recurring meeting with Microsoft and Client at 5.30pm on Thursdays
Spectrogram – use librosa for mfcc, scipy uses multiple Fourier transforms but mfcc uses other method
Everyone can see the subscription

Questions for Microsoft:

Notebook vs Running a project file – best way to run python project other than using notebook on Azure
How to transfer the dataset from box?
Good azure tools for splitting up the audio files in order to use for testing, extracting certain audio from the files for the test set. Aka can there be an automation on Azure somewhere, to go through files and run a python script?
We are comfortable with the machine learning side; the main thing will be the pre-processing of the data.

28/04/2021

Agenda

Set up a meeting with the IC supervisor for next week - signal processing for machine learning research
Any useful resources people have found
Benefits of RNN and MFCC

Minutes

Spectrograms are visual representations of the frequency and amplitude of sound over a specified span of time, and are constructed from an array of frequencies.
use Librosa, an open-source Python library known for its convenience and versatility in audio analysis and manipulation. Librosa allows us to pass our samples to a function that would then compute the appropriate Fourier transforms needed to compose a valid spectrogram.
Spectrogram vs Mel-Spectrogram vs MFCC, we can ask IC supervisors opinion. https://medium.com/prathena/the-dummys-guide-to-mfcc-aceab2450fd
The advantage of MFCC is that it is good in error reduction and able to produce a robust feature when the signal is not affected by noise. SVD/PCA technique is used to extract the important features out of the B-Distribution representation.
Reason to choose RNN is it takes time into account, we can test it out.
Toughest part is the data processing and splitting and forming a good dataset

29/04/2021

Agenda

Discussion
Meet with Microsoft

Minutes

Custom Vision – a tool to create custom computer vision models, very simple and only for images
Notebook vs Running a project file – best way to run python project other than using notebook, https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-train
Training is good on Azure ML Studio deployment on container
Hosting only supports GPU, need KB to host it, instead we can publish it on container instances, good for training not good for hosting
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-keras
Make sure we close the instances it’ll cost more
Batch and real time inferences (real time straight away response), batch you have to wait until you get a certain amount of values, batch can be done with ML compute and you can use GPU. Good for us because its every 4 months as a batch (using AKS).
Data migration from Cornell to Azure:
- https://docs.microsoft.com/en-us/azure/storage/files/storage-how-to-create-file-share?tabs=azure-portal
- https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-blobs-upload?toc=/azure/storage/blobs/toc.json

30/04/2021

Agenda

Look at the data and how we will label it

Minutes

Met up, set up the notebook to see how it works and uploaded a dataset to test it out
MFCC not very good with noisy signals but we will test it out, all of us to watch this series https://www.youtube.com/playlist?list=PLhA3b2k8R3t2Ng1WW_7MiXeh1pfQJQi_P

01/04/2021

Agenda:

Look through code Yiming made for MFCC - periodogram
Compare against spectrogram

Minutes:

Extract key data from signals and put it into a json file, the y axis to x axis and the energy somehow to y axis.
Think we can only generate a periodogram from the time series signal, when you move to MFCC you cannot really generate a periodogram since you have 3D graph, time, frequency and amplitude.
Can use scipy.signal.periodogram if we wanted a periodogram representation.
Kapre doing same thing as our current MFCC extractor, all good

4/05/2021

Agenda:

Discuss the feature engineering and the features we can extract from our data, can we use the non cepstrum values i.e. the graph, normalise it and then use a CNN
Look at the MLP model
Discuss how to prevent false positives

Minutes:

We have got new data, different types of gunshots, can use that for classification and see how well using MFCC to classify it goes
Figuring out who to import blob storage to Azure ML studio

6/05/2021

Agenda:

Look at the classifier we made for the different gun types
Look at the CNN we made for gunshot detection, made two, one using the values but means everything has to be the same length audio clip, and other one is generate the PNG images and then reimport and convert to array of certain size.
Decide spectrogram vs cepstrums, can we extract any other features?

Minutes:

Name, sound file, data length, label in csv file
Main task data normalisation – we will try two method, PNG method and data normalisation of the MFCC
We want to split the large 24 hour audio clips into smaller clips, and then pass the data into the model and see what comes up as false positives, when we find this data we want to save that filename and then we can use it for training
Use mounting to get each file, https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-with-datasets#mount-vs-download through a network drive

7/05/2021

Agenda:

What is everyone doing and what to do next?

Minutes:

What did we accomplish last week?

Processing data, loading file, MFCC coefficients + Spectrum
Lots of reading
Getting to know the dataset
Making small models
What will we accomplish this week? by next Thursday
Breaking down long files to get false positive data
Figuring out how to normalise the MFCC
Brainstorming leaflet ideas
Create CSV file for whole dataset (pnnn and eco) file name, duration, sample rate, labels initial: gun or not gun first, combine dataset into one single CSV file
Everyone up to date with the theory Is anything blocking you?
Need to build proper dataset before we can train models in full

08/05/2021

Agenda:

Brainstorming

Minutes:

Produced brainstorm of ideas

11/05/2021

Agenda

Discuss developments of the tasks set on 07/05/2021
Show the CNN model using spectrogram, suggest as demo
Look through the leaflet/flyer examples together

Minutes

Normalising the data will be sound clip length
Meet IC supervisor wednesday 10am
https://www.youtube.com/playlist?list=PL-wATfeyAMNrtbkCNsLcpoAyBBRJZVlnf 10-15 look at pre-processing, save as a JSON file,
Dynamic time warping as an idea

12/05/2021

Agenda:

Meet with IC supervisor, discuss different methods of audio signal processing, MFCC, Spectrograms etc

Minutes:

Cloud allows users to access this capability easily, to make this impact broader so more people can sue it around the world, for people who do not have the infrastructure to support all the task, tune it towards this end (developing countries can acknowledge this more easily)
Pipeline of the building blocks, keeping costs down, efficient and economical solution to users
Understand the data we have and the goal we want to achieve
Pre-process the data, cut the sound files into smaller clips, use a method MFCC feature extraction to train a network for the features. Currently we are using a CNN (DL method), store the MFCC in a JSON file, we can use decision tree and SVM which allows us to reduce computer power and costs. Can also use an RNN, LSTM.
Find a suitable network and find the suitable weightings on the models. Can use transfer learning on our existing model.
Ask Peter (client) if padding the signal is ok (Peter has good background in signals)

13/05/2021

Agenda:

Team discussion
Ask Peter if padding signal is ok for spectrogram and MFCC
Ask about a recommended frame length
Talk about increase in FN and decrease in FP, causing higher precision, lower recall (may be due to unbalanced dataset)

Minutes:

Need to find an even split of our data, and generate some more data using data augmentation
Demo’d the CNN and recorded for Peter to watch
We can generate data using time stretching but pitch changing, amplitude changing not as effective

14/05/2021

Agenda:

Discuss JSON generation
Split up tasks, and talk about what the completion of a good training dataset will allow us to start testing models
Talk about good methods of data augmentation
Meet with Peter

Minutes:

Augmentation: TIMESHIFTING
Pad ecoguns and pnnn guns to 12 seconds, padding on either side, could use white noise or rainforest sounds
Write script to find FP in the meantime (Peter now provided with FP samples)
Overlap the 12s clips of sound for detection

17/05/2021

Agenda:

Explain the false positive data
Discuss number of cepstrums in MFCC we want, possibly generate identical training sets with different number of cepstrums

Minutes:

Added batch normalisation, may normalise when generating
Added LSTM testing
Overlapping the data to see if it adds some time shifting augmentation

18/05/2021

Agenda:

Work on training the model

Minutes:

Worked out how to generate the test set using JSON files

20/05/2021

Agenda:

Meet with Microsoft

Minutes

What did you accomplish last week?

Prepared dataset
Tested out different (CNN, LSTM/RNN)
Analysis on the dataset (distribution of test and train data seemed to be different)
RNN
What will you accomplish this week?
Testing different representations of our data (number of MFCC) and seeing what data works well on our models
Standardising our code into .py rather than .ipynb (create functions)
Continue training and testing models

Is anything blocking you?

How would Peter prefer the system to run, upload all the data to cloud then run into ML model OR locally process data, then use cloud to determine gunshot or not (full cloud solution vs hybrid solution)

21/05/2021

Agenda

Updates

Minutes:

Want to change the data representation because we are plateauing at 0.9 precision which will still throw a lot of false positives, therefore we are doing a split of background noise, hard negative, gunshot. 0,1,2.
For tomorrow generate the dataset

25/05/2021

Agenda

Re-evaluate the MFCC method
Talk about the distribution difference between test and train data, MFCC extremely effected by noise
Other options?

Minutes

Deciding on taking an object detection method due to noisy dataset, we can find gunshot ‘shape’
Tested MFCC when merging train and test data and shuffling, it still trained to background noises!
Found Azure data labelling service, also testing out Azure Custom Vision

26/05/2021

Agenda:

Meet and label the data on Azure Data Labelling service

Minutes:

What did you accomplish last week?

Tested different representations to see what is best, found that MFCC is not picking up the gunshots properly
Testing object detection techniques using azure custom vision
Drew the bounding boxes for the gunshots
What will you accomplish this week?
Pre-process data and train yolov3 and see its results
Test out more azure custom vision methods
Continue standardising code

01/06/2021

Agenda:

Team discussion
Discuss different object detection architectures we are testing

Minutes:

Want to export the gunshot locations using bbox ratio
Spoke about the custom vision model, and Faster-RCNN model, faster-rcnn slower but better

03/06/2021

What did you accomplish last week?

Tested and trained our custom vision models (iteration 6, high precision, OK recall)
Labelled lots of data
Built a Faster-RCNN network that works much better (it can detect the difference between ak47/automatic rifle and shotguns. It also knows when something is a mangabey (monkey) alarm call or raindrop.

What will you accomplish this week?

Continue testing out the Faster-RCNN model by adding more labels and training data
Testing the F-RCNN model on 24 hour sound clips to see real time performance Is anything blocking you?

09/06/20213

Minutes

Met with Dmitry (Microsoft), told us the use of MFCC could be key, with a smaller sliding window, however our dataset would require manual labelling and time permitting is not possible.
Another solution we proposed is that we can check the bbox coorindates and we know that the lower y value of bbox will not be above a certain value, so we don’t consider it as gunshot (high frequency event will most likely not be a gunshot, they mainly all start from bottom of spectrogram).
Our ML solution actually generates more data that can be used in training, and we also generate the bbox with it, this can be used for a more accurate dataset and in the future used to train a 2s sliding window model using MFCC

10/06/2021

Agenda

Meet with Microsoft
Mention solution of using our model after the current detector runs, therefore we will have all amplitude spike locations, and our model can act as a filter.

Minutes

The first pass could be current detector, and second pass could be our model.
Found that our model doesnt detect every single gunshot when there are multilpe gunshots going off within 0.5 second of eachother, but it will get at least one, usually 33% of the closely packed gunshots from a 12s selection so they are not missed, as the gunshots are essentially overlapping.
Our model can detect gunshots which are far away and do not have as big of an amplitude spike which might be missed by first pass
Discussed about how we train our model against typical false positives such as raindrops, electrical ticks, thunder, monkey alarm calls.

What did you accomplish last week?

Tested our models, Yolov4, Faster-RCNN and Custom Vision and compared against Peters detector
Labelled lots of data
Trained the models against false positives we found it getting

What will you accomplish this week?

Finish off training the models and continue testing
Clean up code and file structure ready for deployment
Prepare for the demonstration of our model

Important development

From the current detector, it can filter all the 24 hour soundclips and give us important sound events.
The method of second pass can be done by putting key sound event selections and running it through our detector, essentially what this means is instead of a human trying to check each individually after the first pass, our detector can do this to filter it down even more, then once this is done a human can check again manually!

ICL-Project15-ELP

Meeting Minutes

27/04/2021

Agenda:

Minutes:

Questions for Microsoft:

28/04/2021

Agenda

Minutes

29/04/2021

Agenda

Minutes

30/04/2021

Agenda

Minutes

01/04/2021

Agenda:

Minutes:

4/05/2021

Agenda:

Minutes:

6/05/2021

Agenda:

7/05/2021

Agenda:

Minutes:

What did we accomplish last week?

What will we accomplish this week? by next Thursday

08/05/2021

Agenda:

Minutes:

11/05/2021

Agenda

Minutes

12/05/2021

Agenda:

Minutes:

13/05/2021

Agenda:

Minutes:

14/05/2021

Agenda:

Minutes:

17/05/2021

Agenda:

Minutes:

18/05/2021

Agenda:

Minutes:

20/05/2021

Agenda:

Minutes

What did you accomplish last week?

What will you accomplish this week?

Is anything blocking you?

21/05/2021

Agenda

Minutes:

25/05/2021

Agenda

Minutes

26/05/2021

Agenda:

Minutes:

What did you accomplish last week?

What will you accomplish this week?

01/06/2021

Agenda:

Minutes:

03/06/2021

What did you accomplish last week?

What will you accomplish this week?

09/06/20213

Minutes

10/06/2021

Agenda

Minutes

What did you accomplish last week?

What will you accomplish this week?

Important development