AI Powered Classroom Occupation Meter
by Atvars in Circuits > Raspberry Pi
58 Views, 1 Favorites, 0 Comments
AI Powered Classroom Occupation Meter
Hello! I am a first year Creative Tech and AI student in Howest, and for my project one I had to think of something that utilized AI.
I came up with the idea of creating a Classroom Occupation Meter. Basically, it's an app that records how many people are going in and out of a room by tracking their heads and seeing whether they cross a horizontal line that the user draws across the doorway.
The video feed is taken by a webcam connected to a laptop, and the AI model makes predictions on each frame of the video. After the user stops recording, the number of people going in and out, the timestamps for each measurements, as well as the line coordinates are sent to a Raspberry Pi (RPi) connected by an ethernet cable. On the RPi, I have created a SQLite database that stores this information, along with info about various classrooms in my college.
All of the code is available at my GitHub repository: https://github.com/howest-mct/2023-2024-projectone-ctai-ApinisAtvars
Here are the datasets that I made to train my model on (not all of them are fully pictures taken by me):
Supplies
Hardware:
For the project itself:
- Raspberry Pi 5
- GJD 1602IIC LCD Display
- Laptop (Or any computer to train the model)
- 4xWires to connect LCD Display
- Webcam
Software:
- VS Code
- PuTTY
- VcXsrv
(OPTIONAL) Find and Preprocess the Training Data
1.Find training data for the model.
I used both Kaggle and Roboflow to find the data to train my model on, though Roboflow has a lot more datasets for computer vision.
I found that by searching "class: head", a lot more datasets that already have heads labelled show up.
2.Gather your own data.
The conditions that your model will be operating on will likely be different than anyone else's. This is why it is really important to gather your own data.
Think about where you will put your camera, take photos of differently looking people, during various times of day and lighting conditions.
My model got confused by hands, so maybe take videos where both your head and your hands are in the frame.
The most important thing, though, is to get permission, if you take pictures or videos in public, as in many places not doing this is illegal.
3.Label the data.
This step is usually needed regardless of whether you use pre-existing data or your own, as many datasets contain pictures of people, but only their whole bodies are annotated.
This, of course, will not do, as my project specifically needs heads to be tracked, so you need to relabel the data.
For all of my labelling needs, I used Roboflow. You can create a free account, and create a dataset consisting of up to 10 000 images.
Roboflow also allows you to add various pre-processing an image augmentation steps to increase the size of your dataset and your model's robustness.
4.Export the dataset
When you are happy with the data you have labelled, you need to export the dataset in the YOLOv8 format. When given the choice, you need to select "show download code" and paste this in a Jupyter notebook.
By running the code they have provided you, your dataset will be automatically downloaded and put into a sibling folder of the one that your notebook file is in.
(OPTIONAL) Train Your Model
The code for training a model is in my repository, under the folder Training_model in the file first_model.ipynb
1.Select the right model for you
Ultralytics provides you with many types of YOLOv8 models.
For starters, there are pretrained models - models that have already been trained by the Ultralytics team, and there are untrained models - models that haven't been trained on anything.
On top of that, YOLOv8 has various kinds of architectures: Nano, Small, Medium, Large and Extreme. The difference between these is the number of neurons, thus larger models are capable of recognizing more patterns (if there are any), but are also slower to train and make predictions with.
I have an RTX 4050 Laptop GPU, and the largest model that I could train was the YOLOv8m (medium). It took me around one and a half days to train for 36 epochs, and I didn't see any increase in the accuracy.
2.Train the model
If you have a powerful GPU that is compatible with CUDA, I strongly suggest you training your model with that. Otherwise, you can train it on your CPU. In the end, this will only change the time needed for training.
When I started training my first models, I just estimated the number of epochs that I could complete during one night, so when I woke up the model was either still training or already done training for a couple of hours. As I found out (of course) after I had trained my final model, you can input the number of hours that you want to train your model for, so that no time is wasted. All of this is to say that there are many useful parameters for the YOLO.train() function that I highly recommend you look in to.
The necessary information is available at this link: https://docs.ultralytics.com/modes/train/#resuming-interrupted-trainings under the sub-header "Train Settings".
(OPTIONAL) Update My Code With Your Model
If you have trained your own model, you need to update my code.
At line 19, where the path to the old model is, you can rename it to the absolute path to your model's best weights.
Install VcXsrv and PuTTY
What is X11 forwarding?
X11 forwarding is an SSH protocol that enables the user to interact with a remote machine's applications using their graphical user interfaces.
Why you (might) need it
When you start the Raspberry Pi code, you are greeted by a graphical user interface that I made using CustomTkinter.
The UI is a menu where you select a classroom to which the measurements will be assigned to. Since this user interface is at the start of the code, nothing else is done until the user exits it, and if you're not doing the X11 forwarding, you'll be stuck looking at an empty terminal.
As mentioned in the title, you might need it. The user interface actually is displayed on the Raspberry Pi's desktop, so if you have a remote desktop viewing tool (like RealVNC Viewer), you can just connect to your RPi through that and everything will work just fine.
So why do you need it? Well, doing this step means that there will be a little less work to do, as the GUI will automatically open on your Windows desktop.
How to install VcXsrv and PuTTY
Please, follow this tutorial: https://linuxhint.com/linux_graphical_windows_x11_forwarding/
Notes
Remember that each time you restart your computer, you will need to start the X11 server again (run xlaunch.exe)
When you enable X11 forwarding in PuTTY, it's nice to save the settings before opening the connection, otherwise you will need to re-enable it every time you open PuTTY.
Create a HeadTracker Class
While we already have a model that can detect heads, it has no idea whether the head we detected in the last frame is the same head in this frame. Also, because we will have a counter line that will need to be crossed to change the number of people in the room, we need to draw a Centroid: A point in the middle of the head.
The HeadTracker class does precisely this. In broad strokes, calculates the centroid of each bounding box that the model predicts, calculates the Euclidean distance between the new points and the ones previously calculated and then decides whether the new points are actually new heads, or already recognised heads that have moved after the previous frame.
In order to implement this algorithm, I followed this tutorial: https://pyimagesearch.com/2018/07/23/simple-object-tracking-with-opencv/
Note
The code for my HeadTracker class is also available in the file AI/models/head_tracker.py
Implement Kalman Filter
When I was testing my project while coding, I usually just drew the counter line in the middle of the screen and bobbed my head up and down. In this scenario, it worked perfectly, however, when I started testing my model in real life scenarios with people walking in and out of the classroom, it only worked if a person was walking as slow as they can. In other words, it didn't work. When searching for a solution, I came across an algorithm called the Kalman Filter.
What is a Kalman Filter?
"The Kalman Filter algorithm is a powerful tool for estimating and predicting system states in the presence of uncertainty and is widely used as a fundamental component in applications such as target tracking, navigation, and control."
(Quoted from: Alex Becker (www.kalmanfilter.net), “Online kalman filter tutorial.” https://www.kalmanfilter.net/default.aspx)
How to implement Kalman Filtering?
I gave Chat GPT my code and asked it to implement this, and it created a class for it, as well as modified my app.py code so that the Kalman Filter is used to make predictions of my Centroid coordinates.
The end result is that my points move a lot more smoothly, however I did run into a problem. (more on that in the next step)
Note
The code for this is both in the KalmanFilter.py file in AI/models folder and the track_heads function in app.py in the AI folder.
Capture Webcam Feed and Overlay Bounding Boxes, Centroids and Counter Line
For capturing footage and overlays, I used the cv2 library in Python.
Capturing video
You can capture your webcam footage with the function cv2.VideoCapture(0) (the 0 means that it's the first device that you are connected to that records video, so you might need to change that if you're using various webcams.)
Additionally, you can resize the frame. I made it take a resolution of 1920x1080.
Overlaying bounding boxes
Drawing the bounding boxes is relatively simple. The first picture is my function for drawing the bounding boxes. Note that it draws a single box, so this function needs to be called in a loop that iterates through all bounding boxes.
Drawing counter line
This, also, wasn't that hard. I chose to make the user draw a counter line themselves, by pressing the mouse.
For finding out how callback functions work in cv2, I used this tutorial: https://docs.opencv.org/4.x/db/d5b/tutorial_py_mouse_handling.html
I just made a simple function that draws a line between 2 points that the user left clicks on, it's the second picture of this step. Remember though, this function needs to be assigned to your cv2 screen, using the cv2.setMouseCallback function.
Overlaying centroids
As I mentioned in the previous step, implementing the Kalman Filter makes a new problem arise.
When a person went into the frame from below, the centroid was initialized at (0,0) (top left corner), then it jumped to the head, crossing the counter line and incrementing the number of people by 1, and then followed the head. So, if someone was to walk outside of the classroom, the number of people wouldn't change. I tried for a long time to fiddle with the arguments of various functions that ChatGPT gave me, but to no avail.
The solution I found was to create a new variable, a dictionary called on_screen_for. The keys are the centroid IDs, and the values are the number of frames that the centroids are on screen for. Each frame, a loop going though every centroid checks whether it is in the dictionary, if so, it increments the value by 1, and if not, it puts it in there with a value of 0 and immediately adds 1. A centroid is only displayed and able to change the number of people if it has been recognised for more than 5 frames.
Sending the Number of People to Your RPi
The sending of the number of people is done with the socket library in Python.
I HIGHLY advise you to watch a video (like this one: https://www.youtube.com/watch?v=Lbfe3-v7yE0&t=2s) to learn more about how socket communication is done in Python, as I spent a lot of time on trial and error trying to modify my lector's code.
In my code, the host is the RPi, and my laptop is the client. The code for this is available in AI/models/client.py for the laptop, and RPi/app.py line 25 through 113.
Write People Number to .csv File
In my code, while the RPi is connected to my laptop, it's in a loop where it constantly checks whether the number of people has changed. If so, it displays it on the LCD display (next step), and writes it to a .csv file.
This, again, is a simple function that I had already written for a Sensors and Interfacing assignment during the second semester. You can see it in the picture.
If you are running the code on your own RPi, you probably need to change the filename, as it's the absolute path to where the file will be stored. The os module checks whether this file exists.
If it does exist, it just writes a new row of the current time and the number of people.
If it doesn't exist, it writes a header row first.
Power the LCD Display and Show the Number of People on It
I had already worked with this particular LCD display in my Sensors and Interfacing course, so I had written a class for it that already handles sending all of the instructions. All I have to do is call the display_text line and write a string of text to display.
This is why I encourage you to take a look at it. It's located in the RPi/models/lcd.py file, and, if you're a CTAI or MCT student, you should have little to no trouble understanding it.
If you're still struggling though, I also recommend looking at the datasheet for this display. Here's the link to it: https://cdn.soselectronic.com/productdata/7f/6b/5f682254/bc-1604a-bnheh-1.pdf
What you need to do to power it:
Make sure that I2C is enabled on your RPi.
To do this, ssh into it, and run the command sudo raspi-config, then go to interface options -> I2C -> Yes -> ok -> [tab] -> [tab] -> [enter]
SQLite Database on RPi
After I had done all of the prior steps, I decided to create a SQLite database using the sqlite3 module.
My database consists of two tables, but it isn't fully normalised, so I recommend you create a separate "Teachers" table, so that you don't have to repeat every teacher name each time you create a new "Class" entry.
Attached, you will find a picture of my database structure.
Here is the link to the official SQLite documentation page, which I used to figure out the syntax: https://www.sqlite.org/docs.html
I created a class DatabaseRepository with functions to create and delete new classes and measurements, update them, change the coordinates, get all classes and measurements, get last class id and create and delete the tables themselves.
I noticed after some testing that, if the path to your database is not correct, a database will still be created somewhere, and you will not get an error, so pay attention to your path string.
After creating the DatabaseRepository, I also created a DatabaseService class just to give a layer of abstraction for the end user.
Sending Final Data to RPi
As you saw in the previous step's picture, I added a lot of data that my RPi hadn't received yet.
In my app, when the main communication (the sending of people_number) is done, a dictionary of people going in, people going out, timestamps of each measurement and the counter line coordinates are sent to the RPi as a .json file.
To achieve this, I used the socket module again, but I ultimately had to change the structure of the message that I wanted to send.
Since the length of the message can vary, I added a header of every message that just contained the length of it in characters. On the receiving end, I constantly checked whether the message was as long as its full length, and when it reached this size, I parsed the json file and got a dictionary again.
All I needed to do then was to insert the final data in the database.
Create GUI for Selecting Classroom
I decided that I wanted a GUI for selecting my classroom, so that running the code would be as simple as it can be for the end user.
To do this, I used the customtkinter module in Python. The difference between it and the regular tkinter module is that customtkinter looks a bit more modern, and has a dark theme.
The code for this UI is in RPi/models/custom_tkinter.py
I have also added a diagram of it here
Send Coordinates to Laptop
I made a loop at the start of each app.py file (on laptop and RPi) that waits until there is a connection.
As soon as there is one, the RPi sends a 0 or a 1, this number means whether the RPi will use existing coordinates (0) or save new ones (1).
If it's 0, the RPi sends the coordinates and the counter line is drawn from the get-go.
Create a Website for Displaying Database Data and Editing It
I also created a website that displays useful statistics using pyplot and seaborn libraries, and creates Pandas data frames of each table. The user can also edit the entries of the database by entering a password.
It's hosted on my RPi, but can be accessed by any device, just with a different link. The website is even compatible with small screens, like the ones on phones (this comes by default).
For creating the website, I used the streamlit library. It is a great way of displaying data frames and plots, and provides the user with a high level interface for coding UI.
I followed tutorials on their website for everything I did: https://docs.streamlit.io/
The code I wrote is available at RPi/models/streamlit.py
FINAL: Running the Code
That's it! This is the final stretch. All you need to do now is run the code.
First, you need to run the code for RPi. You do this by going into PuTTY and opening a connection. The host name should be [your_username]@[your RPi's IP address]
To actually run the code, you need to write python + DISPLAY:=0 (If you don't have X11 forwarding enabled) + the path to the RPi's app.py
Now, navigate to the GUI. It's either on your laptop's desktop, or your RPi's desktop (Depending on whether you enabled X11 forwarding or not).
Next, run the laptop's app.py, and everything should work.
Thank you for following this tutorial, and I hope it helps you do something important, or at least fun.