Toll Gate Vehicle Counting¶
Does the toll gate have a system for calculating the number of incoming vehicles? Of course, toll gates have been around for a long time. But what if we approach this problem with computer vision technology.
So the idea is, with computer vision we can detect objects (cars, buses, trucks, or vans). Then we can track the object (give it an ID so it won't be mistaken for another object). If the object given the ID is entered into the toll gate, we will calculate it (register the ID and class. And boom, we've come up with a solution.
How is it implemented in code? Is it easy? Absolutely. If we recap, there are three things we have to make: (see picture above for details)
- Object detection model.
- Object tracking model.
- Function to entry ID and class.
🌟 YOLOv5 + SORT¶
As the object detection model, we will choose YOLOv5 (Why? Why not). This model is the most manageable object detection model to use out there. You don't need to understand rocket science to use it. We will also use the pre-trained model, so we don't need to do any training.
As an object tracking model, we will use SORT (Simple Online and Realtime Tracking). Luckily, some people make the SORT algorithm easy to use; visit here.
Lastly, the function to enter ID and class. This is the part that we will make in detail. In simple words, we can imagine that we are using Excel to enter the ID and class into their respective columns. It's just that we replace Excel with Pandas.
Nothing seems to have been missed. Let's see the video that will be used as project material. The source of the video can be seen here. If we take a look, we get a view from the side (which we expect should be from the front). So, how can we know if the object has passed/entered the toll gate? The answer is simple, we just need:
- Draw a straight line
- Tracking the objects, whether they have crossed the straight line
- Enter the id and class to the database
🪢 Line Intersection Methode¶
The second answer will be a problem if we viewed it from code perspective. From the object, we can get its midpoint
(x,y). An object can be said to cross the line if the value
(x,y) of the object is more/less than the value
(x,y) of the line (depending on the context). The problem is that the line's value
(x,y) constantly changes because the line forms an angle. This problem can actually be solved by the line intersection method.
🔭 Perspective Transform¶
Unfortunately, the line intersects method is challenging to implement (in code) when we have a lot of objects (or maybe I'm the stupid one). Therefore, we will use the perspective transformation method to turn the line that forms the angle into a vertical line (either in the y-axis or the x-axis).
We will detail the flowchart from the previous diagram by adding the perspective transformation method.
🦾 Implementing in Code¶
Objectively we have what is needed to tackle this problem. It's time for execution. We're not going to get into how YOLOv5 and SORT work. I can't really explain. What we know is, for inference (detection and tracking), we can run the script below:
python track.py \ --source ./data/toll_gate.mp4 \ --yolo-weights ./weights/yolov5s.pt \ --strong-sort-weights ./weights/osnet_x0_25_msmt17.pt \ --classes 2 5 7 \ --save-vid
We will focus on creating a perspective transformation and vehicle entry function. Let's start with the perspective transformation first.
🔭 Perspective Transformation Code¶
The perspective transformation aims to transform (change) perspective from one view to another. In this case, view from video to top-down view/bird-eye view (BEV). Luckily, OpenCV already provides the necessary functionality.
def get_trans_mtx(frame, src=None, dst=None): """Get transformation matrix""" H, W = frame.shape, frame.shape if src is None and dst is None: # default to toll_gate.mp4 frame src = [(0, 195), (390, 292), (485, 262), (75, 175)] # counter-clock (bottom-left) dst = [[0, 125], [W, 125], [W, 25], [0, 25]] # counter-clock (bottom-left) src = np.float32(np.array(src)) dst = np.float32(dst) matrix = cv2.getPerspectiveTransform(src, dst) return matrix
get_trans_mtx() is a function to get the transformation matrix. This function requires two inputs in the form of four source points and four transformation destination points. Four source points can be selected from the frame in the video. In comparison, the four destination points are actually the size of the transformed frame.
def get_trans_point(xyxy, matrix): """Get transformed point with trans mtx""" cx = (xyxy - xyxy)/2 + xyxy cy = (xyxy - xyxy)/2 + xyxy centroid = np.array([[[cx, cy]]]) trans_cent = cv2.perspectiveTransform(centroid, matrix) return (int(trans_cent), int(trans_cent))
get_trans_point() function is a function that aims to transform the center point of the object (centroid) from a video view into another view. Of course, this function requires the input of the object's bounding-box coordinates (in
xyxy) and a transformation matrix. In addition, there are other functions in
script/plot.py that we will not describe because they are easy to understand.
🏎️ Vehicle Entry Code¶
Next, let's shift to the vehicle entry function. The Counting class function requires an input line, which is the coordinates of the line along the y-axis.
class Counting: """Counting vehicle entrace gate""" def __init__(self, line, classes=None): self.line = line ... self.df = pd.DataFrame(columns=['ids', 'cls']) def count(self, cxcy, id, cls): """Count vehicle entrace gate""" cx, cy = cxcy, cxcy if cy < self.line and id not in self.df['ids'].unique(): # append id to dataframe self.df = pd.concat( [self.df, pd.DataFrame([[id, self.classes[cls]]], columns=['ids', 'cls'])], ignore_index=True) ...
Counting class has a
count() function, which will enter vehicles based on their ID and class. This function implements a logic that the vehicle will be counted if it crosses the line and the vehicle id does not exist in the database entry. We create a database entry in the form of a pandas DataFrame.
🖲️ Source Code¶
The source code can be found in ruhyadi/Toll-Gate-Vehicle-Counting