top of page

Pedestrian and Car Detection

Introduction

Pedestrian detection in self-driving cars involves the use of various sensors, computer vision techniques, and machine learning algorithms.

 

How it works?

Here's a general overview of the process:

 

Sensor Inputs: Self-driving cars are equipped with a combination of sensors such as cameras, lidar (light detection and ranging), radar, and sometimes ultrasonic sensors. These sensors provide different types of data, including visual images, depth information, and object detection measurements.

 

 

Preprocessing: The sensor data is preprocessed to enhance the quality, remove noise, and prepare it for further analysis. For visual data from cameras, preprocessing may involve tasks such as image resizing, color space conversion, and image enhancement.

 

Object Detection: Object detection algorithms are applied to the sensor data to identify and locate potential pedestrians in the scene. The algorithms analyze the visual data, looking for patterns, shapes, and motion cues that indicate the presence of pedestrians.

  • Machine learning-based approaches: Many pedestrian detection algorithms use machine learning techniques, such as deep learning, to train models on large datasets of labeled pedestrian images. These models learn to recognize pedestrian features and can classify regions of interest in the input data as either pedestrians or non-pedestrians.

  • Haar cascades and HOG (Histogram of Oriented Gradients): These are traditional computer vision techniques that use handcrafted features and classifiers to detect pedestrians based on visual patterns and gradients in the image.

Tracking and Fusion: Once a pedestrian is detected in a single frame, tracking algorithms are employed to maintain continuity and track the pedestrian's position across subsequent frames. This helps to handle occlusions and track the pedestrian's movement accurately. Sensor fusion techniques combine the information from different sensors, such as cameras and lidar, to obtain a more reliable and complete understanding of the pedestrian's position, size, and motion.

 

Decision-making: The detected pedestrian information is used by the self-driving car's decision-making system to determine appropriate actions. For example, if a pedestrian is detected crossing the road, the car's system may decide to slow down or stop to avoid a collision.

 

It's important to note that pedestrian detection is a challenging task due to various factors such as lighting conditions, occlusions, different pedestrian appearances, and complex environments. Therefore, a combination of sensor inputs, robust algorithms, and continuous improvements in machine-learning models are necessary to achieve accurate and reliable pedestrian detection in self-driving cars

​

Tutorial 1 HAAR CASCADE to detect pedestrian

 

import numpy as np

import cv2

import matplotlib.pyplot as plt

import time

 

ped_cascade = cv2.CascadeClassifier('media/M4/haarcascade_fullbody.xml')

cap = cv2.VideoCapture('media/M4/vtest.avi')

cv2.namedWindow('img', cv2.WINDOW_NORMAL)

 

while 1:   

        ret, image = cap.read()

        if not ret:

             break

 

        image = cv2.resize(image, (0,0), fx=0.5, fy=0.5)

        pedestrians = ped_cascade.detectMultiScale(image)

         

        for (x,y,w,h) in pedestrians :

                cv2.rectangle(image,(x ,y),(x+w,y+h),(0,0,255),2)

                    

        cv2.imshow('img',image)

        k = cv2.waitKey(1)

        if k == ord('q'):

            break

 

cap.release()

cv2.destroyAllWindows()

​

Output

​

​

​

​

​

​

​

​

 

Tutorial 2 Detection of Cat using HAAR CASCADE

import numpy as np

import cv2

import matplotlib.pyplot as plt

import time

 

cat_cascade = cv2.CascadeClassifier('media/M4/haarcascade_frontalcatface.xml')

cap = cv2.VideoCapture('media/M4/catrec.wmv')

cv2.namedWindow('img', cv2.WINDOW_NORMAL)

 

while 1:

     

        ret, image = cap.read()

        if not ret:

             break

                 

        # just resizing to 70% of the size to increase speed      

        image = cv2.resize(image, (0,0), fx=0.7, fy=0.7)

        cat_faces = cat_cascade.detectMultiScale(image)

         

        for (x,y,w,h) in cat_faces:           

                cv2.rectangle(image,(x ,y),(x+w,y+h),(0,0,255),2)

                cv2.putText(image,'Cat Detected',(x,y+h+15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0,0,255), 2, cv2.LINE_AA)  

                    

        cv2.imshow('img',image)

        k = cv2.waitKey(25)

        if k == ord('q'):

            break

 

cap.release()

cv2.destroyAllWindows()

​

Output

​

​

 

 

 

 

 

 

 

 

 

​

Car Detection with FPS

 

Here’s how you can do car detection easily.

car_cascade = cv2.CascadeClassifier('media/M4/carshaar.xml')

cap = cv2.VideoCapture('media/M4/carsvid.wmv')

cv2.namedWindow('img', cv2.WINDOW_NORMAL)

fps = 0  # set initial fps variable to 0

 

while 1: 

     

        start_time = time.time() # note the current time at the start of the loop

        ret, image = cap.read()

        if not ret:

             break

        resized_image = cv2.resize(image, (0,0), fx=0.5, fy=0.5)

         

        cars = car_cascade.detectMultiScale(resized_image) 

             

        ratio = 1 / 0.5

        for (x,y,w,h) in cars:

                x = int(x * ratio)

                y =int( y * ratio)

                w = int(w * ratio)

                h =int( h * ratio)

             

                cv2.rectangle(image,(x ,y),(x+w,y+h),(0,0,255),2)

                cv2.putText(image,'Car Detected',(x,y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.3, (0,255,0), 1, cv2.LINE_AA)

                cv2.putText(image, 'FPS: {:.2f}'.format(fps), (20, 50),cv2.FONT_HERSHEY_SIMPLEX,0.7,(0,255,0),1,cv2.LINE_AA)

             

        cv2.imshow('img',image)

         

        # Minus the current time with start time to get total time for this loop

        total_time = time.time() - start_time

         

        # divide the no of frames (1 in this case)/ by total time it took.

        # And now you have the Fps, which you display on each frame

        fps= (1.0 / total_time)  

     

        k = cv2.waitKey(1)

        if k == ord('q'):

            break

 

cap.release()

cv2.destroyAllWindows()

Output

​

 

 

 

 

 

 

 

​

​

Introduction

Traffic accidents have become one of the most serious problems in today’s world. Increase in the number of vehicles, human errors towards traffic rules and the difficulty to oversee situational dangers by drivers are contributing to the majority of accidents on the road. Lane detection is an essential component for autonomous vehiclesLane detection is an important component of advanced driver assistance systems (ADAS) and autonomous vehicles, as it provides information about the road layout and the position of the vehicle within the lane, which is crucial for navigation and safety. Lane detection system is an advanced driver assistance system (ADAS) technology that uses cameras or sensors to identify and track the lane markings on the road. Its primary purpose is to assist drivers in staying within their designated lanes and avoiding unintentional lane departures. The system analyzes the captured images or sensor data in real-time and provides feedback to the driver through visual, auditory, or haptic alerts.

​

Types of Lane marking

 

 

Types of lane boundaries: (a) Dashed. (b) Dashed-solid. (c) Soliddashed. (d) Single solid. (e) Double solid.

A road is any area the public is reasonably allowed to drive on including streets, highways, riverbeds, beaches, wharves and car parks.But in terms of what we conventionally consider to be a road, there are different types:

 

 

 

Single-lane roads

 

Unmarked: These tend to be called lanes, alleys, backroads and drives. They consist of a narrow road, often barely wide enough for two vehicles to pass, with no or few markings. They might be sealed, but they could be gravel or dirt.

 

 

Marked: These tend to be called streets, roads, routes, single carriageways, highways, or boulevards. They consist of one lane in either direction separated by a centre line

Multi-lane roads

 

Three lanes: a road with a marked overtaking lane with a priority in one direction has a passing lane. The opposite direction can use it if the way is clear and there's no restriction such as bollards, fences or a solid yellow line. Some normal streets or roads in urban areas have two lanes in one direction and one in the other.

Four lanes: These tend to be either dual carriageways (usually roads in urban areas) or expressways (high-speed roads that don't qualify for motorway status). Some 'arterial roads' are dual carriageways, but they could be single carriageways in places. Sometimes the opposing lanes are separated with a median barrier or median strip.

Four or more lanes: street or road (when used in a 50kph urban setting), or motorway when it is high-speed with grade-separated access (i.e. on-ramps and off-ramps)

 

How it works

 

Sensor/Camera Acquisition: The system uses one or more cameras or sensors to capture the view of the road ahead. These sensors are typically mounted on the vehicle's front windshield or in other strategic positions.

Image Processing: The captured images or sensor data are processed using computer vision algorithms. These algorithms analyze the pixels or sensor readings to identify lane markings, such as solid lines, dashed lines, or other road boundaries.

 

 

Lane Marking Extraction: The system extracts relevant lane markings from the processed data. It distinguishes between different types of lines, such as lane dividers, centerlines, or edge lines, to accurately determine the vehicle's position within the lanes.

Lane Tracking and Positioning: Once the lane markings are identified, the system tracks their position relative to the vehicle's current position. It calculates the lateral deviation of the vehicle from the center of the detected lane and provides feedback to the driver accordingly.

Lane Departure Warning: If the system detects that the vehicle is drifting out of its lane without the driver signaling an intention to change lanes, it issues a warning to alert the driver. The warning can be in the form of visual alerts on the dashboard, audible alarms, or vibrations in the steering wheel or seat.

Lane detection systems can enhance road safety by preventing accidents caused by unintended lane departures, drowsy driving, or distraction. They are often integrated with other ADAS features, such as adaptive cruise control, collision warning, or automatic emergency braking, to provide a comprehensive safety package for vehicles.

 

 

Tutorial : Lane detection using open cv

 

https://data-flair.training/blogs/road-lane-line-detection/

​

Algorithm of Lane detection

  • Feed live image frame by frame

  • Converting camera view to world co ordinate

  • Apply color Selection

  • Removal of Distoration and blur

    • Edge detection and perspective Transformation( Canny Edge)

  • Threshold detection ( for lane color detection) colorbased

​

Hough transformation

 

 

ROI Extraction

Given that the vehicle-mounted camera was installed in the middle of the roof of the smart car at a certain depression angle, the detection image contained useless background information, such as the sky, trees and hillsides, on both sides of the road. The effective detection portion, such as the lane, can account for approximately two-thirds of the area of the detected image called ROI. ROI extraction of the detected image can reduce the unnecessary computation and shorten the consumption time of the subsequent image processing step.

 

Inverse Perspective Transformation

Perspective is the phenomenon of images. The closer you get to the camera, the bigger it looks and vice versa. Parallel lane lines merge into a point in the distant field of vision called the vanishing point. The distance between the adjacent lanes near the vanishing point decreases gradually, which is detrimental to effective lane detection. Inverse perspective transformation is based on the inverse coordinate transformation from the world coordinate to the image coordinate that transforms the perspective image into the aerial view and restores the parallel relationship between the lane lines [

 

shows that camera calibration was completed, and the internal and external parameters of the camera were obtained. The internal parameters included focal length and optical centre, whereas the specific external parameters included elevation angle, yaw angle and height of the camera relative to the ground. The inverse coordinate transformation from the world coordinate to the image coordinate was obtained by using Equation (1). Compared with image distortion removal, inverse perspective transformation mapped the position points in the detected image to the new position points in the overlooking perspective to obtain the aerial view of the road image. Figure 7 presents the aerial view of the lane obtained from inverse perspective transformation.

 

 

The aerial view of the lane.

 

Mask operation

A mask operation is often used to define a region of interest (ROI) within the input image. The purpose of this operation is to limit the analysis and processing to the area where the lane markings are expected to be present. By applying a mask, irrelevant parts of the image outside the ROI can be ignored, improving the efficiency and accuracy of the lane detection algorithm.

 

  1. Mask Creation: A mask is created to define the ROI within the image. The mask is typically a binary image of the same size as the input image, where the region inside the ROI is set to white (255) and the region outside the ROI is set to black (0).

  2. Applying the Mask: The mask is applied by performing an element-wise bitwise AND operation between the input image and the mask image. This operation preserves the pixels in the input image that correspond to the white (255) pixels in the mask image, while setting the pixels outside the ROI to black (0), effectively masking them out.

 

 

Tutorials

Detection of Lane on live webcame video feed 

import cv2
import numpy as np

# Function to detect lane lines in a frame
def detect_lanes(frame):
    # Convert frame to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    # Apply Gaussian blur
    blurred = cv2.GaussianBlur(gray, (5, 5), 0)
    
    # Apply Canny edge detection
    edges = cv2.Canny(blurred, 50, 150)
    
    # Mask region of interest (ROI)
    mask = np.zeros_like(edges)
    height, width = frame.shape[:2]
    vertices = np.array([[(0, height), (width // 2, height // 2), (width, height)]], dtype=np.int32)
    cv2.fillPoly(mask, vertices, 255)
    masked_edges = cv2.bitwise_and(edges, mask)
    
    # Apply Hough transform to detect lines
    lines = cv2.HoughLinesP(masked_edges, 1, np.pi/180, 50, minLineLength=50, maxLineGap=100)
    
    # Draw detected lines on the frame
    if lines is not None:
        for line in lines:
            x1, y1, x2, y2 = line[0]
            cv2.line(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
    
    return frame

# Initialize webcam
cap = cv2.VideoCapture(0)  # 0 for the default webcam, change if necessary

# Check if webcam is opened successfully
if not cap.isOpened():
    print("Error opening webcam")
else:
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        
        # Detect lanes in the frame
        frame_with_lanes = detect_lanes(frame)
        
        # Display frame with detected lanes
        cv2.imshow('Lane Detection', frame_with_lanes)
        
        # Press 'q' to exit
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    
    # Release the webcam and close all windows
    cap.release()
    cv2.destroyAllWindows()

​

Output
 

​

​

​

​

​

 

 

 

 

 

 

​

​

Detection of Lane on video file 

​

import cv2
import numpy as np

# Function to detect lane lines in a frame
def detect_lanes(frame):
    # Convert frame to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    # Apply Gaussian blur
    blurred = cv2.GaussianBlur(gray, (5, 5), 0)
    
    # Apply Canny edge detection
    edges = cv2.Canny(blurred, 50, 150)
    
    # Mask region of interest (ROI)
    mask = np.zeros_like(edges)
    height, width = frame.shape[:2]
    vertices = np.array([[(0, height), (width // 2, height // 2), (width, height)]], dtype=np.int32)
    cv2.fillPoly(mask, vertices, 255)
    masked_edges = cv2.bitwise_and(edges, mask)
    
    # Apply Hough transform to detect lines
    lines = cv2.HoughLinesP(masked_edges, 1, np.pi/180, 50, minLineLength=50, maxLineGap=100)
    
    # Draw detected lines on the frame
    if lines is not None:
        for line in lines:
            x1, y1, x2, y2 = line[0]
            cv2.line(frame, (x1, y1), (x2, y2), (0, 255, 0), 3)
    
    return frame

# Read video file
video_path = 'your_video_path.mp4'  # Replace 'your_video_path.mp4' with your video file path
cap = cv2.VideoCapture(video_path)

# Check if video is opened successfully
if not cap.isOpened():
    print("Error opening video file")
else:
    while True:
        ret, frame = cap.read()
        if not ret:
            break
        
        # Detect lanes in the frame
        frame_with_lanes = detect_lanes(frame)
        
        # Display frame with detected lanes
        cv2.imshow('Lane Detection', frame_with_lanes)
        
        # Press 'q' to exit
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    
    # Release the video file and close all windows
    cap.release()
    cv2.destroyAllWindows()
 

Output

​

 

 

 

 

 

 

 

 

 

 

 

 

​

​

Lane Detection and lane centering

Detection of Pedestrian using Hog
Detection of CAT using Hog
Detection of Vehicle using Hog
Detection of Lane
Detection of Lane
bottom of page