Making Machine See Notes - Class 12 AI (843)

Master every concept from Unit-3 Making Machines See with these Class 12 AI notes. We’ve simplified Computer Vision as per the latest CBSE curriculum into clear, easy-to-understand points designed specifically for your board exams. Simple explanations, focused content—everything you need to score full marks.

Computer Vision is a branch of Artificial Intelligence (AI) that uses sensing devices and deep learning techniques to enable machines to understand and interpret visual information.

Contents hide

1. What is Computer Vision?

2. Working Of Computer Vision

3. Computer Vision – Process:

4. Applications of Computer Vision

5. Challenges of Computer Vision

6. Future of Computer Vision

What is Computer Vision?

It is similar to human vision, using cameras, data, and algorithms like the human retina, optic nerves, and visual cortex.
CV extracts meaningful information from images, videos, and other visual inputs.
Computer Vision systems are used to inspect products, monitor infrastructure, and analyze production processes in real-time.
Deep learning models in CV achieve high accuracy in tasks like facial recognition, object detection, and image classification.

Working Of Computer Vision

Computer Vision focuses on processing and analysing digital images and videos to understand their content.
Understanding digital images is a fundamental part of computer vision.

Basics of Digital Images

A digital image is a picture stored as a sequence of numbers that computer can interpret.
Digital images can be created by several ways such as
- Design software (e.g., Paint, Photoshop)
- Digital cameras
- Scanner

Interpretation of Image in Digital Form

A computer sees an image as a collection of tiny units called pixels (picture elements).
Each pixel represents a specific color value. These pixels together form the digital image.
During digitization, an image is converted into a grid of pixels.
Resolution of image is determined by number of pixels:
- More pixels → Higher resolution → More detail
In monochrome (black & white) images: Pixel values range from 0 to 255 where 0 = Black and 255 = White

Computer Vision – Process:

Computer Vision process involves five stages which are explained below.

Image Acquisition

Image Acquisition is the first stage.
It involves capturing digital images or videos and provides the raw data for further analysis.
The quality of acquired images affects further processing and analysis.
Imaging device capabilities and resolution impact image quality.
Higher resolution captures more details and clearer images.
Image acquisition is affected by factors like:
- Lighting conditions
- Angle of capture
In scientific and medical fields, advanced techniques are used. For example: MRI and CT scans

Preprocessing:

Preprocessing in Computer Vision improves the quality of the acquired image.
It uses various techniques to make images suitable for analysis.

Common Preprocessing Techniques

Noise Reduction
- Removes unwanted elements like blurriness, spots, or distortions and makes the image clearer.
- Example: Removing grainy effects in low-light images.
Image Normalization
- Standardizes pixel values across images. Adjusts values to a fixed range (e.g., 0–1 or -1 to 1).
- Ensures consistency in datasets.
- Example: Converting pixel values from 0–255 to 0–1.
Resizing/Cropping
- Changes image size or aspect ratio and ensures all images have the same dimensions.
- Example: Resizing images to 224 × 224 pixels.
Histogram Equalization
- Adjusts brightness and contrast.
- Spreads pixel intensity evenly and enhances details in dark or bright areas.
- Example: Improving low-contrast images.
Goal of Preprocessing
- Remove noise or disturbances.
- Highlight important features.
- Maintain consistency and uniformity in images.

Feature Extraction

Feature Extraction involves identifying and extracting important visual patterns or attributes from a pre-processed image.
The methods used for feature extraction depend on:
- Complexity of the image
- Available computational resources
- Requirements of the application
Types of Feature Extraction
- Edge Detection
  - Identifies boundaries between regions.
  - Detects sharp changes in intensity.
- Corner Detection
  - Identifies points where two or more edges meet.
  - Focuses on areas with high curvature or sharp changes.
- Texture Analysis
  - Extracts patterns like smoothness, roughness, or repetition.
- Colour-Based Feature Extraction
  - Analyses colour distribution in an image.
  - Helps differentiate objects based on colour.
In deep learning approaches, feature extraction is done automatically using Convolutional Neural Networks (CNNs) during training.

Detection/Segmentation

Detection/Segmentation focuses on identifying objects or regions of interest in an image.
It is important for applications like:
- Autonomous driving
- Medical imaging
- Object tracking
This stage is divided into:
- Single Object Tasks
- Multiple Object Tasks

Single Object Tasks

Focus on analysing or identifying a single object in an image with two main objectives:
Classification
- Determines the category or class of an object and help identify the object. It can uses algorithms KNN (K-Nearest Neighbour) – supervised or K-means clustering – unsupervised
Classification + Localization
- Classifies the object and also locates it. It uses bounding boxes to show object position.

Multiple Object Tasks

Multiple Object Tasks deal with images containing multiple objects or different classes.
Aim to identify and distinguish between various objects in an image.

Types of Multiple Object Tasks

Object Detection
- Identifies and locates multiple objects in an image.
- Draws bounding boxes around each object.Assigns class labels to detected objects.Uses features and learning algorithms for recognition.
- Common algorithms: R-CNN, R-FCN, YOLO (You Only Look Once), SSD (Single Shot Detector)
Image Segmentation
- Image Segmentation creates masks around pixels with similar characteristics.
- It assigns a class to pixels in the input image.
- Helps in understanding the image at a detailed (granular) level.
- Each object is represented using a pixel-wise mask.
- Makes it easier to identify and separate different objects.Uses techniques like edge detection (detecting changes in brightness).
- There are different types of image segmentation available.
Types of Image Segmentation
- Semantic Segmentation
  - Classifies pixels of the same class together.
  - Does not differentiate between objects of the same class.
- Instance Segmentation
  - Classifies pixels for each individual object.
  - Differentiates between objects even if they belong to the same class.
High-Level Processing
- High-Level Processing is the final stage of computer vision.
- It interprets and extracts meaningful information from detected objects or regions in images.
- Helps computers achieve a deeper understanding of visual content.
- Enables informed decision-making based on visual data.
- Tasks include:
  - Object recognition
  - Scene understanding
  - Context analysis

Applications of Computer Vision

Facial Recognition
- Used by social media platforms to detect and tag users.
Healthcare
- Detects diseases and abnormalities.
- Helps in identifying cancerous tumours.
Self-Driving Vehicles
- Understand surroundings using cameras.
- Detect vehicles, objects, traffic signals, and pedestrians.
Optical Character Recognition (OCR)
- Extracts text from images or documents.
- Used in invoices, bills, and articles.
Machine Inspection
- Detects defects and flaws in machines or products.
- Helps in quality control during manufacturing.
3D Model Building
- Creates 3D models from real-world objects.
- Used in robotics, autonomous driving, 3D tracking, and AR/VR.
Surveillance
- Uses CCTV footage to monitor activities.
- Identifies suspicious behaviour and prevents crimes.
Fingerprint Recognition and Biometrics
- Identifies individuals using fingerprints or biometric data.
- Used for authentication and security.

Challenges of Computer Vision

Reasoning and Analytical Issues
- Requires accurate interpretation, not just image identification.
- Needs strong reasoning and analytical capabilities.
- Without these, extracting meaningful insights becomes difficult.
Difficulty in Image Acquisition
- Affected by lighting variations, different perspectives, and scales.
- Complex scenes with multiple objects increase difficulty.Occlusion (objects blocking each other) adds complexity.
- High-quality image data is necessary for accurate analysis.
Privacy and Security Concerns
- Surveillance systems may violate privacy rights.
- Facial recognition raises ethical issues.
- Requires careful handling and regulations.
Duplicate and False Content
- Risk of fake or misleading images/videos.
- Exploitation of system vulnerabilities by malicious users.Data breaches can spread duplicate or false content.
- Leads to misinformation and reputational damage.

Future of Computer Vision

Computer Vision has evolved from basic image processing to advanced systems with human-like precision.
Growth is driven by:
- Advances in deep learning algorithms
- Availability of large amounts of labelled data
Machines can now perceive and analyse images and videos effectively.
Future possibilities of computer vision are vast and impactful.
It will have a significant and far-reaching impact on societ
Computer vision can bring transformative benefits for humanity.