|

Making Machine See Notes – Class 12 AI (843) | Unit-3 Computer Vision

Master every concept from Unit-3 Making Machines See with these Class 12 AI notes. We’ve simplified Computer Vision as per the latest CBSE curriculum into clear, easy-to-understand points designed specifically for your board exams. Simple explanations, focused content—everything you need to score full marks.

Computer Vision is a branch of Artificial Intelligence (AI) that uses sensing devices and deep learning techniques to enable machines to understand and interpret visual information.

What is Computer Vision?

  • It is similar to human vision, using cameras, data, and algorithms like the human retina, optic nerves, and visual cortex.
  • CV extracts meaningful information from images, videos, and other visual inputs.
  • Computer Vision systems are used to inspect products, monitor infrastructure, and analyze production processes in real-time.
  • Deep learning models in CV achieve high accuracy in tasks like facial recognition, object detection, and image classification.

Working Of Computer Vision

  • Computer Vision focuses on processing and analysing digital images and videos to understand their content.
  • Understanding digital images is a fundamental part of computer vision.

Basics of Digital Images

  • A digital image is a picture stored as a sequence of numbers that computer can interpret.
  • Digital images can be created by several ways such as
    • Design software (e.g., Paint, Photoshop)
    • Digital cameras
    • Scanner

Interpretation of Image in Digital Form

  • A computer sees an image as a collection of tiny units called pixels (picture elements).
  • Each pixel represents a specific color value.  These pixels together form the digital image.
  • During digitization, an image is converted into a grid of pixels.
  • Resolution of image is determined by number of pixels:
    • More pixels → Higher resolution → More detail
  • In monochrome (black & white) images: Pixel values range from 0 to 255 where 0 = Black and 255 = White

Computer Vision – Process:

Computer Vision process involves five stages which are explained below.

Image Acquisition

  • Image Acquisition is the first stage.
  • It involves capturing digital images or videos and provides the raw data for further analysis.
  • The quality of acquired images affects further processing and analysis.
  • Imaging device capabilities and resolution impact image quality.
  • Higher resolution captures more details and clearer images.
  • Image acquisition is affected by factors like:
    • Lighting conditions
    • Angle of capture
  • In scientific and medical fields, advanced techniques are used. For example: MRI and CT scans

Preprocessing:

  • Preprocessing in Computer Vision improves the quality of the acquired image.
  • It uses various techniques to make images suitable for analysis.

Common Preprocessing Techniques

  • Noise Reduction
    • Removes unwanted elements like blurriness, spots, or distortions and makes the image clearer.
    • Example: Removing grainy effects in low-light images.
  • Image Normalization
    • Standardizes pixel values across images. Adjusts values to a fixed range (e.g., 0–1 or -1 to 1).
    • Ensures consistency in datasets.
    • Example: Converting pixel values from 0–255 to 0–1.
  • Resizing/Cropping
    • Changes image size or aspect ratio and ensures all images have the same dimensions.
    • Example: Resizing images to 224 × 224 pixels.
  • Histogram Equalization
    • Adjusts brightness and contrast.
    • Spreads pixel intensity evenly and enhances details in dark or bright areas.
    • Example: Improving low-contrast images.
  • Goal of Preprocessing
    • Remove noise or disturbances.
    • Highlight important features.
    • Maintain consistency and uniformity in images.

Feature Extraction

  • Feature Extraction involves identifying and extracting important visual patterns or attributes from a pre-processed image.
  • The methods used for feature extraction depend on:
    • Complexity of the image
    • Available computational resources
    • Requirements of the application
  • Types of Feature Extraction
    • Edge Detection
      • Identifies boundaries between regions.
      • Detects sharp changes in intensity.
    • Corner Detection
      • Identifies points where two or more edges meet.
      • Focuses on areas with high curvature or sharp changes.
    • Texture Analysis
      • Extracts patterns like smoothness, roughness, or repetition.
    • Colour-Based Feature Extraction
      • Analyses colour distribution in an image.
      • Helps differentiate objects based on colour.
  • In deep learning approaches, feature extraction is done automatically using Convolutional Neural Networks (CNNs) during training.

Detection/Segmentation

  • Detection/Segmentation focuses on identifying objects or regions of interest in an image.
  • It is important for applications like:
    • Autonomous driving
    • Medical imaging
    • Object tracking
  • This stage is divided into:
    • Single Object Tasks
    • Multiple Object Tasks

Single Object Tasks

  • Focus on analysing or identifying a single object in an image with two main objectives:
  • Classification
    • Determines the category or class of an object and help identify the object. It can uses algorithms KNN (K-Nearest Neighbour) – supervised or K-means clustering – unsupervised
  • Classification + Localization
    • Classifies the object and also locates it. It uses bounding boxes to show object position.

Multiple Object Tasks

  • Multiple Object Tasks deal with images containing multiple objects or different classes.
  • Aim to identify and distinguish between various objects in an image.

Types of Multiple Object Tasks

  • Object Detection
    • Identifies and locates multiple objects in an image.
    • Draws bounding boxes around each object.Assigns class labels to detected objects.Uses features and learning algorithms for recognition.
    • Common algorithms: R-CNN, R-FCN, YOLO (You Only Look Once), SSD (Single Shot Detector)
  • Image Segmentation
    • Image Segmentation creates masks around pixels with similar characteristics.
    • It assigns a class to pixels in the input image.
    • Helps in understanding the image at a detailed (granular) level.
    • Each object is represented using a pixel-wise mask.
    • Makes it easier to identify and separate different objects.Uses techniques like edge detection (detecting changes in brightness).
    • There are different types of image segmentation available.
  • Types of Image Segmentation
    • Semantic Segmentation
      • Classifies pixels of the same class together.
      • Does not differentiate between objects of the same class.
    • Instance Segmentation
      • Classifies pixels for each individual object.
      • Differentiates between objects even if they belong to the same class.
  • High-Level Processing
    • High-Level Processing is the final stage of computer vision.
    • It interprets and extracts meaningful information from detected objects or regions in images.
    • Helps computers achieve a deeper understanding of visual content.
    • Enables informed decision-making based on visual data.
    • Tasks include:
      • Object recognition
      • Scene understanding
      • Context analysis

Applications of Computer Vision

  • Facial Recognition
    • Used by social media platforms to detect and tag users.
  • Healthcare
    • Detects diseases and abnormalities.
    • Helps in identifying cancerous tumours.
  • Self-Driving Vehicles
    • Understand surroundings using cameras.
    • Detect vehicles, objects, traffic signals, and pedestrians.
  • Optical Character Recognition (OCR)
    • Extracts text from images or documents.
    • Used in invoices, bills, and articles.
  • Machine Inspection
    • Detects defects and flaws in machines or products.
    • Helps in quality control during manufacturing.
  • 3D Model Building
    • Creates 3D models from real-world objects.
    • Used in robotics, autonomous driving, 3D tracking, and AR/VR.
  • Surveillance
    • Uses CCTV footage to monitor activities.
    • Identifies suspicious behaviour and prevents crimes.
  • Fingerprint Recognition and Biometrics
    • Identifies individuals using fingerprints or biometric data.
    • Used for authentication and security.

Challenges of Computer Vision

  • Reasoning and Analytical Issues
    • Requires accurate interpretation, not just image identification.
    • Needs strong reasoning and analytical capabilities.
    • Without these, extracting meaningful insights becomes difficult.
  • Difficulty in Image Acquisition
    • Affected by lighting variations, different perspectives, and scales.
    • Complex scenes with multiple objects increase difficulty.Occlusion (objects blocking each other) adds complexity.
    • High-quality image data is necessary for accurate analysis.
  • Privacy and Security Concerns
    • Surveillance systems may violate privacy rights.
    • Facial recognition raises ethical issues.
    • Requires careful handling and regulations.
  • Duplicate and False Content
    • Risk of fake or misleading images/videos.
    • Exploitation of system vulnerabilities by malicious users.Data breaches can spread duplicate or false content.
    • Leads to misinformation and reputational damage.

Future of Computer Vision

  • Computer Vision has evolved from basic image processing to advanced systems with human-like precision.
  • Growth is driven by:
    • Advances in deep learning algorithms
    • Availability of large amounts of labelled data
  • Machines can now perceive and analyse images and videos effectively.
  • Future possibilities of computer vision are vast and impactful.
  • It will have a significant and far-reaching impact on societ
  • Computer vision can bring transformative benefits for humanity.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *