Task #181: Generic Web Scraping Engine for Vidyarti - VIDYARTI - sita softwares - project management and bug tracking

Actions

Task #181

open

Status:

In Progress

Priority:

High

Assignee:

Start date:

03/17/2026

Due date:

% Done:

Estimated time:

Description

Develop a centralized web scraping module that can fetch data from multiple external sources and map it into different modules of Vidyarti such as:

The system should be configurable, reusable, and scalable.

Table

vid_scraping_source_master

id INT (PK) Source ID
source_name VARCHAR(150) Website name
base_url VARCHAR(255) Website URL
module_type ENUM('current_affairs','syllabus','mock_test','study_material') Target module
parsing_rules TEXT JSON rules for scraping
status BOOLEAN Active/Inactive
created_at DATETIME Created date

vid_scraped_data_staging

vid_scraping_logs

Validations

Backend

Frontend

Required fields:

Actions

Also available in: Atom PDF