Phenology/Code/Supervised_learning/resnet/specs/1-phenology-classifier/tasks.md
2025-11-06 14:16:49 +01:00

4.1 KiB

Tasks: ResNet Phenology Classifier

Input: Design documents from /specs/1-phenology-classifier/ Prerequisites: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/

Tests: Included as per constitution testing standards.

Organization: Tasks are grouped by user story to enable independent implementation and testing of each story.

Format: [ID] [P?] [Story] Description

  • [P]: Can run in parallel (different files, no dependencies)
  • [Story]: Which user story this task belongs to (e.g., US1, US2, US3)
  • Include exact file paths in descriptions

Path Conventions

  • Single project: src/, tests/ at repository root

Dependencies

  • US1 (Train Model) must complete before US2 (Evaluate) and US3 (Classify)
  • US2 and US3 can run in parallel after US1

Parallel Execution Examples

  • Setup tasks: T001-T005 can run in parallel
  • US1 tasks: T010-T015 can run in parallel except for training which depends on data loading
  • US2 and US3: Can run in parallel after US1 completion

Implementation Strategy

  • MVP: Complete US1 for basic training capability
  • Incremental: Add US2 for evaluation, then US3 for inference
  • Each user story delivers independently testable value

Phase 1: Setup (Shared Infrastructure)

Purpose: Project initialization and basic structure

  • T001 Create project directory structure per plan.md
  • T002 Create requirements.txt with PyTorch, torchvision, pandas, scikit-learn
  • T003 Create data/ directory and subdirectories for images and labels
  • T004 Create models/ directory for saved models
  • T005 Create src/init.py and basic module structure

Phase 2: Foundational (Blocking Prerequisites)

Purpose: Core components needed by all user stories

  • T006 [P] Implement data loader in src/data_loader.py for CSV labels and image loading
  • T007 [P] Define ResNet50 model in src/model.py with classification head
  • T008 [P] Create utils.py with preprocessing and helper functions
  • T009 [P] Set up test framework with pytest configuration
  • T010 [P] Create unit tests for data loader in tests/test_data_loader.py

Phase 3: US1 - Train ResNet Model

Purpose: Enable model training on labeled datasets Independent Test: Train on subset and verify model learns (accuracy improves)

  • T011 [P] [US1] Implement training script in src/train.py with data loading and ResNet training loop
  • T012 [P] [US1] Add model saving functionality to train.py
  • T013 [P] [US1] Implement data augmentation in utils.py for training
  • T014 [P] [US1] Create unit tests for training components in tests/test_train.py
  • T015 [US1] Integrate training pipeline and test end-to-end training

Phase 4: US2 - Evaluate Model Performance

Purpose: Provide evaluation metrics for trained models Independent Test: Run evaluation on test set and verify metrics output

  • T016 [P] [US2] Implement evaluation script in src/evaluate.py with accuracy and F1-score calculation
  • T017 [P] [US2] Add per-class metrics and confusion matrix in evaluate.py
  • T018 [P] [US2] Create unit tests for evaluation in tests/test_evaluate.py
  • T019 [US2] Integrate evaluation and test on trained model

Phase 5: US3 - Classify New Images

Purpose: Enable inference on new plant images Independent Test: Classify sample image and verify output format

  • T020 [P] [US3] Implement inference script in src/inference.py for single image classification
  • T021 [P] [US3] Create API endpoint in src/api.py using FastAPI for /classify POST
  • T022 [P] [US3] Add input validation and error handling in api.py
  • T023 [P] [US3] Create unit tests for inference in tests/test_inference.py
  • T024 [US3] Integrate API and test classification endpoint

Final Phase: Polish & Cross-Cutting Concerns

Purpose: Quality assurance and production readiness

  • T025 Add logging and monitoring to all scripts
  • T026 Implement CI/CD pipeline with linting and testing
  • T027 Add comprehensive documentation and README updates
  • T028 Performance optimization and memory management
  • T029 Final integration testing and validation