4.5 KiB
Feature Specification: ResNet Phenology Classifier
Feature Branch: 1-phenology-classifier
Created: 2025-11-04
Status: Draft
Input: User description: "Construye un modelo ResNet que tenga la capacidad de clasificar por fase fenologica las imagenes de una planta. La imagenes estan dadas en datasets y etiquetadas de pendiendo de su fase"
User Scenarios & Testing (mandatory)
User Story 1 - Train ResNet Model (Priority: P1)
As a researcher, I want to train a ResNet model on a dataset of labeled plant images to learn phenological phase classification.
Why this priority: This is the core functionality required to build the classifier.
Independent Test: Can be fully tested by training on a subset of the dataset and validating that the model learns to classify phases accurately.
Acceptance Scenarios:
- Given a dataset of plant images labeled by phenological phase, When the model is trained, Then it achieves at least 90% accuracy on a held-out test set.
- Given training data, When training completes, Then the model can be saved and loaded for inference.
User Story 2 - Evaluate Model Performance (Priority: P2)
As a researcher, I want to evaluate the trained model's performance on unseen data to ensure reliability.
Why this priority: Evaluation is essential to validate the model's effectiveness before deployment.
Independent Test: Can be tested by running evaluation on test data and checking metrics like accuracy, precision, and recall.
Acceptance Scenarios:
- Given a trained model and test dataset, When evaluation is run, Then detailed metrics are provided including accuracy, recall, macro-f1, and confusion matrix.
- Given evaluation results, When performance is below threshold, Then the model is flagged for retraining.
User Story 3 - Classify New Images (Priority: P3)
As a user, I want to use the trained model to classify new plant images by phenological phase.
Why this priority: This enables practical use of the model for monitoring or analysis.
Independent Test: Can be tested by providing a new image and verifying the predicted phase matches expectations.
Acceptance Scenarios:
- Given a new plant image, When classification is requested, Then the model returns the predicted phenological phase with visual output.
- Given an image, When classified, Then response time is under 1 second.
Clarifications
Session 2025-11-04
- Q: How should the dataset be split for training, validation, and testing? → A: 70/15/15
- Q: What evaluation metrics must be included? → A: Accuracy, recall, macro-f1, confusion matrix; other metrics accepted
- Q: How should classification results be returned? → A: Visually
Edge Cases
- What happens when an image is of poor quality or not a plant?
- How does the system handle images with multiple plants or unclear phases?
- What if the dataset has imbalanced classes for certain phases?
Requirements (mandatory)
Functional Requirements
- FR-001: System MUST load and preprocess labeled plant image datasets.
- FR-002: System MUST train a ResNet model on the dataset to classify phenological phases specified in the dataset's .csv labels file.
- FR-003: System MUST evaluate the model using metrics including accuracy, recall, macro-f1, and confusion matrix. Other metrics are accepted.
- FR-004: System MUST provide inference capability to classify new images.
- FR-005: System MUST save and load trained models for reuse.
- FR-006: System MUST split the dataset located at C:\Users\sof12\Desktop\ML\Datasets\Nocciola\GBIF into 70% training, 15% validation, 15% testing.
- FR-007: System MUST provide visual output of classification results.
Key Entities (include if feature involves data)
- Dataset: Located at C:\Users\sof12\Desktop\ML\Datasets\Nocciola\GBIF, containing plant images and labels.
- Plant Image: Represents an image of a plant, with attributes like image data and associated phenological phase label.
- Phenological Phase: Represents a growth stage of the plant, with attributes like phase name and description.
Success Criteria (mandatory)
Measurable Outcomes
- SC-001: Model achieves >90% accuracy on a held-out test set.
- SC-002: Training process completes within 1 hour for a standard dataset size.
- SC-003: Inference on a single image takes less than 1 second.
- SC-004: System handles datasets with up to 10,000 images without performance degradation.