94 lines
4.5 KiB
Markdown
94 lines
4.5 KiB
Markdown
# Feature Specification: ResNet Phenology Classifier
|
|
|
|
**Feature Branch**: `1-phenology-classifier`
|
|
**Created**: 2025-11-04
|
|
**Status**: Draft
|
|
**Input**: User description: "Construye un modelo ResNet que tenga la capacidad de clasificar por fase fenologica las imagenes de una planta. La imagenes estan dadas en datasets y etiquetadas de pendiendo de su fase"
|
|
|
|
## User Scenarios & Testing *(mandatory)*
|
|
|
|
### User Story 1 - Train ResNet Model (Priority: P1)
|
|
|
|
As a researcher, I want to train a ResNet model on a dataset of labeled plant images to learn phenological phase classification.
|
|
|
|
**Why this priority**: This is the core functionality required to build the classifier.
|
|
|
|
**Independent Test**: Can be fully tested by training on a subset of the dataset and validating that the model learns to classify phases accurately.
|
|
|
|
**Acceptance Scenarios**:
|
|
|
|
1. **Given** a dataset of plant images labeled by phenological phase, **When** the model is trained, **Then** it achieves at least 90% accuracy on a held-out test set.
|
|
2. **Given** training data, **When** training completes, **Then** the model can be saved and loaded for inference.
|
|
|
|
---
|
|
|
|
### User Story 2 - Evaluate Model Performance (Priority: P2)
|
|
|
|
As a researcher, I want to evaluate the trained model's performance on unseen data to ensure reliability.
|
|
|
|
**Why this priority**: Evaluation is essential to validate the model's effectiveness before deployment.
|
|
|
|
**Independent Test**: Can be tested by running evaluation on test data and checking metrics like accuracy, precision, and recall.
|
|
|
|
**Acceptance Scenarios**:
|
|
|
|
1. **Given** a trained model and test dataset, **When** evaluation is run, **Then** detailed metrics are provided including accuracy, recall, macro-f1, and confusion matrix.
|
|
2. **Given** evaluation results, **When** performance is below threshold, **Then** the model is flagged for retraining.
|
|
|
|
---
|
|
|
|
### User Story 3 - Classify New Images (Priority: P3)
|
|
|
|
As a user, I want to use the trained model to classify new plant images by phenological phase.
|
|
|
|
**Why this priority**: This enables practical use of the model for monitoring or analysis.
|
|
|
|
**Independent Test**: Can be tested by providing a new image and verifying the predicted phase matches expectations.
|
|
|
|
**Acceptance Scenarios**:
|
|
|
|
1. **Given** a new plant image, **When** classification is requested, **Then** the model returns the predicted phenological phase with visual output.
|
|
2. **Given** an image, **When** classified, **Then** response time is under 1 second.
|
|
|
|
---
|
|
|
|
## Clarifications
|
|
|
|
### Session 2025-11-04
|
|
|
|
- Q: How should the dataset be split for training, validation, and testing? → A: 70/15/15
|
|
- Q: What evaluation metrics must be included? → A: Accuracy, recall, macro-f1, confusion matrix; other metrics accepted
|
|
- Q: How should classification results be returned? → A: Visually
|
|
|
|
### Edge Cases
|
|
|
|
- What happens when an image is of poor quality or not a plant?
|
|
- How does the system handle images with multiple plants or unclear phases?
|
|
- What if the dataset has imbalanced classes for certain phases?
|
|
|
|
## Requirements *(mandatory)*
|
|
|
|
### Functional Requirements
|
|
|
|
- **FR-001**: System MUST load and preprocess labeled plant image datasets.
|
|
- **FR-002**: System MUST train a ResNet model on the dataset to classify phenological phases specified in the dataset's .csv labels file.
|
|
- **FR-003**: System MUST evaluate the model using metrics including accuracy, recall, macro-f1, and confusion matrix. Other metrics are accepted.
|
|
- **FR-004**: System MUST provide inference capability to classify new images.
|
|
- **FR-005**: System MUST save and load trained models for reuse.
|
|
- **FR-006**: System MUST split the dataset located at C:\Users\sof12\Desktop\ML\Datasets\Nocciola\GBIF into 70% training, 15% validation, 15% testing.
|
|
- **FR-007**: System MUST provide visual output of classification results.
|
|
|
|
### Key Entities *(include if feature involves data)*
|
|
|
|
- **Dataset**: Located at C:\Users\sof12\Desktop\ML\Datasets\Nocciola\GBIF, containing plant images and labels.
|
|
- **Plant Image**: Represents an image of a plant, with attributes like image data and associated phenological phase label.
|
|
- **Phenological Phase**: Represents a growth stage of the plant, with attributes like phase name and description.
|
|
|
|
## Success Criteria *(mandatory)*
|
|
|
|
### Measurable Outcomes
|
|
|
|
- **SC-001**: Model achieves >90% accuracy on a held-out test set.
|
|
- **SC-002**: Training process completes within 1 hour for a standard dataset size.
|
|
- **SC-003**: Inference on a single image takes less than 1 second.
|
|
- **SC-004**: System handles datasets with up to 10,000 images without performance degradation. |