Phenology/Code/Supervised_learning/resnet/specs/1-phenology-classifier/spec.md

# Feature Specification: ResNet Phenology Classifier

**Feature Branch**: `1-phenology-classifier`
**Created**: 2025-11-04
**Status**: Draft
**Input**: User description: "Construye un modelo ResNet que tenga la capacidad de clasificar por fase fenologica las imagenes de una planta. La imagenes estan dadas en datasets y etiquetadas de pendiendo de su fase"

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Train ResNet Model (Priority: P1)

As a researcher, I want to train a ResNet model on a dataset of labeled plant images to learn phenological phase classification.

**Why this priority**: This is the core functionality required to build the classifier.

**Independent Test**: Can be fully tested by training on a subset of the dataset and validating that the model learns to classify phases accurately.

**Acceptance Scenarios**:

1. **Given** a dataset of plant images labeled by phenological phase, **When** the model is trained, **Then** it achieves at least 90% accuracy on a held-out test set.
2. **Given** training data, **When** training completes, **Then** the model can be saved and loaded for inference.

---

### User Story 2 - Evaluate Model Performance (Priority: P2)

As a researcher, I want to evaluate the trained model's performance on unseen data to ensure reliability.

**Why this priority**: Evaluation is essential to validate the model's effectiveness before deployment.

**Independent Test**: Can be tested by running evaluation on test data and checking metrics like accuracy, precision, and recall.

**Acceptance Scenarios**:

1. **Given** a trained model and test dataset, **When** evaluation is run, **Then** detailed metrics are provided including accuracy, recall, macro-f1, and confusion matrix.
2. **Given** evaluation results, **When** performance is below threshold, **Then** the model is flagged for retraining.

---

### User Story 3 - Classify New Images (Priority: P3)

As a user, I want to use the trained model to classify new plant images by phenological phase.

**Why this priority**: This enables practical use of the model for monitoring or analysis.

**Independent Test**: Can be tested by providing a new image and verifying the predicted phase matches expectations.

**Acceptance Scenarios**:

1. **Given** a new plant image, **When** classification is requested, **Then** the model returns the predicted phenological phase with visual output.
2. **Given** an image, **When** classified, **Then** response time is under 1 second.

---

## Clarifications

### Session 2025-11-04

- Q: How should the dataset be split for training, validation, and testing? → A: 70/15/15
- Q: What evaluation metrics must be included? → A: Accuracy, recall, macro-f1, confusion matrix; other metrics accepted
- Q: How should classification results be returned? → A: Visually

### Edge Cases

- What happens when an image is of poor quality or not a plant?
- How does the system handle images with multiple plants or unclear phases?
- What if the dataset has imbalanced classes for certain phases?

## Requirements *(mandatory)*

### Functional Requirements

- **FR-001**: System MUST load and preprocess labeled plant image datasets.
- **FR-002**: System MUST train a ResNet model on the dataset to classify phenological phases specified in the dataset's .csv labels file.
- **FR-003**: System MUST evaluate the model using metrics including accuracy, recall, macro-f1, and confusion matrix. Other metrics are accepted.
- **FR-004**: System MUST provide inference capability to classify new images.
- **FR-005**: System MUST save and load trained models for reuse.
- **FR-006**: System MUST split the dataset located at C:\Users\sof12\Desktop\ML\Datasets\Nocciola\GBIF into 70% training, 15% validation, 15% testing.
- **FR-007**: System MUST provide visual output of classification results.

### Key Entities *(include if feature involves data)*

- **Dataset**: Located at C:\Users\sof12\Desktop\ML\Datasets\Nocciola\GBIF, containing plant images and labels.
- **Plant Image**: Represents an image of a plant, with attributes like image data and associated phenological phase label.
- **Phenological Phase**: Represents a growth stage of the plant, with attributes like phase name and description.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-001**: Model achieves >90% accuracy on a held-out test set.
- **SC-002**: Training process completes within 1 hour for a standard dataset size.
- **SC-003**: Inference on a single image takes less than 1 second.
- **SC-004**: System handles datasets with up to 10,000 images without performance degradation.