sofia/Phenology

Fork 0

SOFIA GARCIA s321387 a1e046c1ae Supervised Learning models

2025-11-06 14:16:49 +01:00

3.8 KiB

Raw Blame History

ResNet Phenology Classifier - Development Guide

Development Setup

Prerequisites

Python 3.11+
CUDA-capable GPU (recommended)
8GB+ RAM
Git

Environment Setup

Clone the repository:

git clone <repository_url>
cd resnet

Create virtual environment:

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Install development dependencies:

pip install black flake8 mypy pylint pytest-cov

Code Quality Standards

PEP 8 Compliance

All code must follow PEP 8 standards:

flake8 src/ tests/

Type Hints

Use type hints for all functions:

def train_model(epochs: int, lr: float) -> dict:
    ...

Docstrings

All modules, classes, and functions must have docstrings:

def function(arg: str) -> int:
    """
    Brief description.
    
    Args:
        arg: Description
        
    Returns:
        Description
    """
    pass

Testing

Running Tests

# All tests
pytest tests/ -v

# With coverage
pytest tests/ --cov=src --cov-report=html

# Specific markers
pytest tests/ -m unit
pytest tests/ -m integration
pytest tests/ -m slow

Writing Tests

Unit tests for all utility functions
Integration tests for data pipelines
Model validation tests
Use fixtures for common setup

Test Coverage

Minimum 80% code coverage
100% coverage for critical paths

Continuous Integration

Pre-commit Checks

Before committing:

Run linter: flake8 src/ tests/
Run type checker: mypy src/
Run tests: pytest tests/ -v
Check formatting: black --check src/ tests/

CI Pipeline

The CI/CD pipeline runs:

Linting (flake8, pylint)
Type checking (mypy)
Unit tests
Integration tests
Coverage report

Model Development

Training Best Practices

Always set random seed
Use validation set for hyperparameter tuning
Save checkpoints regularly
Monitor training metrics
Use early stopping

Evaluation

Evaluate on independent test set
Report multiple metrics (accuracy, recall, F1)
Analyze confusion matrix
Check for bias

Versioning

Version models with timestamp
Track hyperparameters
Save class mappings
Document training data

Git Workflow

Branching Strategy

master: Production-ready code
1-phenology-classifier: Feature branch
Feature branches for new capabilities

Commit Messages

Follow conventional commits:

feat: add confusion matrix visualization
fix: correct data loader split logic
docs: update README with API examples
test: add unit tests for inference

Performance Optimization

Training

Use mixed precision training
Optimize data loading (num_workers)
Use GPU if available
Batch size tuning

Inference

Model quantization
Batch predictions
Cache loaded models
Optimize image preprocessing

Troubleshooting

Common Issues

CUDA out of memory:

Reduce batch size
Use gradient accumulation
Clear cache: torch.cuda.empty_cache()

Slow data loading:

Increase num_workers
Use SSD for dataset
Preprocess images offline

Poor accuracy:

Check data quality
Increase training epochs
Try different learning rates
Use data augmentation

Documentation

Code Documentation

Docstrings for all public APIs
Inline comments for complex logic
Type hints throughout

Project Documentation

Update README for new features
Document API changes
Maintain changelog

Release Process

Update version number
Run full test suite
Build documentation
Create release notes
Tag release in git
Deploy to production

Contact

For questions or issues, refer to the project specifications in specs/1-phenology-classifier/.

3.8 KiB Raw Blame History