Machine Learning & Deep Learning
Algorithms, architectures, and workflows for supervised, unsupervised, and deep learning in scientific applications
Overview
Machine learning (ML) encompasses algorithms that learn patterns from data without explicit programming. Supervised learning (regression, classification) maps inputs to known outputs; unsupervised learning (clustering, dimensionality reduction) discovers hidden structure; deep learning uses multi-layer neural networks to learn hierarchical representations. In ocean science, ML powers satellite image classification, species distribution modeling, ocean state forecasting, and anomaly detection. Key Python frameworks include scikit-learn, TensorFlow, PyTorch, and XGBoost.
| Category | Methods | Ocean Science Application |
|---|---|---|
| Supervised – Regression | Linear, SVR, RF, Gradient Boosting, MLP | SST prediction, Chl-a retrieval |
| Supervised – Classification | SVM, RF, XGBoost, CNN | Habitat mapping, species ID |
| Unsupervised – Clustering | K-Means, DBSCAN, GMM, SOM | Water mass classification, regime detection |
| Deep Learning – CNN | ResNet, U-Net, VGG | Semantic segmentation of satellite imagery |
| Deep Learning – RNN/LSTM | LSTM, GRU, ConvLSTM | Time series forecasting, surge prediction |
| Deep Learning – Transformer | Vision Transformer, GPT-style | Foundation models for Earth observation |
| Physics-Informed | PINNs, Neural ODEs | Constrained ocean dynamics learning |
Core Concepts
Bias-Variance Tradeoff
Model error = Bias² + Variance + Irreducible noise. Simple models (high bias) underfit; complex models (high variance) overfit. Cross-validation, regularization (L1/L2), and ensemble methods balance this tradeoff.
Neural Network Architecture
A neural network is a composition of linear transformations with nonlinear activations. Each layer: z = Wx + b, a = σ(z). Backpropagation computes gradients for weight updates via chain rule. Dropout, batch normalization, and skip connections improve training.
Convolutional Neural Networks
CNNs use learnable spatial filters (kernels) to extract local features from images. Pooling reduces spatial dimensions; stacked layers learn hierarchical features. U-Net is popular for semantic segmentation of satellite/sonar imagery.
Recurrent & Sequence Models
LSTMs and GRUs process sequential data with gating mechanisms that control information flow, solving the vanishing gradient problem. ConvLSTM combines spatial and temporal learning for spatiotemporal forecasting (e.g., SST fields).
Ensemble Methods
Random Forest (bagging decision trees) and Gradient Boosting (XGBoost, LightGBM) combine many weak learners into a powerful predictor. Feature importance measures aid interpretability.
Hyperparameter Tuning
Grid search, random search, and Bayesian optimization (Optuna, Hyperopt) find optimal hyperparameters. Nested cross-validation prevents information leakage between tuning and evaluation.
Interactive Visualizations
Learning Curves — Bias vs. Variance
Confusion Matrix — Habitat Classification
Training Loss & Validation Loss Over Epochs
Key References
- Goodfellow, I., Bengio, Y. & Courville, A. (2016). Deep Learning. MIT Press.
- Hastie, T., Tibshirani, R. & Friedman, J. (2009). The Elements of Statistical Learning. Springer.
- LeCun, Y., Bengio, Y. & Hinton, G. (2015). Deep learning. Nature, 521, 436–444.
- Reichstein, M. et al. (2019). Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195–204.
- Chen, T. & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. KDD, 785–794.