Machine Learning & Deep Learning

Algorithms, architectures, and workflows for supervised, unsupervised, and deep learning in scientific applications

Overview

Machine learning (ML) encompasses algorithms that learn patterns from data without explicit programming. Supervised learning (regression, classification) maps inputs to known outputs; unsupervised learning (clustering, dimensionality reduction) discovers hidden structure; deep learning uses multi-layer neural networks to learn hierarchical representations. In ocean science, ML powers satellite image classification, species distribution modeling, ocean state forecasting, and anomaly detection. Key Python frameworks include scikit-learn, TensorFlow, PyTorch, and XGBoost.

Category	Methods	Ocean Science Application
Supervised – Regression	Linear, SVR, RF, Gradient Boosting, MLP	SST prediction, Chl-a retrieval
Supervised – Classification	SVM, RF, XGBoost, CNN	Habitat mapping, species ID
Unsupervised – Clustering	K-Means, DBSCAN, GMM, SOM	Water mass classification, regime detection
Deep Learning – CNN	ResNet, U-Net, VGG	Semantic segmentation of satellite imagery
Deep Learning – RNN/LSTM	LSTM, GRU, ConvLSTM	Time series forecasting, surge prediction
Deep Learning – Transformer	Vision Transformer, GPT-style	Foundation models for Earth observation
Physics-Informed	PINNs, Neural ODEs	Constrained ocean dynamics learning

Core Concepts

Bias-Variance Tradeoff

Model error = Bias² + Variance + Irreducible noise. Simple models (high bias) underfit; complex models (high variance) overfit. Cross-validation, regularization (L1/L2), and ensemble methods balance this tradeoff.

MSE = Bias²(f̂) + Var(f̂) + σ²

Neural Network Architecture

A neural network is a composition of linear transformations with nonlinear activations. Each layer: z = Wx + b, a = σ(z). Backpropagation computes gradients for weight updates via chain rule. Dropout, batch normalization, and skip connections improve training.

Convolutional Neural Networks

CNNs use learnable spatial filters (kernels) to extract local features from images. Pooling reduces spatial dimensions; stacked layers learn hierarchical features. U-Net is popular for semantic segmentation of satellite/sonar imagery.

Recurrent & Sequence Models

LSTMs and GRUs process sequential data with gating mechanisms that control information flow, solving the vanishing gradient problem. ConvLSTM combines spatial and temporal learning for spatiotemporal forecasting (e.g., SST fields).

Ensemble Methods

Random Forest (bagging decision trees) and Gradient Boosting (XGBoost, LightGBM) combine many weak learners into a powerful predictor. Feature importance measures aid interpretability.

Hyperparameter Tuning

Grid search, random search, and Bayesian optimization (Optuna, Hyperopt) find optimal hyperparameters. Nested cross-validation prevents information leakage between tuning and evaluation.

Interactive Visualizations

Learning Curves — Bias vs. Variance

Confusion Matrix — Habitat Classification

Training Loss & Validation Loss Over Epochs

Key References

Goodfellow, I., Bengio, Y. & Courville, A. (2016). Deep Learning. MIT Press.
Hastie, T., Tibshirani, R. & Friedman, J. (2009). The Elements of Statistical Learning. Springer.
LeCun, Y., Bengio, Y. & Hinton, G. (2015). Deep learning. Nature, 521, 436–444.
Reichstein, M. et al. (2019). Deep learning and process understanding for data-driven Earth system science. Nature, 566, 195–204.
Chen, T. & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. KDD, 785–794.