Deep Learning Frameworks: PyTorch vs TensorFlow
Compare PyTorch and TensorFlow for deep learning with tensor operations, model building, training loops, and Keras examples
PyTorch vs TensorFlow: The Two Giants
PyTorch (by Meta) and TensorFlow (by Google) dominate deep learning. PyTorch leads in research with its Pythonic feel and dynamic graphs. TensorFlow leads in production deployment with TF Serving and TF Lite. Most engineers should know both.
PyTorch
- Dynamic computation graphs (eager mode)
- Pythonic, intuitive API
- Dominates research (80%+ papers)
- Hugging Face built on PyTorch
- TorchServe for deployment
TensorFlow
- Static + eager execution modes
- Keras high-level API (simpler)
- Strong production ecosystem
- TF Lite for mobile/edge
- TF.js for browser ML
Tensor Operations: Side by Side
# ===== PyTorch Tensor Operations =====
import torch
# Create tensors
a = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
b = torch.randn(2, 2) # random normal
c = torch.zeros(3, 3) # all zeros
d = torch.ones(2, 2) # all ones
# Operations
result = a @ b # matrix multiplication
result = a * b # element-wise multiply
result = a + b # addition
result = a.T # transpose
result = a.reshape(1, 4) # reshape
# GPU support
if torch.cuda.is_available():
a_gpu = a.to('cuda') # move to GPU
result_gpu = a_gpu @ a_gpu # computation on GPU
result_cpu = result_gpu.to('cpu') # back to CPU
print(f"PyTorch tensor shape: {a.shape}")
print(f"PyTorch tensor dtype: {a.dtype}")
# ===== TensorFlow Tensor Operations =====
import tensorflow as tf
# Create tensors
a = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
b = tf.random.normal((2, 2))
c = tf.zeros((3, 3))
d = tf.ones((2, 2))
# Operations
result = a @ b # matrix multiplication
result = a * b # element-wise multiply
result = a + b # addition
result = tf.transpose(a) # transpose
result = tf.reshape(a, (1, 4)) # reshape
# GPU support (automatic in TF)
# TF automatically places ops on GPU if available
with tf.device('/GPU:0'):
result = a @ a
print(f"TF tensor shape: {a.shape}")
print(f"TF tensor dtype: {a.dtype}")
Building a Model: PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
# Define model
class ImageClassifier(nn.Module):
def __init__(self, num_classes=10):
super().__init__()
self.features = nn.Sequential(
nn.Conv2d(1, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.MaxPool2d(2),
)
self.classifier = nn.Sequential(
nn.Flatten(),
nn.Linear(64 * 7 * 7, 128),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(128, num_classes),
)
def forward(self, x):
x = self.features(x)
x = self.classifier(x)
return x
# Training loop (PyTorch gives you full control)
model = ImageClassifier(num_classes=10)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Simulated data (normally use torchvision datasets)
X_train = torch.randn(1000, 1, 28, 28)
y_train = torch.randint(0, 10, (1000,))
dataset = TensorDataset(X_train, y_train)
loader = DataLoader(dataset, batch_size=32, shuffle=True)
# Training
model.train()
for epoch in range(5):
total_loss = 0
for batch_X, batch_y in loader:
optimizer.zero_grad() # clear gradients
outputs = model(batch_X) # forward pass
loss = criterion(outputs, batch_y) # compute loss
loss.backward() # backpropagation
optimizer.step() # update weights
total_loss += loss.item()
print(f"Epoch {epoch+1}: Loss = {total_loss/len(loader):.4f}")
# Evaluation
model.eval()
with torch.no_grad():
test_output = model(X_train[:10])
predictions = test_output.argmax(dim=1)
print(f"Predictions: {predictions.tolist()}")
Building the Same Model: Keras (TensorFlow)
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Define model (much less code with Keras!)
model = keras.Sequential([
layers.Conv2D(32, 3, padding='same', activation='relu',
input_shape=(28, 28, 1)),
layers.MaxPooling2D(2),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(2),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.5),
layers.Dense(10, activation='softmax'),
])
# Compile (one line to configure training)
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
# View model summary
model.summary()
# Training (one line!)
import numpy as np
X_train = np.random.randn(1000, 28, 28, 1).astype('float32')
y_train = np.random.randint(0, 10, (1000,))
history = model.fit(
X_train, y_train,
epochs=5,
batch_size=32,
validation_split=0.2, # auto validation set
verbose=1
)
# Evaluate
loss, accuracy = model.evaluate(X_train[:100], y_train[:100])
print(f"Test accuracy: {accuracy:.4f}")
# Save and load
model.save('my_model.keras')
loaded_model = keras.models.load_model('my_model.keras')
When to Use Which Framework
Choose PyTorch When:
Research, prototyping, working with Hugging Face models, custom training loops, need maximum flexibility, most new papers use PyTorch.
Choose TensorFlow/Keras When:
Production deployment, mobile/edge (TF Lite), browser ML (TF.js), rapid prototyping with Keras, need mature serving infrastructure.
The Trend:
PyTorch dominates research. Most LLM and Hugging Face work is PyTorch. TensorFlow remains strong in production ML systems at Google-scale companies.
Key Takeaways
- PyTorch is the research standard; TensorFlow/Keras excels in production deployment
- Tensor operations are nearly identical between frameworks
- PyTorch requires explicit training loops; Keras abstracts them with model.fit()
- Both support GPU acceleration with minimal code changes
- Learn PyTorch first if working with Hugging Face or modern AI research