Neural Networks

What are Neural Networks?

Neural Networks are computing systems inspired by the biological neural networks in animal brains. They consist of interconnected nodes (neurons) organized in layers that can learn to perform tasks by considering examples, without being programmed with task-specific rules.

🧠 Biological Inspiration:

Just like neurons in the brain receive signals, process them, and pass them on, artificial neurons receive inputs, apply transformations, and produce outputs.

Neural Network Architecture

📥 Input Layer

Receives the raw data (features). Each neuron represents one feature.

Neuron 1

Neuron 2

Neuron 3

↓

🔄 Hidden Layers

Process and transform data. Can have multiple layers (deep learning).

Neuron 1

Neuron 2

Neuron 3

Neuron 4

↓

📤 Output Layer

Produces the final prediction or classification.

Output 1

Output 2

How a Neuron Works

Neuron Components:

1. Inputs (x): Data from previous layer or raw features
2. Weights (w): Learnable parameters that determine input importance
3. Bias (b): Learnable offset parameter
4. Weighted Sum: z = (w₁×x₁ + w₂×x₂ + ... + wₙ×xₙ) + b
5. Activation Function: Introduces non-linearity, a = f(z)
6. Output: Passes to next layer

// Single Neuron Implementation
class Neuron {
  constructor(inputSize) {
    // Initialize weights randomly
    this.weights = Array(inputSize).fill(0).map(() => Math.random() * 2 - 1);
    this.bias = Math.random() * 2 - 1;
  }

  // Activation functions
  sigmoid(x) {
    return 1 / (1 + Math.exp(-x));
  }

  relu(x) {
    return Math.max(0, x);
  }

  tanh(x) {
    return Math.tanh(x);
  }

  // Forward pass
  forward(inputs, activationFunc = 'sigmoid') {
    // Calculate weighted sum
    const z = inputs.reduce((sum, input, i) => {
      return sum + input * this.weights[i];
    }, this.bias);

    // Apply activation function
    let activation;
    switch(activationFunc) {
      case 'relu':
        activation = this.relu(z);
        break;
      case 'tanh':
        activation = this.tanh(z);
        break;
      default:
        activation = this.sigmoid(z);
    }

    return { output: activation, weightedSum: z };
  }
}

// Example usage
const neuron = new Neuron(3);
console.log("Neuron weights:", neuron.weights);
console.log("Neuron bias:", neuron.bias);

const inputs = [0.5, 0.8, 0.2];
const result = neuron.forward(inputs, 'sigmoid');
console.log("Input:", inputs);
console.log("Weighted sum:", result.weightedSum.toFixed(4));
console.log("Output:", result.output.toFixed(4));

Activation Functions Visualized

Sigmoid

σ(x) = 1/(1+e⁻ˣ)

Range: (0, 1)

Use: Binary classification

Output: Probability-like

ReLU

f(x) = max(0, x)

Range: [0, ∞)

Use: Hidden layers

Output: Fast to compute

Tanh

f(x) = (eˣ-e⁻ˣ)/(eˣ+e⁻ˣ)

Range: (-1, 1)

Use: Hidden layers

Output: Zero-centered

Complete Neural Network Implementation

Let's build a multi-layer neural network from scratch:

// Full Neural Network Implementation
class NeuralNetwork {
  constructor(inputSize, hiddenSize, outputSize, learningRate = 0.1) {
    this.learningRate = learningRate;
    
    // Initialize weights for hidden layer
    this.weightsInputHidden = this.initializeWeights(inputSize, hiddenSize);
    this.biasHidden = Array(hiddenSize).fill(0);
    
    // Initialize weights for output layer
    this.weightsHiddenOutput = this.initializeWeights(hiddenSize, outputSize);
    this.biasOutput = Array(outputSize).fill(0);
  }

  initializeWeights(rows, cols) {
    return Array(rows).fill(0).map(() =>
      Array(cols).fill(0).map(() => Math.random() * 2 - 1)
    );
  }

  sigmoid(x) {
    return 1 / (1 + Math.exp(-x));
  }

  sigmoidDerivative(x) {
    return x * (1 - x);
  }

  // Forward propagation
  forward(inputs) {
    // Input to hidden layer
    this.hidden = [];
    for (let i = 0; i < this.weightsInputHidden[0].length; i++) {
      let sum = this.biasHidden[i];
      for (let j = 0; j < inputs.length; j++) {
        sum += inputs[j] * this.weightsInputHidden[j][i];
      }
      this.hidden.push(this.sigmoid(sum));
    }

    // Hidden to output layer
    this.output = [];
    for (let i = 0; i < this.weightsHiddenOutput[0].length; i++) {
      let sum = this.biasOutput[i];
      for (let j = 0; j < this.hidden.length; j++) {
        sum += this.hidden[j] * this.weightsHiddenOutput[j][i];
      }
      this.output.push(this.sigmoid(sum));
    }

    return this.output;
  }

  // Backward propagation (training)
  backward(inputs, target) {
    // Calculate output layer error
    const outputErrors = this.output.map((out, i) => target[i] - out);
    const outputDeltas = outputErrors.map((error, i) =>
      error * this.sigmoidDerivative(this.output[i])
    );

    // Calculate hidden layer error
    const hiddenErrors = this.hidden.map((_, i) => {
      let error = 0;
      for (let j = 0; j < this.output.length; j++) {
        error += outputDeltas[j] * this.weightsHiddenOutput[i][j];
      }
      return error;
    });
    const hiddenDeltas = hiddenErrors.map((error, i) =>
      error * this.sigmoidDerivative(this.hidden[i])
    );

    // Update weights and biases (hidden to output)
    for (let i = 0; i < this.weightsHiddenOutput.length; i++) {
      for (let j = 0; j < this.weightsHiddenOutput[i].length; j++) {
        this.weightsHiddenOutput[i][j] +=
          this.learningRate * outputDeltas[j] * this.hidden[i];
      }
    }
    for (let i = 0; i < this.biasOutput.length; i++) {
      this.biasOutput[i] += this.learningRate * outputDeltas[i];
    }

    // Update weights and biases (input to hidden)
    for (let i = 0; i < this.weightsInputHidden.length; i++) {
      for (let j = 0; j < this.weightsInputHidden[i].length; j++) {
        this.weightsInputHidden[i][j] +=
          this.learningRate * hiddenDeltas[j] * inputs[i];
      }
    }
    for (let i = 0; i < this.biasHidden.length; i++) {
      this.biasHidden[i] += this.learningRate * hiddenDeltas[i];
    }
  }

  // Train the network
  train(trainingData, epochs) {
    for (let epoch = 0; epoch < epochs; epoch++) {
      let totalLoss = 0;
      
      for (const data of trainingData) {
        const output = this.forward(data.input);
        this.backward(data.input, data.target);
        
        // Calculate loss (MSE)
        const loss = output.reduce((sum, out, i) =>
          sum + Math.pow(data.target[i] - out, 2), 0) / output.length;
        totalLoss += loss;
      }
      
      if (epoch % 1000 === 0) {
        console.log(`Epoch ${epoch}: Loss = ${(totalLoss / trainingData.length).toFixed(6)}`);
      }
    }
  }

  predict(inputs) {
    const output = this.forward(inputs);
    return output.map(o => Math.round(o));
  }
}

// Example: XOR Problem (not linearly separable)
const trainingData = [
  { input: [0, 0], target: [0] },
  { input: [0, 1], target: [1] },
  { input: [1, 0], target: [1] },
  { input: [1, 1], target: [0] }
];

console.log("Training Neural Network on XOR problem...");
const nn = new NeuralNetwork(2, 4, 1, 0.5);
nn.train(trainingData, 10000);

console.log("\nTesting:");
trainingData.forEach(data => {
  const prediction = nn.forward(data.input);
  console.log(`Input: [${data.input}] → Prediction: ${prediction[0].toFixed(4)} (Target: ${data.target[0]})`);
});

🎯 Why XOR is Important:

XOR cannot be solved by a single neuron (linear classifier). It requires at least one hidden layer, demonstrating the power of neural networks to learn non-linear patterns!

Training Process: Backpropagation

How Neural Networks Learn:

Forward Pass

Input flows through network, produces output

Calculate Loss

Compare prediction with actual target

Backward Pass

Error propagates backward through layers

Update Weights

Adjust weights using gradient descent

Repeat

Iterate until network converges

Common Neural Network Architectures

Feedforward Neural Network (FNN)

Information flows in one direction from input to output.

Use cases: Classification, regression, pattern recognition

Convolutional Neural Network (CNN)

Specialized for processing grid-like data (images).

Use cases: Image recognition, object detection, computer vision

Recurrent Neural Network (RNN)

Has memory of previous inputs (feedback loops).

Use cases: Text generation, time series, speech recognition

Transformer

Uses attention mechanism to process sequential data.

Use cases: Language models (GPT), translation, NLP

💡 Key Takeaways

✓ Neural networks are inspired by biological neurons
✓ Layers: Input, hidden (processing), and output
✓ Neurons apply weights, bias, and activation functions
✓ Backpropagation is how networks learn from errors
✓ Activation functions introduce non-linearity (Sigmoid, ReLU, Tanh)
✓ Multiple layers enable learning complex, non-linear patterns

📚 Next Steps

Continue your neural network journey:

→ Deep Learning: Networks with many hidden layers
→ Convolutional Networks: Specialized for image processing
✓ Multiple layers enable learning complex, non-linear patterns

📚 Next Steps

Continue your neural network journey:

→ Deep Learning: Networks with many hidden layers
→ Convolutional Networks: Specialized for image processing
→ Optimization Techniques: Adam, RMSprop, learning rate schedules

What are Neural Networks?

Neural Network Architecture

📥 Input Layer

🔄 Hidden Layers

📤 Output Layer

How a Neuron Works

Neuron Components:

Activation Functions Visualized

Sigmoid

ReLU

Tanh

Complete Neural Network Implementation

🎯 Why XOR is Important:

Training Process: Backpropagation

How Neural Networks Learn:

Common Neural Network Architectures

Feedforward Neural Network (FNN)

Convolutional Neural Network (CNN)

Recurrent Neural Network (RNN)

Transformer

💡 Key Takeaways

📚 Next Steps

📚 Next Steps

Learn More

3Blue1Brown - Neural Networks

Neural Networks and Deep Learning

TensorFlow Playground