VERTICAL SCALING 💋
# 🚀 VERTICAL SCALING - COMPLETE GUIDE (0 to 100)
# Everything You Need to Know About Vertical Scaling in JavaScript/TypeScript
================================================================================
TABLE OF CONTENTS
================================================================================
1. WHAT IS VERTICAL SCALING?
2. VERTICAL VS HORIZONTAL SCALING
3. NODEJS CLUSTERING
4. LOAD BALANCING
5. MEMORY OPTIMIZATION
6. CPU OPTIMIZATION
7. DATABASE OPTIMIZATION
8. CACHING STRATEGIES
9. CONNECTION POOLING
10. RATE LIMITING
11. COMPRESSION & GZIP
12. WORKER THREADS
13. MONITORING & PROFILING
14. PRODUCTION DEPLOYMENT
15. COMPLETE EXAMPLES
16. TROUBLESHOOTING
================================================================================
1. WHAT IS VERTICAL SCALING?
================================================================================
VERTICAL SCALING (Scale Up):
Increasing the power of a single server.
More CPU cores, more RAM, faster storage, better network.
Example: 4GB RAM → 16GB RAM, 2 CPUs → 8 CPUs
HOW IT WORKS:
Single Server with more power
↓
Can handle more requests simultaneously
↓
Better throughput and performance
VERTICAL SCALING BENEFITS:
✅ Simpler deployment (one server)
✅ No data consistency issues
✅ Easier session management
✅ Lower latency (everything on one machine)
✅ Easier debugging
✅ No network communication overhead
VERTICAL SCALING LIMITATIONS:
❌ Limited by hardware (can't scale infinitely)
❌ Single point of failure
❌ Higher cost per unit (diminishing returns)
❌ Eventually you run out of resources
❌ No true fault tolerance
WHEN TO USE VERTICAL SCALING:
✅ Early stage (startup, MVP)
✅ Simple applications
✅ High performance needed locally
✅ Database-heavy applications
✅ Budget-conscious (initially cheaper)
WHEN NOT TO USE VERTICAL SCALING:
❌ Need redundancy/high availability
❌ Application hitting hardware limits
❌ Need geographic distribution
❌ Large-scale traffic spikes
❌ Long-term growth strategy
================================================================================
2. VERTICAL VS HORIZONTAL SCALING
================================================================================
VERTICAL SCALING (Scale Up):
- Increase power of single server
- More CPU, RAM, Storage
- Easy to implement
- Simpler architecture
- Limited ceiling
- Single point of failure
HORIZONTAL SCALING (Scale Out):
- Add more servers
- Distributed system
- Complex to implement
- Load balancing needed
- Unlimited potential
- Built-in redundancy
COMPARISON TABLE:
Feature | Vertical | Horizontal
---------------------|-----------------|------------------
Complexity | Simple | Complex
Cost | High per unit | Low per unit
Fault Tolerance | No | Yes
Latency | Low | Higher
Setup Time | Fast | Slower
Database Scaling | Limited | Needs replication
Session Management | Simpler | Complex
Debugging | Easier | Harder
Learning Curve | Easy | Steep
Scalability Limit | Hardware | Infinite
HYBRID APPROACH (RECOMMENDED):
1. Start with vertical scaling (1-2 servers)
2. Add clustering on single server (Node.js cluster)
3. Add caching (Redis)
4. Add connection pooling
5. When hitting limits, go horizontal (multiple servers)
TYPICAL GROWTH PATH:
Single Server
↓
Single Server + Clustering + Caching
↓
Load Balanced Servers + Database Replication
↓
Microservices + Kubernetes
================================================================================
3. NODEJS CLUSTERING
================================================================================
WHAT IS NODEJS CLUSTERING?
Multiple Node.js processes running on single machine.
Utilizes all CPU cores efficiently.
Each process handles separate requests.
No shared memory between processes.
WHY NODEJS CLUSTERING?
Node.js is single-threaded (one CPU core used).
Modern servers have 4, 8, 16+ cores.
Clustering allows using all cores.
Massive performance improvement.
NODEJS CLUSTERING ARCHITECTURE:
Master Process
├─ Worker 1 (Port 8008)
├─ Worker 2 (Port 8008)
├─ Worker 3 (Port 8008)
└─ Worker 4 (Port 8008)
↓
Load Balancer (OS level)
↓
Client Requests (Round Robin)
CLUSTERING BASICS:
Import cluster module:
import cluster from 'cluster';
import os from 'os';
Get number of CPUs:
const numCPUs = os.cpus().length;
console.log(`Number of CPUs: ${numCPUs}`);
Basic clustering code:
if (cluster.isPrimary) {
// Master process
console.log(`Primary process ${process.pid} running`);
// Fork workers (one per CPU)
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
// Handle worker exit
cluster.on('exit', (worker, code, signal) => {
console.log(`Worker ${worker.process.pid} died`);
// Respawn worker
cluster.fork();
});
} else {
// Worker process
console.log(`Worker ${process.pid} running`);
startServer();
}
COMPLETE CLUSTERING EXAMPLE:
import cluster from 'cluster';
import os from 'os';
import express from 'express';
import dotenv from 'dotenv';
dotenv.config();
const port = process.env.PORT || 8008;
const numCPUs = os.cpus().length;
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} starting with ${numCPUs} workers`);
// Fork workers
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
// Track workers
const workers = [];
cluster.on('fork', (worker) => {
workers.push(worker);
console.log(`Worker ${worker.process.pid} forked`);
});
// Handle worker death - auto respawn
cluster.on('exit', (worker, code, signal) => {
console.log(`Worker ${worker.process.pid} died (${signal || code})`);
const newWorker = cluster.fork();
console.log(`Respawned worker ${newWorker.process.pid}`);
});
// Graceful shutdown
process.on('SIGTERM', () => {
console.log('Shutting down workers...');
for (const worker of workers) {
worker.kill();
}
process.exit(0);
});
} else {
// Worker code
const app = express();
app.get('/', (req, res) => {
res.json({
message: 'Hello from worker',
pid: process.pid,
timestamp: new Date().toISOString()
});
});
app.listen(port, () => {
console.log(`Worker ${process.pid} listening on port ${port}`);
});
}
CLUSTERING WITH GRACEFUL SHUTDOWN:
import cluster from 'cluster';
import os from 'os';
import express from 'express';
const numCPUs = os.cpus().length;
const port = 8008;
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} started`);
// Fork workers
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
const workers = Array.from(cluster.workers.values());
// Graceful shutdown
const gracefulShutdown = (signal) => {
console.log(`Received ${signal}, shutting down gracefully...`);
// Stop accepting new connections
for (const worker of workers) {
worker.send('shutdown');
}
// Wait for workers to finish
setTimeout(() => {
console.log('Force killing workers...');
for (const worker of workers) {
worker.kill();
}
process.exit(0);
}, 30000); // 30 seconds timeout
};
process.on('SIGTERM', () => gracefulShutdown('SIGTERM'));
process.on('SIGINT', () => gracefulShutdown('SIGINT'));
// Respawn dead workers
cluster.on('exit', (worker, code, signal) => {
if (signal) {
console.log(`Worker killed by signal: ${signal}`);
}
console.log('Spawning a new worker');
cluster.fork();
});
} else {
// Worker process
const app = express();
app.get('/', (req, res) => {
res.json({ pid: process.pid });
});
const server = app.listen(port, () => {
console.log(`Worker ${process.pid} listening on ${port}`);
});
// Handle shutdown message from master
process.on('message', (msg) => {
if (msg === 'shutdown') {
console.log(`Worker ${process.pid} received shutdown`);
// Stop accepting new connections
server.close(() => {
console.log(`Worker ${process.pid} closed`);
process.exit(0);
});
// Force exit after timeout
setTimeout(() => {
process.exit(1);
}, 10000);
}
});
}
CLUSTERING BENEFITS:
✅ Uses all CPU cores
✅ 4-8x performance improvement (on 4-8 core machine)
✅ Automatic load balancing (OS handles it)
✅ Auto respawn dead workers
✅ Graceful reload possible
✅ No code changes needed (mostly)
CLUSTERING LIMITATIONS:
❌ No shared state between workers (use Redis)
❌ More complex debugging
❌ Higher memory usage (each worker is separate process)
❌ Still limited by single machine
❌ Not truly fault tolerant (single machine failure = everything down)
CLUSTERING WITH TYPESCRIPT:
tsconfig.json:
{
"compilerOptions": {
"target": "ES2020",
"module": "ESNext",
"moduleResolution": "node",
"esModuleInterop": true,
"allowSyntheticDefaultImports": true
}
}
package.json:
{
"type": "module",
"scripts": {
"dev": "tsx src/app.ts",
"build": "tsc",
"start": "node dist/app.js"
},
"dependencies": {
"express": "^5.0.0"
},
"devDependencies": {
"@types/node": "^20",
"typescript": "^5.0",
"tsx": "^4.0"
}
}
src/app.ts:
import cluster from 'cluster';
import os from 'os';
import express from 'express';
const numCPUs = os.cpus().length;
const port = process.env.PORT || 8008;
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} starting`);
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker) => {
console.log(`Worker ${worker.process.pid} died`);
cluster.fork();
});
} else {
const app = express();
app.get('/', (req, res) => {
res.json({ pid: process.pid });
});
app.listen(port, () => {
console.log(`Worker ${process.pid} listening on ${port}`);
});
}
================================================================================
4. LOAD BALANCING
================================================================================
WHAT IS LOAD BALANCING?
Distributing incoming requests across multiple servers/workers.
Ensures no single server is overwhelmed.
Improves overall performance and availability.
LOAD BALANCING STRATEGIES:
1. ROUND ROBIN (Most Common)
Request 1 → Worker 1
Request 2 → Worker 2
Request 3 → Worker 3
Request 4 → Worker 1 (cycles back)
Best for: Equal server power
Code:
const workers = [];
let currentWorker = 0;
function getNextWorker() {
const worker = workers[currentWorker];
currentWorker = (currentWorker + 1) % workers.length;
return worker;
}
2. LEAST CONNECTIONS
Track active connections per worker
Send next request to worker with least connections
Best for: Long-lived connections
Code:
interface WorkerStats {
worker: any;
activeConnections: number;
}
function getLeastBusyWorker(workers: WorkerStats[]) {
return workers.reduce((prev, current) =>
current.activeConnections < prev.activeConnections ? current : prev
).worker;
}
3. IP HASH
Hash client IP address
Same client always goes to same worker
Best for: Session affinity (sticky sessions)
Code:
import crypto from 'crypto';
function getWorkerForIP(ip: string, workers: any[]) {
const hash = crypto.createHash('md5').update(ip).digest('hex');
const index = parseInt(hash, 16) % workers.length;
return workers[index];
}
4. RANDOM
Random worker selection
Best for: Testing
Code:
function getRandomWorker(workers: any[]) {
return workers[Math.floor(Math.random() * workers.length)];
}
5. WEIGHTED ROUND ROBIN
Some workers handle more load than others
Best for: Mixed hardware
Code:
interface WeightedWorker {
worker: any;
weight: number;
counter: number;
}
function getWeightedWorker(workers: WeightedWorker[]) {
let totalWeight = workers.reduce((sum, w) => sum + w.weight, 0);
let counter = 0;
for (const w of workers) {
counter += w.weight;
if (counter >= totalWeight) {
w.counter = 0;
return w.worker;
}
}
}
NODEJS OS-LEVEL LOAD BALANCING:
Node.js cluster module uses OS-level load balancing by default.
Multiple workers bind to same port.
OS distributes connections (usually round robin).
When to use different strategies:
- Round Robin: Default, works well for most cases
- Least Connections: WebSocket, Server-Sent Events
- IP Hash: Sessions stored locally (not recommended, use Redis)
- Random: Load testing
- Weighted: Different server power
MONITORING LOAD BALANCE:
import cluster from 'cluster';
if (cluster.isPrimary) {
const stats = {};
// Track requests per worker
Object.values(cluster.workers).forEach((worker) => {
worker.on('message', (msg) => {
if (msg.type === 'request') {
stats[worker.process.pid] = (stats[worker.process.pid] || 0) + 1;
}
});
});
// Print stats every 10 seconds
setInterval(() => {
console.log('Request distribution:', stats);
}, 10000);
} else {
// Worker sends stats
setInterval(() => {
process.send({ type: 'request' });
}, 1000);
}
================================================================================
5. MEMORY OPTIMIZATION
================================================================================
WHY MEMORY OPTIMIZATION?
Node.js has memory limit (~2GB on 64-bit)
Memory leaks cause crashes
Efficient memory usage = more requests handled
MEMORY LIMITS:
Default: ~1.4GB (64-bit), ~768MB (32-bit)
Set custom: node --max-old-space-size=4096 app.js
MEMORY PROFILING:
Check current memory:
console.log(process.memoryUsage());
// Output:
// {
// rss: 51492864, // Resident Set Size (total allocated)
// heapTotal: 9682944, // Total heap size
// heapUsed: 5005824, // Heap currently used
// external: 1003737, // External memory
// arrayBuffers: 0 // ArrayBuffers memory
// }
Memory profiling middleware:
app.use((req, res, next) => {
const start = process.memoryUsage().heapUsed;
res.on('finish', () => {
const end = process.memoryUsage().heapUsed;
const used = (end - start) / 1024 / 1024; // Convert to MB
console.log(`Request memory delta: ${used.toFixed(2)}MB`);
});
next();
});
Monitor heap size:
setInterval(() => {
const mem = process.memoryUsage();
const heapUsedPercent = (mem.heapUsed / mem.heapTotal) * 100;
console.log(`Heap: ${heapUsedPercent.toFixed(2)}%`);
if (heapUsedPercent > 90) {
console.warn('High memory usage!');
}
}, 5000);
MEMORY LEAK DETECTION:
Simple leak detector:
const prevMemory = process.memoryUsage().heapUsed;
let leakCount = 0;
setInterval(() => {
const currentMemory = process.memoryUsage().heapUsed;
const increase = currentMemory - prevMemory;
if (increase > 10 * 1024 * 1024) { // 10MB increase
leakCount++;
console.warn(`Potential leak detected! Count: ${leakCount}`);
if (leakCount > 5) {
console.error('Restarting due to memory leak');
process.exit(1); // Respawn in cluster
}
} else {
leakCount = 0;
}
}, 30000); // Check every 30 seconds
MEMORY OPTIMIZATION TECHNIQUES:
1. Object pooling (reuse objects instead of creating new):
class ObjectPool {
private pool: any[] = [];
private size: number;
private createFn: () => any;
constructor(createFn: () => any, size: number = 100) {
this.createFn = createFn;
this.size = size;
for (let i = 0; i < size; i++) {
this.pool.push(createFn());
}
}
acquire() {
return this.pool.pop() || this.createFn();
}
release(obj: any) {
if (this.pool.length < this.size) {
// Reset object state
Object.keys(obj).forEach(key => obj[key] = null);
this.pool.push(obj);
}
}
}
Usage:
const bufferPool = new ObjectPool(
() => Buffer.allocUnsafe(1024),
100
);
const buffer = bufferPool.acquire();
// Use buffer
bufferPool.release(buffer);
2. Limit data structures:
// Bad: Unbounded cache
const cache = {};
app.get('/data/:id', (req, res) => {
if (!cache[req.params.id]) {
cache[req.params.id] = expensiveOperation();
}
res.json(cache[req.params.id]);
});
// Good: Bounded cache (LRU)
import LRU from 'lru-cache';
const cache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 }); // 5min TTL
app.get('/data/:id', (req, res) => {
let data = cache.get(req.params.id);
if (!data) {
data = expensiveOperation();
cache.set(req.params.id, data);
}
res.json(data);
});
3. Stream large data instead of buffering:
// Bad: Buffers entire file in memory
app.get('/download', (req, res) => {
const data = fs.readFileSync('large-file.bin');
res.send(data);
});
// Good: Streams file
app.get('/download', (req, res) => {
res.setHeader('Content-Type', 'application/octet-stream');
fs.createReadStream('large-file.bin').pipe(res);
});
4. Clean up event listeners:
// Bad: Memory leak
const array = [];
emitter.on('event', () => {
array.push(new Array(1000000)); // Never removed
});
// Good: Clean up
const handler = () => {
array.push(new Array(1000000));
};
emitter.on('event', handler);
// Later:
emitter.off('event', handler);
5. Use generators for large datasets:
// Bad: Loads all in memory
function* getAllUsers() {
const users = db.all(); // Entire dataset in memory
for (const user of users) {
yield user;
}
}
// Good: Lazy load
function* getAllUsers() {
let offset = 0;
const batchSize = 100;
while (true) {
const users = db.query(`LIMIT ${batchSize} OFFSET ${offset}`);
if (users.length === 0) break;
for (const user of users) {
yield user;
}
offset += batchSize;
}
}
MEMORY OPTIMIZATION CHECKLIST:
✅ Set appropriate max heap size
✅ Profile memory usage regularly
✅ Detect and fix memory leaks
✅ Use object pooling for frequently created objects
✅ Implement bounded caches (LRU)
✅ Stream large data instead of buffering
✅ Clean up event listeners
✅ Avoid global state
✅ Use generators for large datasets
✅ Monitor memory in production
================================================================================
6. CPU OPTIMIZATION
================================================================================
WHY CPU OPTIMIZATION?
CPU is limiting factor in many apps
Better CPU usage = more throughput
Node.js is single-threaded (need clustering or workers)
CPU PROFILING:
Measure execution time:
const start = process.hrtime.bigint();
// Do expensive operation
const end = process.hrtime.bigint();
const duration = Number(end - start) / 1000000; // Convert to ms
console.log(`Operation took ${duration}ms`);
CPU profiling middleware:
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on('finish', () => {
const end = process.hrtime.bigint();
const durationMs = Number(end - start) / 1000000;
// Log slow requests
if (durationMs > 100) {
console.warn(`Slow request: ${req.method} ${req.url} took ${durationMs}ms`);
}
});
next();
});
CPU OPTIMIZATION TECHNIQUES:
1. Caching results (avoid recomputation):
// Bad: Recomputes every time
app.get('/factorial/:n', (req, res) => {
const result = factorial(req.params.n);
res.json({ result });
});
// Good: Cache results
const cache = new Map();
function factorialCached(n) {
if (cache.has(n)) return cache.get(n);
const result = factorial(n);
cache.set(n, result);
return result;
}
2. Use faster algorithms:
// Slow: O(n²)
function contains(arr, val) {
for (let i = 0; i < arr.length; i++) {
for (let j = i + 1; j < arr.length; j++) {
if (arr[i] + arr[j] === val) return true;
}
}
return false;
}
// Fast: O(n) with Set
function containsFast(arr, val) {
const seen = new Set();
for (const num of arr) {
if (seen.has(val - num)) return true;
seen.add(num);
}
return false;
}
3. Lazy load modules:
// Bad: Loads all modules at startup
import heavyLibrary from 'heavy-library'; // 50MB
import anotherHeavy from 'another-heavy';
import moreStuff from 'more-stuff';
// Good: Lazy load
function getHeavyLibrary() {
return require('heavy-library'); // Loaded on first use
}
app.get('/feature1', async (req, res) => {
const lib = getHeavyLibrary();
res.json(await lib.process());
});
4. Use worker threads for CPU-intensive tasks:
// Bad: Blocks event loop
app.post('/process', (req, res) => {
const result = heavyCPUOperation(req.body);
res.json(result);
});
// Good: Use worker threads
import { Worker } from 'worker_threads';
app.post('/process', (req, res) => {
const worker = new Worker('./worker.js');
worker.on('message', (result) => {
res.json(result);
worker.terminate();
});
worker.on('error', (err) => {
res.status(500).json({ error: err.message });
worker.terminate();
});
worker.postMessage(req.body);
});
// worker.js
import { parentPort } from 'worker_threads';
function heavyCPUOperation(data) {
// Expensive computation
return data.map(x => x * x).reduce((a, b) => a + b);
}
parentPort.on('message', (data) => {
const result = heavyCPUOperation(data);
parentPort.postMessage(result);
});
5. Optimize JSON parsing/stringifying:
// Bad: No optimization
app.post('/data', express.json({ limit: '50mb' }));
// Better: Set limits
app.post('/data', express.json({
limit: '1mb',
strict: true,
reviver: null // Custom reviver if needed
}));
// For large datasets, parse incrementally:
import JSONStream from 'JSONStream';
app.post('/bulk', (req, res) => {
let count = 0;
req.pipe(JSONStream.parse('*'))
.on('data', (obj) => {
count++;
processObject(obj);
if (count % 1000 === 0) {
console.log(`Processed ${count} objects`);
}
})
.on('end', () => {
res.json({ count });
});
});
6. Regex optimization:
// Slow regex
const regex = /(\w+)@(\w+)\.(\w+)/;
// Better: Use string methods for simple cases
function isEmail(str) {
const atIndex = str.indexOf('@');
if (atIndex < 1) return false;
const dotIndex = str.indexOf('.', atIndex);
if (dotIndex <= atIndex + 2) return false;
return str.length > dotIndex + 2;
}
7. Use async/await properly (avoid unnecessary delays):
// Bad: Sequential when could be parallel
async function getData() {
const user = await getUser();
const posts = await getPosts();
const comments = await getComments();
return { user, posts, comments };
}
// Good: Parallel
async function getData() {
const [user, posts, comments] = await Promise.all([
getUser(),
getPosts(),
getComments()
]);
return { user, posts, comments };
}
CPU OPTIMIZATION CHECKLIST:
✅ Profile code to find bottlenecks
✅ Use clustering to utilize all cores
✅ Cache computation results
✅ Use better algorithms (O(n) vs O(n²))
✅ Lazy load modules
✅ Use worker threads for CPU-intensive tasks
✅ Optimize JSON operations
✅ Optimize regex patterns
✅ Run operations in parallel
✅ Avoid blocking event loop
================================================================================
7. DATABASE OPTIMIZATION
================================================================================
WHY DATABASE OPTIMIZATION?
Database queries are often slowest part of app
Optimization can provide 10-100x improvement
Proper indexing is crucial
DATABASE OPTIMIZATION TECHNIQUES:
1. CONNECTION POOLING:
Without pooling: New connection per request = SLOW
With pooling: Reuse connections = FAST
// Bad: New connection each time
app.get('/users', async (req, res) => {
const conn = await mysql.createConnection({
host: 'localhost',
user: 'root',
password: 'password'
});
const result = await conn.query('SELECT * FROM users');
conn.end();
res.json(result);
});
// Good: Connection pool
import { Pool } from 'pg';
const pool = new Pool({
host: 'localhost',
port: 5432,
database: 'mydb',
user: 'postgres',
password: 'password',
max: 20, // Max connections in pool
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
app.get('/users', async (req, res) => {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users');
res.json(result.rows);
} finally {
client.release();
}
});
2. QUERY OPTIMIZATION:
Use indexes:
CREATE INDEX idx_user_email ON users(email);
CREATE INDEX idx_post_user_id ON posts(user_id);
CREATE INDEX idx_post_created ON posts(created_at DESC);
Select only needed columns:
// Bad: Selects everything
SELECT * FROM users;
// Good: Select specific columns
SELECT id, name, email FROM users;
Use JOINs instead of multiple queries:
// Bad: N+1 query problem
const users = await db.query('SELECT * FROM users');
const userDetails = [];
for (const user of users) {
const details = await db.query(`SELECT * FROM user_details WHERE user_id = ${user.id}`);
userDetails.push(details);
}
// Good: Single JOIN query
const result = await db.query(`
SELECT u.*, ud.* FROM users u
LEFT JOIN user_details ud ON u.id = ud.user_id
`);
3. PAGINATION:
// Bad: Load all data
app.get('/posts', async (req, res) => {
const posts = await db.query('SELECT * FROM posts');
res.json(posts);
});
// Good: Paginate
app.get('/posts', async (req, res) => {
const page = req.query.page || 1;
const limit = req.query.limit || 20;
const offset = (page - 1) * limit;
const posts = await db.query(
'SELECT * FROM posts ORDER BY created_at DESC LIMIT $1 OFFSET $2',
[limit, offset]
);
const total = await db.query('SELECT COUNT(*) FROM posts');
res.json({
posts,
total: total.rows[0].count,
page,
pages: Math.ceil(total.rows[0].count / limit)
});
});
4. CACHING QUERY RESULTS:
import redis from 'redis';
const redisClient = redis.createClient();
async function getUserCached(id) {
// Try cache first
const cached = await redisClient.get(`user:${id}`);
if (cached) {
return JSON.parse(cached);
}
// Not in cache, query database
const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
// Store in cache for 5 minutes
await redisClient.set(`user:${id}`, JSON.stringify(user), 'EX', 300);
return user;
}
5. BATCH OPERATIONS:
// Bad: Multiple inserts
for (const item of items) {
await db.query(
'INSERT INTO items (name, value) VALUES ($1, $2)',
[item.name, item.value]
);
}
// Good: Batch insert
const values = items.map(item => `('${item.name}', ${item.value})`).join(',');
await db.query(
`INSERT INTO items (name, value) VALUES ${values}`
);
// Or with prepared statements:
const values = [];
const placeholders = items.map((item, i) => {
values.push(item.name, item.value);
return `($${i*2+1}, $${i*2+2})`;
}).join(',');
await db.query(
`INSERT INTO items (name, value) VALUES ${placeholders}`,
values
);
6. USE EXPLAIN to analyze queries:
EXPLAIN SELECT * FROM posts WHERE user_id = 1 AND created_at > NOW() - INTERVAL 7 DAY;
Look for:
- Sequential Scan (slow, no index)
- Index Scan (fast, has index)
- Full table scan (very slow)
7. DATABASE CONFIGURATION:
PostgreSQL example:
max_connections = 200
shared_buffers = 256MB
effective_cache_size = 1GB
maintenance_work_mem = 64MB
checkpoint_completion_target = 0.9
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 4MB
8. PRISMA OPTIMIZATION:
// Use select to get only needed fields
const user = await prisma.user.findUnique({
where: { id: userId },
select: { id: true, name: true, email: true } // Not all fields
});
// Connection pooling
// .env
DATABASE_URL="postgresql://user:password@localhost:5432/db?schema=public&connection_limit=20"
// Use include wisely
const user = await prisma.user.findUnique({
where: { id: userId },
include: {
posts: true, // Gets all posts
profile: true
}
});
// Better: Limit included data
const user = await prisma.user.findUnique({
where: { id: userId },
include: {
posts: {
take: 5, // Get only 5 posts
select: { // Get only needed fields
id: true,
title: true
}
}
}
});
DATABASE OPTIMIZATION CHECKLIST:
✅ Use connection pooling
✅ Create appropriate indexes
✅ Select only needed columns
✅ Use JOINs instead of N+1 queries
✅ Implement pagination
✅ Cache query results
✅ Use batch operations
✅ Analyze queries with EXPLAIN
✅ Optimize database configuration
✅ Monitor query performance
================================================================================
8. CACHING STRATEGIES
================================================================================
WHY CACHING?
Reduces database load
Faster response times (1ms vs 100ms)
Improves scalability
CACHING LAYERS:
Client Cache (Browser)
↓ (If miss)
CDN Cache
↓ (If miss)
Server Cache (Redis)
↓ (If miss)
Database
REDIS CACHING:
Installation:
npm install redis
Basic setup:
import { createClient } from 'redis';
const client = createClient({
host: 'localhost',
port: 6379,
socket: {
reconnectStrategy: (retries) => Math.min(retries * 50, 500)
}
});
client.on('error', err => console.log('Redis error:', err));
await client.connect();
Basic operations:
// Set cache (expires in 5 minutes)
await client.set('user:1', JSON.stringify(user), { EX: 300 });
// Get cache
const cached = await client.get('user:1');
const user = JSON.parse(cached);
// Delete cache
await client.del('user:1');
// Clear pattern
await client.del(await client.keys('user:*'));
Cache-Aside Pattern (Most Common):
async function getUserWithCache(id) {
// Try cache first
const cached = await client.get(`user:${id}`);
if (cached) return JSON.parse(cached);
// Cache miss, fetch from DB
const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
// Update cache
await client.set(`user:${id}`, JSON.stringify(user), { EX: 300 });
return user;
}
Write-Through Pattern:
async function updateUser(id, data) {
// Update DB
const user = await db.query(
'UPDATE users SET ... WHERE id = $1 RETURNING *',
[id]
);
// Update cache immediately
await client.set(`user:${id}`, JSON.stringify(user), { EX: 300 });
return user;
}
Write-Behind Pattern (Asynchronous):
const writeQueue = [];
async function updateUserAsync(id, data) {
// Update cache immediately
const user = { id, ...data };
await client.set(`user:${id}`, JSON.stringify(user), { EX: 300 });
// Queue DB update for later
writeQueue.push({ id, data });
return user;
}
// Process queue periodically
setInterval(async () => {
while (writeQueue.length > 0) {
const { id, data } = writeQueue.shift();
await db.query('UPDATE users SET ... WHERE id = $1', [id, data]);
}
}, 5000);
CACHE INVALIDATION:
Time-based expiration (TTL):
await client.set('key', value, { EX: 300 }); // 5 minutes
Event-based invalidation:
app.post('/users/:id', async (req, res) => {
const user = await updateUser(req.params.id, req.body);
// Invalidate cache
await client.del(`user:${req.params.id}`);
res.json(user);
});
Pattern-based invalidation:
// Clear all user caches
await client.del(await client.keys('user:*'));
// Clear user and their posts
async function deleteUser(id) {
await db.query('DELETE FROM users WHERE id = $1', [id]);
await client.del(await client.keys(`user:${id}:*`));
}
LRU CACHE (Local Memory):
import LRU from 'lru-cache';
const memoryCache = new LRU({
max: 500, // Max items
maxSize: 50 * 1024 * 1024, // 50MB max
ttl: 1000 * 60 * 5, // 5 minutes
updateAgeOnGet: true
});
app.get('/user/:id', async (req, res) => {
const id = req.params.id;
let user = memoryCache.get(id);
if (!user) {
user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
memoryCache.set(id, user);
}
res.json(user);
});
HTTP CACHING HEADERS:
app.use((req, res, next) => {
// Cache for 1 hour
res.set('Cache-Control', 'public, max-age=3600');
// Or for development (no cache)
res.set('Cache-Control', 'no-store, no-cache, must-revalidate');
next();
});
// Cache specific routes
app.get('/static/*', (req, res) => {
res.set('Cache-Control', 'public, max-age=86400'); // 1 day
// Serve static file
});
app.get('/api/data', (req, res) => {
res.set('Cache-Control', 'private, max-age=300'); // 5 min
// Serve API data
});
// ETag for validation
import crypto from 'crypto';
app.get('/data', (req, res) => {
const data = { /* ... */ };
const etag = crypto.createHash('md5').update(JSON.stringify(data)).digest('hex');
res.set('ETag', etag);
if (req.get('If-None-Match') === etag) {
res.status(304).end(); // Not Modified
} else {
res.json(data);
}
});
CACHE WARMING:
Preload cache at startup:
async function warmCache() {
const users = await db.query('SELECT * FROM users LIMIT 1000');
for (const user of users) {
await client.set(`user:${user.id}`, JSON.stringify(user), { EX: 3600 });
}
console.log(`Warmed cache with ${users.length} users`);
}
app.listen(3000, async () => {
await warmCache();
console.log('Server started');
});
MULTI-LAYER CACHING EXAMPLE:
interface CacheConfig {
local?: { ttl: number; max: number };
redis?: { ttl: number };
}
class MultiLayerCache {
private localCache: LRU;
private redis: any;
constructor(redis: any) {
this.redis = redis;
this.localCache = new LRU({ max: 1000, ttl: 1000 * 60 * 5 });
}
async get(key: string, config: CacheConfig) {
// Try local cache first
let value = this.localCache.get(key);
if (value) return value;
// Try Redis
if (config.redis) {
value = await this.redis.get(key);
if (value) {
this.localCache.set(key, JSON.parse(value));
return JSON.parse(value);
}
}
return null;
}
async set(key: string, value: any, config: CacheConfig) {
// Set local cache
if (config.local) {
this.localCache.set(key, value);
}
// Set Redis
if (config.redis) {
await this.redis.set(key, JSON.stringify(value), { EX: config.redis.ttl });
}
}
async invalidate(pattern: string) {
// Clear local cache matching pattern
for (const key of this.localCache.keys()) {
if (key.match(new RegExp(pattern))) {
this.localCache.delete(key);
}
}
// Clear Redis
const keys = await this.redis.keys(pattern);
if (keys.length > 0) {
await this.redis.del(...keys);
}
}
}
CACHING CHECKLIST:
✅ Use Redis for distributed cache
✅ Implement cache-aside pattern
✅ Set appropriate TTL
✅ Invalidate cache on updates
✅ Use LRU cache for local data
✅ Set HTTP cache headers
✅ Warm cache at startup
✅ Monitor cache hit rate
✅ Use multi-layer caching
✅ Handle cache stampedes
================================================================================
9. CONNECTION POOLING
================================================================================
WHY CONNECTION POOLING?
Creating connections is expensive
Reuse connections across requests
Massive performance improvement
CONNECTION POOL ARCHITECTURE:
Client Request
↓
Pool (Available connections?)
├─ Yes: Use connection
└─ No (and space): Create new
↓
Execute query
↓
Return connection to pool
↓
Client receives response
DATABASE CONNECTION POOLING:
PostgreSQL with pg:
import { Pool } from 'pg';
const pool = new Pool({
user: 'postgres',
password: 'password',
host: 'localhost',
port: 5432,
database: 'mydb',
max: 20, // Max connections
idleTimeoutMillis: 30000, // Close idle after 30s
connectionTimeoutMillis: 2000, // Timeout after 2s
});
app.get('/users', async (req, res) => {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users');
res.json(result.rows);
} catch (err) {
res.status(500).json({ error: err.message });
} finally {
client.release(); // Return to pool
}
});
MySQL with mysql2/promise:
import mysql from 'mysql2/promise';
const pool = mysql.createPool({
host: 'localhost',
user: 'root',
password: 'password',
database: 'mydb',
waitForConnections: true,
connectionLimit: 10,
queueLimit: 0,
enableKeepAlive: true,
keepAliveInitialDelayMs: 0,
});
app.get('/users', async (req, res) => {
const connection = await pool.getConnection();
try {
const [rows] = await connection.query('SELECT * FROM users');
res.json(rows);
} finally {
connection.release();
}
});
Prisma connection pooling:
// .env
DATABASE_URL="postgresql://user:pass@localhost/db?schema=public&connection_limit=20"
// Prisma automatically manages pool
REDIS CONNECTION POOLING:
Single connection (not recommended):
import { createClient } from 'redis';
const client = createClient();
await client.connect();
// Reuse for all requests
Connection pool (better):
import { createPool } from 'redis';
const pool = createPool({
host: 'localhost',
port: 6379,
max: 10, // Max connections
});
app.get('/data', async (req, res) => {
const client = await pool.acquire();
try {
const data = await client.get('key');
res.json(JSON.parse(data));
} finally {
pool.release(client);
}
});
MONITORING CONNECTION POOL:
PostgreSQL:
import { Pool } from 'pg';
const pool = new Pool({ /* ... */ });
setInterval(() => {
console.log({
total: pool.totalCount,
idle: pool.idleCount,
waiting: pool.waitingCount
});
}, 10000);
Custom pool wrapper with stats:
class PoolWithStats {
private pool: Pool;
private stats = { acquired: 0, released: 0, errors: 0 };
constructor(config) {
this.pool = new Pool(config);
}
async acquire() {
try {
const client = await this.pool.connect();
this.stats.acquired++;
return client;
} catch (err) {
this.stats.errors++;
throw err;
}
}
async release(client) {
client.release();
this.stats.released++;
}
getStats() {
return this.stats;
}
}
CONNECTION POOL SETTINGS:
Recommended settings:
max: 20-50 // Max connections
idleTimeoutMillis: 30000 // Close idle after 30s
connectionTimeoutMillis: 2000 // Timeout waiting for connection
For high-traffic app:
max: 50-100
idleTimeoutMillis: 60000
connectionTimeoutMillis: 5000
For low-traffic app:
max: 5-10
idleTimeoutMillis: 10000
connectionTimeoutMillis: 2000
CONNECTION POOL ISSUES:
Connection leak:
// Bad: Doesn't release connection
app.get('/users', async (req, res) => {
const client = await pool.connect();
const result = await client.query('SELECT * FROM users');
res.json(result.rows);
// Never releases!
});
// Good: Always release
app.get('/users', async (req, res) => {
const client = await pool.connect();
try {
const result = await client.query('SELECT * FROM users');
res.json(result.rows);
} finally {
client.release();
}
});
Connection timeout:
// Too few connections for load
const pool = new Pool({ max: 5 }); // Too small!
// Load test: 100 requests, only 5 connections available
// Result: Queue of 95 requests waiting
// Solution: Increase max
const pool = new Pool({ max: 50 });
CONNECTION POOLING BEST PRACTICES:
✅ Always release connections in finally block
✅ Set appropriate max connections
✅ Monitor pool stats
✅ Use connection timeouts
✅ Close idle connections
✅ Handle connection errors
✅ Don't hoard connections
✅ Use health checks
================================================================================
10. RATE LIMITING
================================================================================
WHY RATE LIMITING?
Prevents abuse and DDoS attacks
Protects resources from overuse
Fair usage for all users
RATE LIMITING STRATEGIES:
1. FIXED WINDOW:
Divide time into fixed windows (1 minute)
Allow N requests per window
Simple but can have spike at window boundary
Implementation:
class FixedWindowRateLimiter {
private windows = new Map();
isAllowed(key: string, limit: number): boolean {
const now = Date.now();
const windowStart = Math.floor(now / 60000) * 60000; // 1 min window
const windowKey = `${key}:${windowStart}`;
const count = (this.windows.get(windowKey) || 0) + 1;
this.windows.set(windowKey, count);
return count <= limit;
}
}
2. SLIDING WINDOW:
Track timestamps of requests
More accurate than fixed window
Implementation:
class SlidingWindowRateLimiter {
private requests = new Map<string, number[]>();
isAllowed(key: string, limit: number, windowMs: number): boolean {
const now = Date.now();
const windowStart = now - windowMs;
let requests = this.requests.get(key) || [];
// Remove old requests outside window
requests = requests.filter(ts => ts > windowStart);
if (requests.length < limit) {
requests.push(now);
this.requests.set(key, requests);
return true;
}
return false;
}
}
3. TOKEN BUCKET:
Tokens replenish at fixed rate
Burst allowed if tokens available
Best for variable rate limiting
Implementation:
class TokenBucketLimiter {
private buckets = new Map<string, {
tokens: number;
lastRefillTime: number;
}>();
constructor(
private refillRate: number = 10, // tokens per second
private maxTokens: number = 100
) {}
isAllowed(key: string, tokensNeeded: number = 1): boolean {
const now = Date.now();
let bucket = this.buckets.get(key) || {
tokens: this.maxTokens,
lastRefillTime: now
};
// Refill tokens based on time passed
const timePassed = (now - bucket.lastRefillTime) / 1000;
bucket.tokens = Math.min(
this.maxTokens,
bucket.tokens + timePassed * this.refillRate
);
bucket.lastRefillTime = now;
if (bucket.tokens >= tokensNeeded) {
bucket.tokens -= tokensNeeded;
this.buckets.set(key, bucket);
return true;
}
this.buckets.set(key, bucket);
return false;
}
}
REDIS-BASED RATE LIMITING (Production):
import { createClient } from 'redis';
const redis = createClient();
await redis.connect();
async function rateLimit(key: string, limit: number, windowSeconds: number) {
const current = await redis.incr(key);
if (current === 1) {
// First request in window, set expiry
await redis.expire(key, windowSeconds);
}
return current <= limit;
}
app.use(async (req, res, next) => {
const key = `rate:${req.ip}`;
const allowed = await rateLimit(key, 100, 60); // 100 requests per 60 seconds
if (!allowed) {
return res.status(429).json({ error: 'Too many requests' });
}
next();
});
RATE LIMITING MIDDLEWARE:
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';
const redis = createClient();
await redis.connect();
const limiter = rateLimit({
store: new RedisStore({
sendCommand: (cmd, ...args) => redis.sendCommand([cmd, ...args])
}),
windowMs: 15 * 60 * 1000, // 15 minutes
max: 100, // Limit each IP to 100 requests per windowMs
standardHeaders: true, // Return rate limit info in headers
legacyHeaders: false, // Disable X-RateLimit-* headers
});
app.use(limiter);
// Or apply to specific routes:
app.post('/api/login', limiter, (req, res) => {
// ...
});
DIFFERENT LIMITS FOR DIFFERENT ENDPOINTS:
const loginLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 5, // 5 login attempts per 15 minutes
});
const apiLimiter = rateLimit({
windowMs: 60 * 1000,
max: 100, // 100 API calls per minute
});
const downloadLimiter = rateLimit({
windowMs: 60 * 60 * 1000,
max: 10, // 10 downloads per hour
});
app.post('/login', loginLimiter, (req, res) => {
// Login logic
});
app.get('/api/data', apiLimiter, (req, res) => {
// API logic
});
app.get('/download', downloadLimiter, (req, res) => {
// Download logic
});
USER-BASED RATE LIMITING:
const userLimiter = rateLimit({
keyGenerator: (req) => req.user?.id || req.ip, // Use user ID if logged in
windowMs: 60 * 60 * 1000,
max: (req) => {
// Different limits for different user types
if (req.user?.isPremium) return 1000;
if (req.user?.isAdmin) return 10000;
return 100;
}
});
app.use(userLimiter);
SLIDING WINDOW REDIS IMPLEMENTATION:
async function slidingWindowRateLimit(
key: string,
limit: number,
windowSeconds: number
) {
const now = Date.now();
const windowStart = now - windowSeconds * 1000;
// Remove old entries
await redis.zremrangebyscore(key, '-inf', windowStart);
// Count requests in window
const count = await redis.zcard(key);
if (count < limit) {
// Add current request
await redis.zadd(key, now, `${now}-${Math.random()}`);
// Set expiry
await redis.expire(key, windowSeconds + 1);
return true;
}
return false;
}
CUSTOM RATE LIMITER CLASS:
class RateLimiter {
private redis: any;
constructor(redis: any) {
this.redis = redis;
}
async check(
key: string,
limit: number,
windowSeconds: number
): Promise<{ allowed: boolean; remaining: number; resetAt: number }> {
const current = await this.redis.incr(key);
let ttl = await this.redis.ttl(key);
if (current === 1) {
await this.redis.expire(key, windowSeconds);
ttl = windowSeconds;
}
const allowed = current <= limit;
const remaining = Math.max(0, limit - current);
const resetAt = Date.now() + ttl * 1000;
return { allowed, remaining, resetAt };
}
}
// Usage:
const limiter = new RateLimiter(redis);
app.use(async (req, res, next) => {
const key = `rate:${req.ip}`;
const result = await limiter.check(key, 100, 60);
res.set({
'RateLimit-Limit': '100',
'RateLimit-Remaining': result.remaining.toString(),
'RateLimit-Reset': Math.floor(result.resetAt / 1000).toString()
});
if (!result.allowed) {
return res.status(429).json({
error: 'Too many requests',
resetAt: new Date(result.resetAt)
});
}
next();
});
RATE LIMITING BEST PRACTICES:
✅ Use Redis for distributed systems
✅ Set appropriate limits per endpoint
✅ Different limits for authenticated users
✅ Return rate limit headers
✅ Graceful degradation (warn before limit)
✅ DDoS protection
✅ Monitor rate limit hits
✅ Adjust limits based on traffic
================================================================================
11. COMPRESSION & GZIP
================================================================================
WHY COMPRESSION?
Reduces response size by 60-80%
Faster transfer over network
Minimal CPU cost
GZIP COMPRESSION:
Basic setup:
import compression from 'compression';
app.use(compression());
Advanced configuration:
app.use(compression({
level: 6, // 1-9 (default 6, higher = more compression, slower)
threshold: 1000, // Only compress responses > 1KB
filter: (req, res) => {
// Don't compress if request has 'no-compression' header
if (req.headers['x-no-compression']) {
return false;
}
return compression.filter(req, res);
}
}));
COMPRESSION SETTINGS:
Level 1: Fastest, least compression
Level 6: Default (good balance)
Level 9: Slowest, best compression
Choose based on:
- High traffic: Level 1-4 (CPU matters)
- Normal traffic: Level 6 (default)
- Low traffic: Level 9 (compression matters)
Example:
// High traffic, prioritize speed
app.use(compression({ level: 1 }));
// Normal traffic
app.use(compression({ level: 6 }));
// Low traffic, prioritize size
app.use(compression({ level: 9 }));
COMPRESSION WITH BROTLI (Better than GZIP):
npm install iltorb
import * as iltorb from 'iltorb';
import compression from 'compression';
// Custom compression with Brotli fallback
const compressionMiddleware = (req, res, next) => {
const acceptEncoding = req.get('accept-encoding') || '';
if (acceptEncoding.includes('br')) {
// Use Brotli
res.set('Content-Encoding', 'br');
const brotliStream = iltorb.compressStream({
quality: 4
});
res.pipe(brotliStream);
} else if (acceptEncoding.includes('gzip')) {
// Use GZIP fallback
compression()(req, res, next);
} else {
// No compression
next();
}
};
app.use(compressionMiddleware);
SELECTIVE COMPRESSION:
app.use(compression({
filter: (req, res) => {
// Don't compress these content types
const noCompress = ['image/jpeg', 'image/png', 'video/mp4'];
const contentType = res.get('Content-Type') || '';
if (noCompress.some(type => contentType.includes(type))) {
return false;
}
// Compress everything else
return compression.filter(req, res);
}
}));
COMPRESSION BENCHMARKS:
Response size comparison:
- Original JSON: 100KB
- GZIP Level 1: 15KB (85% reduction, 2ms compression)
- GZIP Level 6: 12KB (88% reduction, 10ms compression)
- GZIP Level 9: 11KB (89% reduction, 50ms compression)
- Brotli Level 4: 9KB (91% reduction, 30ms compression)
Network transfer with 10Mbps connection:
- Uncompressed 100KB: 80ms
- GZIP 12KB: 10ms (8x faster)
- Brotli 9KB: 7ms (11x faster)
COMPRESSION HEADERS:
// Server sends compressed response
res.set('Content-Encoding', 'gzip');
res.set('Content-Type', 'application/json');
res.set('Vary', 'Accept-Encoding'); // Tell caches to vary by encoding
// Client accepts compression
req.headers['accept-encoding']; // 'gzip, deflate, br'
COMPRESSION WITH NGINX (Recommended for production):
nginx.conf:
http {
gzip on;
gzip_vary on;
gzip_proxied any;
gzip_comp_level 6;
gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss application/rss+xml font/truetype font/opentype application/vnd.ms-fontobject image/svg+xml;
gzip_disable "msie6";
}
COMPRESSION CHECKLIST:
✅ Enable compression for all responses > 1KB
✅ Use GZIP or Brotli
✅ Set appropriate compression level
✅ Don't compress already compressed (images, video)
✅ Set Cache-Control headers
✅ Measure actual performance improvement
✅ Monitor CPU usage
✅ Use Nginx for production
✅ Test on real devices
================================================================================
12. WORKER THREADS
================================================================================
WHY WORKER THREADS?
CPU-intensive tasks block event loop
Worker threads allow parallel CPU processing
Better than clustering for CPU work
WORKER THREADS VS CLUSTERING:
Clustering:
- Separate Node.js processes
- Each has own memory
- Good for I/O operations
- Can use all CPU cores
- More memory overhead
Worker Threads:
- Lightweight threads within single process
- Shared memory available
- Better for CPU-intensive work
- Cheaper than clustering
- Limited to CPU count
BASIC WORKER THREAD:
worker.js:
import { parentPort } from 'worker_threads';
function fibonacci(n) {
if (n <= 1) return n;
return fibonacci(n - 1) + fibonacci(n - 2);
}
parentPort.on('message', (n) => {
const result = fibonacci(n);
parentPort.postMessage(result);
});
main.js:
import { Worker } from 'worker_threads';
const worker = new Worker('./worker.js');
worker.on('message', (result) => {
console.log(`Fibonacci result: ${result}`);
worker.terminate();
});
worker.postMessage(45);
WORKER POOL:
class WorkerPool {
private workers: Worker[] = [];
private queue: any[] = [];
private activeWorkers = new Set();
constructor(workerPath: string, poolSize: number = 4) {
for (let i = 0; i < poolSize; i++) {
const worker = new Worker(workerPath);
worker.on('message', (result) => {
this.activeWorkers.delete(worker);
if (this.queue.length > 0) {
const { task, resolve } = this.queue.shift();
this.executeTask(worker, task, resolve);
}
});
this.workers.push(worker);
}
}
private executeTask(worker: Worker, task: any, resolve: any) {
this.activeWorkers.add(worker);
worker.once('message', resolve);
worker.postMessage(task);
}
async run(task: any): Promise<any> {
return new Promise((resolve) => {
// Find available worker
const available = this.workers.find(w => !this.activeWorkers.has(w));
if (available) {
this.executeTask(available, task, resolve);
} else {
// Queue task if no workers available
this.queue.push({ task, resolve });
}
});
}
terminate() {
return Promise.all(this.workers.map(w => w.terminate()));
}
}
// Usage:
const pool = new WorkerPool('./worker.js', 4);
app.post('/compute', async (req, res) => {
try {
const result = await pool.run(req.body);
res.json({ result });
} catch (err) {
res.status(500).json({ error: err.message });
}
});
SHARED MEMORY WITH WORKER THREADS:
main.js:
import { Worker, isMainThread, parentPort } from 'worker_threads';
if (isMainThread) {
// Create shared memory
const sharedBuffer = new SharedArrayBuffer(4);
const sharedArray = new Int32Array(sharedBuffer);
sharedArray[0] = 10;
const worker = new Worker('./worker.js');
worker.on('message', () => {
console.log(`Shared value after worker: ${sharedArray[0]}`);
});
worker.postMessage({ sharedBuffer });
setTimeout(() => {
console.log(`Shared value from main: ${sharedArray[0]}`);
}, 100);
} else {
parentPort.on('message', ({ sharedBuffer }) => {
const sharedArray = new Int32Array(sharedBuffer);
sharedArray[0] *= 2; // Modify shared value
parentPort.postMessage('done');
});
}
EXPRESS WITH WORKER THREADS:
app.post('/heavy-computation', async (req, res) => {
const worker = new Worker('./compute.js');
worker.once('message', (result) => {
res.json(result);
worker.terminate();
});
worker.once('error', (err) => {
res.status(500).json({ error: err.message });
worker.terminate();
});
worker.once('exit', (code) => {
if (code !== 0) {
res.status(500).json({ error: `Worker stopped with exit code ${code}` });
}
});
worker.postMessage(req.body);
});
WORKER THREAD BEST PRACTICES:
✅ Use for CPU-intensive operations
✅ Implement worker pool for reusability
✅ Handle worker errors
✅ Terminate workers when done
✅ Use shared memory for large data
✅ Monitor worker performance
✅ Limit pool size (usually CPU count)
================================================================================
13. MONITORING & PROFILING
================================================================================
WHY MONITORING?
Identify performance bottlenecks
Detect memory leaks early
Track system health
MEMORY PROFILING:
Print memory usage:
setInterval(() => {
const mem = process.memoryUsage();
console.log({
rss: `${(mem.rss / 1024 / 1024).toFixed(2)}MB`,
heapTotal: `${(mem.heapTotal / 1024 / 1024).toFixed(2)}MB`,
heapUsed: `${(mem.heapUsed / 1024 / 1024).toFixed(2)}MB`,
external: `${(mem.external / 1024 / 1024).toFixed(2)}MB`
});
}, 5000);
Memory middleware:
app.use((req, res, next) => {
const start = process.memoryUsage().heapUsed;
res.on('finish', () => {
const end = process.memoryUsage().heapUsed;
const delta = (end - start) / 1024 / 1024;
console.log(`Memory delta: ${delta.toFixed(2)}MB`);
});
next();
});
CPU PROFILING:
Request timing:
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on('finish', () => {
const end = process.hrtime.bigint();
const duration = Number(end - start) / 1000000; // ms
if (duration > 100) {
console.warn(`Slow: ${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
}
});
next();
});
PRODUCTION MONITORING TOOLS:
1. Node.js built-in tools:
- node --inspect app.js (Chrome DevTools)
- node --prof app.js (CPU profiling)
- node --heap-prof app.js (Heap profiling)
2. APM Solutions:
- New Relic
- DataDog
- SignalFx
- Elastic APM
3. Prometheus metrics:
npm install prom-client
import prometheus from 'prom-client';
// Create metrics
const httpDuration = new prometheus.Histogram({
name: 'http_request_duration_seconds',
help: 'Duration of HTTP requests in seconds',
labelNames: ['method', 'route', 'status_code'],
buckets: [0.1, 0.5, 1, 2, 5]
});
const httpRequests = new prometheus.Counter({
name: 'http_requests_total',
help: 'Total HTTP requests',
labelNames: ['method', 'route', 'status_code']
});
// Middleware
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
const duration = (Date.now() - start) / 1000;
httpDuration.observe({
method: req.method,
route: req.route?.path || req.path,
status_code: res.statusCode
}, duration);
httpRequests.inc({
method: req.method,
route: req.route?.path || req.path,
status_code: res.statusCode
});
});
next();
});
// Expose metrics
app.get('/metrics', async (req, res) => {
res.set('Content-Type', prometheus.register.contentType);
res.end(await prometheus.register.metrics());
});
HEALTH CHECK ENDPOINT:
app.get('/health', (req, res) => {
const mem = process.memoryUsage();
const uptime = process.uptime();
const health = {
status: 'ok',
uptime,
memory: {
rss: mem.rss,
heapTotal: mem.heapTotal,
heapUsed: mem.heapUsed,
heapPercent: (mem.heapUsed / mem.heapTotal * 100).toFixed(2)
},
timestamp: new Date().toISOString()
};
// Check for unhealthy state
if (mem.heapUsed / mem.heapTotal > 0.9) {
health.status = 'degraded';
}
const statusCode = health.status === 'ok' ? 200 : 503;
res.status(statusCode).json(health);
});
LOG AGGREGATION:
Simple logger:
class Logger {
log(level: string, message: string, meta: any = {}) {
console.log(JSON.stringify({
timestamp: new Date().toISOString(),
level,
message,
...meta
}));
}
info(message: string, meta?: any) { this.log('INFO', message, meta); }
error(message: string, meta?: any) { this.log('ERROR', message, meta); }
warn(message: string, meta?: any) { this.log('WARN', message, meta); }
}
const logger = new Logger();
app.use((req, res, next) => {
res.on('finish', () => {
logger.info('HTTP request completed', {
method: req.method,
url: req.url,
status: res.statusCode,
duration: res.getHeader('X-Response-Time')
});
});
next();
});
MONITORING CHECKLIST:
✅ Monitor memory usage continuously
✅ Monitor CPU usage
✅ Track response times
✅ Detect memory leaks
✅ Health check endpoint
✅ Request counting
✅ Error tracking
✅ Log aggregation
✅ Alerting on thresholds
================================================================================
14. PRODUCTION DEPLOYMENT
================================================================================
PRODUCTION SCALING CHECKLIST:
Before deploying:
✅ Enable clustering
✅ Implement connection pooling
✅ Setup caching (Redis)
✅ Add rate limiting
✅ Enable compression
✅ Add monitoring/health checks
✅ Optimize database queries
✅ Set resource limits
✅ Configure logging
✅ Test under load
DOCKER WITH CLUSTERING:
Dockerfile:
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 8008
CMD ["node", "dist/app.js"]
docker-compose.yml (with clustering enabled):
version: '3.9'
services:
app:
build: .
environment:
NODE_ENV: production
PORT: 8008
DB_POOL_MAX: 20
REDIS_URL: redis://redis:6379
ports:
- "8008:8008"
depends_on:
- redis
- postgres
restart: unless-stopped
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '1'
memory: 1G
redis:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
- redis_data:/data
restart: unless-stopped
postgres:
image: postgres:16-alpine
environment:
POSTGRES_PASSWORD: password
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
restart: unless-stopped
volumes:
redis_data:
postgres_data:
PRODUCTION SERVER SETUP:
1. Ubuntu/Linux:
sudo apt-get update
sudo apt-get install nodejs npm redis-server postgresql
2. Node.js process manager (PM2):
npm install -g pm2
pm2 start app.js -i max --name "myapp"
pm2 save
pm2 startup
3. Nginx reverse proxy:
/etc/nginx/sites-available/default:
upstream app {
server localhost:8008;
server localhost:8009;
server localhost:8010;
server localhost:8011;
keepalive 64;
}
server {
listen 80;
server_name example.com;
client_max_body_size 10M;
location / {
proxy_pass http://app;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
gzip on;
gzip_types text/plain text/css text/javascript application/json;
}
sudo nginx -t
sudo systemctl restart nginx
KUBERNETES DEPLOYMENT:
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:1.0.0
ports:
- containerPort: 8008
env:
- name: NODE_ENV
value: "production"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8008
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health
port: 8008
initialDelaySeconds: 5
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: myapp
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 8008
type: LoadBalancer
ENVIRONMENT VARIABLES:
.env.production:
NODE_ENV=production
PORT=8008
LOG_LEVEL=warn
DB_HOST=postgres.example.com
DB_PORT=5432
DB_USER=postgres
DB_PASSWORD=securepassword
DB_NAME=production_db
DB_POOL_MAX=50
REDIS_URL=redis://redis.example.com:6379
API_TIMEOUT=30000
RATE_LIMIT_WINDOW=60000
RATE_LIMIT_MAX=1000
================================================================================
15. COMPLETE EXAMPLES
================================================================================
EXAMPLE 1: COMPLETE SCALED APP (Node.js + TypeScript)
src/app.ts:
import cluster from 'cluster';
import os from 'os';
import express from 'express';
import compression from 'compression';
import rateLimit from 'express-rate-limit';
import { Pool } from 'pg';
import { createClient } from 'redis';
import dotenv from 'dotenv';
dotenv.config();
const port = Number(process.env.PORT) || 8008;
const numCPUs = process.env.NODE_ENV === 'production' ? os.cpus().length : 2;
// Database pool
const pool = new Pool({
host: process.env.DB_HOST,
port: Number(process.env.DB_PORT),
database: process.env.DB_NAME,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
max: Number(process.env.DB_POOL_MAX) || 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000,
});
// Redis client
const redis = createClient({
url: process.env.REDIS_URL
});
if (cluster.isPrimary) {
console.log(`Primary ${process.pid} starting with ${numCPUs} workers`);
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker) => {
console.log(`Worker ${worker.process.pid} died, respawning...`);
cluster.fork();
});
process.on('SIGTERM', () => {
console.log('Shutting down...');
for (const worker of Object.values(cluster.workers || {})) {
worker?.kill();
}
process.exit(0);
});
} else {
const app = express();
// Middleware
app.use(compression());
app.use(express.json({ limit: '1mb' }));
// Rate limiting
const limiter = rateLimit({
windowMs: 60 * 1000,
max: 100,
standardHeaders: true,
legacyHeaders: false,
});
app.use('/api/', limiter);
// Performance tracking
app.use((req, res, next) => {
const start = process.hrtime.bigint();
res.on('finish', () => {
const end = process.hrtime.bigint();
const duration = Number(end - start) / 1000000;
if (duration > 100) {
console.warn(`Slow: ${req.method} ${req.url} - ${duration.toFixed(2)}ms`);
}
});
next();
});
// Routes
app.get('/health', (req, res) => {
const mem = process.memoryUsage();
res.json({
status: 'ok',
pid: process.pid,
uptime: process.uptime(),
memory: {
heapUsed: (mem.heapUsed / 1024 / 1024).toFixed(2),
heapTotal: (mem.heapTotal / 1024 / 1024).toFixed(2)
}
});
});
app.get('/api/users/:id', async (req, res) => {
try {
const cacheKey = `user:${req.params.id}`;
// Try cache
let user = await redis.get(cacheKey);
if (user) {
return res.json({ cached: true, data: JSON.parse(user) });
}
// Database
const result = await pool.query('SELECT * FROM users WHERE id = $1', [req.params.id]);
if (result.rows.length === 0) {
return res.status(404).json({ error: 'User not found' });
}
user = result.rows[0];
// Cache
await redis.set(cacheKey, JSON.stringify(user), { EX: 300 });
res.json({ cached: false, data: user });
} catch (err) {
console.error(err);
res.status(500).json({ error: err.message });
}
});
app.listen(port, () => {
console.log(`Worker ${process.pid} listening on port ${port}`);
});
// Graceful shutdown
process.on('SIGTERM', () => {
console.log(`Worker ${process.pid} shutting down...`);
process.exit(0);
});
}
EXAMPLE 2: LOAD BALANCING MULTIPLE SERVERS
nginx.conf:
upstream backend {
least_conn; # Load balancing strategy
server backend1.example.com:8008 weight=1;
server backend2.example.com:8008 weight=1;
server backend3.example.com:8008 weight=1;
server backend4.example.com:8008 weight=1;
keepalive 32;
}
server {
listen 80;
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_connect_timeout 60s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
}
gzip on;
gzip_types application/json text/plain text/css;
gzip_min_length 1000;
}
================================================================================
16. TROUBLESHOOTING
================================================================================
COMMON ISSUES:
1. HIGH MEMORY USAGE
Problem: Heap keeps growing
Solution:
- Enable profiling: node --heap-prof app.js
- Check for memory leaks
- Restart worker periodically
- Use memory limits
2. SLOW REQUESTS
Problem: Some requests take 1-10 seconds
Solution:
- Add profiling middleware
- Check database queries (use EXPLAIN)
- Add caching
- Use connection pooling
- Optimize algorithms
3. HIGH CPU USAGE
Problem: CPU at 100%
Solution:
- Use clustering
- Profile with: node --prof app.js
- Offload to worker threads
- Optimize hot code paths
- Consider horizontal scaling
4. DATABASE BOTTLENECK
Problem: Database connections exhausted
Solution:
- Increase connection pool size
- Reduce connection timeout
- Optimize queries
- Add Redis caching
- Use read replicas
5. CONNECTION POOL TIMEOUT
Problem: "Error: Client has already been released"
Solution:
- Always release connections in finally block
- Check for connection leaks
- Increase pool size
- Reduce idle timeout
6. RATE LIMITING ISSUES
Problem: Legitimate users get rate limited
Solution:
- Increase rate limit
- Use user-based limits
- Whitelist IPs
- Implement gradual backoff
PRODUCTION MONITORING COMMAND:
# Monitor everything
watch 'ps aux | grep node; echo "---"; free -h; echo "---"; df -h'
# Monitor network
nethogs -d 1
# Monitor processes
top
# Monitor disk I/O
iostat -x 1
PERFORMANCE BENCHMARK SCRIPT:
#!/bin/bash
echo "Warming up..."
ab -n 100 -c 10 http://localhost:8008/
echo "Running benchmark..."
ab -n 10000 -c 100 http://localhost:8008/health
# Output: requests/sec, mean response time, etc.
FINAL VERTICAL SCALING CHECKLIST:
✅ Enable Node.js clustering
✅ Setup database connection pooling (20-50 connections)
✅ Implement Redis caching layer
✅ Setup rate limiting (100-1000 req/min)
✅ Enable Gzip/Brotli compression
✅ Optimize database queries (indexes, JOINs)
✅ Use worker threads for CPU tasks
✅ Add monitoring and health checks
✅ Configure logging
✅ Setup graceful shutdown
✅ Test under load
✅ Document configuration
EXPECTED IMPROVEMENTS:
Before optimization:
- 100 requests/sec
- 100ms response time
- 2GB memory usage
After vertical scaling:
- 1000-2000 requests/sec (10-20x)
- 10-20ms response time (5-10x faster)
- Similar memory (optimized usage)
Cost: Mostly software optimization (free to cheap)
================================================================================
END OF VERTICAL SCALING A TO Z GUIDE
================================================================================
You now have complete knowledge of vertical scaling in Node.js/TypeScript!
Key Takeaways:
✅ Vertical scaling increases server power
✅ Node.js clustering utilizes all CPU cores
✅ Connection pooling is essential
✅ Caching (Redis) reduces database load
✅ Rate limiting prevents abuse
✅ Monitoring identifies issues early
✅ Combine multiple techniques for best results
✅ Can 10-20x performance with optimization
✅ Know when to switch to horizontal scaling
Scaling Journey:
Single Server → Clustered + Caching → Load Balanced → Horizontal → Kubernetes
Happy Scaling! 🚀
Comments