
How I Recovered Lost MongoDB Data Without Backup Using Node.js and Puppeteer
Let me confess something painful:
I once deleted my entire MongoDB database. On a live project.
Yep — all my blog posts, user data, and analytics — gone in the blink of an overly confident db.dropDatabase()
command.
If you’re reading this because you just nuked your MongoDB too… take a deep breath. I’ve been there. It’s not fun, but it’s recoverable (sort of).
The Great MongoDB Disaster
It all started when I was “cleaning up” my collections.
One wrong database selection later — boom.
Empty.
I opened MongoDB Atlas — no snapshots.
Checked my /data/db
folder — nothing.
My dynamic sitemap? Of course it was also stored in MongoDB.
I basically deleted my site and the map to find it again.
At that moment, I considered a career in gardening.
My “Wait, I Can Fix This” Moment
After 10 minutes of staring blankly at my screen (and one cup of panic coffee), I remembered:
👉 Google Search Console still had my site’s URLs!
Each indexed URL was like a breadcrumb from my old website.
So I thought — what if I re-scraped all my posts from Google’s cached pages or any live mirrors that still existed?
That became my insane-but-successful recovery plan.
Step-by-Step: How I Recovered MongoDB Data Without Backup
Here’s how you can try it too if you ever lose your MongoDB data.
1. Export URLs from Google Search Console
Go to Search Console → Pages → Indexed Pages.
Export the list as a CSV or Excel file.
Clean it up — keep only the URLs you need to rebuild.
You can also grab URLs from Google Search like this:
site:yourdomain.com
Copy those links manually or scrape them (more on that next ).
2. Scrape Page Content Using Node.js + Puppeteer
Once you’ve got your URLs, you can use Puppeteer (a headless Chrome library for Node.js) to scrape and rebuild your data.
Here’s a simple script I wrote during my caffeine-fueled recovery mission:
// scrape.js
import fs from "fs";
import puppeteer from "puppeteer";
async function scrapeUrls(urls) {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
const results = [];
for (const url of urls) {
try {
console.log(`Scraping: ${url}`);
await page.goto(url, { waitUntil: "domcontentloaded", timeout: 60000 });
const title = await page.$eval("h1", el => el.textContent.trim());
const content = await page.$eval("article", el => el.innerHTML.trim());
const date = await page.$eval("time", el => el.getAttribute("datetime") || "unknown");
results.push({ title, content, date, url });
} catch (err) {
console.error(`Failed to scrape ${url}:`, err.message);
}
}
await browser.close();
// Save the results to a JSON file
fs.writeFileSync("recoveredData.json", JSON.stringify(results, null, 2));
console.log("✅ Scraping complete! Data saved to recoveredData.json");
}
const urls = [
"https://yourwebsite.com/post1",
"https://yourwebsite.com/post2",
// ...add all your URLs here
];
scrapeUrls(urls);
**What this does:**
Opens each URL in a headless browser
Extracts your post title, content, and publish date
Saves everything as structured JSON (ready to re-import into MongoDB)
3. Rebuild Your MongoDB Collection
Once you’ve got your recovered JSON file, just feed it back into your MongoDB:
// restore.js
import { MongoClient } from "mongodb";
import fs from "fs";
const MONGO_URI = "mongodb+srv://<your-connection-string>";
const client = new MongoClient(MONGO_URI);
async function restoreData() {
const data = JSON.parse(fs.readFileSync("recoveredData.json", "utf-8"));
await client.connect();
const db = client.db("yourdbname");
const collection = db.collection("posts");
await collection.insertMany(data);
console.log(`✅ Inserted ${data.length} documents successfully!`);
await client.close();
}
restoreData().catch(console.error);
This script reads the JSON file created by Puppeteer and inserts it into your MongoDB collection.
And just like that — your content lives again!
Bonus Recovery Tips
✅ Check MongoDB’s Local Data Folder
If MongoDB was running locally, look inside /data/db
.
Sometimes data files (.wt
, .ns
) can still be repaired using:
mongod --repair --dbpath /data/db
✅ Use mongorestore
If You Ever Did a mongodump
If you’ve got any old .bson
files, restore them with:
mongorestore --db yourdb path/to/dump
✅ Try Atlas Snapshots (if you’re lucky)
MongoDB Atlas offers automatic continuous backups on higher tiers.
Go to Backups → Snapshots in your cluster — you might have one waiting.
✅ Use Wayback Machine or Google Cache
Search:
cache:yourdomain.com/blog-post
or check archive.org/web to retrieve snapshots.
Recovery Tools That Might Help
If you’re dealing with corrupted files, try:
Kernel for MongoDB Recovery
Stellar Repair for MongoDB
EaseUS Data Recovery
They can sometimes extract raw BSON data even from broken .wt
files.
What I Learned (The Hard Way)
After that fiasco, I started automating backups:
mongodump --uri="your_connection_string" --out=/backups/$(date +%F)
Then set it to run daily with a cron job.
Now my backups are safer than my lunch leftovers.
Moral of the Story
Never run MongoDB commands without backups.
Google Search Console is your unexpected hero.
Puppeteer can become your best friend in digital archaeology.
And most importantly — always test your commands on staging first.
Find it useful also read how to remove trailing slash error
Final Thoughts
Losing a database feels like watching your digital life flash before your eyes.
But if you think creatively — scraping, caching, or even Google indexing — you can rebuild almost anything.