The Insider’s Guide to GridFS

  • CubetCubet
  • Web App Development
  • May 22 2015

The Insiders Guide to GridFS

What is GridFS

GridFS is a virtual file system for file storage with Mongodb. It enables you to store and retrieve files such as images, audio files, video files, etc that exceeds BSON-document size limit of 16MB.

The files are broken down into smaller pieces called chunks and these are placed in a bucket called fs. There are 2 types of collections to store chunks:

  • fs.files
  • fs.chunks

    fs.chunks contains files_id and n field

files_id is the _id of chunk’s “parent” document
The n field contains the sequence number of the chunks, starting with 0.
Following is a sample document of fs.chunks document:

{ "files_id": ObjectId("534a75d19f54bfec8a2fe44b"), "n":NumberInt(0), "data":"Mongo Binary Data" }

fs.files contains the metadata of files
Following is a sample document of fs.files collection:

{ "filename":"test.txt", "chunkSize": NumberInt(261120), "uploadDate":ISODate("2014-04-13T11:32:33.557Z"), "md5": "7b762939321e146569b07f72c62cca4f", "length": NumberInt(646) }

When to use GridFS
Usually we use relational database system to store user uploaded files.
While using this database the files get stored on the file system separate from the database. This creates numerous problems when we:

  • Try to replicate the files for all needed servers
  • Delete the files from database
  • Backup the files for safety and disaster recovery

We can overcome these problems by using GridFS, since the users can store the files along with the database. Hence, it is easy to back up, replicate and delete files.

Also, it is very useful when we are dealing with large media content that needs to be selectively read or edited, we have to read only a certain range of bytes of the files and only those chunks are brought into memory and not the whole file.

Integrate GridFS with Mongoose:

How to install

$npm install gridfs-stream $npm install busboy-body-parser

How to use

var Busboy = require('busboy'); // 0.2.9 var express = require('express');//4.12.3 var mongo = require('mongodb');//2.0.31 var Grid = require('gridfs-stream');//1.1.1" var app = express(); var server = app.listen(9002); var db = new mongo.Db('test',new mongo.Server('127.0.0.1', 27017)); var gfs; db.open(function(err, db) { if (err) throw err; gfs = Grid(db, mongo); });

The routing

var upload = require ('./controllers/upload.server.controller'); app.route('/upload/:filename') .get(upload.read); app.route('/upload/') .post(upload.create);

Storing datastream from post

exports.create = function(req, res) { var busboy = new Busboy({ headers : req.headers }); var fileId = new mongo.ObjectId(); busboy.on('file', function(fieldname, file, filename, encoding, mimetype) { console.log('got file', filename, mimetype, encoding); var writeStream = gfs.createWriteStream({ _id: fileId, filename:filename, mode:'w', content_type:mimetype }); file.pipe(writeStream); }).on('finish', function() { // show a link to the uploaded file res.writeHead(200, {'content-type':'text/html'}); res.end('download file'); }); req.pipe(busboy); });

Retrieve datastream from get

exports.read = function(req, res) { gfs.findOne({ _id: req.params.id }, function (err, file) { if (err) return res.status(400).send(err); if(!file) return res.status(404).send(' '); res.set('Content-Type', file.contentType); res.set('Content-Disposition','attachment; filename="'+ file.filename + '"'); var readstream = gfs.createReadStream({ _id: file._id }); readstream.on("error", function(err) { console.log("Got error while processing stream " + err.message); res.end(); }); readstream.pipe(res); }); });

Modules
There are several GridFS plugin modules available to serve the file data stored in MongoDB directly from web server or file system

GridFS-Fuse- Plugin GridFS into the filesystem
GridFS-Nginx- Plugin to server GridFS files directly from Nginx

Limitations
Serving files along with our database content may significantly churn our memory working set, and we can avoid this by using another MongoDB server dedicated to GridFS storage
Atomic update of a file is not provided by GridFs and if it is necessary we have to maintain multiple versions of our files and pick the right version.

Conclusion
There are some curbs while using Grid FS which we can be overcome at our planning stage with proper estimation of the amount of data that we are storing and how we are accessing it, Grid FS will then be a good practice for our application’s file storage needs.

Next Step
Meet and hire professional MongoDB GridFS developers at Cubet Techno Labs and get your ideas converted to result-oriented applications within your budget.

Know More About This Topic from our Techies

Got a similar project idea?

Connect with us & let’s start the journey!

Questions about our products and services?

We're here to support you.

Staff augmentation is a flexible workforce strategy companies adopt to meet specific project needs or address skill gaps.

Begin your journey!
Need more help?