The Insiders Guide to GridFS
What is GridFS

GridFS is a virtual file system for file storage with Mongodb. It enables you to store and retrieve files such as images, audio files, video files, etc that exceeds BSON-document size limit of 16MB.

The files are broken down into smaller pieces called chunks and these are placed in a bucket called fs. There are 2 types of collections to store chunks:

  • fs.files
  • fs.chunks

    fs.chunks contains files_id and n field

files_id is the _id of chunk’s “parent” document
The n field contains the sequence number of the chunks, starting with 0.
Following is a sample document of fs.chunks document:

{
"files_id": ObjectId("534a75d19f54bfec8a2fe44b"),
"n":NumberInt(0),
"data":"Mongo Binary Data"
}

fs.files contains the metadata of files
Following is a sample document of fs.files collection:

{
"filename":"test.txt",
"chunkSize": NumberInt(261120),
"uploadDate":ISODate("2014-04-13T11:32:33.557Z"),
"md5": "7b762939321e146569b07f72c62cca4f",
"length": NumberInt(646)
}

When to use GridFS
Usually we use relational database system to store user uploaded files.
While using this database the files get stored on the file system separate from the database. This creates a large number of problems when we:

  • Try to replicate the files for all needed servers
  • Delete the files from database
  • Backup the files for safety and disaster recovery

We can overcome these problems by using GridFS, since the users can store the files along with the database.Hence it is easy to backup, replicate and delete files.
Also it is very useful when we are dealing with large media content that needs to be selectively read or edited, we have to read only a certain range of bytes of the files and only those chunks are brought into memory and not the whole file.

Integrate GridFS with Mongoose:

How to install

$npm install gridfs-stream
$npm install busboy-body-parser

How to use

var Busboy = require('busboy'); // 0.2.9
var express = require('express');//4.12.3
var mongo = require('mongodb');//2.0.31
var Grid = require('gridfs-stream');//1.1.1"
var app = express();
var server = app.listen(9002);

var db = new mongo.Db('test',new mongo.Server('127.0.0.1', 27017));
var gfs;
db.open(function(err, db) {
if (err)
throw err;
gfs = Grid(db, mongo);
});

The routing

var upload = require ('./controllers/upload.server.controller');
app.route('/upload/:filename')
.get(upload.read);
app.route('/upload/')
.post(upload.create);

Storing datastream from post

exports.create = function(req, res) {
var busboy = new
Busboy({ headers : req.headers });
var fileId = new mongo.ObjectId();
busboy.on('file', function(fieldname, file, filename, encoding, mimetype) {
console.log('got file', filename, mimetype, encoding);
var writeStream = gfs.createWriteStream({
_id: fileId,
filename:filename,
mode:'w',
content_type:mimetype });
file.pipe(writeStream);
}).on('finish',
function()
{ // show a link to the uploaded file
res.writeHead(200, {'content-type':'text/html'});
res.end('download file');
});
req.pipe(busboy);
});

Retrieve datastream from get

exports.read = function(req, res) {
gfs.findOne({ _id: req.params.id }, function (err, file) {
if (err) return
res.status(400).send(err);
if(!file)
return
res.status(404).send(' ');
res.set('Content-Type', file.contentType);
res.set('Content-Disposition','attachment; filename="'+ file.filename + '"');
var readstream = gfs.createReadStream({
_id: file._id });
readstream.on("error", function(err)
{
console.log("Got error while processing stream " + err.message);
res.end();
});
readstream.pipe(res);
});
});

Modules
There are several GridFS plugin modules available to serve the file data stored in MongoDB directly from web server or file system

GridFS-Fuse- Plugin GridFS into the filesystem
GridFS-Nginx- Plugin to server GridFS files directly from Nginx

Limitations
Serving files along with our database content may significantly churn our memory working set and we can avoid this by using another MongoDB server dedicated to GridFS storage
Atomic update of a file is not provided by GridFs and if it is necessary we have to maintain multiple versions of our files and pick the right version.

Conclusion
There are some curbs while using Grid FS which we can be overcome at our planning stage with proper estimation of the amount of data that we are storing and how we are accessing it, Grid FS will then be a good practise for our application’s file storage needs.

Next Step
Meet and hire professional MongoDB GridFS developers at Cubet Techno Labs and get your ideas converted to result-oriented applications within your budget.

Know More About This Topic from our Techies

Latest Post