Mongo-DB Introduction

MongoDB is a cross-platform document-oriented database. Classified as a NoSQL database, MongoDB eschews the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas (MongoDB calls the format BSON), making the integration of data in certain types of applications easier and faster.

History
Development of MongoDB began in 2007, when the company (then named 10gen) was building a platform as a service similar to Windows Azure or Google App Engine.In 2009, MongoDB was open sourced as a stand-alone product with an AGPL license.

Main features

Ad hoc queries
MongoDB supports search by field, range queries, regular expression searches. Queries can return specific fields of documents and also include user-defined JavaScript functions.

Indexing
Any field in a MongoDB document can be indexed (indices in MongoDB are conceptually similar to those in RDBMSes). Secondary indices are also available.

Replication
MongoDB provides high availability with replica sets.A replica set consists of two or more copies of the data. Each replica set member may act in the role of primary or secondary replica at any time. The primary replica performs all writes and reads by default. Secondary replicas maintain a copy of the data on the primary using built-in replication. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary. Secondaries can also perform read operations, but the data is eventually consistent by default.

Load balancing
MongoDB scales horizontally using sharding.The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.)
MongoDB can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure. Automatic configuration is easy to deploy, and new machines can be added to a running database.

File storage
MongoDB can be used as a file system, taking advantage of load balancing and data replication features over multiple machines for storing files.
This function, called GridFS, is included with MongoDB drivers and available with no difficulty for development languages (see “Language Support” for a list of supported languages). MongoDB exposes functions for file manipulation and content to developers. GridFS is used, for example, in plugins for NGINX and lighttpd. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.
In a multi-machine MongoDB system, files can be distributed and copied multiple times between machines transparently, thus effectively creating a load-balanced and fault-tolerant system.

Aggregation
MapReduce can be used for batch processing of data and aggregation operations. The aggregation framework enables users to obtain the kind of results for which the SQL GROUP BY clause is used.

Server-side JavaScript execution
JavaScript can be used in queries, aggregation functions (such as MapReduce), and sent directly to the database to be executed.

Capped collections
MongoDB supports fixed-size collections called capped collections. This type of collection maintains insertion order and, once the specified size has been reached, behaves like a circular queue.

Language support
MongoDB has official drivers for a variety of popular programming languages and development environments. Web programming language Opa also has built-in support for MongoDB, which is tightly integrated in the language and offers a type-safety layer on top of MongoDB.There are also a large number of unofficial or community-supported drivers for other programming languages and frameworks.

MongoDB tools
In a MongoDB installation the following commands are available:

mongo
MongoDB offers an interactive shell called mongo,which lets developers view, insert, remove, and update data in their databases, as well as get replication information, set up sharding, shut down servers, execute JavaScript, and more.
Administrative information can also be accessed through a web interface,a simple webpage that serves information about the current server status. By default, this interface is 1000 ports above the database port (28017).

mongostat
mongostat is a command-line tool that displays a summary list of status statistics for a currently running MongoDB instance: how many inserts, updates, removes, queries, and commands were performed, as well as what percentage of the time the database was locked and how much memory it is using. This tool is similar to the UNIX/Linux vmstat utility.

mongotop
mongotop is a command-line tool providing a method to track the amount of time a MongoDB instance spends reading and writing data. mongotop provides statistics on the per-collection level. By default, mongotop returns values every second. This tool is similar to the UNIX/Linux top utility.

mongosniff
mongosniff is a command-line tool providing a low-level tracing/sniffing view into database activity by monitoring (or “sniffing”) network traffic going to and from MongoDB. mongosniff requires the Libpcap network library and is only available for Unix-like systems. A cross-platform alternative is the open source Wireshark packet analyzer which has full support for the MongoDB wire protocol.

mongoimport, mongoexport
mongoimport is a command-line utility to import content from a JSON, CSV, or TSV export created by mongoexport or potentially other third-party data exports.

mongodump, mongorestore
mongodump is a command-line utility for creating a binary export of the contents of a Mongo database; mongorestore can be used to reload a database dump.

 

Sample document
Below given example shows the document structure of a blog site which is simply a comma separated key value pair.

_id is a 12 bytes hexadecimal number which assures the uniqueness of every document. You can provide _id while inserting the document. If you didn’t provide then MongoDB provide a unique id for every document. These 12 bytes first 4 bytes for the current timestamp, next 3 bytes for machine id, next 2 bytes for process id of mongodb server and remaining 3 bytes are simple incremental value.