In the last post, we finished the last steps in setting up our server and installed Mongo.  In this post, we create a database, collection, and add documents.  It is important to note that security has not been set up for this server, nor SSL.  These steps will be done in a later blog post to show how things tend to run in the real world.  For now, download Robo 3T from the RoboMongo website (it’s free!) and start it.  When you do, it will want you to create a connection.  The only information you should need for this Mongo server is the IP address or DNS entry (if you created one) and can be entered (Click “create connection”) like so:

create_connection1

You will need to give your connection a name and add the ip address.  It is highly suggested to click “Test’ to make sure your connection works correctly.  Save when you are done.  The next step is to create a database for the pictures we are going to upload.  Right-click on the name you gave the connection (left pane) and then click “Create database.”

create_database1

Add a name and click “Create.”

create_database2

The database created in the left pane (called PictureTest) will contain blob data that we want to be able to export and import if there is an issue.  When you click on the arrow for the database (PictureTest), it will reveal where to put your collections, functions, and Users for the database:

create_database3

Right-click on “Collections” and left-click “Create Collection.”  This will open a window for you to type in the name of your collection.  In this example, we created one called “blogpictures.”

create_collection1

The final setup should look like the following:

create_collection2

To complete the setup on the Mongo server, we need data!  We are going to load quite a few pictures and documents for later use, so a script will be necessary.  More importantly, the script needs to be re-usable so testing can done whenever we want.  I am a fan of Python and everything that goes with it, so this example will use Python.  However, it is just as easy to write something up in Java, Node, or a dozen other languages to insert your data.  The following script is pretty basic, but will get the job done:

import os
import pymongo
from cStringIO import StringIO
from PIL import Image
import base64
# from bson.binary import Binary

# Make connection to mongo server
mongoclient = pymongo.MongoClient(“mongodb://192.168.4.37:27017/”)
testdb = mongoclient[“PictureTest”]
testcollection = testdb[“blogpictures”]

# File system connection
filelist = []
directory = “C:\\Users\\lazyitdude\\Pictures”

for filename in os.listdir(directory):
if os.path.isfile(os.path.join(directory, filename)) and filename != “desktop.ini”:
os.chdir(directory)
with open(filename, “rb”) as imageFile:
strfile = base64.b64encode(imageFile.read())
filelist.append(strfile)
print (filename)

# Get length of array
filelength = range(len(filelist))
print(“filelist array length: ” + str(filelength))

# Create array and indexfor key information
keyinit = 0
mongolist = []

# Loop through filelist, create titles, and insert records
for keyinit in range(len(filelength)):
key = “Picture” + str(keyinit)
print(key)
tempdict = {‘Title’:key,’Document’:filelist[keyinit]}
testcollection.insert_one(tempdict)
ins = mongolist.append(tempdict)
keyinit +=1
#keylist.append(key)

# Print ids to show that new records have been inserted
print(ins.inserted_ids)

There are a couple of things you will need installed in your Python environment to make sure this works.  First, you will need to install PyMongo.  Directions can be found here but you can simply type python -m pip install pymongo if you are using pip and not easy_install.   Second, you will want to install Pillow as the original PIL project has been deprecated.  Instructions can be found here.  Like PyMongo, this install can simply be another pip install or installed as an executable.

A few notes about the script:

  1. Make sure to change the mongoclient line to reflect your IP address or DNS entry.  The port should remain 27017
  2. The names in quotes on the testdb and testcollection lines should reflect your database and collection respectively.  However, if you are following this tutorial, then the names should match.
  3. The directory line should also reflect where you are pulling your binary documents from.  In the case of this tutorial, we pulled a series of pictures down from Google and some pdfs to use.  All of them were put in the Pictures directory.
  4. The Pictures directory on the test machine had a desktop.ini file hidden it that needed to be accounted for, thus the line reflecting it in the if statement above.
  5.  This line – key = “Picture” + str(keyinit) – uses the word “Picture” to declare a title along with a reference ID number for each picture.  This was only to allow a key filter for later testing if needed.  You can change the word “Picture” to anything you want.
  6. Remember that this is a rudementary script and please feel to change it as you need!

When the script is run, it should upload the “Title” and “Document” fields along with how many files are in the “Pictures” directory on the machine.  This can be verified by looking at Robo 3T:

collection_verification

There should be three fields for each record (document) in the database.  The _id field is auto-generated.  The Document field is a base64 encoded representation of each picture and pdf that was loaded and the Title field now gives a way to search for the documents.  The query db.getCollection(‘blogpictures’).find(||) can be run directly or you can right-click on the collection and then left-click “View Documents.”

You now have a fully operational Mongo server along with a database, a collection, and a set of documents to experiment with.  Feel free to leave comments or suggestions since this is a basic setup document.  We will go into how to set up a replication server and shard our collections in another post.  Good luck and thanks for reading!