Lecture 8: Working with GEOLocation Data

The Art of Analyzing Big Data - The Data Scientist’s Toolbox

By Dr. Michael Fire


0. Package Setup

For this lecture, we are going to use the TuriCreate, spacy, Cartopy, imageio, pymongo, GeoPandas, descartes, Geopy package, and Folium package packages. Let's set them up:

In [37]:
!pip install turicreate
!pip install spaCy
!pip install pymongo
!pip install geopandas
!pip install descartes
!pip install geopy
!pip install folium
Requirement already satisfied: turicreate in /anaconda3/envs/massivedata/lib/python3.6/site-packages (6.1)
Requirement already satisfied: numpy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (1.17.2)
Requirement already satisfied: pillow>=5.2.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (6.2.0)
Requirement already satisfied: requests>=2.9.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (2.22.0)
Requirement already satisfied: prettytable==0.7.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (0.7.2)
Requirement already satisfied: coremltools==3.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (3.3)
Requirement already satisfied: six>=1.10.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (1.12.0)
Requirement already satisfied: pandas>=0.23.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (0.25.1)
Requirement already satisfied: decorator>=4.0.9 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (4.4.0)
Requirement already satisfied: scipy>=1.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (1.3.1)
Requirement already satisfied: tensorflow>=2.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (2.1.0)
Requirement already satisfied: resampy==0.2.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (0.2.1)
Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (1.24.2)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (3.0.4)
Requirement already satisfied: protobuf>=3.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from coremltools==3.3->turicreate) (3.11.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.2->turicreate) (2.8.0)
Requirement already satisfied: pytz>=2017.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.2->turicreate) (2019.3)
Requirement already satisfied: opt-einsum>=2.3.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (3.1.0)
Requirement already satisfied: absl-py>=0.7.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.9.0)
Requirement already satisfied: astor>=0.6.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.8.1)
Requirement already satisfied: keras-preprocessing>=1.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.1.0)
Requirement already satisfied: google-pasta>=0.1.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.1.8)
Requirement already satisfied: wrapt>=1.11.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.11.2)
Requirement already satisfied: termcolor>=1.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.1.0)
Requirement already satisfied: gast==0.2.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.2.2)
Requirement already satisfied: tensorboard<2.2.0,>=2.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (2.1.0)
Requirement already satisfied: wheel>=0.26; python_version >= "3" in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.33.6)
Requirement already satisfied: keras-applications>=1.0.8 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.0.8)
Requirement already satisfied: grpcio>=1.8.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.27.2)
Requirement already satisfied: tensorflow-estimator<2.2.0,>=2.1.0rc0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (2.1.0)
Requirement already satisfied: numba>=0.32 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from resampy==0.2.1->turicreate) (0.45.1)
Requirement already satisfied: setuptools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from protobuf>=3.1.0->coremltools==3.3->turicreate) (41.4.0)
Requirement already satisfied: google-auth<2,>=1.6.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (1.11.2)
Requirement already satisfied: markdown>=2.6.8 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (3.2.1)
Requirement already satisfied: werkzeug>=0.11.15 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.16.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.4.1)
Requirement already satisfied: h5py in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from keras-applications>=1.0.8->tensorflow>=2.0.0->turicreate) (2.9.0)
Requirement already satisfied: llvmlite>=0.29.0dev0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from numba>=0.32->resampy==0.2.1->turicreate) (0.29.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.2.8)
Requirement already satisfied: rsa<4.1,>=3.1.4 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (4.0)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (4.0.0)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (1.3.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.4.8)
Requirement already satisfied: oauthlib>=3.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (3.1.0)
Requirement already satisfied: spaCy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (2.2.4)
Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.0.2)
Requirement already satisfied: setuptools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (41.4.0)
Requirement already satisfied: blis<0.5.0,>=0.4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (0.4.1)
Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (0.6.0)
Requirement already satisfied: numpy>=1.15.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.17.2)
Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.0.0)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (2.22.0)
Requirement already satisfied: thinc==7.4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (7.4.0)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (4.46.0)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.0.2)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (2.0.3)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (3.0.2)
Requirement already satisfied: plac<1.2.0,>=0.9.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.1.3)
Requirement already satisfied: importlib-metadata>=0.20; python_version < "3.8" in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from catalogue<1.1.0,>=0.0.7->spaCy) (0.23)
Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (2.8)
Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (2019.9.11)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (1.24.2)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (3.0.4)
Requirement already satisfied: zipp>=0.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spaCy) (0.6.0)
Requirement already satisfied: more-itertools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from zipp>=0.5->importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spaCy) (7.2.0)
Requirement already satisfied: pymongo in /anaconda3/envs/massivedata/lib/python3.6/site-packages (3.10.1)
Requirement already satisfied: geopandas in /anaconda3/envs/massivedata/lib/python3.6/site-packages (0.7.0)
Requirement already satisfied: fiona in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (1.8.13.post1)
Requirement already satisfied: pandas>=0.23.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (0.25.1)
Requirement already satisfied: pyproj>=2.2.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (2.3.1)
Requirement already satisfied: shapely in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (1.6.4.post2)
Requirement already satisfied: click-plugins>=1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (1.1.1)
Requirement already satisfied: attrs>=17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (19.2.0)
Requirement already satisfied: munch in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (2.5.0)
Requirement already satisfied: cligj>=0.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (0.5.0)
Requirement already satisfied: click<8,>=4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (7.0)
Requirement already satisfied: six>=1.7 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (1.12.0)
Requirement already satisfied: pytz>=2017.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (2019.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (2.8.0)
Requirement already satisfied: numpy>=1.13.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (1.17.2)
Requirement already satisfied: descartes in /anaconda3/envs/massivedata/lib/python3.6/site-packages (1.1.0)
Requirement already satisfied: matplotlib in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from descartes) (3.1.1)
Requirement already satisfied: cycler>=0.10 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (1.1.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (2.4.2)
Requirement already satisfied: python-dateutil>=2.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (2.8.0)
Requirement already satisfied: numpy>=1.11 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (1.17.2)
Requirement already satisfied: six in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from cycler>=0.10->matplotlib->descartes) (1.12.0)
Requirement already satisfied: setuptools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kiwisolver>=1.0.1->matplotlib->descartes) (41.4.0)
Requirement already satisfied: geopy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (1.21.0)
Requirement already satisfied: geographiclib<2,>=1.49 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopy) (1.50)
Collecting folium
  Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB)
     |████████████████████████████████| 102kB 615kB/s ta 0:00:01
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl
Requirement already satisfied: jinja2>=2.9 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from folium) (2.10.3)
Requirement already satisfied: requests in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from folium) (2.22.0)
Requirement already satisfied: numpy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from folium) (1.17.2)
Requirement already satisfied: MarkupSafe>=0.23 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from jinja2>=2.9->folium) (1.1.1)
Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (2019.9.11)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (3.0.4)
Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (2.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (1.24.2)
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0
In [66]:
!conda install -y -c conda-forge imageio
!conda install -y -c conda-forge cartopy
Collecting package metadata (repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.8.2
  latest version: 4.8.3

Please update conda by running

    $ conda update -n base conda



# All requested packages already installed.

Collecting package metadata (repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 4.8.2
  latest version: 4.8.3

Please update conda by running

    $ conda update -n base conda



# All requested packages already installed.

Let's install and setup the Kaggle package:

In [10]:
# Installing the Kaggle package
import json
!pip install kaggle 

#Important Note: complete this with your own key - after running this for the first time remmember to **remove** your API_KEY
api_token = {"username":"<Insert Your Kaggle User Name>","key":"<Insert Your Kaggle API key>"}

# creating kaggle.json file with the personal API-Key details 
# You can also put this file on your Google Drive
with open('~/.kaggle/kaggle.json', 'w') as file:
  json.dump(api_token, file)
!chmod 600 ~/.kaggle/kaggle.json
Requirement already satisfied: kaggle in /anaconda3/envs/massivedata/lib/python3.6/site-packages (1.5.6)
Requirement already satisfied: python-dateutil in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (2.8.0)
Requirement already satisfied: python-slugify in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (4.0.0)
Requirement already satisfied: requests in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (2.22.0)
Requirement already satisfied: tqdm in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (4.46.0)
Requirement already satisfied: urllib3<1.25,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (1.24.2)
Requirement already satisfied: certifi in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (2019.9.11)
Requirement already satisfied: six>=1.10 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (1.12.0)
Requirement already satisfied: text-unidecode>=1.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from python-slugify->kaggle) (1.3)
Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->kaggle) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->kaggle) (3.0.4)

1. Working with PyMongo

To work with MongoDB first we need to download and install it. In this notebook, we will be working with MongoDB. Therefore, I prefer to run the notebook locally on my laptop. Another option is to work with MongoDB Atlas.

Now let's run MongoDB and test the connection to it:

In [5]:
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017/")
client.list_database_names()
Out[5]:
['admin', 'config', 'local', 'locations']

Now let's create a new collection and load the US baby names dataset to the collection:

In [13]:
#!mkdir ./datasets
!mkdir ./datasets/us-baby-name

# download the dataset from Kaggle and unzip it
!kaggle datasets download kaggle/us-baby-names -f StateNames.csv -p ./datasets/

!unzip ./datasets/StateNames.csv.zip  -d ./datasets/us-baby-name/
Downloading StateNames.csv.zip to ./datasets
 98%|█████████████████████████████████████▍| 30.0M/30.5M [00:04<00:00, 6.79MB/s]
100%|██████████████████████████████████████| 30.5M/30.5M [00:04<00:00, 6.42MB/s]
Archive:  ./datasets/StateNames.csv.zip
  inflating: ./datasets/us-baby-name/StateNames.csv  
In [15]:
import turicreate as tc
import turicreate.aggregate as agg


sf = tc.SFrame.read_csv("./datasets/us-baby-name/StateNames.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/us-baby-name/StateNames.csv
Parsing completed. Parsed 100 lines in 1.31749 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,int,str,str,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Read 1940415 lines. Lines per second: 1.30741e+06
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/us-baby-name/StateNames.csv
Parsing completed. Parsed 5647426 lines in 2.93838 secs.
Out[15]:
Id Name Year Gender State Count
1 Mary 1910 F AK 14
2 Annie 1910 F AK 12
3 Anna 1910 F AK 10
4 Margaret 1910 F AK 8
5 Helen 1910 F AK 7
6 Elsie 1910 F AK 6
7 Lucy 1910 F AK 6
8 Dorothy 1910 F AK 5
9 Mary 1911 F AK 12
10 Margaret 1911 F AK 7
[5647426 rows x 6 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [16]:
sf['Year_Count'] =  sf.apply(lambda r: (r['Year'], r['Count']))
g = sf.groupby(["State", "Name", "Gender"], {"years_list" : agg.CONCAT("Year_Count")})
g
Out[16]:
Gender Name State years_list
M Ferman LA [[1926.0, 5.0]]
M Holton NC [[2013.0, 5.0]]
F Faiga NJ [[1995.0, 5.0], [1999.0,
5.0], [2007.0, 17.0], ...
F Carlie MD [[1990.0, 6.0], [1997.0,
7.0], [1998.0, 10.0], ...
M Beau SC [[2014.0, 16.0], [1981.0,
6.0], [2012.0, 7.0], ...
F Kaitlynn WA [[1995.0, 19.0], [1996.0,
26.0], [2003.0, 15.0], ...
F Tiny NC [[1917.0, 5.0], [1936.0,
5.0], [1920.0, 5.0], ...
F Cayla NJ [[1993.0, 6.0], [1998.0,
12.0], [1999.0, 14.0], ...
M Renardo NC [[1983.0, 6.0]]
M Auther SC [[1936.0, 5.0], [1937.0,
5.0], [1920.0, 6.0], ...
[304918 rows x 4 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [17]:
g['YearsCountDict'] = g['years_list'].apply(lambda l: {str(int(y)): int(c) for y,c in l})
g = g.remove_column("years_list")
g
Out[17]:
Gender Name State YearsCountDict
M Ferman LA {'1926': 5}
M Holton NC {'2013': 5}
F Faiga NJ {'1995': 5, '1999': 5,
'2007': 17, '2008': 8, ...
F Carlie MD {'1990': 6, '1997': 7,
'1998': 10, '1992': 12, ...
M Beau SC {'2014': 16, '1981': 6,
'2012': 7, '2008': 7, ...
F Kaitlynn WA {'1995': 19, '1996': 26,
'2003': 15, '2008': 18, ...
F Tiny NC {'1917': 5, '1936': 5,
'1920': 5, '1932': 5, ...
F Cayla NJ {'1993': 6, '1998': 12,
'1999': 14, '2003': 12, ...
M Renardo NC {'1983': 6}
M Auther SC {'1936': 5, '1937': 5,
'1920': 6, '1939': 5, ...
[304918 rows x 4 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Let's insert data from the SFrame object into a MongoDB collection:

In [18]:
db = client['baby_names'] # Created a new DB named baby_names 
collection = db['names']  # Created a new collection named names 

# insert each row as a document 
for r in g[:100]:
    collection.insert_one(r) 

print(f"Total documents {collection.count_documents({})}")
Total documents 100

Let's delete all the documents in the collection:

In [19]:
collection.delete_many({}) # delete all documents
print(f"Total documents {collection.count_documents({})}")
Total documents 0
In [20]:
collection.insert_many(g)
print(f"Total documents {collection.count_documents({})}")
Total documents 304918

Let's search for documents:

In [21]:
collection.find_one({ "Name": "Mary", "Gender": "F" })
Out[21]:
{'_id': ObjectId('5eb51b889993d0175fb5f810'),
 'Gender': 'F',
 'Name': 'Mary',
 'State': 'CA',
 'YearsCountDict': {'1910': 295,
  '1915': 998,
  '1918': 1252,
  '1928': 1787,
  '1930': 1851,
  '1964': 2709,
  '1971': 1056,
  '1983': 751,
  '2004': 292,
  '1912': 534,
  '1916': 1091,
  '1921': 1697,
  '1929': 1713,
  '1945': 3019,
  '1947': 3460,
  '1953': 3400,
  '1980': 831,
  '1984': 756,
  '1985': 706,
  '1988': 707,
  '1991': 723,
  '1997': 446,
  '1999': 438,
  '2003': 304,
  '2005': 282,
  '2014': 171,
  '1944': 2975,
  '1946': 3147,
  '1950': 3134,
  '1963': 2608,
  '1965': 2240,
  '1966': 1852,
  '1974': 742,
  '1979': 806,
  '1996': 500,
  '1998': 428,
  '2010': 177,
  '2011': 201,
  '2012': 166,
  '1917': 1149,
  '1925': 1890,
  '1931': 1626,
  '1939': 1699,
  '1941': 1951,
  '1949': 3217,
  '1954': 3718,
  '1961': 2716,
  '1967': 1621,
  '1968': 1447,
  '1970': 1278,
  '1975': 729,
  '1977': 710,
  '1987': 649,
  '1990': 678,
  '1992': 706,
  '1993': 639,
  '1995': 550,
  '2009': 201,
  '1911': 390,
  '1913': 584,
  '1914': 773,
  '1932': 1498,
  '1935': 1484,
  '1940': 1819,
  '1956': 3414,
  '1959': 3192,
  '1960': 3105,
  '2008': 220,
  '1923': 1829,
  '1934': 1590,
  '1936': 1541,
  '1942': 2441,
  '1952': 3422,
  '1958': 3158,
  '1962': 2755,
  '1972': 928,
  '1976': 722,
  '2001': 381,
  '1922': 1732,
  '1924': 1958,
  '1933': 1470,
  '1948': 3426,
  '1981': 845,
  '1982': 837,
  '1994': 567,
  '2000': 416,
  '2007': 239,
  '2013': 190,
  '1919': 1204,
  '1920': 1554,
  '1926': 1719,
  '1927': 1817,
  '1937': 1712,
  '1938': 1876,
  '1943': 2929,
  '1951': 3184,
  '1955': 3389,
  '1957': 3461,
  '1969': 1346,
  '1973': 795,
  '1978': 700,
  '1986': 671,
  '1989': 682,
  '2002': 339,
  '2006': 270}}
In [22]:
c =  list(collection.find({ "Name": "Mary", "Gender": "F" }))
list(c)
Out[22]:
[{'_id': ObjectId('5eb51b889993d0175fb5f810'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'CA',
  'YearsCountDict': {'1910': 295,
   '1915': 998,
   '1918': 1252,
   '1928': 1787,
   '1930': 1851,
   '1964': 2709,
   '1971': 1056,
   '1983': 751,
   '2004': 292,
   '1912': 534,
   '1916': 1091,
   '1921': 1697,
   '1929': 1713,
   '1945': 3019,
   '1947': 3460,
   '1953': 3400,
   '1980': 831,
   '1984': 756,
   '1985': 706,
   '1988': 707,
   '1991': 723,
   '1997': 446,
   '1999': 438,
   '2003': 304,
   '2005': 282,
   '2014': 171,
   '1944': 2975,
   '1946': 3147,
   '1950': 3134,
   '1963': 2608,
   '1965': 2240,
   '1966': 1852,
   '1974': 742,
   '1979': 806,
   '1996': 500,
   '1998': 428,
   '2010': 177,
   '2011': 201,
   '2012': 166,
   '1917': 1149,
   '1925': 1890,
   '1931': 1626,
   '1939': 1699,
   '1941': 1951,
   '1949': 3217,
   '1954': 3718,
   '1961': 2716,
   '1967': 1621,
   '1968': 1447,
   '1970': 1278,
   '1975': 729,
   '1977': 710,
   '1987': 649,
   '1990': 678,
   '1992': 706,
   '1993': 639,
   '1995': 550,
   '2009': 201,
   '1911': 390,
   '1913': 584,
   '1914': 773,
   '1932': 1498,
   '1935': 1484,
   '1940': 1819,
   '1956': 3414,
   '1959': 3192,
   '1960': 3105,
   '2008': 220,
   '1923': 1829,
   '1934': 1590,
   '1936': 1541,
   '1942': 2441,
   '1952': 3422,
   '1958': 3158,
   '1962': 2755,
   '1972': 928,
   '1976': 722,
   '2001': 381,
   '1922': 1732,
   '1924': 1958,
   '1933': 1470,
   '1948': 3426,
   '1981': 845,
   '1982': 837,
   '1994': 567,
   '2000': 416,
   '2007': 239,
   '2013': 190,
   '1919': 1204,
   '1920': 1554,
   '1926': 1719,
   '1927': 1817,
   '1937': 1712,
   '1938': 1876,
   '1943': 2929,
   '1951': 3184,
   '1955': 3389,
   '1957': 3461,
   '1969': 1346,
   '1973': 795,
   '1978': 700,
   '1986': 671,
   '1989': 682,
   '2002': 339,
   '2006': 270}},
 {'_id': ObjectId('5eb51b889993d0175fb60b00'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'LA',
  'YearsCountDict': {'1993': 154,
   '1996': 139,
   '2003': 113,
   '1927': 1379,
   '1944': 1639,
   '1959': 1128,
   '1980': 250,
   '1983': 227,
   '1999': 119,
   '2000': 110,
   '2002': 114,
   '2006': 79,
   '2010': 61,
   '2013': 57,
   '1912': 658,
   '1919': 1164,
   '1926': 1340,
   '1934': 1243,
   '1940': 1412,
   '1942': 1623,
   '1943': 1603,
   '1963': 872,
   '1965': 683,
   '1982': 251,
   '2004': 88,
   '1913': 634,
   '1921': 1195,
   '1925': 1304,
   '1929': 1306,
   '1949': 1557,
   '1969': 411,
   '1970': 448,
   '1976': 236,
   '1988': 169,
   '1995': 146,
   '2008': 75,
   '2011': 57,
   '1910': 586,
   '1914': 747,
   '1931': 1228,
   '1941': 1555,
   '1952': 1510,
   '1953': 1468,
   '1954': 1564,
   '1960': 1087,
   '1967': 535,
   '1978': 228,
   '1984': 236,
   '1985': 219,
   '1986': 195,
   '1989': 153,
   '1990': 170,
   '1916': 881,
   '1918': 1111,
   '1920': 1301,
   '1922': 1269,
   '1924': 1288,
   '1928': 1265,
   '1932': 1312,
   '1938': 1327,
   '1939': 1485,
   '1950': 1582,
   '1957': 1318,
   '1958': 1214,
   '1971': 355,
   '1991': 175,
   '1992': 161,
   '1997': 123,
   '1998': 144,
   '2001': 104,
   '1911': 502,
   '1915': 817,
   '1923': 1263,
   '1930': 1198,
   '1935': 1213,
   '1936': 1250,
   '1948': 1485,
   '1955': 1343,
   '1956': 1348,
   '1964': 936,
   '1966': 622,
   '1975': 240,
   '1977': 225,
   '1994': 184,
   '2007': 62,
   '2009': 64,
   '2012': 63,
   '1962': 963,
   '1968': 441,
   '1973': 262,
   '1979': 230,
   '1987': 164,
   '2005': 97,
   '2014': 65,
   '1917': 957,
   '1933': 1141,
   '1937': 1299,
   '1945': 1467,
   '1946': 1602,
   '1947': 1639,
   '1951': 1511,
   '1961': 1012,
   '1972': 282,
   '1974': 221,
   '1981': 230}},
 {'_id': ObjectId('5eb51b889993d0175fb63fc5'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'SD',
  'YearsCountDict': {'1912': 108,
   '1918': 229,
   '1920': 229,
   '1924': 250,
   '1928': 224,
   '1947': 310,
   '1952': 299,
   '1953': 313,
   '1955': 315,
   '1960': 217,
   '1973': 25,
   '1986': 24,
   '1988': 23,
   '1998': 11,
   '2000': 16,
   '2003': 13,
   '2006': 10,
   '1914': 128,
   '1923': 269,
   '1937': 228,
   '1946': 300,
   '1962': 171,
   '1964': 177,
   '1965': 115,
   '1968': 85,
   '1976': 27,
   '1979': 30,
   '1985': 17,
   '1992': 25,
   '1997': 13,
   '2002': 15,
   '2009': 6,
   '2012': 7,
   '1925': 265,
   '1926': 227,
   '1938': 225,
   '1950': 338,
   '1956': 269,
   '1959': 225,
   '1980': 35,
   '1981': 34,
   '1982': 38,
   '1990': 16,
   '1996': 9,
   '2010': 11,
   '2011': 5,
   '2013': 5,
   '1922': 249,
   '1929': 232,
   '1935': 249,
   '1949': 357,
   '1958': 229,
   '1967': 99,
   '1970': 55,
   '1974': 37,
   '1989': 26,
   '1993': 21,
   '2001': 13,
   '2007': 11,
   '1917': 228,
   '1919': 208,
   '1933': 244,
   '1934': 242,
   '1940': 209,
   '1943': 271,
   '1957': 239,
   '1966': 115,
   '1969': 66,
   '1971': 52,
   '1978': 26,
   '1991': 18,
   '1994': 19,
   '1999': 12,
   '1910': 58,
   '1913': 111,
   '1927': 238,
   '1930': 238,
   '1932': 274,
   '1963': 190,
   '1983': 16,
   '2005': 7,
   '1915': 177,
   '1921': 273,
   '1931': 229,
   '1941': 216,
   '1948': 317,
   '1951': 307,
   '1954': 337,
   '1961': 187,
   '1972': 36,
   '1975': 26,
   '1977': 27,
   '1995': 21,
   '1911': 68,
   '1916': 192,
   '1936': 226,
   '1939': 215,
   '1942': 225,
   '1944': 273,
   '1945': 225,
   '1984': 23,
   '1987': 22,
   '2014': 6}},
 {'_id': ObjectId('5eb51b889993d0175fb64d97'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'WV',
  'YearsCountDict': {'1916': 1120,
   '1923': 1522,
   '1924': 1647,
   '1936': 1085,
   '1944': 888,
   '1953': 765,
   '1963': 374,
   '1964': 436,
   '1978': 140,
   '1994': 44,
   '1998': 57,
   '2000': 35,
   '2006': 19,
   '1912': 519,
   '1921': 1494,
   '1925': 1549,
   '1930': 1252,
   '1931': 1185,
   '1942': 1042,
   '1943': 1022,
   '1945': 840,
   '1952': 825,
   '1977': 159,
   '1979': 154,
   '1980': 147,
   '1982': 106,
   '1991': 55,
   '2003': 33,
   '2004': 31,
   '2009': 16,
   '1918': 1220,
   '1919': 1225,
   '1920': 1379,
   '1927': 1467,
   '1937': 1129,
   '1947': 1082,
   '1956': 646,
   '1958': 491,
   '1995': 41,
   '1996': 53,
   '1997': 35,
   '1926': 1436,
   '1934': 1163,
   '1949': 957,
   '1976': 151,
   '1985': 103,
   '1993': 60,
   '2002': 26,
   '2011': 16,
   '1911': 441,
   '1922': 1541,
   '1932': 1184,
   '1933': 1111,
   '1941': 991,
   '1946': 1016,
   '1954': 715,
   '1959': 492,
   '1965': 357,
   '1975': 143,
   '1984': 98,
   '1987': 75,
   '1988': 72,
   '1992': 61,
   '2001': 21,
   '2005': 15,
   '2008': 14,
   '2013': 14,
   '1938': 1087,
   '1951': 843,
   '1955': 626,
   '1957': 559,
   '1962': 422,
   '1966': 285,
   '1967': 228,
   '1968': 248,
   '1969': 229,
   '1970': 224,
   '1981': 127,
   '1986': 78,
   '1914': 790,
   '1928': 1412,
   '1929': 1286,
   '1935': 1149,
   '1940': 1045,
   '1948': 1015,
   '1960': 449,
   '1974': 157,
   '1989': 80,
   '1990': 62,
   '2007': 21,
   '2010': 13,
   '2014': 10,
   '1910': 380,
   '1913': 641,
   '1915': 1054,
   '1917': 1230,
   '1939': 1026,
   '1950': 889,
   '1961': 468,
   '1971': 218,
   '1972': 189,
   '1973': 173,
   '1983': 115,
   '1999': 37}},
 {'_id': ObjectId('5eb51b889993d0175fb64ff9'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'HI',
  'YearsCountDict': {'1938': 63,
   '1949': 87,
   '1965': 60,
   '1967': 57,
   '1982': 19,
   '1984': 16,
   '1987': 14,
   '1990': 15,
   '2004': 5,
   '1915': 92,
   '1916': 86,
   '1930': 122,
   '1937': 69,
   '1962': 82,
   '1976': 26,
   '1979': 22,
   '1981': 22,
   '1992': 12,
   '1998': 12,
   '1925': 100,
   '1926': 113,
   '1957': 85,
   '1958': 114,
   '1960': 82,
   '1969': 53,
   '1970': 43,
   '1975': 25,
   '1977': 22,
   '1991': 17,
   '2013': 9,
   '1922': 101,
   '1923': 103,
   '1924': 119,
   '1928': 111,
   '1929': 108,
   '1933': 92,
   '1934': 71,
   '1940': 66,
   '1944': 82,
   '1945': 89,
   '1950': 96,
   '1953': 81,
   '1959': 89,
   '1963': 92,
   '1983': 21,
   '1917': 88,
   '1935': 81,
   '1941': 52,
   '1946': 76,
   '1951': 85,
   '1961': 102,
   '1973': 31,
   '1974': 30,
   '1986': 18,
   '1993': 16,
   '2000': 6,
   '2008': 7,
   '2010': 8,
   '1910': 47,
   '1911': 56,
   '1912': 62,
   '1913': 60,
   '1920': 102,
   '1927': 95,
   '1931': 114,
   '1932': 102,
   '1943': 73,
   '1947': 96,
   '1954': 85,
   '1966': 63,
   '1968': 39,
   '1978': 31,
   '1980': 29,
   '1988': 15,
   '1989': 15,
   '1995': 16,
   '2001': 12,
   '1918': 103,
   '1921': 121,
   '1936': 59,
   '1939': 90,
   '1952': 80,
   '1955': 69,
   '1971': 36,
   '1972': 37,
   '1985': 13,
   '1914': 79,
   '1919': 100,
   '1942': 69,
   '1948': 98,
   '1956': 95,
   '1964': 99,
   '1994': 14,
   '1996': 9,
   '1999': 9,
   '2005': 7,
   '2012': 5,
   '2014': 7}},
 {'_id': ObjectId('5eb51b889993d0175fb6625a'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NY',
  'YearsCountDict': {'1911': 2322,
   '1917': 5502,
   '1933': 3932,
   '1937': 3805,
   '1951': 4732,
   '1970': 1469,
   '1980': 596,
   '1984': 463,
   '1988': 434,
   '1991': 468,
   '2002': 246,
   '2013': 103,
   '1919': 5061,
   '1926': 4779,
   '1940': 3662,
   '1948': 4720,
   '1959': 4869,
   '1967': 2205,
   '1976': 669,
   '1982': 553,
   '1983': 537,
   '1996': 305,
   '1997': 334,
   '1998': 291,
   '1999': 276,
   '1916': 5496,
   '1923': 5160,
   '1942': 4169,
   '1943': 4449,
   '1946': 4797,
   '1947': 4966,
   '1949': 4597,
   '1950': 4734,
   '1963': 3782,
   '1964': 3623,
   '2010': 146,
   '2011': 92,
   '1918': 5526,
   '1920': 5296,
   '1921': 5413,
   '1922': 5303,
   '1925': 4916,
   '1929': 4765,
   '1935': 3766,
   '1938': 3622,
   '1941': 3879,
   '1961': 4426,
   '1972': 951,
   '1974': 797,
   '1981': 585,
   '2001': 247,
   '2006': 153,
   '1912': 2909,
   '1915': 5342,
   '1927': 5094,
   '1928': 4905,
   '1930': 4854,
   '1934': 3701,
   '1952': 4940,
   '1953': 4920,
   '1957': 5265,
   '1958': 5010,
   '1962': 3823,
   '1975': 689,
   '1989': 461,
   '1993': 444,
   '1994': 380,
   '2007': 149,
   '2008': 141,
   '2009': 128,
   '2012': 100,
   '1910': 1923,
   '1914': 4244,
   '1936': 3689,
   '1939': 3570,
   '1954': 5467,
   '1955': 5114,
   '1956': 5063,
   '1968': 1796,
   '1971': 1141,
   '1979': 570,
   '1986': 418,
   '1995': 373,
   '2004': 214,
   '2014': 121,
   '1931': 4591,
   '1932': 4257,
   '1945': 3907,
   '1965': 3076,
   '1985': 455,
   '2000': 293,
   '2003': 206,
   '1913': 3267,
   '1924': 5144,
   '1944': 4100,
   '1960': 4630,
   '1966': 2546,
   '1969': 1555,
   '1973': 820,
   '1977': 665,
   '1978': 539,
   '1987': 453,
   '1990': 465,
   '1992': 464,
   '2005': 207}},
 {'_id': ObjectId('5eb51b899993d0175fb67cd1'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NM',
  'YearsCountDict': {'1912': 155,
   '1919': 270,
   '1921': 416,
   '1922': 377,
   '1924': 400,
   '1926': 342,
   '1942': 519,
   '1971': 86,
   '2008': 18,
   '1928': 380,
   '1931': 390,
   '1932': 402,
   '1943': 528,
   '1946': 557,
   '1950': 512,
   '1953': 488,
   '1956': 425,
   '1966': 170,
   '1967': 144,
   '1974': 73,
   '1976': 60,
   '1977': 72,
   '1983': 48,
   '1989': 51,
   '1992': 25,
   '1999': 13,
   '2006': 16,
   '2012': 8,
   '2013': 6,
   '1915': 202,
   '1916': 250,
   '1927': 394,
   '1935': 475,
   '1941': 505,
   '1949': 522,
   '1964': 257,
   '1969': 97,
   '1975': 60,
   '1997': 14,
   '2002': 21,
   '1917': 239,
   '1923': 382,
   '1933': 370,
   '1936': 432,
   '1937': 481,
   '1944': 509,
   '1945': 469,
   '1947': 589,
   '1951': 516,
   '1952': 508,
   '1963': 264,
   '1973': 72,
   '1993': 29,
   '2001': 22,
   '2014': 12,
   '1910': 98,
   '1911': 111,
   '1914': 140,
   '1918': 319,
   '1929': 394,
   '1938': 453,
   '1954': 509,
   '1955': 455,
   '1961': 289,
   '1968': 120,
   '1972': 77,
   '1984': 45,
   '1987': 38,
   '1990': 43,
   '2003': 15,
   '2004': 16,
   '1925': 392,
   '1948': 547,
   '1959': 366,
   '1994': 26,
   '1998': 28,
   '2010': 14,
   '1913': 127,
   '1920': 307,
   '1934': 433,
   '1939': 504,
   '1958': 386,
   '1960': 342,
   '1979': 72,
   '1981': 58,
   '1982': 69,
   '1988': 34,
   '1995': 22,
   '2005': 15,
   '2007': 17,
   '2009': 11,
   '1930': 389,
   '1940': 481,
   '1957': 440,
   '1962': 285,
   '1965': 208,
   '1970': 111,
   '1978': 61,
   '1980': 57,
   '1985': 46,
   '1986': 46,
   '1991': 33,
   '1996': 34,
   '2000': 29}},
 {'_id': ObjectId('5eb51b899993d0175fb68908'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'ME',
  'YearsCountDict': {'1910': 92,
   '1916': 261,
   '1941': 211,
   '1942': 265,
   '1943': 245,
   '1947': 282,
   '1953': 233,
   '1957': 275,
   '1968': 112,
   '1969': 89,
   '1970': 74,
   '1976': 40,
   '1979': 40,
   '1985': 32,
   '1990': 34,
   '1997': 28,
   '1915': 237,
   '1917': 249,
   '1927': 261,
   '1938': 208,
   '1952': 219,
   '1961': 217,
   '1963': 160,
   '1977': 42,
   '1986': 35,
   '1988': 33,
   '1992': 34,
   '2002': 16,
   '2005': 11,
   '1934': 219,
   '1936': 223,
   '1940': 223,
   '1950': 264,
   '1954': 249,
   '1971': 69,
   '1973': 38,
   '1980': 47,
   '1987': 32,
   '1989': 24,
   '2001': 19,
   '1911': 83,
   '1914': 182,
   '1918': 262,
   '1920': 311,
   '1945': 226,
   '1967': 105,
   '1978': 37,
   '1983': 38,
   '1993': 26,
   '2010': 6,
   '2012': 7,
   '1913': 140,
   '1922': 296,
   '1924': 289,
   '1931': 256,
   '1939': 211,
   '1946': 263,
   '1994': 28,
   '1998': 20,
   '2000': 14,
   '1925': 286,
   '1928': 217,
   '1935': 230,
   '1948': 262,
   '1955': 256,
   '1964': 191,
   '1965': 148,
   '1982': 49,
   '2004': 12,
   '2014': 6,
   '1912': 131,
   '1926': 260,
   '1929': 220,
   '1930': 213,
   '1932': 248,
   '1933': 224,
   '1937': 236,
   '1944': 223,
   '1956': 239,
   '1958': 227,
   '1966': 133,
   '1972': 63,
   '1974': 46,
   '1975': 47,
   '1981': 41,
   '1984': 31,
   '1999': 15,
   '2003': 18,
   '2006': 9,
   '2007': 11,
   '2008': 8,
   '2013': 9,
   '1919': 281,
   '1921': 293,
   '1923': 257,
   '1949': 275,
   '1951': 225,
   '1959': 214,
   '1960': 210,
   '1962': 201,
   '1991': 26,
   '1995': 21,
   '1996': 16,
   '2009': 9}},
 {'_id': ObjectId('5eb51b899993d0175fb69f9c'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MI',
  'YearsCountDict': {'1915': 1447,
   '1917': 1774,
   '1938': 2118,
   '1962': 2106,
   '1972': 556,
   '1978': 345,
   '1990': 303,
   '1994': 217,
   '1996': 196,
   '2000': 151,
   '1933': 1935,
   '1935': 1847,
   '1944': 2492,
   '1953': 3056,
   '1955': 3143,
   '1965': 1584,
   '1970': 779,
   '1971': 659,
   '1983': 348,
   '1984': 321,
   '1986': 285,
   '1992': 249,
   '1922': 1772,
   '1929': 2208,
   '1936': 2066,
   '1943': 2734,
   '1948': 2933,
   '1949': 2982,
   '1950': 2936,
   '1954': 3435,
   '1957': 3197,
   '1959': 2772,
   '1960': 2518,
   '1964': 1983,
   '1976': 367,
   '1991': 265,
   '1993': 222,
   '1999': 191,
   '2014': 56,
   '1912': 628,
   '1926': 2114,
   '1932': 2002,
   '1941': 2289,
   '1942': 2455,
   '1946': 2811,
   '1951': 3088,
   '1967': 1127,
   '1973': 507,
   '1981': 406,
   '1997': 195,
   '2001': 171,
   '2003': 137,
   '2006': 94,
   '1923': 1962,
   '1927': 2156,
   '1939': 1999,
   '1966': 1384,
   '1968': 949,
   '1977': 389,
   '1979': 379,
   '1980': 431,
   '1995': 222,
   '2008': 87,
   '1911': 456,
   '1916': 1587,
   '1919': 1554,
   '1920': 1809,
   '1924': 1946,
   '1925': 1932,
   '1930': 2291,
   '1934': 2034,
   '1963': 2028,
   '1975': 407,
   '1985': 300,
   '1988': 284,
   '1989': 295,
   '2002': 130,
   '2005': 104,
   '2012': 60,
   '1910': 349,
   '1913': 744,
   '1918': 1787,
   '1921': 1958,
   '1928': 2250,
   '1947': 2961,
   '1952': 2963,
   '1956': 3152,
   '1958': 2744,
   '1969': 843,
   '1982': 369,
   '1987': 287,
   '2007': 105,
   '1914': 885,
   '1931': 2092,
   '1937': 1958,
   '1940': 2017,
   '1945': 2503,
   '1961': 2344,
   '1974': 414,
   '1998': 193,
   '2004': 145,
   '2009': 80,
   '2010': 70,
   '2011': 60,
   '2013': 74}},
 {'_id': ObjectId('5eb51b899993d0175fb6a2e9'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MN',
  'YearsCountDict': {'1912': 379,
   '1923': 893,
   '1928': 889,
   '1937': 962,
   '1950': 1812,
   '1972': 206,
   '1977': 155,
   '1978': 150,
   '1981': 173,
   '1989': 125,
   '1998': 85,
   '2009': 43,
   '1910': 216,
   '1914': 501,
   '1920': 825,
   '1925': 805,
   '1933': 895,
   '1952': 1755,
   '1953': 1680,
   '1954': 1920,
   '1962': 1242,
   '1975': 141,
   '1985': 144,
   '1995': 105,
   '1917': 756,
   '1921': 872,
   '1936': 1024,
   '1943': 1400,
   '1947': 1753,
   '1949': 1641,
   '1960': 1468,
   '1973': 183,
   '1986': 160,
   '2005': 65,
   '1911': 267,
   '1916': 810,
   '1919': 733,
   '1926': 942,
   '1930': 930,
   '1931': 944,
   '1944': 1316,
   '1946': 1692,
   '1948': 1768,
   '1955': 1720,
   '1961': 1327,
   '1965': 821,
   '1970': 342,
   '1971': 242,
   '1974': 182,
   '1991': 124,
   '1993': 141,
   '2000': 86,
   '2001': 113,
   '2003': 82,
   '2008': 56,
   '2010': 52,
   '1913': 430,
   '1918': 754,
   '1922': 850,
   '1924': 855,
   '1927': 913,
   '1934': 931,
   '1940': 1135,
   '1945': 1348,
   '1951': 1855,
   '1956': 1725,
   '1957': 1731,
   '1959': 1634,
   '1964': 1113,
   '1979': 175,
   '1980': 177,
   '1987': 116,
   '1994': 106,
   '1997': 107,
   '2007': 40,
   '2011': 35,
   '2013': 34,
   '1929': 942,
   '1938': 1009,
   '1963': 1141,
   '1967': 533,
   '1976': 157,
   '1982': 224,
   '1983': 140,
   '1988': 129,
   '1999': 92,
   '2012': 44,
   '1932': 980,
   '1939': 1052,
   '1941': 1148,
   '1969': 336,
   '1992': 138,
   '1996': 116,
   '2002': 82,
   '1915': 731,
   '1935': 981,
   '1942': 1293,
   '1958': 1595,
   '1966': 669,
   '1968': 443,
   '1984': 149,
   '1990': 134,
   '2004': 86,
   '2006': 73,
   '2014': 42}},
 {'_id': ObjectId('5eb51b899993d0175fb6b480'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'CT',
  'YearsCountDict': {'1916': 1078,
   '1922': 836,
   '1923': 863,
   '1928': 643,
   '1947': 665,
   '1949': 586,
   '1950': 611,
   '1953': 665,
   '1955': 772,
   '1958': 702,
   '1961': 661,
   '1971': 198,
   '1981': 105,
   '1985': 92,
   '1986': 81,
   '1994': 100,
   '2001': 49,
   '2004': 50,
   '1924': 760,
   '1926': 751,
   '1931': 552,
   '1935': 428,
   '1940': 450,
   '1942': 581,
   '1954': 749,
   '1957': 762,
   '1964': 592,
   '1967': 336,
   '1969': 256,
   '1970': 236,
   '1992': 84,
   '2008': 27,
   '1920': 970,
   '1930': 625,
   '1956': 752,
   '1963': 590,
   '1966': 381,
   '1974': 89,
   '1982': 98,
   '1984': 73,
   '2003': 50,
   '2005': 44,
   '2006': 30,
   '2010': 29,
   '2012': 20,
   '2014': 22,
   '1913': 606,
   '1914': 851,
   '1921': 1010,
   '1933': 493,
   '1934': 459,
   '1939': 419,
   '1946': 643,
   '1965': 519,
   '1976': 103,
   '1999': 68,
   '2007': 38,
   '2011': 22,
   '1918': 1119,
   '1936': 422,
   '1937': 438,
   '1948': 619,
   '1959': 763,
   '1972': 136,
   '1979': 80,
   '1995': 80,
   '1996': 73,
   '1911': 382,
   '1938': 390,
   '1952': 632,
   '1960': 682,
   '1968': 277,
   '1973': 124,
   '1997': 65,
   '2002': 59,
   '2009': 27,
   '1915': 989,
   '1919': 969,
   '1925': 729,
   '1932': 493,
   '1941': 509,
   '1944': 520,
   '1945': 532,
   '1951': 623,
   '1987': 76,
   '1991': 109,
   '1993': 94,
   '1998': 68,
   '2000': 66,
   '2013': 24,
   '1910': 304,
   '1912': 471,
   '1917': 1138,
   '1927': 668,
   '1929': 624,
   '1943': 589,
   '1962': 576,
   '1975': 89,
   '1977': 83,
   '1978': 95,
   '1980': 120,
   '1983': 99,
   '1988': 94,
   '1989': 93,
   '1990': 96}},
 {'_id': ObjectId('5eb51b899993d0175fb7174c'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'AL',
  'YearsCountDict': {'1913': 1125,
   '1919': 2223,
   '1925': 2694,
   '1933': 2106,
   '1936': 2017,
   '1952': 1720,
   '1957': 1223,
   '1960': 982,
   '1962': 821,
   '1967': 590,
   '1980': 321,
   '1981': 298,
   '1985': 274,
   '2001': 234,
   '2011': 138,
   '1912': 1041,
   '1916': 1722,
   '1923': 2420,
   '1939': 2073,
   '1955': 1354,
   '1968': 496,
   '1982': 314,
   '1987': 246,
   '1989': 257,
   '2004': 204,
   '1922': 2384,
   '1927': 2610,
   '1928': 2501,
   '1944': 2079,
   '1958': 1044,
   '1965': 715,
   '1991': 266,
   '1998': 243,
   '2002': 265,
   '2008': 183,
   '1917': 1825,
   '1924': 2596,
   '1930': 2397,
   '1943': 2308,
   '1959': 1048,
   '1963': 781,
   '1969': 465,
   '1971': 444,
   '1986': 240,
   '1990': 294,
   '2005': 221,
   '2007': 191,
   '2010': 147,
   '1910': 875,
   '1911': 804,
   '1920': 2357,
   '1921': 2318,
   '1947': 2117,
   '1951': 1704,
   '1954': 1447,
   '1964': 800,
   '1966': 645,
   '1973': 320,
   '1979': 253,
   '1997': 264,
   '2009': 144,
   '1914': 1429,
   '1932': 2339,
   '1935': 2214,
   '1937': 2154,
   '1938': 2193,
   '1942': 2351,
   '1945': 2081,
   '1956': 1247,
   '1972': 361,
   '1975': 254,
   '1976': 290,
   '1977': 268,
   '1983': 283,
   '2012': 123,
   '2013': 126,
   '1926': 2535,
   '1929': 2339,
   '1934': 2287,
   '1941': 2099,
   '1946': 2107,
   '1948': 2074,
   '1949': 1976,
   '1961': 930,
   '1970': 455,
   '1978': 291,
   '1984': 266,
   '1988': 275,
   '1992': 259,
   '1994': 284,
   '1995': 282,
   '1999': 269,
   '2000': 261,
   '2003': 204,
   '2006': 150,
   '2014': 112,
   '1915': 1552,
   '1918': 1914,
   '1931': 2274,
   '1940': 2074,
   '1950': 1839,
   '1953': 1503,
   '1974': 339,
   '1993': 243,
   '1996': 276}},
 {'_id': ObjectId('5eb51b899993d0175fb72926'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'TN',
  'YearsCountDict': {'1938': 1827,
   '1942': 1865,
   '1957': 1104,
   '1965': 636,
   '1967': 559,
   '1968': 504,
   '1969': 525,
   '1970': 520,
   '1995': 270,
   '2005': 193,
   '1916': 1723,
   '1923': 2371,
   '1927': 2329,
   '1930': 2102,
   '1932': 2023,
   '1936': 1792,
   '1943': 1927,
   '1953': 1463,
   '1963': 767,
   '1973': 408,
   '1975': 342,
   '1985': 320,
   '1990': 281,
   '1993': 325,
   '2002': 231,
   '2004': 178,
   '1913': 1029,
   '1917': 1744,
   '1948': 1778,
   '1956': 1207,
   '1959': 973,
   '1972': 413,
   '1974': 349,
   '1994': 302,
   '1996': 290,
   '2000': 258,
   '2001': 248,
   '2003': 215,
   '2010': 127,
   '2011': 103,
   '2014': 101,
   '1921': 2254,
   '1926': 2151,
   '1928': 2115,
   '1941': 1926,
   '1946': 1876,
   '1947': 1928,
   '1954': 1321,
   '1960': 907,
   '1971': 469,
   '1977': 327,
   '1986': 287,
   '1998': 257,
   '1911': 721,
   '1922': 2245,
   '1924': 2474,
   '1925': 2442,
   '1934': 1957,
   '1937': 1742,
   '1939': 1734,
   '1958': 1037,
   '1962': 797,
   '1964': 755,
   '1976': 337,
   '1979': 350,
   '1983': 344,
   '1987': 320,
   '1988': 313,
   '1989': 302,
   '1992': 308,
   '1999': 229,
   '2006': 162,
   '1914': 1158,
   '1920': 2076,
   '1944': 1810,
   '1945': 1646,
   '1949': 1670,
   '1950': 1555,
   '1952': 1488,
   '1961': 874,
   '1978': 352,
   '1991': 297,
   '2013': 102,
   '1910': 735,
   '1915': 1515,
   '1929': 2028,
   '1931': 2042,
   '1933': 1844,
   '1935': 1952,
   '1951': 1547,
   '1955': 1284,
   '1966': 563,
   '1981': 365,
   '2007': 173,
   '2008': 169,
   '2012': 111,
   '1912': 869,
   '1918': 1909,
   '1919': 2001,
   '1940': 1780,
   '1980': 363,
   '1982': 364,
   '1984': 314,
   '1997': 268,
   '2009': 165}},
 {'_id': ObjectId('5eb51b899993d0175fb74f71'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MS',
  'YearsCountDict': {'1913': 899,
   '1959': 1010,
   '1977': 232,
   '1985': 171,
   '2000': 153,
   '2004': 114,
   '2013': 78,
   '1910': 762,
   '1914': 970,
   '1922': 1719,
   '1943': 1715,
   '1947': 1742,
   '1950': 1639,
   '1952': 1536,
   '1955': 1312,
   '1961': 833,
   '1962': 817,
   '1964': 721,
   '1967': 512,
   '1970': 406,
   '1973': 284,
   '1980': 231,
   '2001': 148,
   '2002': 124,
   '2007': 83,
   '1912': 806,
   '1932': 1568,
   '1933': 1497,
   '1948': 1714,
   '1956': 1225,
   '1960': 941,
   '1998': 141,
   '2003': 113,
   '1917': 1215,
   '1919': 1520,
   '1923': 1617,
   '1931': 1463,
   '1939': 1565,
   '1942': 1652,
   '1989': 145,
   '1990': 161,
   '1999': 144,
   '2010': 74,
   '2012': 89,
   '1916': 1263,
   '1918': 1348,
   '1928': 1680,
   '1929': 1624,
   '1945': 1500,
   '1951': 1587,
   '1954': 1377,
   '1958': 1002,
   '1971': 353,
   '1975': 237,
   '1978': 192,
   '1988': 179,
   '1995': 194,
   '1996': 145,
   '2008': 104,
   '1915': 1141,
   '1920': 1556,
   '1924': 1734,
   '1927': 1840,
   '1940': 1559,
   '1944': 1545,
   '1965': 617,
   '1974': 254,
   '1976': 213,
   '1987': 160,
   '1992': 153,
   '1994': 159,
   '2006': 88,
   '1911': 606,
   '1921': 1526,
   '1930': 1662,
   '1935': 1504,
   '1937': 1498,
   '1946': 1621,
   '1949': 1765,
   '1953': 1496,
   '1963': 764,
   '1972': 314,
   '1979': 222,
   '1982': 261,
   '1986': 160,
   '1993': 147,
   '2005': 121,
   '2009': 90,
   '2014': 82,
   '1925': 1778,
   '1926': 1713,
   '1934': 1491,
   '1936': 1549,
   '1938': 1538,
   '1941': 1628,
   '1957': 1164,
   '1966': 565,
   '1968': 424,
   '1969': 374,
   '1981': 207,
   '1983': 178,
   '1984': 193,
   '1991': 162,
   '1997': 145,
   '2011': 96}},
 {'_id': ObjectId('5eb51b899993d0175fb769d2'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'AK',
  'YearsCountDict': {'1930': 35,
   '1931': 41,
   '1932': 26,
   '1933': 25,
   '1934': 31,
   '1982': 29,
   '1984': 31,
   '2011': 15,
   '1954': 93,
   '1976': 23,
   '1978': 29,
   '1981': 32,
   '1985': 42,
   '1992': 21,
   '1994': 21,
   '1995': 31,
   '2000': 18,
   '2005': 18,
   '1951': 77,
   '1974': 35,
   '2006': 9,
   '2007': 11,
   '1910': 14,
   '1911': 12,
   '1912': 9,
   '1913': 21,
   '1914': 22,
   '1915': 23,
   '1916': 18,
   '1917': 21,
   '1918': 27,
   '1919': 22,
   '1952': 75,
   '1962': 86,
   '1973': 29,
   '1979': 25,
   '1991': 31,
   '1996': 14,
   '2002': 16,
   '1939': 28,
   '1940': 43,
   '1941': 41,
   '1947': 63,
   '1948': 71,
   '1957': 99,
   '1959': 61,
   '1960': 78,
   '1961': 80,
   '1971': 46,
   '1977': 26,
   '1987': 32,
   '1997': 25,
   '2008': 12,
   '1942': 43,
   '1943': 47,
   '1944': 42,
   '1956': 96,
   '1963': 78,
   '1966': 67,
   '1967': 59,
   '1968': 36,
   '1969': 51,
   '1970': 47,
   '1980': 38,
   '1983': 36,
   '1989': 29,
   '1993': 26,
   '1999': 19,
   '2014': 6,
   '1925': 24,
   '1926': 39,
   '1927': 30,
   '1928': 27,
   '1929': 25,
   '1935': 29,
   '1936': 33,
   '1937': 41,
   '1938': 37,
   '1945': 47,
   '1946': 47,
   '1949': 79,
   '1950': 71,
   '1953': 80,
   '1975': 29,
   '1988': 28,
   '2003': 9,
   '2004': 11,
   '1920': 38,
   '1921': 36,
   '1922': 29,
   '1923': 26,
   '1924': 41,
   '1955': 91,
   '1958': 84,
   '1964': 84,
   '1965': 68,
   '1972': 31,
   '1986': 27,
   '1990': 27,
   '1998': 19,
   '2001': 16,
   '2009': 6,
   '2010': 15,
   '2012': 15,
   '2013': 14}},
 {'_id': ObjectId('5eb51b899993d0175fb76eab'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'PA',
  'YearsCountDict': {'1921': 7771,
   '1932': 4536,
   '1934': 4088,
   '1935': 3929,
   '1937': 4037,
   '1938': 4059,
   '1940': 3942,
   '1941': 4085,
   '1953': 4025,
   '1956': 3872,
   '1970': 1051,
   '1972': 739,
   '1982': 497,
   '2002': 264,
   '1911': 3188,
   '1944': 3952,
   '1963': 2577,
   '1964': 2531,
   '1969': 1145,
   '1977': 526,
   '1985': 398,
   '1995': 367,
   '1997': 303,
   '2012': 130,
   '1926': 6004,
   '1939': 3878,
   '1946': 4455,
   '1951': 4208,
   '1958': 3697,
   '1959': 3578,
   '1967': 1418,
   '1987': 365,
   '1991': 415,
   '1992': 430,
   '1910': 2913,
   '1912': 4106,
   '1923': 7034,
   '1924': 7200,
   '1925': 6565,
   '1930': 5186,
   '1949': 4350,
   '1974': 593,
   '1980': 531,
   '1998': 286,
   '2007': 191,
   '1914': 5981,
   '1920': 7651,
   '1931': 4916,
   '1936': 3932,
   '1943': 4471,
   '1950': 4074,
   '1965': 2049,
   '1966': 1695,
   '1973': 611,
   '1976': 476,
   '1981': 502,
   '1984': 430,
   '1999': 319,
   '2001': 299,
   '2008': 165,
   '2014': 124,
   '1915': 7970,
   '1922': 7303,
   '1928': 5739,
   '1955': 4071,
   '1961': 3035,
   '1968': 1275,
   '1983': 455,
   '1993': 437,
   '2004': 223,
   '2005': 199,
   '2006': 183,
   '2010': 158,
   '1913': 4738,
   '1927': 6228,
   '1942': 4474,
   '1947': 4845,
   '1952': 4171,
   '1960': 3295,
   '1962': 2635,
   '1978': 468,
   '1979': 500,
   '1986': 388,
   '2003': 225,
   '2011': 128,
   '1916': 7730,
   '1917': 7987,
   '1918': 8184,
   '1919': 7428,
   '1929': 5303,
   '1933': 4122,
   '1945': 3658,
   '1948': 4440,
   '1954': 4394,
   '1957': 3968,
   '1971': 807,
   '1975': 580,
   '1988': 400,
   '1989': 447,
   '1990': 414,
   '1994': 430,
   '1996': 358,
   '2000': 310,
   '2009': 160,
   '2013': 123}},
 {'_id': ObjectId('5eb51b899993d0175fb7848e'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'WA',
  'YearsCountDict': {'1914': 281,
   '1915': 368,
   '1918': 479,
   '1930': 401,
   '1938': 382,
   '1944': 684,
   '1955': 735,
   '1967': 326,
   '1988': 112,
   '1923': 475,
   '1931': 377,
   '1952': 828,
   '1954': 791,
   '1957': 684,
   '1969': 231,
   '1970': 228,
   '1972': 130,
   '1977': 133,
   '1990': 129,
   '1999': 78,
   '2013': 39,
   '2014': 32,
   '1927': 463,
   '1929': 432,
   '1935': 375,
   '1950': 818,
   '1958': 621,
   '1961': 495,
   '1965': 353,
   '1980': 193,
   '1985': 123,
   '1992': 114,
   '1994': 108,
   '2001': 90,
   '2005': 62,
   '2007': 60,
   '2011': 38,
   '1913': 235,
   '1940': 366,
   '1951': 736,
   '1960': 548,
   '1964': 443,
   '1973': 166,
   '1986': 122,
   '1996': 108,
   '1910': 112,
   '1912': 207,
   '1925': 436,
   '1928': 400,
   '1937': 378,
   '1956': 702,
   '1968': 226,
   '1971': 177,
   '1974': 122,
   '1975': 134,
   '1978': 123,
   '1995': 103,
   '2012': 35,
   '1917': 408,
   '1919': 419,
   '1922': 531,
   '1926': 434,
   '1934': 361,
   '1941': 423,
   '1945': 655,
   '1953': 757,
   '1959': 683,
   '1979': 161,
   '1987': 134,
   '2002': 74,
   '2003': 80,
   '2004': 71,
   '2009': 47,
   '1920': 544,
   '1932': 398,
   '1942': 571,
   '1943': 600,
   '1946': 753,
   '1949': 796,
   '1962': 514,
   '1976': 119,
   '1981': 179,
   '1982': 149,
   '1984': 145,
   '1989': 140,
   '1993': 83,
   '1997': 93,
   '2006': 61,
   '2010': 40,
   '1911': 131,
   '1916': 364,
   '1921': 531,
   '1924': 464,
   '1933': 388,
   '1936': 351,
   '1939': 349,
   '1947': 790,
   '1948': 805,
   '1963': 465,
   '1966': 325,
   '1983': 147,
   '1991': 117,
   '1998': 85,
   '2000': 73,
   '2008': 50}},
 {'_id': ObjectId('5eb51b899993d0175fb7aaf5'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NC',
  'YearsCountDict': {'1914': 1581,
   '1921': 2811,
   '1945': 2224,
   '1948': 2402,
   '1954': 1700,
   '1957': 1505,
   '1959': 1256,
   '1978': 358,
   '1979': 381,
   '1997': 292,
   '2008': 161,
   '1917': 2129,
   '1918': 2344,
   '1925': 2915,
   '1931': 2441,
   '1932': 2674,
   '1940': 2281,
   '1962': 1084,
   '1964': 999,
   '1965': 893,
   '1966': 764,
   '1967': 705,
   '1980': 423,
   '1984': 337,
   '1986': 333,
   '2010': 141,
   '1913': 1180,
   '1916': 2056,
   '1919': 2577,
   '1920': 2675,
   '1924': 3126,
   '1929': 2665,
   '1933': 2387,
   '1934': 2442,
   '1937': 2305,
   '1947': 2571,
   '1974': 415,
   '1989': 355,
   '1992': 355,
   '2001': 265,
   '2006': 214,
   '2013': 115,
   '1912': 1086,
   '1915': 1946,
   '1936': 2254,
   '1951': 2115,
   '1953': 1779,
   '1968': 650,
   '1969': 635,
   '1976': 369,
   '1987': 369,
   '1991': 386,
   '1994': 339,
   '2011': 130,
   '1910': 837,
   '1911': 838,
   '1922': 2852,
   '1927': 2967,
   '1949': 2236,
   '1955': 1681,
   '1963': 1062,
   '2000': 280,
   '2005': 221,
   '1935': 2410,
   '1946': 2413,
   '1971': 555,
   '1973': 434,
   '1977': 384,
   '1982': 377,
   '1983': 357,
   '1998': 285,
   '2002': 228,
   '2007': 181,
   '2012': 110,
   '1928': 2898,
   '1942': 2413,
   '1943': 2447,
   '1950': 2032,
   '1952': 1960,
   '1956': 1519,
   '1958': 1353,
   '1970': 647,
   '1988': 336,
   '2004': 262,
   '1923': 2891,
   '1926': 2912,
   '1930': 2483,
   '1938': 2347,
   '1939': 2287,
   '1941': 2250,
   '1944': 2312,
   '1960': 1165,
   '1961': 1176,
   '1972': 471,
   '1975': 383,
   '1981': 417,
   '1985': 369,
   '1990': 358,
   '1993': 346,
   '1995': 285,
   '1996': 291,
   '1999': 312,
   '2003': 244,
   '2009': 145,
   '2014': 114}},
 {'_id': ObjectId('5eb51b899993d0175fb7d408'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'VT',
  'YearsCountDict': {'1918': 139,
   '1921': 154,
   '1935': 99,
   '1939': 106,
   '1949': 173,
   '1968': 42,
   '1973': 18,
   '1988': 14,
   '2008': 8,
   '1920': 132,
   '1926': 114,
   '1929': 141,
   '1936': 112,
   '1953': 173,
   '1954': 187,
   '1957': 156,
   '1969': 39,
   '1974': 20,
   '1975': 20,
   '1978': 20,
   '1980': 32,
   '1986': 19,
   '1999': 13,
   '1910': 45,
   '1911': 62,
   '1912': 95,
   '1914': 106,
   '1922': 129,
   '1927': 114,
   '1930': 126,
   '1937': 104,
   '1940': 123,
   '1947': 183,
   '1951': 147,
   '1959': 126,
   '1960': 101,
   '1961': 98,
   '1962': 81,
   '1964': 96,
   '1976': 23,
   '1977': 22,
   '1982': 13,
   '2002': 11,
   '1915': 120,
   '1916': 144,
   '1928': 123,
   '1952': 155,
   '1965': 73,
   '1967': 56,
   '1970': 35,
   '1983': 18,
   '1990': 22,
   '1991': 13,
   '1917': 126,
   '1919': 121,
   '1931': 114,
   '1945': 120,
   '1946': 132,
   '1950': 155,
   '1956': 123,
   '1963': 84,
   '1971': 44,
   '1972': 31,
   '1987': 9,
   '1995': 9,
   '2000': 13,
   '2001': 10,
   '1913': 80,
   '1923': 119,
   '1932': 127,
   '1943': 121,
   '1944': 118,
   '1948': 146,
   '1955': 144,
   '1966': 61,
   '1989': 13,
   '1992': 17,
   '1997': 6,
   '1998': 10,
   '2003': 15,
   '1925': 141,
   '1933': 117,
   '1938': 105,
   '1942': 128,
   '1958': 117,
   '1981': 23,
   '1984': 18,
   '1985': 20,
   '1996': 15,
   '1924': 124,
   '1934': 114,
   '1941': 111,
   '1979': 27,
   '1993': 11,
   '1994': 14,
   '2004': 5}},
 {'_id': ObjectId('5eb51b8a9993d0175fb828f7'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'OR',
  'YearsCountDict': {'1910': 54,
   '1931': 217,
   '1950': 429,
   '1951': 450,
   '1954': 459,
   '1957': 383,
   '1961': 305,
   '1962': 233,
   '1969': 132,
   '1972': 98,
   '1984': 87,
   '1992': 63,
   '1997': 62,
   '1999': 66,
   '2004': 41,
   '2006': 44,
   '1912': 129,
   '1919': 256,
   '1928': 252,
   '1932': 198,
   '1933': 188,
   '1978': 97,
   '1985': 82,
   '1987': 63,
   '1998': 54,
   '1918': 230,
   '1921': 326,
   '1936': 210,
   '1939': 235,
   '1944': 368,
   '1947': 504,
   '1949': 435,
   '1952': 439,
   '1955': 417,
   '1971': 112,
   '1974': 83,
   '1976': 94,
   '1981': 109,
   '1993': 74,
   '2001': 39,
   '2014': 20,
   '1914': 149,
   '1915': 248,
   '1922': 271,
   '1935': 190,
   '1938': 253,
   '1942': 351,
   '1946': 421,
   '1948': 442,
   '1956': 373,
   '1966': 160,
   '1979': 99,
   '2003': 44,
   '2010': 22,
   '1911': 73,
   '1913': 136,
   '1923': 321,
   '1925': 274,
   '1926': 238,
   '1958': 362,
   '1968': 144,
   '1975': 94,
   '1983': 80,
   '1988': 68,
   '1996': 60,
   '1916': 198,
   '1917': 249,
   '1927': 259,
   '1943': 395,
   '1960': 323,
   '1967': 165,
   '1973': 90,
   '1980': 118,
   '1989': 69,
   '1991': 80,
   '2009': 32,
   '2013': 20,
   '1920': 271,
   '1924': 291,
   '1929': 244,
   '1930': 268,
   '1937': 238,
   '1953': 486,
   '1959': 295,
   '1963': 222,
   '1965': 229,
   '1970': 122,
   '1982': 121,
   '1990': 85,
   '1995': 49,
   '2000': 60,
   '2005': 27,
   '2007': 31,
   '2011': 20,
   '1934': 206,
   '1940': 250,
   '1941': 255,
   '1945': 371,
   '1964': 252,
   '1977': 84,
   '1986': 68,
   '1994': 56,
   '2002': 39,
   '2008': 40,
   '2012': 14}},
 {'_id': ObjectId('5eb51b8a9993d0175fb85300'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'SC',
  'YearsCountDict': {'1911': 508,
   '1912': 711,
   '1918': 1412,
   '1923': 1646,
   '1939': 1483,
   '1946': 1570,
   '1954': 1300,
   '1977': 237,
   '1982': 225,
   '1986': 162,
   '1999': 162,
   '2001': 174,
   '2002': 159,
   '1920': 1615,
   '1921': 1687,
   '1924': 1742,
   '1976': 240,
   '1979': 243,
   '1994': 168,
   '1996': 197,
   '1914': 908,
   '1941': 1626,
   '1949': 1595,
   '1953': 1357,
   '1957': 995,
   '1978': 228,
   '1990': 188,
   '1991': 207,
   '1992': 209,
   '2009': 95,
   '1925': 1670,
   '1929': 1419,
   '1934': 1528,
   '1935': 1463,
   '1947': 1710,
   '1948': 1639,
   '1951': 1458,
   '1952': 1402,
   '1981': 225,
   '1987': 178,
   '1995': 188,
   '1997': 171,
   '2005': 141,
   '2006': 123,
   '2008': 98,
   '2014': 92,
   '1915': 1098,
   '1919': 1616,
   '1922': 1634,
   '1926': 1628,
   '1933': 1465,
   '1937': 1548,
   '1938': 1478,
   '1944': 1603,
   '1956': 1097,
   '1959': 877,
   '1966': 469,
   '1967': 475,
   '2003': 142,
   '2007': 103,
   '1916': 1243,
   '1936': 1435,
   '1942': 1635,
   '1960': 805,
   '1963': 644,
   '1969': 365,
   '1972': 277,
   '1985': 214,
   '2000': 167,
   '2013': 93,
   '1928': 1539,
   '1930': 1450,
   '1932': 1565,
   '1940': 1570,
   '1958': 897,
   '1962': 715,
   '1964': 617,
   '1968': 422,
   '1971': 338,
   '1973': 273,
   '1974': 245,
   '1975': 233,
   '1988': 189,
   '1989': 196,
   '1910': 602,
   '1913': 725,
   '1917': 1263,
   '1927': 1638,
   '1931': 1365,
   '1943': 1699,
   '1945': 1548,
   '1950': 1455,
   '1955': 1144,
   '1961': 750,
   '1965': 531,
   '1970': 364,
   '1980': 235,
   '1983': 203,
   '1984': 202,
   '1993': 179,
   '1998': 175,
   '2004': 124,
   '2010': 72,
   '2011': 97,
   '2012': 79}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8564f'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'IA',
  'YearsCountDict': {'1987': 58,
   '2009': 27,
   '2012': 21,
   '1957': 966,
   '1962': 729,
   '2001': 48,
   '2002': 43,
   '2008': 31,
   '2013': 27,
   '2014': 16,
   '1936': 1072,
   '1945': 1079,
   '1953': 1216,
   '1968': 309,
   '1996': 59,
   '2000': 54,
   '2006': 36,
   '1911': 279,
   '1913': 480,
   '1918': 1070,
   '1927': 1170,
   '1929': 1025,
   '1933': 1059,
   '1937': 1100,
   '1943': 1242,
   '1967': 344,
   '1969': 270,
   '1991': 80,
   '1998': 53,
   '1999': 52,
   '2005': 34,
   '2010': 26,
   '1910': 239,
   '1914': 624,
   '1915': 867,
   '1920': 1213,
   '1921': 1215,
   '1926': 1051,
   '1934': 1091,
   '1935': 1025,
   '1940': 1098,
   '1944': 1173,
   '1952': 1220,
   '1956': 1082,
   '1960': 789,
   '1977': 112,
   '1980': 134,
   '1982': 117,
   '1984': 94,
   '1990': 62,
   '2004': 40,
   '1928': 1100,
   '1939': 1028,
   '1941': 1091,
   '1949': 1290,
   '1954': 1275,
   '1955': 1153,
   '1963': 685,
   '1979': 112,
   '1995': 65,
   '1997': 54,
   '2003': 31,
   '2011': 21,
   '1922': 1144,
   '1923': 1204,
   '1950': 1312,
   '1958': 831,
   '1961': 780,
   '1965': 495,
   '1973': 121,
   '1974': 119,
   '1976': 116,
   '1983': 110,
   '1989': 64,
   '1992': 64,
   '2007': 22,
   '1917': 1077,
   '1919': 1076,
   '1931': 1118,
   '1932': 1077,
   '1938': 1087,
   '1942': 1192,
   '1947': 1397,
   '1948': 1354,
   '1951': 1362,
   '1964': 615,
   '1966': 449,
   '1972': 156,
   '1975': 129,
   '1981': 118,
   '1986': 72,
   '1988': 73,
   '1993': 73,
   '1994': 53,
   '1912': 403,
   '1916': 982,
   '1924': 1158,
   '1925': 1123,
   '1930': 1140,
   '1946': 1296,
   '1959': 832,
   '1970': 246,
   '1971': 217,
   '1978': 133,
   '1985': 91}},
 {'_id': ObjectId('5eb51b8a9993d0175fb85e6a'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'OK',
  'YearsCountDict': {'1911': 348,
   '1925': 1452,
   '1941': 1056,
   '2000': 78,
   '2002': 57,
   '2004': 55,
   '1910': 326,
   '1921': 1338,
   '1922': 1358,
   '1950': 814,
   '1955': 607,
   '1956': 546,
   '1957': 541,
   '1963': 358,
   '1977': 155,
   '1982': 168,
   '1990': 97,
   '1994': 83,
   '2001': 63,
   '2003': 51,
   '2006': 42,
   '2014': 32,
   '1920': 1313,
   '1923': 1446,
   '1933': 1254,
   '1937': 1094,
   '1939': 1082,
   '1945': 851,
   '1948': 819,
   '1949': 809,
   '1951': 746,
   '1953': 660,
   '1962': 364,
   '1965': 298,
   '1967': 227,
   '1972': 171,
   '1978': 126,
   '1980': 167,
   '1986': 127,
   '1989': 94,
   '2005': 51,
   '2012': 22,
   '1917': 931,
   '1919': 1051,
   '1931': 1237,
   '1932': 1224,
   '1934': 1223,
   '1935': 1154,
   '1943': 1047,
   '1947': 914,
   '1966': 254,
   '1968': 210,
   '1976': 126,
   '1988': 94,
   '1913': 525,
   '1918': 1036,
   '1926': 1410,
   '1928': 1341,
   '1930': 1318,
   '1946': 877,
   '1954': 669,
   '1961': 413,
   '1981': 166,
   '1983': 169,
   '1992': 102,
   '1915': 763,
   '1916': 815,
   '1940': 1042,
   '1942': 1067,
   '1944': 997,
   '1952': 710,
   '1959': 460,
   '1969': 204,
   '1971': 192,
   '1975': 126,
   '1984': 144,
   '1987': 97,
   '1996': 70,
   '1998': 70,
   '2007': 39,
   '2009': 39,
   '2010': 29,
   '2011': 29,
   '1912': 431,
   '1914': 572,
   '1924': 1461,
   '1929': 1350,
   '1960': 470,
   '1974': 138,
   '1979': 162,
   '1985': 136,
   '1991': 92,
   '1993': 91,
   '1995': 82,
   '1997': 65,
   '2008': 46,
   '2013': 29,
   '1927': 1502,
   '1936': 1097,
   '1938': 1115,
   '1958': 468,
   '1964': 347,
   '1970': 193,
   '1973': 129,
   '1999': 58}},
 {'_id': ObjectId('5eb51b8a9993d0175fb85f05'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'FL',
  'YearsCountDict': {'1910': 239,
   '1911': 223,
   '1922': 706,
   '1929': 823,
   '1932': 837,
   '1935': 781,
   '1942': 1066,
   '1947': 1237,
   '1955': 1256,
   '1962': 1003,
   '1977': 348,
   '1979': 341,
   '1994': 327,
   '1995': 291,
   '1999': 234,
   '2001': 219,
   '1934': 788,
   '1954': 1244,
   '1965': 849,
   '1971': 546,
   '1973': 445,
   '2002': 195,
   '2006': 146,
   '2011': 75,
   '1920': 626,
   '1923': 794,
   '1931': 782,
   '1933': 679,
   '1938': 803,
   '1948': 1224,
   '1953': 1222,
   '1956': 1348,
   '1963': 966,
   '1964': 1052,
   '1968': 640,
   '1970': 577,
   '1984': 321,
   '1988': 340,
   '1992': 328,
   '2007': 153,
   '2013': 84,
   '1913': 326,
   '1914': 427,
   '1918': 585,
   '1930': 801,
   '1950': 1157,
   '1958': 1256,
   '1967': 704,
   '1978': 363,
   '1982': 418,
   '1985': 356,
   '1993': 322,
   '1998': 232,
   '2003': 175,
   '2009': 93,
   '2010': 100,
   '2012': 80,
   '2014': 83,
   '1912': 290,
   '1921': 717,
   '1927': 1042,
   '1936': 783,
   '1943': 1215,
   '1949': 1107,
   '1951': 1222,
   '1959': 1195,
   '1961': 1133,
   '1972': 455,
   '1976': 386,
   '1987': 349,
   '2000': 223,
   '2005': 166,
   '2008': 114,
   '1916': 537,
   '1917': 581,
   '1919': 620,
   '1926': 1068,
   '1928': 916,
   '1937': 816,
   '1946': 1123,
   '1952': 1229,
   '1969': 549,
   '1981': 422,
   '1991': 371,
   '1997': 221,
   '1915': 482,
   '1924': 846,
   '1940': 902,
   '1941': 907,
   '1944': 1175,
   '1960': 1201,
   '1966': 779,
   '1974': 417,
   '1975': 379,
   '1980': 407,
   '1989': 361,
   '1990': 357,
   '2004': 200,
   '1925': 972,
   '1939': 888,
   '1945': 1216,
   '1957': 1239,
   '1983': 360,
   '1986': 326,
   '1996': 253}},
 {'_id': ObjectId('5eb51b8a9993d0175fb861bc'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'IL',
  'YearsCountDict': {'1910': 1076,
   '1918': 3381,
   '1919': 3396,
   '1921': 3674,
   '1926': 3336,
   '1931': 3118,
   '1933': 2753,
   '1935': 2570,
   '1936': 2680,
   '1946': 3492,
   '1963': 2658,
   '1969': 1169,
   '1975': 577,
   '1978': 491,
   '1980': 621,
   '1991': 439,
   '1999': 278,
   '2006': 172,
   '2007': 157,
   '1916': 3287,
   '1942': 3205,
   '1949': 3728,
   '1951': 3778,
   '1959': 3700,
   '1982': 519,
   '1983': 523,
   '1987': 457,
   '2000': 293,
   '2001': 246,
   '2005': 196,
   '1914': 2332,
   '1924': 3553,
   '1927': 3575,
   '1955': 4100,
   '1966': 1706,
   '1970': 1046,
   '1971': 884,
   '1974': 629,
   '1985': 448,
   '1988': 392,
   '1989': 415,
   '1998': 292,
   '2002': 261,
   '1923': 3607,
   '1950': 3810,
   '1952': 3871,
   '1954': 4341,
   '1960': 3571,
   '1964': 2556,
   '1965': 2102,
   '1967': 1439,
   '1973': 637,
   '1976': 504,
   '1981': 598,
   '1984': 426,
   '1995': 356,
   '1996': 314,
   '2008': 127,
   '2010': 96,
   '1911': 1207,
   '1930': 3444,
   '1945': 2931,
   '1962': 2780,
   '1968': 1319,
   '1972': 692,
   '1992': 395,
   '2003': 238,
   '1912': 1594,
   '1928': 3415,
   '1943': 3352,
   '1958': 3659,
   '1961': 3157,
   '1979': 536,
   '1986': 394,
   '1994': 347,
   '2004': 205,
   '2009': 106,
   '2013': 105,
   '2014': 85,
   '1922': 3519,
   '1929': 3274,
   '1938': 2733,
   '1940': 2770,
   '1944': 3109,
   '1948': 3646,
   '1956': 4008,
   '1977': 546,
   '1990': 393,
   '1993': 420,
   '1913': 1956,
   '1915': 3043,
   '1917': 3474,
   '1920': 3472,
   '1925': 3498,
   '1932': 2903,
   '1934': 2670,
   '1937': 2772,
   '1939': 2634,
   '1941': 2797,
   '1947': 3938,
   '1953': 3806,
   '1957': 3993,
   '1997': 320,
   '2011': 92,
   '2012': 84}},
 {'_id': ObjectId('5eb51b8a9993d0175fb87b99'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MD',
  'YearsCountDict': {'1912': 562,
   '1915': 963,
   '1923': 1226,
   '1926': 1146,
   '1942': 1064,
   '1950': 1028,
   '1961': 849,
   '1963': 827,
   '1968': 441,
   '1972': 267,
   '1979': 169,
   '1928': 1132,
   '1938': 890,
   '1953': 1125,
   '1965': 713,
   '1966': 540,
   '1980': 193,
   '1986': 180,
   '1990': 215,
   '1996': 146,
   '2013': 41,
   '1919': 1069,
   '1922': 1189,
   '1927': 1200,
   '1937': 877,
   '1946': 1184,
   '1958': 1015,
   '1960': 920,
   '1974': 218,
   '1982': 199,
   '1985': 166,
   '1993': 149,
   '2002': 92,
   '1913': 617,
   '1930': 1043,
   '1931': 931,
   '1932': 995,
   '1943': 1191,
   '1949': 1023,
   '1951': 1063,
   '1952': 1087,
   '1988': 167,
   '1997': 149,
   '2004': 85,
   '2010': 54,
   '2012': 55,
   '1914': 718,
   '1916': 961,
   '1929': 1050,
   '1933': 903,
   '1941': 988,
   '1945': 965,
   '1956': 1108,
   '1962': 815,
   '1970': 371,
   '1975': 166,
   '1976': 181,
   '1978': 151,
   '1983': 168,
   '1984': 159,
   '1989': 168,
   '1994': 185,
   '2006': 76,
   '2008': 82,
   '1918': 1076,
   '1921': 1302,
   '1924': 1248,
   '1940': 906,
   '1955': 1103,
   '1957': 1013,
   '1959': 1001,
   '1969': 385,
   '1973': 223,
   '1992': 190,
   '1995': 149,
   '2000': 141,
   '2001': 123,
   '2003': 96,
   '2011': 52,
   '2014': 49,
   '1910': 393,
   '1911': 425,
   '1917': 1031,
   '1934': 886,
   '1939': 846,
   '1964': 729,
   '1967': 528,
   '1971': 317,
   '1977': 177,
   '1981': 191,
   '1987': 144,
   '1991': 202,
   '1998': 149,
   '1920': 1208,
   '1925': 1181,
   '1935': 882,
   '1936': 875,
   '1944': 1083,
   '1947': 1255,
   '1948': 1042,
   '1954': 1207,
   '1999': 97,
   '2005': 90,
   '2007': 61,
   '2009': 62}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8a0e4'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'RI',
  'YearsCountDict': {'1916': 414,
   '1919': 410,
   '1923': 426,
   '1924': 414,
   '1929': 269,
   '1931': 216,
   '1937': 144,
   '1945': 215,
   '1946': 253,
   '1948': 243,
   '1960': 241,
   '1971': 66,
   '1972': 37,
   '1974': 29,
   '1979': 31,
   '1992': 29,
   '1996': 14,
   '2000': 16,
   '1914': 350,
   '1922': 461,
   '1925': 382,
   '1930': 278,
   '1936': 167,
   '1941': 171,
   '1956': 292,
   '1957': 267,
   '2002': 14,
   '2007': 7,
   '2014': 6,
   '1921': 485,
   '1935': 172,
   '1938': 163,
   '1939': 180,
   '1949': 226,
   '1952': 249,
   '1953': 259,
   '1958': 268,
   '1964': 188,
   '1976': 29,
   '1984': 30,
   '1991': 21,
   '1993': 24,
   '1995': 19,
   '1999': 12,
   '2009': 6,
   '2011': 8,
   '1928': 319,
   '1951': 256,
   '1954': 300,
   '1961': 227,
   '1962': 235,
   '1968': 79,
   '1982': 21,
   '1985': 23,
   '1986': 27,
   '1989': 24,
   '2001': 11,
   '2003': 17,
   '1911': 168,
   '1913': 269,
   '1926': 309,
   '1942': 173,
   '1944': 195,
   '1950': 226,
   '1980': 33,
   '1981': 26,
   '1988': 23,
   '1997': 12,
   '2010': 5,
   '1918': 458,
   '1932': 225,
   '1940': 177,
   '1943': 234,
   '1955': 278,
   '1959': 258,
   '1963': 203,
   '1966': 116,
   '1967': 101,
   '1969': 66,
   '1983': 31,
   '1987': 25,
   '2004': 13,
   '2006': 12,
   '2008': 8,
   '1910': 141,
   '1915': 415,
   '1970': 77,
   '1973': 34,
   '1975': 23,
   '2013': 8,
   '1912': 266,
   '1917': 442,
   '1920': 452,
   '1927': 338,
   '1933': 187,
   '1934': 164,
   '1947': 281,
   '1965': 150,
   '1977': 31,
   '1978': 24,
   '1990': 31,
   '1994': 21,
   '1998': 20,
   '2005': 14}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8ae1e'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'WI',
  'YearsCountDict': {'1921': 950,
   '1933': 1042,
   '1945': 1780,
   '1946': 1972,
   '1947': 2255,
   '1952': 2246,
   '1955': 2198,
   '1958': 1864,
   '1975': 174,
   '1983': 172,
   '1992': 113,
   '1994': 132,
   '2004': 64,
   '2009': 36,
   '1914': 541,
   '1928': 1130,
   '1931': 1089,
   '1957': 2128,
   '1965': 1065,
   '1969': 450,
   '1991': 135,
   '1995': 108,
   '1996': 100,
   '1997': 91,
   '2005': 57,
   '1910': 260,
   '1918': 873,
   '1924': 975,
   '1925': 1010,
   '1934': 1178,
   '1950': 2061,
   '1981': 186,
   '1982': 150,
   '1984': 141,
   '2007': 53,
   '1917': 838,
   '1920': 920,
   '1922': 970,
   '1929': 1130,
   '1932': 1118,
   '1942': 1583,
   '1943': 1723,
   '1960': 1791,
   '1966': 877,
   '1970': 399,
   '1972': 278,
   '1976': 172,
   '1985': 136,
   '1988': 113,
   '1999': 79,
   '2006': 57,
   '1923': 928,
   '1938': 1308,
   '1951': 2247,
   '1954': 2399,
   '1959': 1775,
   '1961': 1669,
   '1968': 571,
   '1973': 213,
   '1989': 130,
   '1990': 129,
   '1998': 80,
   '2000': 68,
   '2002': 63,
   '2003': 70,
   '1913': 458,
   '1926': 960,
   '1930': 1128,
   '1935': 1115,
   '1949': 2069,
   '1953': 2126,
   '1967': 683,
   '1974': 217,
   '1980': 195,
   '1912': 394,
   '1915': 759,
   '1927': 1120,
   '1936': 1189,
   '1937': 1274,
   '1940': 1461,
   '1962': 1464,
   '1963': 1363,
   '1971': 333,
   '1978': 154,
   '1979': 178,
   '2001': 78,
   '2013': 36,
   '2014': 36,
   '1911': 278,
   '1916': 838,
   '1919': 771,
   '1939': 1288,
   '1941': 1313,
   '1944': 1699,
   '1948': 2170,
   '1956': 2013,
   '1964': 1309,
   '1977': 183,
   '1986': 136,
   '1987': 110,
   '1993': 123,
   '2008': 52,
   '2010': 41,
   '2011': 34,
   '2012': 28}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8affe'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'UT',
  'YearsCountDict': {'1921': 235,
   '1942': 189,
   '1950': 240,
   '1954': 230,
   '1955': 181,
   '1973': 83,
   '1978': 101,
   '2008': 54,
   '2013': 25,
   '1910': 57,
   '1911': 71,
   '1925': 189,
   '1929': 177,
   '1939': 179,
   '1941': 168,
   '1945': 195,
   '1946': 220,
   '1952': 230,
   '1967': 89,
   '1970': 97,
   '1991': 71,
   '1992': 64,
   '1994': 53,
   '2011': 40,
   '1931': 187,
   '1932': 164,
   '1944': 218,
   '1947': 240,
   '1949': 233,
   '1953': 230,
   '1956': 205,
   '1993': 62,
   '1997': 60,
   '1999': 62,
   '2009': 47,
   '1914': 123,
   '1915': 183,
   '1938': 165,
   '1961': 145,
   '1964': 129,
   '1965': 125,
   '1966': 110,
   '1971': 77,
   '1981': 112,
   '1987': 56,
   '1990': 68,
   '1995': 56,
   '2003': 64,
   '2005': 51,
   '2006': 52,
   '1913': 82,
   '1927': 200,
   '1936': 185,
   '1960': 148,
   '1963': 137,
   '1972': 77,
   '1977': 98,
   '1986': 53,
   '2001': 57,
   '2007': 48,
   '1912': 86,
   '1923': 219,
   '1924': 187,
   '1926': 202,
   '1930': 161,
   '1934': 158,
   '1958': 183,
   '1962': 171,
   '1979': 113,
   '1982': 104,
   '2002': 60,
   '1916': 172,
   '1917': 203,
   '1918': 206,
   '1943': 240,
   '1957': 207,
   '1968': 81,
   '1969': 79,
   '1974': 70,
   '1983': 79,
   '1985': 83,
   '1998': 86,
   '2000': 58,
   '2004': 50,
   '2012': 36,
   '1919': 195,
   '1920': 222,
   '1922': 223,
   '1928': 179,
   '1933': 164,
   '1935': 160,
   '1937': 161,
   '1940': 168,
   '1948': 256,
   '1951': 234,
   '1959': 164,
   '1975': 89,
   '1976': 88,
   '1980': 109,
   '1984': 84,
   '1988': 61,
   '1989': 60,
   '1996': 65,
   '2010': 30,
   '2014': 32}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8e1e2'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'CO',
  'YearsCountDict': {'1920': 576,
   '1929': 522,
   '1948': 583,
   '1954': 557,
   '1960': 438,
   '1985': 117,
   '1997': 89,
   '1998': 79,
   '2001': 70,
   '2003': 74,
   '2004': 82,
   '2005': 49,
   '1910': 193,
   '1919': 472,
   '1934': 509,
   '1938': 458,
   '1951': 575,
   '1952': 562,
   '1957': 474,
   '1959': 448,
   '1967': 219,
   '1970': 173,
   '1971': 156,
   '1977': 96,
   '1980': 106,
   '1992': 76,
   '2000': 90,
   '1941': 474,
   '1943': 534,
   '1947': 618,
   '1961': 418,
   '1983': 114,
   '1994': 76,
   '2009': 44,
   '1912': 234,
   '1916': 457,
   '1917': 462,
   '1928': 539,
   '1936': 487,
   '1939': 476,
   '1953': 574,
   '1956': 567,
   '1964': 355,
   '1965': 292,
   '1976': 122,
   '1984': 119,
   '1990': 75,
   '1991': 91,
   '1995': 79,
   '1996': 92,
   '2007': 39,
   '2011': 34,
   '1915': 387,
   '1918': 539,
   '1922': 612,
   '1937': 528,
   '1942': 500,
   '1944': 522,
   '1945': 476,
   '1950': 549,
   '1963': 347,
   '1969': 171,
   '1974': 111,
   '1982': 127,
   '1989': 102,
   '2002': 57,
   '1911': 169,
   '1925': 533,
   '1926': 542,
   '1927': 512,
   '1932': 496,
   '1946': 553,
   '1949': 559,
   '1968': 161,
   '1972': 119,
   '1978': 110,
   '1986': 98,
   '1999': 87,
   '2008': 50,
   '2010': 30,
   '2013': 40,
   '1913': 258,
   '1921': 627,
   '1923': 626,
   '1924': 551,
   '1930': 515,
   '1935': 472,
   '1966': 241,
   '1973': 105,
   '1979': 113,
   '1981': 125,
   '1987': 111,
   '1988': 97,
   '1993': 75,
   '2006': 61,
   '2014': 38,
   '1914': 338,
   '1931': 511,
   '1933': 469,
   '1940': 486,
   '1955': 498,
   '1958': 462,
   '1962': 343,
   '1975': 112,
   '2012': 32}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8ef3d'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'IN',
  'YearsCountDict': {'1917': 2231,
   '1935': 1406,
   '1939': 1278,
   '1963': 975,
   '1969': 504,
   '1978': 300,
   '1983': 250,
   '1986': 198,
   '1992': 185,
   '1995': 177,
   '2003': 125,
   '2005': 115,
   '1911': 612,
   '1912': 935,
   '1922': 2376,
   '1925': 2220,
   '1936': 1370,
   '1949': 1559,
   '1954': 1601,
   '1964': 956,
   '1974': 313,
   '1977': 300,
   '1980': 288,
   '1984': 208,
   '1996': 195,
   '2010': 69,
   '1915': 1977,
   '1943': 1586,
   '1947': 1611,
   '1948': 1664,
   '1960': 1257,
   '1962': 1016,
   '1966': 684,
   '1973': 320,
   '1979': 282,
   '2000': 157,
   '2002': 127,
   '1913': 1105,
   '1921': 2633,
   '1928': 2050,
   '1929': 1795,
   '1932': 1616,
   '1933': 1497,
   '1937': 1437,
   '1944': 1408,
   '1946': 1610,
   '1950': 1517,
   '1968': 536,
   '1989': 206,
   '1993': 181,
   '1997': 165,
   '1998': 160,
   '2001': 137,
   '2009': 75,
   '2011': 81,
   '1910': 619,
   '1926': 2063,
   '1938': 1459,
   '1941': 1360,
   '1961': 1151,
   '1967': 647,
   '1976': 273,
   '1919': 2209,
   '1920': 2394,
   '1924': 2400,
   '1942': 1547,
   '1953': 1525,
   '1957': 1476,
   '1965': 844,
   '1970': 516,
   '1971': 422,
   '1981': 277,
   '1987': 186,
   '1991': 204,
   '1999': 154,
   '2014': 72,
   '1914': 1349,
   '1916': 2223,
   '1923': 2397,
   '1930': 1884,
   '1931': 1581,
   '1951': 1621,
   '1952': 1540,
   '1956': 1489,
   '1959': 1284,
   '1975': 284,
   '1982': 291,
   '1988': 193,
   '1994': 200,
   '2006': 102,
   '2012': 63,
   '2013': 59,
   '1918': 2316,
   '1927': 2186,
   '1934': 1466,
   '1940': 1286,
   '1945': 1331,
   '1955': 1471,
   '1958': 1354,
   '1972': 358,
   '1985': 210,
   '1990': 215,
   '2004': 145,
   '2007': 92,
   '2008': 90}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8f475'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MO',
  'YearsCountDict': {'1911': 626,
   '1917': 1882,
   '1918': 1950,
   '1923': 2115,
   '1924': 2261,
   '1925': 2154,
   '1927': 2315,
   '1945': 1682,
   '1947': 2062,
   '1956': 1724,
   '1963': 1060,
   '1968': 492,
   '1971': 440,
   '1981': 269,
   '1996': 164,
   '1910': 611,
   '1931': 1884,
   '1972': 357,
   '1975': 295,
   '1986': 215,
   '2010': 69,
   '1932': 1972,
   '1946': 1946,
   '1951': 1783,
   '1954': 1814,
   '1965': 913,
   '1970': 468,
   '1976': 277,
   '1982': 259,
   '1989': 224,
   '1993': 203,
   '2004': 98,
   '2012': 62,
   '1912': 834,
   '1920': 2022,
   '1921': 2122,
   '1930': 1990,
   '1935': 1668,
   '1937': 1680,
   '1943': 1850,
   '1953': 1666,
   '1958': 1536,
   '1959': 1409,
   '1974': 291,
   '1978': 243,
   '1980': 278,
   '1987': 195,
   '2001': 138,
   '2002': 114,
   '2003': 130,
   '2007': 94,
   '2011': 74,
   '1914': 1126,
   '1929': 1872,
   '1936': 1681,
   '1940': 1707,
   '1950': 1816,
   '1952': 1794,
   '1957': 1609,
   '1967': 628,
   '1991': 233,
   '1992': 232,
   '2006': 117,
   '1916': 1735,
   '1933': 1724,
   '1941': 1743,
   '1948': 1933,
   '1960': 1350,
   '1964': 1043,
   '1969': 518,
   '1985': 239,
   '2000': 163,
   '1913': 971,
   '1915': 1562,
   '1919': 1817,
   '1928': 2088,
   '1934': 1797,
   '1938': 1688,
   '1942': 1853,
   '1944': 1794,
   '1955': 1692,
   '1962': 1133,
   '1988': 227,
   '1994': 199,
   '1995': 208,
   '1999': 138,
   '1922': 2107,
   '1926': 2097,
   '1939': 1618,
   '1949': 1862,
   '1961': 1278,
   '1966': 710,
   '1973': 296,
   '1977': 290,
   '1979': 246,
   '1983': 282,
   '1984': 238,
   '1990': 218,
   '1997': 173,
   '1998': 164,
   '2005': 107,
   '2008': 77,
   '2009': 103,
   '2013': 70,
   '2014': 73}},
 {'_id': ObjectId('5eb51b8a9993d0175fb8fd77'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'GA',
  'YearsCountDict': {'1933': 2113,
   '1943': 2382,
   '1946': 2253,
   '1950': 2062,
   '1955': 1625,
   '1964': 978,
   '1983': 342,
   '1999': 377,
   '2001': 315,
   '1915': 1683,
   '1918': 2203,
   '1938': 2183,
   '1940': 2173,
   '1949': 2251,
   '1956': 1507,
   '1960': 1169,
   '1962': 1020,
   '1970': 609,
   '1978': 342,
   '1982': 372,
   '1984': 364,
   '2004': 297,
   '2007': 195,
   '2010': 175,
   '1911': 893,
   '1912': 1170,
   '1913': 1223,
   '1928': 2213,
   '1945': 2100,
   '1967': 690,
   '1972': 452,
   '1974': 427,
   '1980': 404,
   '1989': 365,
   '1990': 358,
   '1997': 348,
   '2002': 348,
   '2013': 142,
   '1919': 2386,
   '1929': 2153,
   '1930': 2171,
   '1939': 2197,
   '1941': 2175,
   '1965': 843,
   '2006': 243,
   '1921': 2474,
   '1923': 2401,
   '1924': 2502,
   '1931': 2133,
   '1947': 2350,
   '1952': 1969,
   '1953': 1823,
   '1963': 897,
   '1996': 341,
   '2000': 364,
   '2003': 296,
   '2005': 244,
   '2011': 148,
   '1917': 2025,
   '1932': 2212,
   '1936': 1923,
   '1942': 2314,
   '1957': 1456,
   '1958': 1319,
   '1977': 383,
   '1979': 367,
   '1985': 369,
   '2008': 199,
   '1916': 1947,
   '1920': 2538,
   '1922': 2614,
   '1926': 2377,
   '1934': 2219,
   '1937': 2096,
   '1951': 1988,
   '1954': 1777,
   '1959': 1277,
   '1973': 399,
   '1975': 341,
   '1976': 363,
   '1991': 378,
   '1992': 358,
   '1993': 366,
   '1994': 352,
   '2014': 147,
   '1910': 841,
   '1914': 1471,
   '1925': 2558,
   '1927': 2354,
   '1935': 2086,
   '1944': 2314,
   '1948': 2285,
   '1961': 1103,
   '1966': 791,
   '1968': 591,
   '1969': 585,
   '1971': 583,
   '1981': 380,
   '1986': 366,
   '1987': 395,
   '1988': 346,
   '1995': 344,
   '1998': 351,
   '2009': 194,
   '2012': 133}},
 {'_id': ObjectId('5eb51b8a9993d0175fb9046e'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'DE',
  'YearsCountDict': {'1917': 161,
   '1918': 178,
   '1925': 101,
   '1928': 116,
   '1945': 139,
   '1948': 145,
   '1958': 158,
   '1960': 131,
   '1963': 130,
   '1971': 46,
   '1979': 24,
   '1980': 32,
   '1983': 27,
   '1985': 18,
   '1987': 39,
   '1989': 32,
   '2001': 24,
   '2003': 14,
   '2006': 5,
   '1922': 162,
   '1923': 149,
   '1933': 71,
   '1934': 93,
   '1935': 92,
   '1939': 98,
   '1961': 142,
   '1962': 139,
   '1965': 98,
   '1968': 62,
   '1986': 31,
   '1992': 21,
   '1919': 181,
   '1920': 165,
   '1929': 115,
   '1930': 105,
   '1931': 100,
   '1936': 102,
   '1937': 98,
   '1944': 135,
   '1946': 124,
   '1956': 165,
   '1974': 44,
   '1978': 34,
   '1981': 30,
   '1990': 34,
   '1997': 21,
   '2000': 24,
   '1921': 141,
   '1924': 132,
   '1940': 115,
   '1941': 111,
   '1949': 135,
   '1952': 135,
   '1977': 23,
   '1984': 24,
   '1999': 11,
   '2002': 18,
   '2004': 13,
   '2008': 10,
   '1926': 115,
   '1927': 120,
   '1932': 97,
   '1938': 96,
   '1951': 132,
   '1953': 162,
   '1954': 166,
   '1959': 166,
   '1988': 24,
   '1993': 28,
   '1994': 35,
   '1995': 21,
   '1912': 77,
   '1913': 80,
   '1914': 101,
   '1942': 123,
   '1947': 139,
   '1955': 189,
   '1957': 166,
   '1966': 93,
   '1972': 55,
   '1975': 42,
   '1976': 39,
   '1982': 28,
   '1991': 33,
   '1996': 16,
   '2005': 8,
   '2009': 5,
   '1915': 141,
   '1916': 114,
   '1950': 149,
   '1964': 128,
   '1970': 55,
   '1973': 30,
   '2010': 6,
   '2011': 7,
   '1910': 59,
   '1911': 49,
   '1943': 133,
   '1967': 66,
   '1969': 56,
   '1998': 25,
   '2007': 5,
   '2014': 5}},
 {'_id': ObjectId('5eb51b8a9993d0175fb90946'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'AR',
  'YearsCountDict': {'1911': 386,
   '1927': 1340,
   '1944': 1114,
   '1954': 718,
   '1965': 339,
   '1991': 119,
   '1996': 69,
   '1998': 86,
   '1999': 87,
   '2012': 34,
   '2014': 34,
   '1913': 527,
   '1921': 1190,
   '1930': 1322,
   '1938': 1159,
   '1952': 842,
   '1953': 762,
   '1955': 682,
   '1956': 642,
   '1961': 441,
   '1972': 183,
   '1973': 139,
   '1992': 89,
   '1993': 92,
   '1995': 96,
   '1997': 97,
   '2003': 79,
   '1916': 904,
   '1924': 1363,
   '1932': 1249,
   '1943': 1227,
   '1945': 971,
   '1948': 1131,
   '1949': 1017,
   '1958': 551,
   '1959': 502,
   '1968': 210,
   '1976': 120,
   '1977': 145,
   '1989': 112,
   '1990': 95,
   '2005': 68,
   '2007': 54,
   '1912': 491,
   '1915': 837,
   '1920': 1206,
   '1928': 1172,
   '1931': 1172,
   '1935': 1227,
   '1939': 1111,
   '1940': 1189,
   '1942': 1208,
   '1966': 295,
   '1974': 165,
   '1975': 149,
   '1979': 120,
   '1982': 123,
   '1983': 130,
   '1987': 97,
   '1917': 957,
   '1922': 1283,
   '1923': 1257,
   '1951': 893,
   '1964': 379,
   '1969': 195,
   '1970': 228,
   '1981': 133,
   '1984': 121,
   '1986': 119,
   '2000': 85,
   '2013': 46,
   '1914': 645,
   '1937': 1148,
   '1950': 975,
   '1957': 669,
   '1962': 438,
   '1963': 389,
   '1978': 132,
   '2002': 100,
   '1918': 1048,
   '1919': 1101,
   '1933': 1149,
   '1934': 1211,
   '1941': 1223,
   '1946': 1074,
   '1947': 1188,
   '1971': 177,
   '1980': 139,
   '1988': 128,
   '1994': 95,
   '2001': 88,
   '2010': 45,
   '1910': 408,
   '1925': 1403,
   '1926': 1321,
   '1929': 1240,
   '1936': 1088,
   '1960': 478,
   '1967': 255,
   '1985': 131,
   '2004': 86,
   '2006': 57,
   '2008': 39,
   '2009': 44,
   '2011': 33}},
 {'_id': ObjectId('5eb51b8a9993d0175fb92004'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NH',
  'YearsCountDict': {'1913': 98,
   '1915': 186,
   '1919': 181,
   '1922': 171,
   '1924': 167,
   '1926': 137,
   '1937': 101,
   '1953': 128,
   '1970': 56,
   '1977': 27,
   '1978': 25,
   '1981': 37,
   '1991': 27,
   '1998': 23,
   '1999': 27,
   '2002': 27,
   '2010': 10,
   '2014': 16,
   '1932': 124,
   '1936': 90,
   '1943': 117,
   '1949': 146,
   '1951': 136,
   '1954': 121,
   '1956': 135,
   '1963': 103,
   '1982': 37,
   '1988': 25,
   '1917': 139,
   '1928': 151,
   '1933': 113,
   '1941': 121,
   '1948': 158,
   '1962': 102,
   '1964': 104,
   '1980': 39,
   '1983': 22,
   '1987': 31,
   '1989': 37,
   '1993': 26,
   '1995': 32,
   '2004': 15,
   '1910': 50,
   '1925': 154,
   '1931': 110,
   '1935': 96,
   '1939': 104,
   '1947': 160,
   '1950': 135,
   '1992': 31,
   '1916': 157,
   '1927': 136,
   '1930': 124,
   '1934': 100,
   '1940': 112,
   '1957': 153,
   '1959': 127,
   '1967': 77,
   '1969': 35,
   '1976': 24,
   '2000': 16,
   '2001': 25,
   '2008': 5,
   '2012': 5,
   '1942': 119,
   '1944': 113,
   '1952': 110,
   '1965': 99,
   '1966': 63,
   '1972': 35,
   '1973': 33,
   '1985': 32,
   '1990': 31,
   '1994': 29,
   '2003': 22,
   '2006': 14,
   '2007': 7,
   '1911': 68,
   '1912': 95,
   '1920': 156,
   '1923': 148,
   '1938': 107,
   '1945': 125,
   '1946': 146,
   '1955': 131,
   '1958': 120,
   '1961': 107,
   '1971': 37,
   '1975': 34,
   '1979': 34,
   '1986': 30,
   '1996': 12,
   '2009': 10,
   '1914': 138,
   '1918': 158,
   '1921': 182,
   '1929': 107,
   '1960': 119,
   '1968': 54,
   '1974': 44,
   '1984': 27,
   '1997': 25,
   '2005': 26}},
 {'_id': ObjectId('5eb51b8a9993d0175fb92d0d'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NV',
  'YearsCountDict': {'1959': 46,
   '1962': 51,
   '1983': 28,
   '1997': 22,
   '2008': 11,
   '2014': 9,
   '1910': 10,
   '1911': 6,
   '1912': 5,
   '1913': 21,
   '1914': 12,
   '1915': 27,
   '1925': 36,
   '1926': 33,
   '1927': 34,
   '1928': 25,
   '1934': 23,
   '1935': 29,
   '1936': 32,
   '1937': 35,
   '1950': 49,
   '1952': 60,
   '1953': 61,
   '1956': 59,
   '1967': 52,
   '1973': 24,
   '1982': 29,
   '1992': 25,
   '2004': 15,
   '2010': 13,
   '1920': 28,
   '1921': 33,
   '1922': 30,
   '1923': 29,
   '1924': 29,
   '1947': 61,
   '1954': 65,
   '1957': 69,
   '1965': 69,
   '1976': 20,
   '1977': 19,
   '1978': 19,
   '1980': 28,
   '1985': 21,
   '1987': 22,
   '1990': 24,
   '2002': 24,
   '2012': 17,
   '2013': 19,
   '1938': 14,
   '1939': 30,
   '1940': 22,
   '1993': 24,
   '1995': 33,
   '1999': 29,
   '2000': 18,
   '2003': 17,
   '2007': 16,
   '1929': 22,
   '1930': 25,
   '1931': 21,
   '1932': 23,
   '1933': 21,
   '1945': 55,
   '1946': 45,
   '1963': 53,
   '1964': 58,
   '1966': 40,
   '1968': 32,
   '1970': 32,
   '1971': 36,
   '1943': 50,
   '1944': 50,
   '1951': 54,
   '1975': 25,
   '1981': 28,
   '1994': 29,
   '1998': 21,
   '2001': 22,
   '1955': 67,
   '1984': 27,
   '1991': 33,
   '1996': 27,
   '2005': 20,
   '2006': 23,
   '2009': 13,
   '1916': 40,
   '1917': 31,
   '1918': 31,
   '1919': 37,
   '1941': 23,
   '1942': 48,
   '1948': 48,
   '1949': 57,
   '1958': 40,
   '1960': 50,
   '1961': 46,
   '1969': 41,
   '1972': 36,
   '1974': 16,
   '1979': 28,
   '1986': 21,
   '1988': 25,
   '1989': 26,
   '2011': 10}},
 {'_id': ObjectId('5eb51b8b9993d0175fb99714'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'OH',
  'YearsCountDict': {'1918': 4202,
   '1927': 4026,
   '1929': 3554,
   '1931': 3265,
   '1952': 3561,
   '1957': 3459,
   '1962': 2434,
   '1987': 377,
   '1991': 388,
   '1993': 358,
   '1996': 327,
   '2007': 153,
   '1911': 1277,
   '1933': 2917,
   '1946': 3333,
   '1949': 3377,
   '1966': 1638,
   '1992': 406,
   '2000': 237,
   '2001': 234,
   '2013': 96,
   '2014': 99,
   '1912': 1868,
   '1916': 3736,
   '1941': 2722,
   '1944': 2885,
   '1958': 3088,
   '1964': 2247,
   '1980': 565,
   '1995': 348,
   '1999': 319,
   '2010': 113,
   '1914': 2731,
   '1922': 4384,
   '1935': 2753,
   '1939': 2721,
   '1940': 2692,
   '1945': 2680,
   '1953': 3374,
   '1959': 3053,
   '1961': 2568,
   '1969': 1126,
   '1973': 679,
   '1977': 559,
   '1979': 525,
   '1982': 504,
   '2005': 186,
   '1913': 2181,
   '1924': 4200,
   '1936': 2767,
   '1951': 3522,
   '1963': 2285,
   '1967': 1403,
   '1978': 553,
   '1994': 349,
   '2004': 174,
   '2006': 180,
   '1919': 3814,
   '1921': 4386,
   '1928': 4003,
   '1942': 3035,
   '1950': 3318,
   '1985': 428,
   '2012': 112,
   '1917': 3948,
   '1925': 3812,
   '1930': 3585,
   '1932': 3090,
   '1934': 2901,
   '1955': 3557,
   '1970': 1097,
   '1972': 758,
   '1975': 627,
   '1986': 399,
   '1989': 371,
   '1990': 388,
   '1998': 294,
   '2003': 204,
   '2009': 119,
   '1910': 1099,
   '1915': 3653,
   '1920': 4232,
   '1923': 4128,
   '1926': 3806,
   '1937': 2759,
   '1938': 2788,
   '1943': 3166,
   '1947': 3563,
   '1948': 3463,
   '1954': 3754,
   '1956': 3467,
   '1960': 2904,
   '1965': 1881,
   '1968': 1177,
   '1971': 963,
   '1974': 701,
   '1976': 547,
   '1981': 591,
   '1983': 487,
   '1984': 445,
   '1988': 373,
   '1997': 323,
   '2002': 225,
   '2008': 114,
   '2011': 125}},
 {'_id': ObjectId('5eb51b8b9993d0175fb9a6fc'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'TX',
  'YearsCountDict': {'1911': 906,
   '1929': 3421,
   '1932': 3181,
   '1935': 3180,
   '1938': 3382,
   '1941': 3619,
   '1945': 3778,
   '1947': 4211,
   '1958': 3261,
   '1959': 3078,
   '1962': 2435,
   '1964': 2333,
   '1969': 1352,
   '1980': 905,
   '1993': 551,
   '1915': 2051,
   '1923': 3247,
   '1924': 3529,
   '1933': 2996,
   '1942': 3772,
   '1954': 4081,
   '1960': 2769,
   '1961': 2614,
   '1971': 1257,
   '1992': 615,
   '1998': 425,
   '2002': 381,
   '2009': 248,
   '1925': 3535,
   '1928': 3422,
   '1934': 3254,
   '1937': 3122,
   '1955': 3726,
   '1972': 968,
   '1975': 875,
   '1995': 557,
   '1910': 895,
   '1916': 2328,
   '1920': 3192,
   '1931': 3302,
   '2011': 211,
   '1927': 3607,
   '1949': 4141,
   '1950': 4061,
   '1952': 4153,
   '1957': 3617,
   '1963': 2342,
   '1968': 1423,
   '1977': 794,
   '1979': 867,
   '1984': 748,
   '1985': 803,
   '1986': 728,
   '1988': 659,
   '1991': 595,
   '2006': 315,
   '2013': 203,
   '1912': 1179,
   '1917': 2504,
   '1919': 2740,
   '1936': 3110,
   '1939': 3254,
   '1940': 3375,
   '1946': 4203,
   '1951': 3992,
   '1956': 3621,
   '1973': 973,
   '1974': 915,
   '1976': 770,
   '1978': 755,
   '1997': 445,
   '1999': 477,
   '2001': 370,
   '2004': 332,
   '2007': 270,
   '2012': 204,
   '1918': 2589,
   '1922': 3370,
   '1926': 3442,
   '1930': 3395,
   '1944': 4139,
   '1948': 4177,
   '1953': 4154,
   '1966': 1720,
   '1967': 1562,
   '1970': 1352,
   '1981': 820,
   '1987': 650,
   '1996': 463,
   '2005': 314,
   '2008': 263,
   '1913': 1425,
   '1914': 1707,
   '1921': 3413,
   '1943': 4065,
   '1965': 1891,
   '1982': 902,
   '1983': 783,
   '1989': 665,
   '1990': 663,
   '1994': 526,
   '2000': 428,
   '2003': 346,
   '2010': 196,
   '2014': 220}},
 {'_id': ObjectId('5eb51b8b9993d0175fb9d386'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NE',
  'YearsCountDict': {'1911': 142,
   '1915': 371,
   '1925': 485,
   '1928': 513,
   '1933': 476,
   '1940': 460,
   '1963': 399,
   '1973': 87,
   '1981': 74,
   '1999': 40,
   '2008': 21,
   '1920': 516,
   '1932': 514,
   '1934': 561,
   '1948': 658,
   '1949': 644,
   '1952': 639,
   '1985': 56,
   '1988': 52,
   '1991': 51,
   '2003': 38,
   '1910': 161,
   '1926': 483,
   '1931': 546,
   '1936': 503,
   '1958': 478,
   '1962': 326,
   '1968': 152,
   '1998': 50,
   '2006': 37,
   '1913': 255,
   '1938': 510,
   '1946': 563,
   '1953': 630,
   '1972': 108,
   '1975': 64,
   '1994': 43,
   '2001': 33,
   '2004': 31,
   '2011': 16,
   '2014': 13,
   '1914': 288,
   '1947': 624,
   '1950': 665,
   '1957': 512,
   '1966': 197,
   '1967': 177,
   '1982': 81,
   '1987': 67,
   '1990': 46,
   '1992': 44,
   '1993': 58,
   '2000': 38,
   '2007': 16,
   '2012': 22,
   '1921': 575,
   '1924': 580,
   '1941': 421,
   '1942': 454,
   '1955': 552,
   '1956': 618,
   '1969': 133,
   '1979': 64,
   '1980': 78,
   '1986': 47,
   '1989': 54,
   '1995': 47,
   '1996': 40,
   '1997': 47,
   '1916': 413,
   '1919': 435,
   '1922': 531,
   '1923': 533,
   '1929': 545,
   '1935': 480,
   '1939': 458,
   '1960': 452,
   '1964': 395,
   '1965': 284,
   '1970': 145,
   '1971': 120,
   '1976': 72,
   '1977': 64,
   '1984': 53,
   '2005': 22,
   '2010': 12,
   '1912': 215,
   '1917': 439,
   '1918': 486,
   '1927': 577,
   '1930': 535,
   '1937': 466,
   '1943': 506,
   '1944': 530,
   '1945': 498,
   '1951': 619,
   '1954': 620,
   '1959': 493,
   '1961': 433,
   '1974': 78,
   '1978': 56,
   '1983': 68,
   '2002': 28,
   '2009': 26,
   '2013': 19}},
 {'_id': ObjectId('5eb51b8b9993d0175fba06b3'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NJ',
  'YearsCountDict': {'1910': 593,
   '1919': 1869,
   '1923': 1786,
   '1929': 1399,
   '1933': 1141,
   '1936': 964,
   '1938': 1001,
   '1949': 1324,
   '1953': 1557,
   '1965': 1073,
   '2012': 42,
   '2013': 44,
   '1911': 733,
   '1922': 1801,
   '1928': 1521,
   '1937': 1057,
   '1966': 868,
   '1983': 200,
   '1984': 202,
   '2004': 103,
   '1915': 1923,
   '1921': 1922,
   '1927': 1567,
   '1931': 1272,
   '1932': 1265,
   '1947': 1448,
   '1960': 1620,
   '1964': 1242,
   '1975': 262,
   '1988': 189,
   '1999': 148,
   '2002': 127,
   '2005': 82,
   '2007': 70,
   '1914': 1449,
   '1917': 2014,
   '1920': 1885,
   '1939': 1011,
   '1943': 1324,
   '1944': 1240,
   '1951': 1407,
   '1955': 1709,
   '1958': 1725,
   '1961': 1462,
   '1970': 491,
   '1976': 229,
   '1978': 210,
   '1987': 193,
   '1996': 163,
   '1997': 156,
   '1998': 135,
   '2006': 81,
   '1916': 1882,
   '1935': 1038,
   '1942': 1331,
   '1959': 1567,
   '1962': 1273,
   '1968': 608,
   '1971': 473,
   '1973': 301,
   '1974': 263,
   '1981': 210,
   '1992': 214,
   '1994': 169,
   '2000': 120,
   '2009': 44,
   '2011': 47,
   '2014': 44,
   '1918': 1990,
   '1930': 1388,
   '1940': 1013,
   '1941': 1128,
   '1945': 1203,
   '1950': 1331,
   '1956': 1772,
   '1972': 367,
   '1977': 272,
   '1985': 199,
   '1986': 187,
   '2001': 118,
   '1912': 1024,
   '1925': 1617,
   '1934': 1062,
   '1946': 1420,
   '1980': 225,
   '1989': 203,
   '1990': 202,
   '2008': 63,
   '2010': 49,
   '1913': 1078,
   '1924': 1771,
   '1926': 1466,
   '1948': 1398,
   '1952': 1467,
   '1954': 1823,
   '1957': 1779,
   '1963': 1300,
   '1967': 776,
   '1969': 566,
   '1979': 225,
   '1982': 221,
   '1991': 233,
   '1993': 190,
   '1995': 134,
   '2003': 112}},
 {'_id': ObjectId('5eb51b8b9993d0175fba2674'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MT',
  'YearsCountDict': {'1912': 108,
   '1919': 287,
   '1927': 201,
   '1931': 184,
   '1944': 192,
   '1950': 265,
   '1952': 252,
   '1953': 237,
   '1955': 252,
   '1959': 211,
   '1966': 79,
   '1970': 49,
   '1976': 33,
   '1997': 11,
   '1998': 12,
   '2010': 8,
   '1916': 257,
   '1932': 192,
   '1943': 204,
   '1949': 248,
   '1961': 170,
   '1962': 135,
   '1989': 26,
   '1991': 22,
   '1917': 289,
   '1918': 342,
   '1930': 208,
   '1942': 209,
   '1958': 187,
   '1964': 107,
   '1969': 48,
   '1978': 33,
   '1979': 40,
   '1984': 35,
   '1995': 16,
   '2000': 21,
   '2005': 10,
   '2012': 9,
   '2013': 6,
   '1929': 211,
   '1935': 184,
   '1940': 185,
   '1945': 169,
   '1947': 221,
   '1963': 126,
   '1981': 41,
   '1999': 16,
   '2003': 15,
   '2009': 7,
   '1913': 140,
   '1920': 311,
   '1922': 281,
   '1923': 256,
   '1926': 216,
   '1928': 200,
   '1936': 184,
   '1941': 186,
   '1975': 30,
   '1977': 29,
   '1987': 19,
   '1990': 24,
   '1996': 17,
   '2001': 11,
   '2014': 8,
   '1910': 81,
   '1911': 72,
   '1921': 326,
   '1924': 261,
   '1933': 170,
   '1948': 228,
   '1965': 121,
   '1973': 38,
   '1974': 31,
   '1994': 23,
   '2002': 7,
   '2004': 5,
   '2007': 10,
   '1925': 213,
   '1934': 232,
   '1937': 177,
   '1938': 177,
   '1946': 240,
   '1951': 238,
   '1954': 254,
   '1957': 239,
   '1967': 62,
   '1968': 68,
   '1980': 31,
   '1983': 29,
   '1988': 27,
   '2008': 10,
   '1914': 187,
   '1915': 244,
   '1939': 164,
   '1956': 235,
   '1960': 191,
   '1971': 51,
   '1972': 35,
   '1982': 32,
   '1985': 16,
   '1986': 27,
   '1992': 23,
   '1993': 18,
   '2006': 9,
   '2011': 5}},
 {'_id': ObjectId('5eb51b8b9993d0175fba4219'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'KY',
  'YearsCountDict': {'1911': 827,
   '1914': 1322,
   '1918': 2139,
   '1920': 2338,
   '1934': 1881,
   '1945': 1434,
   '1947': 1814,
   '1979': 310,
   '1980': 352,
   '1985': 221,
   '1990': 188,
   '1991': 187,
   '1996': 151,
   '1917': 1913,
   '1922': 2442,
   '1923': 2345,
   '1925': 2341,
   '1932': 1950,
   '1948': 1673,
   '1959': 1131,
   '1964': 862,
   '1966': 596,
   '1972': 369,
   '1973': 338,
   '1999': 132,
   '2002': 119,
   '2008': 71,
   '2014': 40,
   '1913': 1189,
   '1930': 2001,
   '1939': 1731,
   '1954': 1464,
   '1968': 511,
   '1970': 462,
   '1981': 302,
   '1983': 247,
   '2007': 85,
   '2009': 72,
   '1924': 2537,
   '1926': 2312,
   '1933': 1852,
   '1937': 1725,
   '1940': 1787,
   '1944': 1599,
   '1949': 1593,
   '1952': 1385,
   '1953': 1409,
   '1958': 1127,
   '1974': 344,
   '1975': 339,
   '1978': 305,
   '1986': 212,
   '2000': 143,
   '2011': 62,
   '1916': 1834,
   '1921': 2511,
   '1936': 1690,
   '1941': 1697,
   '1946': 1734,
   '1957': 1198,
   '1965': 742,
   '1967': 586,
   '1971': 461,
   '1976': 310,
   '1977': 318,
   '1992': 185,
   '2003': 100,
   '2010': 50,
   '1928': 2165,
   '1956': 1245,
   '1961': 966,
   '1962': 945,
   '1969': 446,
   '1984': 261,
   '1987': 202,
   '1997': 160,
   '1998': 149,
   '2001': 120,
   '1915': 1807,
   '1931': 1875,
   '1963': 841,
   '1988': 174,
   '1989': 211,
   '1993': 182,
   '1994': 179,
   '1995': 162,
   '2004': 80,
   '2006': 105,
   '1910': 793,
   '1912': 984,
   '1919': 2161,
   '1927': 2352,
   '1929': 1994,
   '1935': 1811,
   '1938': 1783,
   '1942': 1747,
   '1943': 1678,
   '1950': 1560,
   '1951': 1464,
   '1955': 1311,
   '1960': 997,
   '1982': 256,
   '2005': 97,
   '2012': 59,
   '2013': 71}},
 {'_id': ObjectId('5eb51b8b9993d0175fba45e4'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'ID',
  'YearsCountDict': {'1924': 180,
   '1927': 142,
   '1970': 46,
   '1974': 43,
   '1979': 49,
   '1980': 56,
   '1995': 30,
   '1996': 22,
   '1915': 149,
   '1918': 214,
   '1928': 147,
   '1939': 144,
   '1940': 144,
   '1947': 188,
   '1952': 186,
   '1958': 154,
   '1972': 42,
   '1982': 45,
   '1987': 21,
   '2001': 18,
   '2003': 18,
   '1916': 171,
   '1933': 135,
   '1937': 158,
   '1941': 126,
   '1944': 174,
   '1945': 169,
   '1946': 180,
   '1951': 170,
   '1959': 134,
   '1965': 98,
   '1973': 34,
   '1975': 51,
   '1992': 24,
   '1999': 24,
   '2000': 28,
   '2013': 9,
   '1912': 77,
   '1913': 91,
   '1938': 156,
   '1943': 162,
   '1953': 184,
   '1955': 167,
   '1961': 110,
   '1962': 94,
   '1964': 101,
   '1967': 54,
   '1990': 29,
   '1991': 21,
   '1993': 33,
   '2002': 22,
   '2008': 11,
   '1910': 53,
   '1911': 50,
   '1922': 223,
   '1926': 135,
   '1954': 159,
   '1957': 149,
   '1963': 99,
   '1981': 47,
   '1994': 23,
   '1997': 16,
   '1998': 15,
   '2005': 17,
   '1919': 185,
   '1920': 209,
   '1921': 207,
   '1929': 151,
   '1932': 145,
   '1948': 177,
   '1950': 186,
   '1960': 121,
   '1976': 37,
   '1978': 42,
   '1983': 36,
   '1985': 37,
   '1989': 23,
   '2009': 12,
   '2010': 7,
   '2011': 10,
   '1917': 199,
   '1931': 159,
   '1934': 120,
   '1935': 136,
   '1936': 145,
   '1942': 150,
   '1984': 35,
   '1988': 32,
   '2006': 17,
   '2007': 13,
   '2012': 11,
   '2014': 17,
   '1914': 93,
   '1923': 195,
   '1925': 169,
   '1930': 125,
   '1949': 198,
   '1956': 156,
   '1966': 66,
   '1968': 56,
   '1969': 44,
   '1971': 49,
   '1977': 54,
   '1986': 24,
   '2004': 25}},
 {'_id': ObjectId('5eb51b8b9993d0175fba5735'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'KS',
  'YearsCountDict': {'1925': 926,
   '1926': 905,
   '1928': 860,
   '1936': 711,
   '1948': 727,
   '1950': 681,
   '1959': 537,
   '1970': 153,
   '1971': 138,
   '1975': 107,
   '1976': 83,
   '1982': 119,
   '1983': 86,
   '1991': 64,
   '1994': 81,
   '2001': 49,
   '1920': 890,
   '1943': 746,
   '1974': 90,
   '2003': 56,
   '2006': 29,
   '2013': 32,
   '1912': 369,
   '1927': 987,
   '1929': 845,
   '1931': 838,
   '1934': 814,
   '1937': 649,
   '1940': 662,
   '1952': 724,
   '1954': 655,
   '1961': 427,
   '1962': 425,
   '1964': 358,
   '1973': 110,
   '1989': 85,
   '1999': 50,
   '2000': 67,
   '2005': 39,
   '2014': 33,
   '1911': 278,
   '1922': 979,
   '1933': 771,
   '1935': 673,
   '1942': 688,
   '1946': 739,
   '1951': 744,
   '1955': 718,
   '1958': 552,
   '1977': 93,
   '1981': 118,
   '1990': 62,
   '1993': 84,
   '2008': 45,
   '2009': 27,
   '1910': 251,
   '1916': 757,
   '1919': 828,
   '1921': 1058,
   '1923': 982,
   '1968': 198,
   '1972': 105,
   '1986': 83,
   '1992': 63,
   '1914': 527,
   '1917': 856,
   '1932': 792,
   '1939': 712,
   '1944': 745,
   '1957': 617,
   '1960': 514,
   '1965': 273,
   '1967': 212,
   '1969': 188,
   '1978': 110,
   '1987': 64,
   '1995': 53,
   '1996': 73,
   '1998': 60,
   '2004': 51,
   '2010': 25,
   '1915': 671,
   '1918': 884,
   '1924': 952,
   '1930': 860,
   '1941': 642,
   '1945': 644,
   '1947': 812,
   '1953': 733,
   '1963': 396,
   '1966': 235,
   '1984': 98,
   '1985': 87,
   '1988': 73,
   '2011': 23,
   '1913': 437,
   '1938': 737,
   '1949': 796,
   '1956': 631,
   '1979': 89,
   '1980': 98,
   '1997': 66,
   '2002': 44,
   '2007': 31,
   '2012': 34}},
 {'_id': ObjectId('5eb51b8c9993d0175fba7430'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MA',
  'YearsCountDict': {'1916': 2754,
   '1925': 2505,
   '1938': 1275,
   '1939': 1303,
   '1941': 1439,
   '1972': 305,
   '1973': 232,
   '1974': 214,
   '1975': 208,
   '1981': 179,
   '1983': 193,
   '1995': 188,
   '1996': 164,
   '1999': 152,
   '2012': 49,
   '1914': 2497,
   '1927': 2260,
   '1945': 1418,
   '1949': 1732,
   '1969': 508,
   '1982': 196,
   '1987': 176,
   '1997': 147,
   '1998': 146,
   '2001': 130,
   '2004': 107,
   '2010': 56,
   '2013': 45,
   '2014': 52,
   '1920': 2971,
   '1931': 1797,
   '1977': 209,
   '1979': 207,
   '1988': 194,
   '1989': 193,
   '1994': 161,
   '2000': 117,
   '2003': 104,
   '2011': 50,
   '1913': 1874,
   '1923': 2778,
   '1926': 2329,
   '1928': 2152,
   '1933': 1513,
   '1934': 1506,
   '1935': 1424,
   '1937': 1365,
   '1940': 1322,
   '1942': 1623,
   '1947': 1859,
   '1950': 1710,
   '1953': 1751,
   '1955': 1881,
   '1957': 1990,
   '1959': 1754,
   '1968': 575,
   '1971': 346,
   '1986': 134,
   '1990': 194,
   '1992': 208,
   '2009': 61,
   '1917': 2818,
   '1919': 2769,
   '1922': 2847,
   '1944': 1520,
   '1961': 1555,
   '1962': 1373,
   '1964': 1251,
   '1970': 437,
   '1985': 178,
   '1991': 203,
   '2002': 135,
   '1910': 989,
   '1915': 2794,
   '1921': 3035,
   '1930': 1959,
   '1936': 1429,
   '1946': 1665,
   '1948': 1817,
   '1954': 2133,
   '1956': 1970,
   '1958': 1835,
   '1960': 1693,
   '1963': 1309,
   '1965': 1120,
   '1967': 759,
   '1976': 168,
   '1929': 1938,
   '1932': 1727,
   '1952': 1774,
   '1980': 198,
   '1984': 167,
   '2006': 70,
   '2007': 66,
   '1911': 1248,
   '1912': 1636,
   '1918': 3006,
   '1924': 2747,
   '1943': 1703,
   '1951': 1704,
   '1966': 820,
   '1978': 206,
   '1993': 162,
   '2005': 83,
   '2008': 54}},
 {'_id': ObjectId('5eb51b8c9993d0175fba82d9'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'ND',
  'YearsCountDict': {'1910': 85,
   '1915': 196,
   '1937': 182,
   '1947': 298,
   '1953': 284,
   '1987': 22,
   '1990': 12,
   '1919': 220,
   '1922': 226,
   '1926': 215,
   '1939': 199,
   '1940': 230,
   '1943': 214,
   '1946': 295,
   '1948': 315,
   '1964': 133,
   '1965': 117,
   '1968': 71,
   '1969': 66,
   '1972': 43,
   '1981': 20,
   '1995': 20,
   '1998': 10,
   '2005': 6,
   '1913': 114,
   '1927': 238,
   '1932': 200,
   '1945': 227,
   '1954': 269,
   '1958': 217,
   '1963': 143,
   '1971': 43,
   '1973': 24,
   '1977': 34,
   '1988': 21,
   '2001': 6,
   '2006': 6,
   '1911': 96,
   '1916': 271,
   '1920': 230,
   '1921': 241,
   '1929': 213,
   '1952': 269,
   '1957': 247,
   '1970': 43,
   '1975': 30,
   '1979': 23,
   '1984': 21,
   '1997': 8,
   '2004': 7,
   '2014': 5,
   '1914': 142,
   '1917': 234,
   '1933': 194,
   '1934': 202,
   '1935': 189,
   '1950': 290,
   '1956': 289,
   '1960': 202,
   '1980': 27,
   '1992': 11,
   '1996': 8,
   '2002': 9,
   '1918': 199,
   '1923': 220,
   '1924': 231,
   '1928': 216,
   '1941': 222,
   '1951': 250,
   '1959': 231,
   '1961': 182,
   '1966': 94,
   '1982': 24,
   '2012': 9,
   '1912': 100,
   '1930': 210,
   '1936': 236,
   '1938': 204,
   '1942': 228,
   '1944': 256,
   '1955': 246,
   '1962': 169,
   '1967': 64,
   '1983': 24,
   '1985': 23,
   '1991': 9,
   '1993': 15,
   '1994': 8,
   '2000': 6,
   '2003': 6,
   '2007': 7,
   '1925': 214,
   '1931': 198,
   '1949': 269,
   '1974': 32,
   '1976': 37,
   '1978': 30,
   '1986': 17,
   '1989': 13,
   '1999': 7,
   '2009': 8}},
 {'_id': ObjectId('5eb51b8c9993d0175fba8ac8'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'WY',
  'YearsCountDict': {'1937': 121,
   '1938': 105,
   '1952': 116,
   '1957': 108,
   '1961': 59,
   '1966': 48,
   '1972': 19,
   '1989': 13,
   '1998': 11,
   '2014': 5,
   '1910': 27,
   '1911': 30,
   '1923': 127,
   '1926': 116,
   '1945': 102,
   '1949': 114,
   '1955': 97,
   '1960': 84,
   '1963': 68,
   '1979': 31,
   '1980': 24,
   '2007': 7,
   '1928': 108,
   '1929': 106,
   '1941': 109,
   '1942': 96,
   '1956': 96,
   '1965': 47,
   '1975': 13,
   '1982': 28,
   '1990': 12,
   '1997': 6,
   '1917': 89,
   '1925': 132,
   '1967': 41,
   '1976': 16,
   '1978': 24,
   '1995': 5,
   '1996': 8,
   '1999': 5,
   '1919': 104,
   '1922': 130,
   '1924': 142,
   '1932': 100,
   '1933': 76,
   '1934': 105,
   '1935': 97,
   '1962': 52,
   '1968': 28,
   '1977': 17,
   '1981': 36,
   '1987': 13,
   '1988': 9,
   '1991': 13,
   '2000': 12,
   '2001': 5,
   '2002': 9,
   '2004': 5,
   '1915': 57,
   '1916': 95,
   '1930': 109,
   '1943': 103,
   '1948': 113,
   '1959': 83,
   '1973': 14,
   '1983': 24,
   '1985': 14,
   '2008': 6,
   '1920': 112,
   '1921': 149,
   '1944': 104,
   '1946': 119,
   '1947': 127,
   '1950': 137,
   '1953': 127,
   '1964': 66,
   '1974': 31,
   '1984': 17,
   '1912': 44,
   '1913': 50,
   '1914': 55,
   '1918': 106,
   '1927': 124,
   '1931': 96,
   '1936': 107,
   '1939': 95,
   '1940': 94,
   '1951': 114,
   '1954': 135,
   '1958': 93,
   '1969': 28,
   '1970': 31,
   '1971': 25,
   '1986': 18,
   '1992': 11,
   '2009': 7,
   '2010': 8}},
 {'_id': ObjectId('5eb51b8c9993d0175fba8cbf'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'AZ',
  'YearsCountDict': {'1914': 133,
   '1924': 281,
   '1940': 319,
   '1946': 385,
   '1948': 441,
   '1950': 414,
   '1961': 378,
   '1962': 360,
   '1971': 141,
   '1987': 95,
   '2004': 54,
   '2013': 32,
   '2014': 30,
   '1910': 74,
   '1918': 272,
   '1921': 266,
   '1922': 252,
   '1927': 310,
   '1931': 298,
   '1932': 288,
   '1937': 354,
   '1959': 420,
   '1967': 220,
   '1983': 103,
   '1990': 89,
   '1992': 99,
   '1935': 310,
   '1944': 383,
   '1953': 493,
   '1957': 475,
   '1964': 368,
   '1966': 246,
   '1989': 86,
   '1994': 76,
   '2001': 56,
   '2005': 40,
   '1919': 290,
   '1928': 283,
   '1930': 286,
   '1938': 334,
   '1951': 438,
   '1955': 492,
   '1958': 458,
   '1965': 279,
   '1974': 117,
   '1976': 107,
   '1978': 115,
   '1984': 81,
   '1985': 90,
   '1997': 84,
   '1999': 78,
   '2002': 64,
   '2003': 55,
   '2007': 44,
   '2010': 43,
   '1913': 103,
   '1915': 158,
   '1947': 426,
   '1963': 328,
   '1972': 121,
   '1975': 106,
   '1980': 121,
   '2011': 26,
   '1936': 297,
   '1942': 353,
   '1945': 366,
   '1949': 411,
   '1952': 472,
   '1956': 449,
   '1960': 409,
   '1981': 112,
   '1986': 89,
   '1988': 113,
   '1998': 83,
   '1911': 67,
   '1912': 100,
   '1916': 191,
   '1920': 280,
   '1923': 272,
   '1926': 301,
   '1934': 287,
   '1968': 195,
   '1969': 181,
   '1970': 169,
   '1973': 151,
   '1977': 111,
   '1979': 95,
   '1982': 108,
   '1991': 98,
   '1995': 101,
   '2009': 35,
   '1917': 218,
   '1925': 293,
   '1929': 303,
   '1933': 265,
   '1939': 347,
   '1941': 349,
   '1943': 431,
   '1954': 465,
   '1993': 106,
   '1996': 90,
   '2000': 63,
   '2006': 66,
   '2008': 54,
   '2012': 32}},
 {'_id': ObjectId('5eb51b8c9993d0175fba8cd5'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'VA',
  'YearsCountDict': {'1916': 1652,
   '1918': 1918,
   '1919': 1940,
   '1922': 2010,
   '1928': 1698,
   '1932': 1580,
   '1942': 1621,
   '1945': 1517,
   '1950': 1447,
   '1957': 1195,
   '1963': 909,
   '1983': 320,
   '1984': 310,
   '1986': 284,
   '2002': 191,
   '2009': 98,
   '1925': 1940,
   '1926': 1844,
   '1933': 1449,
   '1939': 1349,
   '1943': 1627,
   '1956': 1290,
   '1977': 327,
   '1985': 317,
   '1994': 274,
   '2006': 139,
   '2007': 115,
   '2012': 94,
   '1947': 1598,
   '1949': 1600,
   '1958': 1072,
   '1966': 774,
   '1988': 318,
   '1991': 308,
   '1997': 233,
   '2010': 90,
   '1911': 747,
   '1923': 2020,
   '1930': 1658,
   '1936': 1377,
   '1941': 1500,
   '1946': 1561,
   '1955': 1260,
   '1959': 1079,
   '1962': 945,
   '1976': 286,
   '1981': 346,
   '1982': 359,
   '1990': 327,
   '1993': 327,
   '2004': 165,
   '2005': 176,
   '2008': 122,
   '1910': 848,
   '1931': 1558,
   '1940': 1426,
   '1951': 1402,
   '1964': 896,
   '1974': 350,
   '1980': 365,
   '1987': 267,
   '1996': 249,
   '2003': 166,
   '1912': 998,
   '1921': 2060,
   '1934': 1538,
   '1937': 1452,
   '1938': 1387,
   '1948': 1487,
   '1954': 1368,
   '1967': 619,
   '1969': 508,
   '1992': 313,
   '1998': 228,
   '1999': 217,
   '2011': 99,
   '2013': 82,
   '1913': 1060,
   '1920': 2046,
   '1924': 2076,
   '1927': 1789,
   '1935': 1464,
   '1944': 1547,
   '1960': 1098,
   '1961': 985,
   '1965': 824,
   '1968': 552,
   '1970': 537,
   '1971': 543,
   '1978': 297,
   '1979': 300,
   '2014': 91,
   '1914': 1265,
   '1915': 1576,
   '1917': 1783,
   '1929': 1600,
   '1952': 1389,
   '1953': 1361,
   '1972': 395,
   '1973': 396,
   '1975': 340,
   '1989': 298,
   '1995': 251,
   '2000': 219,
   '2001': 209}},
 {'_id': ObjectId('5eb51b8c9993d0175fba8d64'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'DC',
  'YearsCountDict': {'1929': 286,
   '1954': 520,
   '1956': 433,
   '1959': 407,
   '1963': 307,
   '1967': 175,
   '1971': 102,
   '1975': 65,
   '1979': 54,
   '1983': 45,
   '1988': 38,
   '1992': 46,
   '2009': 11,
   '2010': 17,
   '1921': 362,
   '1930': 293,
   '1931': 258,
   '1934': 288,
   '1940': 365,
   '1941': 400,
   '1942': 515,
   '1945': 493,
   '1948': 519,
   '1951': 537,
   '1960': 367,
   '1985': 54,
   '1990': 51,
   '2003': 22,
   '2004': 26,
   '2007': 19,
   '2011': 16,
   '1918': 247,
   '1922': 347,
   '1926': 295,
   '1937': 330,
   '1962': 319,
   '1965': 251,
   '1969': 137,
   '1981': 60,
   '1984': 54,
   '1987': 55,
   '1991': 48,
   '1998': 28,
   '2012': 18,
   '1932': 309,
   '1933': 294,
   '1943': 546,
   '1947': 533,
   '1976': 49,
   '1982': 50,
   '1989': 23,
   '1993': 46,
   '1995': 30,
   '2001': 26,
   '2002': 27,
   '2013': 14,
   '1911': 88,
   '1912': 121,
   '1913': 136,
   '1914': 176,
   '1916': 200,
   '1917': 234,
   '1925': 346,
   '1949': 491,
   '1952': 513,
   '1953': 510,
   '1955': 526,
   '1970': 112,
   '1972': 73,
   '1974': 50,
   '2005': 16,
   '1910': 80,
   '1920': 255,
   '1935': 325,
   '1939': 367,
   '1946': 559,
   '1957': 472,
   '1958': 435,
   '1973': 50,
   '1980': 50,
   '2008': 13,
   '1915': 195,
   '1924': 345,
   '1928': 335,
   '1944': 520,
   '1961': 374,
   '1964': 287,
   '1966': 221,
   '1978': 53,
   '1994': 48,
   '1997': 19,
   '2006': 17,
   '2014': 17,
   '1919': 274,
   '1923': 350,
   '1927': 348,
   '1936': 320,
   '1938': 347,
   '1950': 520,
   '1968': 144,
   '1977': 59,
   '1986': 44,
   '1996': 26,
   '1999': 31,
   '2000': 20}}]

We can also return only part of each document:

In [23]:
c =  list(collection.find({ "Name": "Mary", "Gender": "F" }, {'YearsCountDict':0})) # Exclude 'YearsCountDict' values
c[1]
Out[23]:
{'_id': ObjectId('5eb51b889993d0175fb60b00'),
 'Gender': 'F',
 'Name': 'Mary',
 'State': 'LA'}
In [24]:
#regex query - Return only names that start with 'M' and ends with 'y'
query = { "Name": { "$regex": "^M.*y$" } } 
list(collection.find(query, {'YearsCountDict':0}))
Out[24]:
[{'_id': ObjectId('5eb51b889993d0175fb5e8a9'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5e9ea'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'NH'},
 {'_id': ObjectId('5eb51b889993d0175fb5ea0d'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b889993d0175fb5eb44'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b889993d0175fb5ebd2'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b889993d0175fb5ebd9'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb5ed4d'),
  'Gender': 'F',
  'Name': 'Mazzy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5ed56'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b889993d0175fb5edf7'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb5efaf'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f0de'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f127'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f195'),
  'Gender': 'M',
  'Name': 'Mackey',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b889993d0175fb5f1b5'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f277'),
  'Gender': 'F',
  'Name': 'Maizy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f287'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b889993d0175fb5f4bd'),
  'Gender': 'F',
  'Name': 'Maudry',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f4cc'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb5f4e2'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb5f556'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb5f59d'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f5b1'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb5f5b7'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f5f8'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b889993d0175fb5f61e'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f735'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f7ae'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb5f810'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f967'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b889993d0175fb5fa11'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb5fb11'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb5fbfd'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb5fccc'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5fd9c'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'NH'},
 {'_id': ObjectId('5eb51b889993d0175fb5ffa5'),
  'Gender': 'F',
  'Name': 'Malory',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb6004f'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb600bb'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb6018d'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb602e4'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb60364'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb603e7'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb60496'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb605fa'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b889993d0175fb6067a'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b889993d0175fb606a1'),
  'Gender': 'F',
  'Name': 'Madeley',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb60710'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b889993d0175fb60718'),
  'Gender': 'F',
  'Name': 'Melaney',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb60773'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b889993d0175fb607f5'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b889993d0175fb60812'),
  'Gender': 'F',
  'Name': 'Mackenzy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb60912'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb60a47'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb60b00'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb60b56'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb60b7e'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb60c21'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b889993d0175fb60c50'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb60ca1'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b889993d0175fb60d36'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb60e00'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b889993d0175fb60e0a'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb60eb3'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b889993d0175fb60fe6'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b889993d0175fb610ad'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb611ef'),
  'Gender': 'M',
  'Name': 'Manny',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb61246'),
  'Gender': 'F',
  'Name': 'Mallary',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb61323'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b889993d0175fb613c1'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb6144e'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb61494'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb614d9'),
  'Gender': 'M',
  'Name': 'My',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb6167d'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b889993d0175fb61682'),
  'Gender': 'F',
  'Name': 'My',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb61718'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb6172a'),
  'Gender': 'F',
  'Name': 'Mariely',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb61a56'),
  'Gender': 'M',
  'Name': 'Manny',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b889993d0175fb61adf'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b889993d0175fb61b65'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb61c6f'),
  'Gender': 'F',
  'Name': 'Meily',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb61dde'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb61e8f'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b889993d0175fb61f42'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb61f43'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b889993d0175fb61f65'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b889993d0175fb62116'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb62207'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b889993d0175fb622a4'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb622c3'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b889993d0175fb62446'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb62469'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb624fa'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb62504'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb62559'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b889993d0175fb6255c'),
  'Gender': 'F',
  'Name': 'Marshay',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb62617'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b889993d0175fb62684'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb62749'),
  'Gender': 'F',
  'Name': 'Maily',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb6279a'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb62917'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b889993d0175fb62955'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb62983'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb629c5'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb62a9d'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb62b1a'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b889993d0175fb62b4b'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb62bc4'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b889993d0175fb62d61'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b889993d0175fb62d73'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b889993d0175fb62e27'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb62e3c'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b889993d0175fb62f2e'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb630a8'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb630b3'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b889993d0175fb63133'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b889993d0175fb632c2'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb63317'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb63332'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb63360'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'VT'},
 {'_id': ObjectId('5eb51b889993d0175fb6342c'),
  'Gender': 'F',
  'Name': 'Malory',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb634fa'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb63575'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b889993d0175fb63676'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b889993d0175fb63810'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b889993d0175fb63827'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb63979'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb63985'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb63a85'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'DC'},
 {'_id': ObjectId('5eb51b889993d0175fb63aab'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb63b06'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b889993d0175fb63b94'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb63c06'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b889993d0175fb63c82'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'VT'},
 {'_id': ObjectId('5eb51b889993d0175fb63cbf'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b889993d0175fb63cdc'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b889993d0175fb63d95'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b889993d0175fb63e12'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb63ed4'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'AK'},
 {'_id': ObjectId('5eb51b889993d0175fb63f90'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb63fc5'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b889993d0175fb6400b'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb64037'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b889993d0175fb640d1'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb6410b'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb6423f'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb64317'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b889993d0175fb6438f'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb64429'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'DC'},
 {'_id': ObjectId('5eb51b889993d0175fb6450e'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b889993d0175fb64604'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb64680'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb64765'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b889993d0175fb647df'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b889993d0175fb648e2'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb649de'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb64a55'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb64a6f'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb64a85'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb64c50'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb64ccd'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb64cd4'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb64d97'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb64d9e'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b889993d0175fb64daf'),
  'Gender': 'F',
  'Name': 'Mackenzy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb64db0'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'VT'},
 {'_id': ObjectId('5eb51b889993d0175fb64e6e'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb64fef'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b889993d0175fb64ff9'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb650d3'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb65160'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b889993d0175fb653ad'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b889993d0175fb653e0'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b889993d0175fb65501'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb6550b'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b889993d0175fb6553b'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb6566a'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb65976'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b889993d0175fb65986'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b889993d0175fb65988'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb659b3'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b889993d0175fb65c5a'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b889993d0175fb65ca9'),
  'Gender': 'F',
  'Name': 'Mckinsey',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b889993d0175fb65d56'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b889993d0175fb65f3b'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b889993d0175fb65f9a'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b889993d0175fb65fb9'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b889993d0175fb6616d'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b889993d0175fb6625a'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b889993d0175fb663bb'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b889993d0175fb663e3'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b889993d0175fb66413'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b889993d0175fb664ba'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b889993d0175fb664cf'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b889993d0175fb66544'),
  'Gender': 'F',
  'Name': 'Murphy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb6658d'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb665d3'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b889993d0175fb667c5'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b889993d0175fb66815'),
  'Gender': 'F',
  'Name': 'Maizy',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b889993d0175fb6687a'),
  'Gender': 'F',
  'Name': 'Mckinsey',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb668bc'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb66a3c'),
  'Gender': 'F',
  'Name': 'Malillany',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb66b95'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb66c6c'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb66caa'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb66cca'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb66ccf'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb66ef5'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb66f24'),
  'Gender': 'F',
  'Name': 'Merrily',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb66fc1'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb67095'),
  'Gender': 'F',
  'Name': 'Mackenzy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb670b1'),
  'Gender': 'F',
  'Name': 'Mckay',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb670c9'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb670e0'),
  'Gender': 'F',
  'Name': 'Mallary',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb67168'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6718d'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb6719d'),
  'Gender': 'M',
  'Name': 'Motty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb671c9'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb67280'),
  'Gender': 'F',
  'Name': 'Merrily',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb672fb'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb67330'),
  'Gender': 'F',
  'Name': 'Malory',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb67597'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb67645'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb676a8'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb67819'),
  'Gender': 'F',
  'Name': 'Makinsey',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6784f'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb678b8'),
  'Gender': 'M',
  'Name': 'Mckay',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb67954'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb679f5'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67b99'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb67c0c'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb67c48'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67cd1'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb67cf7'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb67d70'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb67e28'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb67eba'),
  'Gender': 'M',
  'Name': 'Michaelanthony',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb67ef8'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb67f34'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb67f3f'),
  'Gender': 'M',
  'Name': 'Manley',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb6801a'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb68290'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb682c3'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb683d6'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb684d8'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb68526'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb68586'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb686c5'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6873f'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb688e4'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68908'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb689dc'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb689f1'),
  'Gender': 'F',
  'Name': 'Melaney',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68a6e'),
  'Gender': 'M',
  'Name': 'Moishy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68a89'),
  'Gender': 'M',
  'Name': 'Mikey',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68b2f'),
  'Gender': 'F',
  'Name': 'Milady',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68ba4'),
  'Gender': 'M',
  'Name': 'Mallory',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb68bdf'),
  'Gender': 'F',
  'Name': 'Majesty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68c18'),
  'Gender': 'F',
  'Name': 'Marly',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb68c2e'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb68c69'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb68c86'),
  'Gender': 'M',
  'Name': 'Matty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68ccb'),
  'Gender': 'M',
  'Name': 'Mackey',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb68d44'),
  'Gender': 'F',
  'Name': 'Mckinzy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb68d50'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb68ea7'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb68f48'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb68fd4'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb690a6'),
  'Gender': 'F',
  'Name': 'Magaby',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb690c5'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb690d8'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb690fb'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb69142'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb69220'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb69325'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb6945b'),
  'Gender': 'F',
  'Name': 'Modesty',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb69615'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb6961f'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb6965c'),
  'Gender': 'F',
  'Name': 'Meloney',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb696ad'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb6983f'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb69b4d'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb69c37'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb69c8c'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'RI'},
 {'_id': ObjectId('5eb51b899993d0175fb69e57'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb69ee8'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b899993d0175fb69f9c'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb69fff'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb6a162'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb6a16d'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'DC'},
 {'_id': ObjectId('5eb51b899993d0175fb6a170'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb6a195'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb6a21c'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb6a2e9'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb6a346'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb6a453'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb6a456'),
  'Gender': 'F',
  'Name': 'Murphy',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb6a4db'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb6a5b4'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb6a602'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb6a628'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b899993d0175fb6a640'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb6a67b'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb6a681'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb6a7b9'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb6a7dc'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb6a8e4'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb6a9c4'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb6aa91'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb6abf1'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb6ac09'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb6ad58'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb6ae18'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'AK'},
 {'_id': ObjectId('5eb51b899993d0175fb6aea0'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb6aede'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb6afc8'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb6b2e4'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb6b30d'),
  'Gender': 'F',
  'Name': 'Melanny',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb6b318'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb6b477'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb6b480'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb6b501'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb6b51f'),
  'Gender': 'F',
  'Name': 'Mazzy',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb6b5b4'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb6b6fb'),
  'Gender': 'F',
  'Name': 'Marcey',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb6b746'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb6b7d4'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb6b7f4'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb6b810'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb6b8a6'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb6b8e4'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb6b92e'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb6b9e0'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb6bac1'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb6bb1f'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb6bd62'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb6bd65'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'NH'},
 {'_id': ObjectId('5eb51b899993d0175fb6be7b'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6bf18'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6bfd6'),
  'Gender': 'F',
  'Name': 'Mey',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c081'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb6c1b0'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c253'),
  'Gender': 'F',
  'Name': 'Mackenzy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c439'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6c61d'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6c67f'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c708'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6c735'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6c84d'),
  'Gender': 'F',
  'Name': 'Maleny',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6cae1'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6cc36'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb6cdf8'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d1a0'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6d235'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb6d26b'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d2b6'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b899993d0175fb6d2cb'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6d33a'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d378'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d3e2'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6d470'),
  'Gender': 'F',
  'Name': 'Marykay',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6d60a'),
  'Gender': 'F',
  'Name': 'Marshay',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d661'),
  'Gender': 'M',
  'Name': 'Markanthony',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d669'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6d786'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb6d8cb'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6d8f3'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb6da59'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6da73'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb6da7d'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6da95'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6dbae'),
  'Gender': 'F',
  'Name': 'Mily',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6dcd8'),
  'Gender': 'F',
  'Name': 'Mily',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6df5c'),
  'Gender': 'F',
  'Name': 'Maggy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e04a'),
  'Gender': 'F',
  'Name': 'Magaby',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6e0d2'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e3d8'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6e527'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e555'),
  'Gender': 'M',
  'Name': 'Murry',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb6e572'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6e5d2'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e61c'),
  'Gender': 'F',
  'Name': 'Mellany',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e6d7'),
  'Gender': 'F',
  'Name': 'Majesty',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e7ea'),
  'Gender': 'F',
  'Name': 'Marelly',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e83f'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b899993d0175fb6e8b9'),
  'Gender': 'F',
  'Name': 'Mallary',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e92e'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b899993d0175fb6eb0b'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6ec32'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b899993d0175fb6ec3d'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6ec6a'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6ece6'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb6ed09'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6ef00'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb6ef95'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6f0f3'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6f18f'),
  'Gender': 'F',
  'Name': 'Marigny',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb6f190'),
  'Gender': 'M',
  'Name': 'Malachy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6f1e9'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb6f214'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb6f2ab'),
  'Gender': 'F',
  'Name': 'Mariely',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6f318'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6f54b'),
  'Gender': 'M',
  'Name': 'Moody',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6f602'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6f7f7'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb6f951'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb6fa25'),
  'Gender': 'F',
  'Name': 'Melanny',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6faa6'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb6faed'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b899993d0175fb6fb54'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6fbd1'),
  'Gender': 'F',
  'Name': 'Makinsey',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6fc17'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb6fc3d'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6fceb'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6fe0d'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6ff4b'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb6ff51'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb6ff61'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb6ffe3'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb70047'),
  'Gender': 'M',
  'Name': 'Macy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb7006d'),
  'Gender': 'F',
  'Name': 'My',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb700b4'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb700cf'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb70115'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb70148'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb7014a'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb70152'),
  'Gender': 'M',
  'Name': 'Macaulay',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb70173'),
  'Gender': 'M',
  'Name': 'Marcanthony',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb701eb'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb701f5'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb70203'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb70235'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb7035c'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb7039a'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb704c0'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb70831'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb7084e'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb70915'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'RI'},
 {'_id': ObjectId('5eb51b899993d0175fb7096f'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'RI'},
 {'_id': ObjectId('5eb51b899993d0175fb709b2'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb709d0'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb70a07'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb70a77'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb70afb'),
  'Gender': 'M',
  'Name': 'Manley',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb70b07'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb70b18'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb70b80'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb70b9e'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb70bdc'),
  'Gender': 'F',
  'Name': 'Maly',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb70bf4'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb70c05'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb70de3'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb70e1b'),
  'Gender': 'F',
  'Name': 'Marney',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb710ca'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'RI'},
 {'_id': ObjectId('5eb51b899993d0175fb7112c'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb711df'),
  'Gender': 'M',
  'Name': 'Manny',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb7122c'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb7129c'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb712a9'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb712cf'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb712de'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b899993d0175fb71444'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb715f3'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb715f9'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb71649'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb716f5'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7174c'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb7182a'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb71866'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb719d3'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb71b7a'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb71bbd'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb71d35'),
  'Gender': 'M',
  'Name': 'Macaulay',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb71d37'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb71fdf'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb72059'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb7207a'),
  'Gender': 'M',
  'Name': 'Murry',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb7208c'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb72123'),
  'Gender': 'M',
  'Name': 'Majesty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb72144'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb72252'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb724b3'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb725a0'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb72604'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb72734'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb72756'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb727c2'),
  'Gender': 'F',
  'Name': 'Marily',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb72873'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb72875'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb728e1'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb72926'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb72a18'),
  'Gender': 'F',
  'Name': 'Murphy',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb72a51'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb72a63'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb72b1c'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb72b35'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb72be5'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb72c78'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb72ca8'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb72cb2'),
  'Gender': 'M',
  'Name': 'Markanthony',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb72cbc'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb72d49'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb72ea9'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb72f5c'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb730ec'),
  'Gender': 'F',
  'Name': 'Maily',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7313f'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb7314c'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb732a5'),
  'Gender': 'M',
  'Name': 'Macy',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb7334c'),
  'Gender': 'M',
  'Name': 'Mckay',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb73466'),
  'Gender': 'M',
  'Name': 'Murry',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb73530'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7354b'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb73569'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb737f7'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7385f'),
  'Gender': 'F',
  'Name': 'Maebry',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb73918'),
  'Gender': 'F',
  'Name': 'Marny',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb73931'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb7398b'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb739cd'),
  'Gender': 'F',
  'Name': 'Marry',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb739f1'),
  'Gender': 'M',
  'Name': 'Mallory',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb73abf'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb73cf4'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb73e3d'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb73f4a'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb73f65'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb74214'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb7422a'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb742a9'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb74335'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb7439d'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb743f9'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb7451c'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb7451d'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb74571'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7476b'),
  'Gender': 'F',
  'Name': 'Mahaley',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb747ed'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb7487b'),
  'Gender': 'F',
  'Name': 'Marcey',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb748e4'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7490d'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7498b'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb74a8f'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb74bc4'),
  'Gender': 'F',
  'Name': 'Marly',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb74ceb'),
  'Gender': 'F',
  'Name': 'Mariely',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb74e40'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb74ee7'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb74f3d'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb74f4a'),
  'Gender': 'F',
  'Name': 'Matty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb74f71'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb74f98'),
  'Gender': 'M',
  'Name': 'Mikey',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb74ffc'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb75107'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb75109'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb7516c'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb751a1'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b899993d0175fb75208'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7526e'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb752a6'),
  'Gender': 'F',
  'Name': 'Mckenzy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb752c5'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb753eb'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb7547b'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'VT'},
 {'_id': ObjectId('5eb51b899993d0175fb7547c'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb75569'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb755a2'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb75602'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb75609'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb75673'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b899993d0175fb757b6'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7580a'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb75866'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb758d3'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb75b7e'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb75c62'),
  'Gender': 'F',
  'Name': 'Maizy',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb75c7a'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b899993d0175fb75c8f'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb75d13'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b899993d0175fb75d27'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb75d90'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb75dcc'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb75dfc'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb75e60'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb75ec0'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb75ef3'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b899993d0175fb75f56'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb75f71'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b899993d0175fb75f7f'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb760db'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb761ff'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb762dd'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb7649b'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b899993d0175fb764a5'),
  'Gender': 'F',
  'Name': 'Mellody',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb764b1'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb764b5'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb764fd'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb764ff'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb76510'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b899993d0175fb76541'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb7662d'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb76769'),
  'Gender': 'F',
  'Name': 'Mallary',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb76786'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb768af'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb768f9'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb769c4'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb769d2'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'AK'},
 {'_id': ObjectId('5eb51b899993d0175fb76b47'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb76c3b'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb76d86'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb76da8'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb76eab'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb76eb2'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb76ec1'),
  'Gender': 'F',
  'Name': 'Marykay',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb76fa2'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb77034'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb77194'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb772eb'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb773b5'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b899993d0175fb77403'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb7773d'),
  'Gender': 'F',
  'Name': 'Majesty',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb77758'),
  'Gender': 'M',
  'Name': 'Marcanthony',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb77781'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'NH'},
 {'_id': ObjectId('5eb51b899993d0175fb77802'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb77990'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb779b1'),
  'Gender': 'M',
  'Name': 'Manley',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb77a6b'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb77aab'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb77b3f'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb77ba4'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb77d3a'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb77da4'),
  'Gender': 'F',
  'Name': 'Mailey',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb77e00'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb77e90'),
  'Gender': 'M',
  'Name': 'Malachy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb77f5d'),
  'Gender': 'F',
  'Name': 'Micky',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb77fc9'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb78066'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb78088'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b899993d0175fb780fa'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb78107'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb78148'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b899993d0175fb782a9'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb7836b'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb783e7'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7848e'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb785ea'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb78606'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb786eb'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb786fa'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb78741'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb7874e'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb787f1'),
  'Gender': 'M',
  'Name': 'Murry',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb78820'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7883a'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb788ab'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb7892c'),
  'Gender': 'F',
  'Name': 'Marly',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb78b25'),
  'Gender': 'F',
  'Name': 'Marry',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb78cc0'),
  'Gender': 'F',
  'Name': 'Marry',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb78d67'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb78e77'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb78f45'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb78f5d'),
  'Gender': 'M',
  'Name': 'Mikey',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb78fe1'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb79023'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb793d4'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb79446'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7947c'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb794ef'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb79607'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb79702'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb7978e'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb797b6'),
  'Gender': 'F',
  'Name': 'Mackenzy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7999a'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb79a50'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb79e60'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb79ee3'),
  'Gender': 'F',
  'Name': 'Mily',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb79f2f'),
  'Gender': 'M',
  'Name': 'Manny',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb79f34'),
  'Gender': 'M',
  'Name': 'Murry',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b899993d0175fb7a036'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb7a0bd'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb7a113'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7a2d4'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'NH'},
 {'_id': ObjectId('5eb51b899993d0175fb7a48b'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7a4bf'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb7a525'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb7a5be'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b899993d0175fb7a6a8'),
  'Gender': 'F',
  'Name': 'Malky',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7a756'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b899993d0175fb7a7fb'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb7a9ac'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b899993d0175fb7a9be'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb7a9ce'),
  'Gender': 'M',
  'Name': 'Markanthony',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7aaa0'),
  'Gender': 'M',
  'Name': 'Morrissey',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7aab7'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7aaf5'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb7ad96'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7af5d'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b899993d0175fb7af7a'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7af9a'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b899993d0175fb7afe7'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb7b142'),
  'Gender': 'M',
  'Name': 'Marcanthony',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7b297'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b899993d0175fb7b340'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b412'),
  'Gender': 'F',
  'Name': 'Maddy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b433'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7b46f'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b899993d0175fb7b4bc'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb7b4ec'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b899993d0175fb7b51f'),
  'Gender': 'M',
  'Name': 'Molly',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b5aa'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b5c6'),
  'Gender': 'M',
  'Name': 'Macaulay',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b63e'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7b658'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b899993d0175fb7b6f4'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb7b827'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b899993d0175fb7b834'),
  'Gender': 'M',
  'Name': 'Markanthony',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb7b92f'),
  'Gender': 'F',
  'Name': 'Marshay',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb7b952'),
  'Gender': 'M',
  'Name': 'Markanthony',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7b9a7'),
  'Gender': 'F',
  'Name': 'Mitzy',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b9e8'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb7bb76'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb7bbd2'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb7bcb1'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb7be90'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb7c08d'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb7c142'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb7c19a'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7c244'),
  'Gender': 'M',
  'Name': 'Mckay',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb7c344'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb7c4fb'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb7c5a6'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7c5ce'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb7c633'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb7c6e1'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb7caa6'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7cb41'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb7cbe1'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7cd10'),
  'Gender': 'M',
  'Name': 'Manny',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb7cece'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb7cfae'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb7cfbe'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'HI'},
 {'_id': ObjectId('5eb51b899993d0175fb7cfd7'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7d1a6'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb7d279'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb7d2a4'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb7d349'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb7d408'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'VT'},
 {'_id': ObjectId('5eb51b899993d0175fb7d4b7'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb7d4e7'),
  'Gender': 'F',
  'Name': 'Merrily',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7d578'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb7d641'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb7d66f'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb7d723'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b899993d0175fb7d7e6'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'DC'},
 {'_id': ObjectId('5eb51b899993d0175fb7d952'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb7da2d'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb7dd54'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb7e15c'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'AK'},
 {'_id': ObjectId('5eb51b899993d0175fb7e278'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b899993d0175fb7e29b'),
  'Gender': 'F',
  'Name': 'Malory',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7e48c'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb7e579'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb7e58b'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'DC'},
 {'_id': ObjectId('5eb51b899993d0175fb7e5b1'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb7e636'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b899993d0175fb7e66f'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb7e6ae'),
  'Gender': 'F',
  'Name': 'Merrily',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb7e6c3'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b899993d0175fb7e6cb'),
  'Gender': 'F',
  'Name': 'Margaretmary',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7e6de'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb7e734'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'ND'},
 {'_id': ObjectId('5eb51b899993d0175fb7e760'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb7e967'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7ea23'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'SD'},
 {'_id': ObjectId('5eb51b899993d0175fb7ea8c'),
  'Gender': 'F',
  'Name': 'Marjory',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b899993d0175fb7eaec'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb7eb39'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7eba4'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b899993d0175fb7ecb1'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7eea9'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb7ef04'),
  'Gender': 'F',
  'Name': 'Marykay',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7ef2f'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb7ef7a'),
  'Gender': 'F',
  'Name': 'Malky',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7eff7'),
  'Gender': 'F',
  'Name': 'Marshay',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7f042'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb7f21c'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb7f246'),
  'Gender': 'F',
  'Name': 'Marykay',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7f440'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7f479'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7f4d4'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7f4db'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb7f524'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb7f5c7'),
  'Gender': 'M',
  'Name': 'Morty',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7f5f1'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb7f622'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7f86b'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7f965'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb7f97c'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7f98c'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7fcba'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb7fcd1'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb7fd0e'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb7fd4d'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb7fed7'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7ff24'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb7ffef'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb80017'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'KY'},
 {'_id': ObjectId('5eb51b899993d0175fb8004e'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb80136'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb803a5'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb803be'),
  'Gender': 'F',
  'Name': 'Meleny',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb804a5'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb8051c'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb805b3'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb80693'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb8069e'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb806ab'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb806cf'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb8070b'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb8070e'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb80813'),
  'Gender': 'F',
  'Name': 'Mckinsey',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb8084e'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb80a61'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb80ab0'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb80b5e'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'AL'},
 {'_id': ObjectId('5eb51b899993d0175fb80b70'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb80be2'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb80c95'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'UT'},
 {'_id': ObjectId('5eb51b899993d0175fb80cd1'),
  'Gender': 'F',
  'Name': 'Milady',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb80d29'),
  'Gender': 'F',
  'Name': 'Merrily',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb80e10'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb80e46'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b899993d0175fb80f30'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb80f7c'),
  'Gender': 'F',
  'Name': 'Mileidy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb80fad'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b899993d0175fb81060'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb81085'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'MS'},
 {'_id': ObjectId('5eb51b899993d0175fb81148'),
  'Gender': 'F',
  'Name': 'Mckenzy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb81371'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b899993d0175fb81636'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb81646'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb816f1'),
  'Gender': 'F',
  'Name': 'Marly',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb81861'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb81890'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b899993d0175fb81896'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b899993d0175fb818d6'),
  'Gender': 'M',
  'Name': 'Murry',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb81959'),
  'Gender': 'M',
  'Name': 'Mckay',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b899993d0175fb819fb'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb81a44'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'ME'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81b78'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81bbe'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81c5c'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81cd7'),
  'Gender': 'F',
  'Name': 'Mendy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81d29'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81dc9'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81f75'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81fca'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb81fe6'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'KS'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82008'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8203d'),
  'Gender': 'F',
  'Name': 'Mahogany',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb820b1'),
  'Gender': 'F',
  'Name': 'Mitzy',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82141'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8225b'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82286'),
  'Gender': 'M',
  'Name': 'Micky',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82463'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'WI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82482'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8251c'),
  'Gender': 'M',
  'Name': 'Mccoy',
  'State': 'ID'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82540'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'TN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb825d7'),
  'Gender': 'M',
  'Name': 'Malachy',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb826ef'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'NM'},
 {'_id': ObjectId('5eb51b8a9993d0175fb827bc'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb827eb'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb828f7'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb829b8'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb829c9'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb829fa'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82a69'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82a72'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82b70'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82b91'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82b95'),
  'Gender': 'M',
  'Name': 'Mallory',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82ba6'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82c5e'),
  'Gender': 'F',
  'Name': 'Merikay',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82cdb'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82d20'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82d40'),
  'Gender': 'F',
  'Name': 'May',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82d75'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82d8b'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82edd'),
  'Gender': 'F',
  'Name': 'Misty',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82f2e'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82f77'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82fa9'),
  'Gender': 'F',
  'Name': 'Melony',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb82ff7'),
  'Gender': 'F',
  'Name': 'Melanny',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb830fa'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'WY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb831a2'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb832fc'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'AZ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83313'),
  'Gender': 'F',
  'Name': 'Marykay',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8332c'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'RI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb833af'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b8a9993d0175fb833b4'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'DC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83530'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8359a'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8373d'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8376e'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83818'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8386a'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'VA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb838a6'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb838f8'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'NE'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83961'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb839b4'),
  'Gender': 'M',
  'Name': 'Montgomery',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb839bc'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b8a9993d0175fb839c2'),
  'Gender': 'F',
  'Name': 'Makinley',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83a18'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83ae9'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83bbb'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'CO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83c5f'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83cdb'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83d7b'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83da3'),
  'Gender': 'F',
  'Name': 'Mabry',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83e42'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'AK'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83e4d'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'WV'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83ed2'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb83fff'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'VT'},
 {'_id': ObjectId('5eb51b8a9993d0175fb841d4'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8438e'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'MT'},
 {'_id': ObjectId('5eb51b8a9993d0175fb843f9'),
  'Gender': 'F',
  'Name': 'Marty',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84458'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb844f6'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b8a9993d0175fb844fe'),
  'Gender': 'M',
  'Name': 'Marty',
  'State': 'MO'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8458c'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'OR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84691'),
  'Gender': 'F',
  'Name': 'My',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb846de'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'CT'},
 {'_id': ObjectId('5eb51b8a9993d0175fb846f1'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'MN'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84970'),
  'Gender': 'M',
  'Name': 'Manley',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84a27'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84a97'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84bf4'),
  'Gender': 'F',
  'Name': 'Mindy',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84c12'),
  'Gender': 'F',
  'Name': 'Milly',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84e28'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84e3c'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84e8d'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84e9b'),
  'Gender': 'F',
  'Name': 'Marely',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84f8c'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84f97'),
  'Gender': 'M',
  'Name': 'Mckinley',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8517a'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb851c5'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85201'),
  'Gender': 'M',
  'Name': 'Mary',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb852eb'),
  'Gender': 'F',
  'Name': 'Magaly',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85300'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85349'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb853cf'),
  'Gender': 'F',
  'Name': 'Missy',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85612'),
  'Gender': 'F',
  'Name': 'Magally',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8564f'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb856a8'),
  'Gender': 'M',
  'Name': 'Murray',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85749'),
  'Gender': 'F',
  'Name': 'Miley',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8581d'),
  'Gender': 'F',
  'Name': 'Malillany',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8599e'),
  'Gender': 'F',
  'Name': 'Merary',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb859ef'),
  'Gender': 'F',
  'Name': 'Mayerly',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85a01'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85a2b'),
  'Gender': 'F',
  'Name': 'Marleny',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85c1b'),
  'Gender': 'M',
  'Name': 'Monty',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85cd7'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'LA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85e3c'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85e6a'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85e89'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85e9e'),
  'Gender': 'F',
  'Name': 'Melody',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85ee9'),
  'Gender': 'F',
  'Name': 'Mandy',
  'State': 'IA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85ef6'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85f05'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85f6c'),
  'Gender': 'M',
  'Name': 'Marley',
  'State': 'SC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb861bc'),
  'Gender': 'F',
  'Name': 'Mary',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb861fd'),
  'Gender': 'F',
  'Name': 'Mickey',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86220'),
  'Gender': 'F',
  'Name': 'Marcey',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb862bc'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8648f'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb864e6'),
  'Gender': 'M',
  'Name': 'Mckay',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb865a2'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8669c'),
  'Gender': 'M',
  'Name': 'Marcanthony',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb866af'),
  'Gender': 'F',
  'Name': 'Mercy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86715'),
  'Gender': 'F',
  'Name': 'Merrily',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86889'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb869ab'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb869ec'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86a73'),
  'Gender': 'F',
  'Name': 'Macy',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86bcd'),
  'Gender': 'M',
  'Name': 'Mosby',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86d74'),
  'Gender': 'M',
  'Name': 'Matvey',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86d90'),
  'Gender': 'F',
  'Name': 'Margy',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86ed8'),
  'Gender': 'F',
  'Name': 'Melanny',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86eff'),
  'Gender': 'M',
  'Name': 'Maury',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86fb8'),
  'Gender': 'F',
  'Name': 'Marcy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb86fc9'),
  'Gender': 'F',
  'Name': 'Mallory',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87006'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8709e'),
  'Gender': 'F',
  'Name': 'Macey',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87239'),
  'Gender': 'F',
  'Name': 'Molly',
  'State': 'DE'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8727d'),
  'Gender': 'F',
  'Name': 'Margery',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87501'),
  'Gender': 'F',
  'Name': 'Melany',
  'State': 'NV'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87558'),
  'Gender': 'M',
  'Name': 'Mickey',
  'State': 'AR'},
 {'_id': ObjectId('5eb51b8a9993d0175fb875c4'),
  'Gender': 'F',
  'Name': 'Maizy',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87607'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87630'),
  'Gender': 'F',
  'Name': 'Maisy',
  'State': 'WA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb876a7'),
  'Gender': 'F',
  'Name': 'Marley',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b8a9993d0175fb877e0'),
  'Gender': 'M',
  'Name': 'Murphy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8782e'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8783d'),
  'Gender': 'F',
  'Name': 'Merry',
  'State': 'MD'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87845'),
  'Gender': 'F',
  'Name': 'Mckinley',
  'State': 'OK'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8789a'),
  'Gender': 'F',
  'Name': 'Makenzy',
  'State': 'OK'},
 ...]
In [25]:
# search in nested documents
# returns all baby names that in 1990 more than 4000 babies were born with this name
query = { "YearsCountDict.1990":   {"$gt": 4000}}  
list(collection.find(query, {'YearsCountDict':0}))
Out[25]:
[{'_id': ObjectId('5eb51b889993d0175fb5fbeb'),
  'Gender': 'M',
  'Name': 'David',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb66eb7'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67e7f'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6bfd9'),
  'Gender': 'M',
  'Name': 'Anthony',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c2b7'),
  'Gender': 'F',
  'Name': 'Ashley',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6da47'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e54b'),
  'Gender': 'M',
  'Name': 'Andrew',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb78eaa'),
  'Gender': 'F',
  'Name': 'Jessica',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8522b'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8607a'),
  'Gender': 'M',
  'Name': 'Jose',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8d57b'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb93e75'),
  'Gender': 'M',
  'Name': 'Daniel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9e653'),
  'Gender': 'M',
  'Name': 'Jonathan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba07e0'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba13d9'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'CA'}]

We can also sort the return results, or limit the number of results:

In [31]:
query = { "YearsCountDict.2000":   {"$gt": 1000}}  
list(collection.find(query, {'YearsCountDict':0}).sort("Name"))
Out[31]:
[{'_id': ObjectId('5eb51b899993d0175fb78593'),
  'Gender': 'M',
  'Name': 'Aaron',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8bd8c'),
  'Gender': 'F',
  'Name': 'Abigail',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87b78'),
  'Gender': 'M',
  'Name': 'Adrian',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb88b2d'),
  'Gender': 'M',
  'Name': 'Alejandro',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7fb10'),
  'Gender': 'M',
  'Name': 'Alexander',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb99010'),
  'Gender': 'M',
  'Name': 'Alexander',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9eee7'),
  'Gender': 'M',
  'Name': 'Alexander',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb79496'),
  'Gender': 'F',
  'Name': 'Alexandra',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb79893'),
  'Gender': 'F',
  'Name': 'Alexis',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8c926'),
  'Gender': 'F',
  'Name': 'Alexis',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb9224b'),
  'Gender': 'F',
  'Name': 'Alyssa',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba531e'),
  'Gender': 'F',
  'Name': 'Alyssa',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8588f'),
  'Gender': 'F',
  'Name': 'Amanda',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5eb84'),
  'Gender': 'F',
  'Name': 'Andrea',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb67f92'),
  'Gender': 'M',
  'Name': 'Andrew',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb69171'),
  'Gender': 'M',
  'Name': 'Andrew',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb6e54b'),
  'Gender': 'M',
  'Name': 'Andrew',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9a759'),
  'Gender': 'M',
  'Name': 'Andrew',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb68f9a'),
  'Gender': 'M',
  'Name': 'Angel',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb939c0'),
  'Gender': 'M',
  'Name': 'Angel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb628b2'),
  'Gender': 'M',
  'Name': 'Anthony',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb6bfd9'),
  'Gender': 'M',
  'Name': 'Anthony',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7f775'),
  'Gender': 'M',
  'Name': 'Anthony',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb940a7'),
  'Gender': 'M',
  'Name': 'Anthony',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8b9993d0175fb99479'),
  'Gender': 'M',
  'Name': 'Anthony',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb60eea'),
  'Gender': 'F',
  'Name': 'Ashley',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb66f07'),
  'Gender': 'F',
  'Name': 'Ashley',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6c2b7'),
  'Gender': 'F',
  'Name': 'Ashley',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7eea0'),
  'Gender': 'F',
  'Name': 'Ashley',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb66ce3'),
  'Gender': 'M',
  'Name': 'Austin',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85260'),
  'Gender': 'M',
  'Name': 'Austin',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb78b9e'),
  'Gender': 'M',
  'Name': 'Benjamin',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7f535'),
  'Gender': 'M',
  'Name': 'Brandon',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb88554'),
  'Gender': 'M',
  'Name': 'Brandon',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb924e4'),
  'Gender': 'M',
  'Name': 'Brandon',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb990cf'),
  'Gender': 'M',
  'Name': 'Brandon',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9e000'),
  'Gender': 'M',
  'Name': 'Brian',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb80993'),
  'Gender': 'F',
  'Name': 'Brianna',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb860df'),
  'Gender': 'F',
  'Name': 'Brianna',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb98696'),
  'Gender': 'F',
  'Name': 'Brianna',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb7ac9c'),
  'Gender': 'M',
  'Name': 'Bryan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e612'),
  'Gender': 'M',
  'Name': 'Cameron',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb88732'),
  'Gender': 'M',
  'Name': 'Carlos',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9a506'),
  'Gender': 'M',
  'Name': 'Carlos',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb66419'),
  'Gender': 'M',
  'Name': 'Christian',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6c805'),
  'Gender': 'M',
  'Name': 'Christian',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb92d0e'),
  'Gender': 'M',
  'Name': 'Christian',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb5f030'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb73a20'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb797a0'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7dd5b'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8d57b'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8b9993d0175fba13d9'),
  'Gender': 'M',
  'Name': 'Christopher',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6de60'),
  'Gender': 'M',
  'Name': 'Daniel',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8dcbe'),
  'Gender': 'M',
  'Name': 'Daniel',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb93e75'),
  'Gender': 'M',
  'Name': 'Daniel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9eb0b'),
  'Gender': 'M',
  'Name': 'Daniel',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8c9993d0175fba5ce6'),
  'Gender': 'M',
  'Name': 'Daniel',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b889993d0175fb5fbeb'),
  'Gender': 'M',
  'Name': 'David',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6f9f2'),
  'Gender': 'M',
  'Name': 'David',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb721b6'),
  'Gender': 'M',
  'Name': 'David',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb72f48'),
  'Gender': 'M',
  'Name': 'David',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb9066a'),
  'Gender': 'F',
  'Name': 'Destiny',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f480'),
  'Gender': 'M',
  'Name': 'Dylan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb66203'),
  'Gender': 'M',
  'Name': 'Dylan',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb5f236'),
  'Gender': 'M',
  'Name': 'Eduardo',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba13d5'),
  'Gender': 'F',
  'Name': 'Elizabeth',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8c9993d0175fba6b09'),
  'Gender': 'F',
  'Name': 'Elizabeth',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6fca2'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb73214'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb777d5'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb835e2'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8877c'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8b9993d0175fba3d1d'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8c9993d0175fba5f2e'),
  'Gender': 'F',
  'Name': 'Emily',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9e04c'),
  'Gender': 'F',
  'Name': 'Emma',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84f3d'),
  'Gender': 'M',
  'Name': 'Eric',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f910'),
  'Gender': 'M',
  'Name': 'Ethan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba3c71'),
  'Gender': 'M',
  'Name': 'Ethan',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6cf8b'),
  'Gender': 'M',
  'Name': 'Gabriel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb6226b'),
  'Gender': 'F',
  'Name': 'Grace',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb700f5'),
  'Gender': 'F',
  'Name': 'Hannah',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb73fd6'),
  'Gender': 'F',
  'Name': 'Hannah',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb74419'),
  'Gender': 'F',
  'Name': 'Hannah',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb78e67'),
  'Gender': 'F',
  'Name': 'Hannah',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb91a60'),
  'Gender': 'F',
  'Name': 'Hannah',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb84e9a'),
  'Gender': 'M',
  'Name': 'Isaac',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6e0b6'),
  'Gender': 'F',
  'Name': 'Isabella',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb909db'),
  'Gender': 'M',
  'Name': 'Isaiah',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb69e00'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'IN'},
 {'_id': ObjectId('5eb51b899993d0175fb6a72e'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb71add'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb790ad'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8867b'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8e589'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b8a9993d0175fb914d9'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba1412'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8b9993d0175fba3ca9'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8b9993d0175fba4aff'),
  'Gender': 'M',
  'Name': 'Jacob',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb94303'),
  'Gender': 'F',
  'Name': 'Jacqueline',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7322e'),
  'Gender': 'M',
  'Name': 'James',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87540'),
  'Gender': 'M',
  'Name': 'James',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8ccff'),
  'Gender': 'M',
  'Name': 'James',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6c017'),
  'Gender': 'F',
  'Name': 'Jasmine',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb79871'),
  'Gender': 'M',
  'Name': 'Jason',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5fc1f'),
  'Gender': 'F',
  'Name': 'Jennifer',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba4ab0'),
  'Gender': 'F',
  'Name': 'Jennifer',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67729'),
  'Gender': 'F',
  'Name': 'Jessica',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb78eaa'),
  'Gender': 'F',
  'Name': 'Jessica',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7f48f'),
  'Gender': 'F',
  'Name': 'Jessica',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b889993d0175fb607dc'),
  'Gender': 'M',
  'Name': 'Jesus',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8bcc0'),
  'Gender': 'M',
  'Name': 'Jesus',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb63f84'),
  'Gender': 'M',
  'Name': 'John',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b889993d0175fb668d4'),
  'Gender': 'M',
  'Name': 'John',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb803b2'),
  'Gender': 'M',
  'Name': 'John',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9fa17'),
  'Gender': 'M',
  'Name': 'John',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb69442'),
  'Gender': 'M',
  'Name': 'Jonathan',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb931b0'),
  'Gender': 'M',
  'Name': 'Jonathan',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8b9993d0175fb982cc'),
  'Gender': 'M',
  'Name': 'Jonathan',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9e653'),
  'Gender': 'M',
  'Name': 'Jonathan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6deeb'),
  'Gender': 'M',
  'Name': 'Jordan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8827b'),
  'Gender': 'M',
  'Name': 'Jorge',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb67387'),
  'Gender': 'M',
  'Name': 'Jose',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8607a'),
  'Gender': 'M',
  'Name': 'Jose',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb610b3'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb694c3'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6dbaf'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb771d5'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb98d5b'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8b9993d0175fb99f8a'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9f4ff'),
  'Gender': 'M',
  'Name': 'Joseph',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb656e0'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb68364'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb690af'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb70cd6'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'MI'},
 {'_id': ObjectId('5eb51b899993d0175fb78d60'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb805d9'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8a9993d0175fb882c0'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b8a9993d0175fb9190a'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8b9993d0175fba07e0'),
  'Gender': 'M',
  'Name': 'Joshua',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d36d'),
  'Gender': 'M',
  'Name': 'Juan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba4565'),
  'Gender': 'M',
  'Name': 'Juan',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb78396'),
  'Gender': 'F',
  'Name': 'Julia',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6fdec'),
  'Gender': 'M',
  'Name': 'Julian',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb61bd7'),
  'Gender': 'M',
  'Name': 'Justin',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba4ae9'),
  'Gender': 'M',
  'Name': 'Justin',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8b9993d0175fba57f4'),
  'Gender': 'M',
  'Name': 'Justin',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6e633'),
  'Gender': 'F',
  'Name': 'Kayla',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba5837'),
  'Gender': 'F',
  'Name': 'Kayla',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6e524'),
  'Gender': 'M',
  'Name': 'Kevin',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7334e'),
  'Gender': 'M',
  'Name': 'Kevin',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb7420a'),
  'Gender': 'M',
  'Name': 'Kevin',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb5fc20'),
  'Gender': 'F',
  'Name': 'Kimberly',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9e51a'),
  'Gender': 'M',
  'Name': 'Kyle',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8738a'),
  'Gender': 'F',
  'Name': 'Lauren',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8db27'),
  'Gender': 'F',
  'Name': 'Lauren',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb6cbde'),
  'Gender': 'F',
  'Name': 'Leslie',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6d054'),
  'Gender': 'M',
  'Name': 'Luis',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7f553'),
  'Gender': 'M',
  'Name': 'Luis',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67e4c'),
  'Gender': 'F',
  'Name': 'Madison',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb68fbd'),
  'Gender': 'F',
  'Name': 'Madison',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb93a3a'),
  'Gender': 'F',
  'Name': 'Madison',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb80676'),
  'Gender': 'F',
  'Name': 'Maria',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9d7d2'),
  'Gender': 'F',
  'Name': 'Maria',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb62bed'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7330e'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb746a4'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7a2a1'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb7be9f'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8522b'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8d039'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb94681'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b8a9993d0175fb95eff'),
  'Gender': 'M',
  'Name': 'Matthew',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb9296e'),
  'Gender': 'F',
  'Name': 'Megan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9efe5'),
  'Gender': 'F',
  'Name': 'Melissa',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb625c6'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb66eb7'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67e7f'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6888c'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb6da47'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7a96c'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb7ac87'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb7c849'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7d025'),
  'Gender': 'M',
  'Name': 'Michael',
  'State': 'MA'},
 {'_id': ObjectId('5eb51b8b9993d0175fba0223'),
  'Gender': 'F',
  'Name': 'Michelle',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c10d'),
  'Gender': 'M',
  'Name': 'Miguel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb60fce'),
  'Gender': 'F',
  'Name': 'Natalie',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8addd'),
  'Gender': 'M',
  'Name': 'Nathan',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9de39'),
  'Gender': 'M',
  'Name': 'Nathan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb60df6'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b889993d0175fb617b6'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'NJ'},
 {'_id': ObjectId('5eb51b899993d0175fb6c177'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7946d'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'IL'},
 {'_id': ObjectId('5eb51b899993d0175fb7e98f'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb7ede2'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb801aa'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8b9993d0175fb98b11'),
  'Gender': 'M',
  'Name': 'Nicholas',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87444'),
  'Gender': 'F',
  'Name': 'Nicole',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb925c0'),
  'Gender': 'M',
  'Name': 'Noah',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6fcbe'),
  'Gender': 'M',
  'Name': 'Oscar',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7adcd'),
  'Gender': 'F',
  'Name': 'Rachel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9fe1e'),
  'Gender': 'M',
  'Name': 'Richard',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb855b4'),
  'Gender': 'M',
  'Name': 'Robert',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8a2ba'),
  'Gender': 'M',
  'Name': 'Robert',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b899993d0175fb67ce8'),
  'Gender': 'M',
  'Name': 'Ryan',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb79e90'),
  'Gender': 'M',
  'Name': 'Ryan',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9a1a4'),
  'Gender': 'M',
  'Name': 'Ryan',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8c9993d0175fba8473'),
  'Gender': 'M',
  'Name': 'Ryan',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b899993d0175fb67056'),
  'Gender': 'F',
  'Name': 'Samantha',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb81709'),
  'Gender': 'F',
  'Name': 'Samantha',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb92a32'),
  'Gender': 'F',
  'Name': 'Samantha',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb66f35'),
  'Gender': 'M',
  'Name': 'Samuel',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9f55c'),
  'Gender': 'M',
  'Name': 'Samuel',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb60e3a'),
  'Gender': 'F',
  'Name': 'Sarah',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb67a6f'),
  'Gender': 'F',
  'Name': 'Sarah',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb815e5'),
  'Gender': 'F',
  'Name': 'Sarah',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb87a24'),
  'Gender': 'M',
  'Name': 'Sebastian',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5fc0f'),
  'Gender': 'F',
  'Name': 'Sophia',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb85971'),
  'Gender': 'F',
  'Name': 'Stephanie',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7b882'),
  'Gender': 'M',
  'Name': 'Steven',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb71a5f'),
  'Gender': 'F',
  'Name': 'Taylor',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb9127d'),
  'Gender': 'F',
  'Name': 'Taylor',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb68e07'),
  'Gender': 'M',
  'Name': 'Thomas',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb9064c'),
  'Gender': 'M',
  'Name': 'Thomas',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb6cb88'),
  'Gender': 'M',
  'Name': 'Tyler',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb75106'),
  'Gender': 'M',
  'Name': 'Tyler',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b899993d0175fb7ca6a'),
  'Gender': 'M',
  'Name': 'Tyler',
  'State': 'PA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8b3db'),
  'Gender': 'M',
  'Name': 'Tyler',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8a9993d0175fb93872'),
  'Gender': 'M',
  'Name': 'Tyler',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b8b9993d0175fba3356'),
  'Gender': 'M',
  'Name': 'Tyler',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb6c5f8'),
  'Gender': 'F',
  'Name': 'Vanessa',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb88d95'),
  'Gender': 'M',
  'Name': 'Victor',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b889993d0175fb5f697'),
  'Gender': 'F',
  'Name': 'Victoria',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb7f4bd'),
  'Gender': 'F',
  'Name': 'Victoria',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b889993d0175fb65cc9'),
  'Gender': 'M',
  'Name': 'William',
  'State': 'GA'},
 {'_id': ObjectId('5eb51b899993d0175fb6c672'),
  'Gender': 'M',
  'Name': 'William',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b899993d0175fb729fd'),
  'Gender': 'M',
  'Name': 'William',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8a72f'),
  'Gender': 'M',
  'Name': 'William',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9dec1'),
  'Gender': 'M',
  'Name': 'William',
  'State': 'NC'},
 {'_id': ObjectId('5eb51b889993d0175fb5ef4c'),
  'Gender': 'M',
  'Name': 'Zachary',
  'State': 'FL'},
 {'_id': ObjectId('5eb51b899993d0175fb672cc'),
  'Gender': 'M',
  'Name': 'Zachary',
  'State': 'NY'},
 {'_id': ObjectId('5eb51b899993d0175fb683c1'),
  'Gender': 'M',
  'Name': 'Zachary',
  'State': 'TX'},
 {'_id': ObjectId('5eb51b8b9993d0175fb982df'),
  'Gender': 'M',
  'Name': 'Zachary',
  'State': 'OH'},
 {'_id': ObjectId('5eb51b8b9993d0175fb9f34a'),
  'Gender': 'M',
  'Name': 'Zachary',
  'State': 'CA'}]
In [32]:
query = { "YearsCountDict.2000":   {"$gt": 1000}}  
list(collection.find(query, {'YearsCountDict':0}).sort("Name").limit(2))
Out[32]:
[{'_id': ObjectId('5eb51b899993d0175fb78593'),
  'Gender': 'M',
  'Name': 'Aaron',
  'State': 'CA'},
 {'_id': ObjectId('5eb51b8a9993d0175fb8bd8c'),
  'Gender': 'F',
  'Name': 'Abigail',
  'State': 'TX'}]

2. Visualizing Geographic Data

2.1 Working with Cartopy

In this section, we are going to use Cartopy:

We can easily use Cartopy to plot various map projections:

In [35]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline

plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.PlateCarree())
print(f"ax type: {type(ax)}")
ax.coastlines()
ax type: <class 'cartopy.mpl.geoaxes.GeoAxesSubplot'>
Out[35]:
<cartopy.mpl.feature_artist.FeatureArtist at 0xa1a3d9c50>
In [36]:
plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.InterruptedGoodeHomolosine())
print(f"ax type: {type(ax)}")
ax.coastlines()
ax.stock_img() # Add a standard image to the map -> add some colors :)
ax type: <class 'cartopy.mpl.geoaxes.GeoAxesSubplot'>
Out[36]:
<matplotlib.image.AxesImage at 0xa1c663940>

Let's add some data to the map:

In [37]:
plt.figure(figsize=(20, 20))
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))
ax.set_extent([-20, 20, 40, 60])  # Select a specific part of the map
ax.coastlines(resolution='10m', color='black', linewidth=1) # draw with batter coaslines resolution
ax.stock_img() # add colors 

london_lon, london_lat = 0.1278, 51.5074
plt.plot(london_lon, london_lat,
         color='blue', marker='o', markersize=8
         )
ax.text(0.1278, 52, 'London', fontsize=14)
Out[37]:
Text(0.1278, 52, 'London')

Let's plot a map with all the capital cities' names:

In [38]:
import turicreate as tc


!wget -O ./datasets/country-capitals.csv http://techslides.com/demos/country-capitals.csv
sf = tc.SFrame.read_csv("./datasets/country-capitals.csv", error_bad_lines=False)
sf
--2020-05-08 11:58:21--  http://techslides.com/demos/country-capitals.csv
Resolving techslides.com (techslides.com)... 107.170.15.66
Connecting to techslides.com (techslides.com)|107.170.15.66|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13643 (13K) [application/octet-stream]
Saving to: ‘./datasets/country-capitals.csv’

./datasets/country- 100%[===================>]  13.32K  --.-KB/s    in 0s      

2020-05-08 11:58:22 (129 MB/s) - ‘./datasets/country-capitals.csv’ saved [13643/13643]

Unexpected characters after last column. "Central America"
Parse failed at token ending at: 
	United States,Washington, D.C.,38.883333,-77.000000,US,Central America^
Successfully parsed 6 tokens: 
	0: United States
	1: Washington
	2: D.C.
	3: 38.8833
	4: -77
	5: US
Unexpected characters after last column. "Australia"
Parse failed at token ending at: 
	US Minor Outlying Islands,Washington, D.C.,38.883333,-77.000000,UM,Australia^
Successfully parsed 6 tokens: 
	0: US Minor O ... ng Islands
	1: Washington
	2: D.C.
	3: 38.8833
	4: -77
	5: UM
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,str,float,float,str,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
2 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/country-capitals.csv
Parsing completed. Parsed 100 lines in 0.027236 secs.
Unable to interpret "D.C." as a integer
Parse failed at token ending at: 
	United States,Washington, D.C.,^38.883333,-77.000000,US,Central America
Successfully parsed 2 tokens: 
	0: United States
	1: Washington
Unable to interpret "D.C." as a integer
Parse failed at token ending at: 
	US Minor Outlying Islands,Washington, D.C.,^38.883333,-77.000000,UM,Australia
Successfully parsed 2 tokens: 
	0: US Minor O ... ng Islands
	1: Washington
2 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/country-capitals.csv
Parsing completed. Parsed 243 lines in 0.006348 secs.
Out[38]:
CountryName CapitalName CapitalLatitude CapitalLongitude CountryCode ContinentName
Somaliland Hargeisa 9.55 44.05 NULL Africa
South Georgia and South
Sandwich Islands ...
King Edward Point -54.283333 -36.5 GS Antarctica
French Southern and
Antarctic Lands ...
Port-aux-Français -49.35 70.216667 TF Antarctica
Palestine Jerusalem 31.766666666666666 35.233333 PS Asia
Aland Islands Mariehamn 60.116667 19.9 AX Europe
Nauru Yaren -0.5477 166.920867 NR Australia
Saint Martin Marigot 18.0731 -63.0822 MF North America
Tokelau Atafu -9.166667 -171.833333 TK Australia
Western Sahara El-Aaiún 27.153611 -13.203333 EH Africa
Afghanistan Kabul 34.516666666666666 69.183333 AF Asia
[243 rows x 6 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [39]:
def draw_map(w_size=30, h_size=30):
    plt.figure(figsize=(w_size, h_size))
    ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))
    ax.coastlines(resolution='10m', color='black', linewidth=1) # draw with batter coaslines resolution
    return ax
ax = draw_map()
for r in sf:
    lon, lat, name = r['CapitalLongitude'], r['CapitalLatitude'], r['CapitalName']
    plt.plot(lon, lat,
         color='black', marker='o', markersize=4,transform=ccrs.PlateCarree(),
         )
    ax.text(lon, lat+0.2, name, fontsize=8, color="blue", transform=ccrs.PlateCarree(),)

ax.stock_img() # add colors
Out[39]:
<matplotlib.image.AxesImage at 0xa1e1215c0>
In [40]:
r = sf[sf['CapitalName'] == 'Canberra'][0]
canb_long, canb_lat = r['CapitalLongitude'], r['CapitalLatitude']
r = sf[sf['CapitalName'] == 'London'][0]
lon_long, lon_lat = r['CapitalLongitude'], r['CapitalLatitude']

ax = draw_map(20,40)

plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
         color='blue', linewidth=2, marker='o',
         )
ax.text(canb_long, canb_lat+0.5, "Canberra", fontsize=16, color="red", transform=ccrs.PlateCarree())
ax.text(lon_long, lon_lat+0.5, "London", fontsize=16, color="red", transform=ccrs.PlateCarree())
Out[40]:
Text(-0.083333, 52.0, 'London')

Let's plot a line connecting London and Canberra:

In [41]:
r = sf[sf['CapitalName'] == 'Canberra'][0]
canb_long, canb_lat = r['CapitalLongitude'], r['CapitalLatitude']
r = sf[sf['CapitalName'] == 'London'][0]
lon_long, lon_lat = r['CapitalLongitude'], r['CapitalLatitude']

ax = draw_map(20,40)

plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
         color='blue', linewidth=2, marker='o',
         transform=ccrs.Geodetic(),
         )
ax.text(canb_long, canb_lat+0.5, "Canberra", fontsize=16, color="red", transform=ccrs.PlateCarree())
ax.text(lon_long, lon_lat+0.5, "London", fontsize=16, color="red", transform=ccrs.PlateCarree())
Out[41]:
Text(-0.083333, 52.0, 'London')

Let's draw another connecting line:

In [42]:
ax = draw_map(20,40)

plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
         color='blue', linewidth=2, marker='o',
         transform=ccrs.Geodetic(),
         )
ax.text(canb_long, canb_lat+0.5, "Canberra", fontsize=16, color="red", transform=ccrs.PlateCarree())
ax.text(lon_long, lon_lat+0.5, "London", fontsize=16, color="red", transform=ccrs.PlateCarree())

plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
         color='gray', linestyle='--',
         transform=ccrs.PlateCarree(),
         )
Out[42]:
[<matplotlib.lines.Line2D at 0xa22f8bc18>]

We can add additional features to each map:

In [43]:
import cartopy.feature as cfeature

fig = plt.figure(figsize=(30,30))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
ax.stock_img()

# Create a feature for States from Natural Earth. See https://www.naturalearthdata.com/features/
states_provinces = cfeature.NaturalEarthFeature(
    category='cultural',
    name='admin_1_states_provinces_lines',
    scale='10m',
    )
#it is possible to add land, rivers,lakes, borders,costlines features
ax.add_feature(cfeature.LAND)
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.RIVERS)
ax.add_feature(states_provinces, edgecolor='gray')
Out[43]:
<cartopy.mpl.feature_artist.FeatureArtist at 0xa23279128>

2.1.1 The US Elections

Let move to look at a map of the US states:

In [52]:
import cartopy.io.shapereader as shpreader

fig = plt.figure(figsize=(30,30))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.LambertConformal())
ax.set_extent([-125, -66.5, 20, 50], crs= ccrs.Geodetic())

ax.background_patch.set_visible(False)
ax.outline_patch.set_visible(False)

shapename = 'admin_1_states_provinces_lakes_shp'
states_shp = shpreader.natural_earth(resolution='110m',category='cultural', name=shapename)
ax.add_geometries(shpreader.Reader(states_shp).geometries(), ccrs.PlateCarree(), edgecolor='#FFFFFF')
Out[52]:
<cartopy.mpl.feature_artist.FeatureArtist at 0xa29f7b630>

Each shape record contains attributes and bounds:

In [53]:
k = list(shpreader.Reader(states_shp).records())[0]
print(f"Bounds: {k.bounds}")
k.attributes
Bounds: (-97.22894344764502, 43.500187486335385, -89.5997116839191, 49.38928538674973)
Out[53]:
{'scalerank': 2,
 'featurecla': 'Admin-1 scale rank',
 'adm1_code': 'USA-3514',
 'diss_me': 3514,
 'adm1_cod_1': 'USA-3514',
 'iso_3166_2': 'US-MN',
 'wikipedia': 'http://en.wikipedia.org/wiki/Minnesota',
 'sr_sov_a3': 'US1',
 'sr_adm0_a3': 'USA',
 'iso_a2': 'US',
 'adm0_sr': 1,
 'admin0_lab': 2,
 'name': 'Minnesota',
 'name_alt': 'MN|Minn.',
 'name_local': '',
 'type': 'State',
 'type_en': 'State',
 'code_local': 'US32',
 'code_hasc': 'US.MN',
 'note': '',
 'hasc_maybe': '',
 'region': 'Midwest',
 'region_cod': '',
 'region_big': 'West North Central',
 'big_code': '',
 'provnum_ne': 0,
 'gadm_level': 1,
 'check_me': 10,
 'scaleran_1': 2,
 'datarank': 1,
 'abbrev': 'Minn.',
 'postal': 'MN',
 'area_sqkm': 0.0,
 'sameascity': -99,
 'labelrank': 0,
 'featurec_1': 'Admin-1 scale rank',
 'admin': 'United States of America',
 'name_len': 9,
 'mapcolor9': 1,
 'mapcolor13': 1}

Let's color the states according to their U.S. President votes:

In [58]:
import turicreate as tc
import turicreate.aggregate as agg
dataset_path = "./datasets/1976-2016-president.csv"
sf = tc.SFrame.read_csv(dataset_path)
sf = sf["year", "state",'state_po', "party", "candidatevotes"]
sf["party"] = sf["party"].apply(lambda s: "democrat" if "democrat" in s else s ) # there is Minnesota Democratic–Farmer–Labor
d_sf = sf[sf["party"] == "democrat"]
r_sf = sf[sf["party"] == "republican"]
v_sf =  d_sf.join(r_sf,on={"state":"state", "year":"year"})
v_sf = v_sf.rename({'candidatevotes': 'democrat_votes', 'candidatevotes.1': 'republican_votes' })
v_sf['result'] = v_sf.apply(lambda r: 'democrat' if r['democrat_votes'] > r['republican_votes'] else "republican")
v_sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/1976-2016-president.csv
Parsing completed. Parsed 100 lines in 0.032195 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,str,int,int,int,str,str,str,str,int,int,int,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/1976-2016-president.csv
Parsing completed. Parsed 3740 lines in 0.01495 secs.
Out[58]:
year state state_po party democrat_votes state_po.1 party.1 republican_votes
1976 Alabama AL democrat 659170 AL republican 504070
1976 Alaska AK democrat 44058 AK republican 71555
1976 Arizona AZ democrat 295602 AZ republican 418642
1976 Arkansas AR democrat 498604 AR republican 267903
1976 California CA democrat 3742284 CA republican 3882244
1976 Colorado CO democrat 460801 CO republican 584278
1976 Connecticut CT democrat 647895 CT republican 719261
1976 Delaware DE democrat 122461 DE republican 109780
1976 District of Columbia DC democrat 137818 DC republican 27873
1976 Florida FL democrat 1636000 FL republican 1469531
result
democrat
republican
republican
democrat
republican
republican
republican
democrat
democrat
democrat
[567 rows x 9 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [61]:
import matplotlib.patches as mpatches

def draw_us_map():
    fig = plt.figure(figsize=(30,30))
    ax = fig.add_subplot(1, 1, 1, projection=ccrs.LambertConformal())
    ax.set_extent([-125, -66.5, 20, 50], crs= ccrs.Geodetic())

    ax.background_patch.set_visible(False)
    ax.outline_patch.set_visible(False)

    shapename = 'admin_1_states_provinces_lakes_shp'
    states_shp = shpreader.natural_earth(resolution='110m',category='cultural', name=shapename)
    ax.add_geometries(shpreader.Reader(states_shp).geometries(), ccrs.PlateCarree())
    return ax

def create_election_result_by_year(sf, year):
    sf =  sf[sf["year"] == year]
    results_dict = {}
    for r in sf:
        results_dict[r['state']] = r['result']
        results_dict[r['state_po']] = r['result'] # adding additional name options for each state

    ax = draw_us_map()
    
    for state_record in shpreader.Reader(states_shp).records():
        edgecolor = 'black'
        if 'postal' not in state_record.attributes:
            continue
        
        name = state_record.attributes['postal']
        facecolor = 'green'
        if name not in results_dict:
            facecolor = 'green'
        elif results_dict[name] == 'democrat':
            facecolor = 'blue'
        elif results_dict[name] == 'republican':
            facecolor = 'red'
        ax.add_geometries([state_record.geometry], ccrs.PlateCarree(),
                          facecolor=facecolor, edgecolor=edgecolor)
    #let's add legend
    ax.set_title(f'{year} United States Presidential Election', fontsize=42)
    rebuplican = mpatches.Rectangle((0, 0), 1, 1, facecolor="red")
    democrat = mpatches.Rectangle((0, 0), 1, 1, facecolor="blue")
    labels = ['Democrat won','Republican won']
    ax.legend([democrat, rebuplican], labels,
              loc='lower left', bbox_to_anchor=(0.025, 0.05), fancybox=True,
             prop={'size': 32})
    return ax

create_election_result_by_year(v_sf, 2016)
Out[61]:
<cartopy.mpl.geoaxes.GeoAxesSubplot at 0xa2afd9a90>
In [67]:
from tqdm import tqdm
import imageio # need install imageio 
!mkdir ./images
!mkdir ./images/elections
#Creating image for the election in each year
years = list(v_sf['year'].unique().sort())
images_path = "./images/elections/"
images_list = []
for y in tqdm(years):
    ax = create_election_result_by_year(v_sf, y)
    img_path = f"{images_path}/{y}_elections.png"
    plt.savefig(img_path)
    images_list.append(img_path)
    plt.clf()

import imageio
images = []
for filename in images_list:
    images.append(imageio.imread(filename))
imageio.mimsave(f"{images_path}/all_elections.gif", images, duration=1 )
mkdir: ./images: File exists
mkdir: ./images/elections: File exists
100%|██████████| 11/11 [00:09<00:00,  1.15it/s]
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>

SegmentLocal

2.1.2 Exploring Flights

In this section we will explore flight routes. Let's start by loading the flights dataset with over 5.8 flights into an SFrame object:

In [70]:
#!mkdir ./datasets
!mkdir ./datasets/flights

# download the dataset from Kaggle and unzip it
!kaggle datasets download freddejn/flights  -p ./datasets/flights

!unzip ./datasets/flights/*.zip  -d ./datasets/flights/
Archive:  ./datasets/flights/flights.zip
  inflating: ./datasets/flights/L_AIRPORT.csv  
  inflating: ./datasets/flights/L_AIRPORT_ID.csv  
  inflating: ./datasets/flights/cleaned_and_sampled_flights_v2.csv  
  inflating: ./datasets/flights/flights.csv  
In [95]:
import turicreate as tc
import turicreate.aggregate as agg

dataset_path = "./datasets/flights"
sf = tc.SFrame.read_csv(f"{dataset_psth}/flights.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/flights.csv
Parsing completed. Parsed 100 lines in 0.828921 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,int,int,int,str,int,str,str,str,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,str,int,int,int,int,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Read 520345 lines. Lines per second: 328598
Read 3627333 lines. Lines per second: 498810
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/flights.csv
Parsing completed. Parsed 5819079 lines in 11.2635 secs.
Out[95]:
YEAR MONTH DAY DAY_OF_WEEK AIRLINE FLIGHT_NUMBER TAIL_NUMBER ORIGIN_AIRPORT DESTINATION_AIRPORT SCHEDULED_DEPARTURE
2015 1 1 4 AS 98 N407AS ANC SEA 5
2015 1 1 4 AA 2336 N3KUAA LAX PBI 10
2015 1 1 4 US 840 N171US SFO CLT 20
2015 1 1 4 AA 258 N3HYAA LAX MIA 20
2015 1 1 4 AS 135 N527AS SEA ANC 25
2015 1 1 4 DL 806 N3730B SFO MSP 25
2015 1 1 4 NK 612 N635NK LAS MSP 25
2015 1 1 4 US 2013 N584UW LAX CLT 30
2015 1 1 4 AA 1112 N3LAAA SFO DFW 30
2015 1 1 4 DL 1173 N826DN LAS ATL 30
DEPARTURE_TIME DEPARTURE_DELAY TAXI_OUT WHEELS_OFF SCHEDULED_TIME ELAPSED_TIME AIR_TIME DISTANCE WHEELS_ON
2354 -11 21 15 205 194 169 1448 404
2 -8 12 14 280 279 263 2330 737
18 -2 16 34 286 293 266 2296 800
15 -5 15 30 285 281 258 2342 748
24 -1 11 35 235 215 199 1448 254
20 -5 18 38 217 230 206 1589 604
19 -6 11 30 181 170 154 1299 504
44 14 13 57 273 249 228 2125 745
19 -11 17 36 195 193 173 1464 529
33 3 12 45 221 203 186 1747 651
TAXI_IN SCHEDULED_ARRIVAL ARRIVAL_TIME ARRIVAL_DELAY DIVERTED CANCELLED CANCELLATION_REASON AIR_SYSTEM_DELAY
4 430 408 -22 0 0 None
4 750 741 -9 0 0 None
11 806 811 5 0 0 None
8 805 756 -9 0 0 None
5 320 259 -21 0 0 None
6 602 610 8 0 0 None
5 526 509 -17 0 0 None
8 803 753 -10 0 0 None
3 545 532 -13 0 0 None
5 711 656 -15 0 0 None
SECURITY_DELAY AIRLINE_DELAY LATE_AIRCRAFT_DELAY WEATHER_DELAY
None None None None
None None None None
None None None None
None None None None
None None None None
None None None None
None None None None
None None None None
None None None None
None None None None
[5819079 rows x 31 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Let's calculate how many flights took place in each route:

In [96]:
g = sf.groupby(['ORIGIN_AIRPORT', 'DESTINATION_AIRPORT'], {'total_flights': agg.COUNT()})
g.sort('total_flights', ascending=False)
Out[96]:
DESTINATION_AIRPORT ORIGIN_AIRPORT total_flights
LAX SFO 13744
SFO LAX 13457
LAX JFK 12016
JFK LAX 12015
LAX LAS 9715
ORD LGA 9639
LAS LAX 9594
LGA ORD 9575
JFK SFO 8440
SFO JFK 8437
[8609 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Let's create a flight network and visualize it using Cytoscape:

In [97]:
import networkx as nx

ng = nx.DiGraph()
for r in g:
    ng.add_edge(r['ORIGIN_AIRPORT'], r['DESTINATION_AIRPORT'], weight=r['total_flights'])
nx.write_gml(ng, f"{dataset_psth}/flights_network.gml")
nx.info(ng)
Out[97]:
'Name: \nType: DiGraph\nNumber of nodes: 629\nNumber of edges: 8609\nAverage in degree:  13.6868\nAverage out degree:  13.6868'

SegmentLocal

We can see that the network consists of two main components. Additionally, in each component, there are few central nodes (according to vertices' betweenness measure). Let's select only one of the components and draw the routes on a map using the airport location data from the OpenFlight website:

In [98]:
cc = nx.weakly_connected_components(ng)
l = list(cc)
l
Out[98]:
[{'10135',
  '10136',
  '10140',
  '10141',
  '10146',
  '10154',
  '10155',
  '10157',
  '10158',
  '10165',
  '10170',
  '10185',
  '10208',
  '10257',
  '10268',
  '10279',
  '10299',
  '10333',
  '10372',
  '10397',
  '10408',
  '10423',
  '10431',
  '10434',
  '10469',
  '10529',
  '10551',
  '10561',
  '10577',
  '10581',
  '10599',
  '10620',
  '10627',
  '10631',
  '10666',
  '10685',
  '10693',
  '10713',
  '10721',
  '10728',
  '10731',
  '10732',
  '10739',
  '10747',
  '10754',
  '10779',
  '10781',
  '10785',
  '10792',
  '10800',
  '10821',
  '10849',
  '10868',
  '10874',
  '10918',
  '10926',
  '10980',
  '10990',
  '10994',
  '11003',
  '11013',
  '11042',
  '11049',
  '11057',
  '11066',
  '11067',
  '11076',
  '11097',
  '11109',
  '11111',
  '11122',
  '11140',
  '11146',
  '11150',
  '11193',
  '11203',
  '11252',
  '11259',
  '11267',
  '11274',
  '11278',
  '11292',
  '11298',
  '11308',
  '11315',
  '11337',
  '11413',
  '11423',
  '11433',
  '11447',
  '11471',
  '11481',
  '11503',
  '11525',
  '11537',
  '11540',
  '11577',
  '11587',
  '11603',
  '11612',
  '11617',
  '11618',
  '11624',
  '11630',
  '11637',
  '11638',
  '11641',
  '11648',
  '11695',
  '11697',
  '11721',
  '11775',
  '11778',
  '11823',
  '11865',
  '11867',
  '11884',
  '11898',
  '11905',
  '11921',
  '11953',
  '11973',
  '11977',
  '11980',
  '11982',
  '11986',
  '11995',
  '11996',
  '12003',
  '12007',
  '12016',
  '12094',
  '12129',
  '12156',
  '12173',
  '12177',
  '12191',
  '12197',
  '12206',
  '12217',
  '12255',
  '12264',
  '12265',
  '12266',
  '12278',
  '12280',
  '12323',
  '12335',
  '12339',
  '12343',
  '12389',
  '12391',
  '12402',
  '12441',
  '12448',
  '12451',
  '12478',
  '12511',
  '12519',
  '12523',
  '12758',
  '12819',
  '12884',
  '12888',
  '12889',
  '12891',
  '12892',
  '12896',
  '12898',
  '12915',
  '12945',
  '12951',
  '12953',
  '12954',
  '12982',
  '12992',
  '13029',
  '13061',
  '13076',
  '13127',
  '13158',
  '13184',
  '13198',
  '13204',
  '13230',
  '13232',
  '13241',
  '13244',
  '13256',
  '13264',
  '13277',
  '13290',
  '13296',
  '13303',
  '13342',
  '13344',
  '13360',
  '13367',
  '13377',
  '13422',
  '13433',
  '13459',
  '13476',
  '13485',
  '13486',
  '13487',
  '13495',
  '13502',
  '13541',
  '13577',
  '13795',
  '13796',
  '13830',
  '13851',
  '13871',
  '13873',
  '13891',
  '13930',
  '13931',
  '13933',
  '13964',
  '13970',
  '14006',
  '14025',
  '14027',
  '14057',
  '14098',
  '14100',
  '14107',
  '14108',
  '14109',
  '14113',
  '14122',
  '14150',
  '14193',
  '14222',
  '14252',
  '14254',
  '14256',
  '14262',
  '14307',
  '14321',
  '14457',
  '14487',
  '14489',
  '14492',
  '14520',
  '14524',
  '14543',
  '14570',
  '14574',
  '14576',
  '14588',
  '14633',
  '14635',
  '14674',
  '14679',
  '14683',
  '14685',
  '14689',
  '14696',
  '14698',
  '14709',
  '14711',
  '14730',
  '14747',
  '14771',
  '14783',
  '14794',
  '14814',
  '14828',
  '14831',
  '14842',
  '14843',
  '14869',
  '14893',
  '14905',
  '14908',
  '14952',
  '14960',
  '14986',
  '15016',
  '15024',
  '15027',
  '15041',
  '15048',
  '15070',
  '15096',
  '15249',
  '15295',
  '15304',
  '15323',
  '15356',
  '15370',
  '15376',
  '15380',
  '15389',
  '15401',
  '15411',
  '15412',
  '15497',
  '15607',
  '15624',
  '15841',
  '15919',
  '15991',
  '16218'},
 {'ABE',
  'ABI',
  'ABQ',
  'ABR',
  'ABY',
  'ACK',
  'ACT',
  'ACV',
  'ACY',
  'ADK',
  'ADQ',
  'AEX',
  'AGS',
  'AKN',
  'ALB',
  'ALO',
  'AMA',
  'ANC',
  'APN',
  'ASE',
  'ATL',
  'ATW',
  'AUS',
  'AVL',
  'AVP',
  'AZO',
  'BDL',
  'BET',
  'BFL',
  'BGM',
  'BGR',
  'BHM',
  'BIL',
  'BIS',
  'BJI',
  'BLI',
  'BMI',
  'BNA',
  'BOI',
  'BOS',
  'BPT',
  'BQK',
  'BQN',
  'BRD',
  'BRO',
  'BRW',
  'BTM',
  'BTR',
  'BTV',
  'BUF',
  'BUR',
  'BWI',
  'BZN',
  'CAE',
  'CAK',
  'CDC',
  'CDV',
  'CEC',
  'CHA',
  'CHO',
  'CHS',
  'CID',
  'CIU',
  'CLD',
  'CLE',
  'CLL',
  'CLT',
  'CMH',
  'CMI',
  'CMX',
  'CNY',
  'COD',
  'COS',
  'COU',
  'CPR',
  'CRP',
  'CRW',
  'CSG',
  'CVG',
  'CWA',
  'DAB',
  'DAL',
  'DAY',
  'DBQ',
  'DCA',
  'DEN',
  'DFW',
  'DHN',
  'DIK',
  'DLG',
  'DLH',
  'DRO',
  'DSM',
  'DTW',
  'DVL',
  'EAU',
  'ECP',
  'EGE',
  'EKO',
  'ELM',
  'ELP',
  'ERI',
  'ESC',
  'EUG',
  'EVV',
  'EWN',
  'EWR',
  'EYW',
  'FAI',
  'FAR',
  'FAT',
  'FAY',
  'FCA',
  'FLG',
  'FLL',
  'FNT',
  'FSD',
  'FSM',
  'FWA',
  'GCC',
  'GCK',
  'GEG',
  'GFK',
  'GGG',
  'GJT',
  'GNV',
  'GPT',
  'GRB',
  'GRI',
  'GRK',
  'GRR',
  'GSO',
  'GSP',
  'GST',
  'GTF',
  'GTR',
  'GUC',
  'GUM',
  'HDN',
  'HIB',
  'HLN',
  'HNL',
  'HOB',
  'HOU',
  'HPN',
  'HRL',
  'HSV',
  'HYA',
  'HYS',
  'IAD',
  'IAG',
  'IAH',
  'ICT',
  'IDA',
  'ILG',
  'ILM',
  'IMT',
  'IND',
  'INL',
  'ISN',
  'ISP',
  'ITH',
  'ITO',
  'JAC',
  'JAN',
  'JAX',
  'JFK',
  'JLN',
  'JMS',
  'JNU',
  'KOA',
  'KTN',
  'LAN',
  'LAR',
  'LAS',
  'LAW',
  'LAX',
  'LBB',
  'LBE',
  'LCH',
  'LEX',
  'LFT',
  'LGA',
  'LGB',
  'LIH',
  'LIT',
  'LNK',
  'LRD',
  'LSE',
  'LWS',
  'MAF',
  'MBS',
  'MCI',
  'MCO',
  'MDT',
  'MDW',
  'MEI',
  'MEM',
  'MFE',
  'MFR',
  'MGM',
  'MHK',
  'MHT',
  'MIA',
  'MKE',
  'MKG',
  'MLB',
  'MLI',
  'MLU',
  'MMH',
  'MOB',
  'MOT',
  'MQT',
  'MRY',
  'MSN',
  'MSO',
  'MSP',
  'MSY',
  'MTJ',
  'MVY',
  'MYR',
  'OAJ',
  'OAK',
  'OGG',
  'OKC',
  'OMA',
  'OME',
  'ONT',
  'ORD',
  'ORF',
  'ORH',
  'OTH',
  'OTZ',
  'PAH',
  'PBG',
  'PBI',
  'PDX',
  'PHF',
  'PHL',
  'PHX',
  'PIA',
  'PIB',
  'PIH',
  'PIT',
  'PLN',
  'PNS',
  'PPG',
  'PSC',
  'PSE',
  'PSG',
  'PSP',
  'PUB',
  'PVD',
  'PWM',
  'RAP',
  'RDD',
  'RDM',
  'RDU',
  'RHI',
  'RIC',
  'RKS',
  'RNO',
  'ROA',
  'ROC',
  'ROW',
  'RST',
  'RSW',
  'SAF',
  'SAN',
  'SAT',
  'SAV',
  'SBA',
  'SBN',
  'SBP',
  'SCC',
  'SCE',
  'SDF',
  'SEA',
  'SFO',
  'SGF',
  'SGU',
  'SHV',
  'SIT',
  'SJC',
  'SJT',
  'SJU',
  'SLC',
  'SMF',
  'SMX',
  'SNA',
  'SPI',
  'SPS',
  'SRQ',
  'STC',
  'STL',
  'STT',
  'STX',
  'SUN',
  'SUX',
  'SWF',
  'SYR',
  'TLH',
  'TOL',
  'TPA',
  'TRI',
  'TTN',
  'TUL',
  'TUS',
  'TVC',
  'TWF',
  'TXK',
  'TYR',
  'TYS',
  'UST',
  'VEL',
  'VLD',
  'VPS',
  'WRG',
  'WYS',
  'XNA',
  'YAK',
  'YUM'}]
In [99]:
l[0]
Out[99]:
{'10135',
 '10136',
 '10140',
 '10141',
 '10146',
 '10154',
 '10155',
 '10157',
 '10158',
 '10165',
 '10170',
 '10185',
 '10208',
 '10257',
 '10268',
 '10279',
 '10299',
 '10333',
 '10372',
 '10397',
 '10408',
 '10423',
 '10431',
 '10434',
 '10469',
 '10529',
 '10551',
 '10561',
 '10577',
 '10581',
 '10599',
 '10620',
 '10627',
 '10631',
 '10666',
 '10685',
 '10693',
 '10713',
 '10721',
 '10728',
 '10731',
 '10732',
 '10739',
 '10747',
 '10754',
 '10779',
 '10781',
 '10785',
 '10792',
 '10800',
 '10821',
 '10849',
 '10868',
 '10874',
 '10918',
 '10926',
 '10980',
 '10990',
 '10994',
 '11003',
 '11013',
 '11042',
 '11049',
 '11057',
 '11066',
 '11067',
 '11076',
 '11097',
 '11109',
 '11111',
 '11122',
 '11140',
 '11146',
 '11150',
 '11193',
 '11203',
 '11252',
 '11259',
 '11267',
 '11274',
 '11278',
 '11292',
 '11298',
 '11308',
 '11315',
 '11337',
 '11413',
 '11423',
 '11433',
 '11447',
 '11471',
 '11481',
 '11503',
 '11525',
 '11537',
 '11540',
 '11577',
 '11587',
 '11603',
 '11612',
 '11617',
 '11618',
 '11624',
 '11630',
 '11637',
 '11638',
 '11641',
 '11648',
 '11695',
 '11697',
 '11721',
 '11775',
 '11778',
 '11823',
 '11865',
 '11867',
 '11884',
 '11898',
 '11905',
 '11921',
 '11953',
 '11973',
 '11977',
 '11980',
 '11982',
 '11986',
 '11995',
 '11996',
 '12003',
 '12007',
 '12016',
 '12094',
 '12129',
 '12156',
 '12173',
 '12177',
 '12191',
 '12197',
 '12206',
 '12217',
 '12255',
 '12264',
 '12265',
 '12266',
 '12278',
 '12280',
 '12323',
 '12335',
 '12339',
 '12343',
 '12389',
 '12391',
 '12402',
 '12441',
 '12448',
 '12451',
 '12478',
 '12511',
 '12519',
 '12523',
 '12758',
 '12819',
 '12884',
 '12888',
 '12889',
 '12891',
 '12892',
 '12896',
 '12898',
 '12915',
 '12945',
 '12951',
 '12953',
 '12954',
 '12982',
 '12992',
 '13029',
 '13061',
 '13076',
 '13127',
 '13158',
 '13184',
 '13198',
 '13204',
 '13230',
 '13232',
 '13241',
 '13244',
 '13256',
 '13264',
 '13277',
 '13290',
 '13296',
 '13303',
 '13342',
 '13344',
 '13360',
 '13367',
 '13377',
 '13422',
 '13433',
 '13459',
 '13476',
 '13485',
 '13486',
 '13487',
 '13495',
 '13502',
 '13541',
 '13577',
 '13795',
 '13796',
 '13830',
 '13851',
 '13871',
 '13873',
 '13891',
 '13930',
 '13931',
 '13933',
 '13964',
 '13970',
 '14006',
 '14025',
 '14027',
 '14057',
 '14098',
 '14100',
 '14107',
 '14108',
 '14109',
 '14113',
 '14122',
 '14150',
 '14193',
 '14222',
 '14252',
 '14254',
 '14256',
 '14262',
 '14307',
 '14321',
 '14457',
 '14487',
 '14489',
 '14492',
 '14520',
 '14524',
 '14543',
 '14570',
 '14574',
 '14576',
 '14588',
 '14633',
 '14635',
 '14674',
 '14679',
 '14683',
 '14685',
 '14689',
 '14696',
 '14698',
 '14709',
 '14711',
 '14730',
 '14747',
 '14771',
 '14783',
 '14794',
 '14814',
 '14828',
 '14831',
 '14842',
 '14843',
 '14869',
 '14893',
 '14905',
 '14908',
 '14952',
 '14960',
 '14986',
 '15016',
 '15024',
 '15027',
 '15041',
 '15048',
 '15070',
 '15096',
 '15249',
 '15295',
 '15304',
 '15323',
 '15356',
 '15370',
 '15376',
 '15380',
 '15389',
 '15401',
 '15411',
 '15412',
 '15497',
 '15607',
 '15624',
 '15841',
 '15919',
 '15991',
 '16218'}
In [85]:
l[1]
Out[85]:
{'ABE',
 'ABI',
 'ABQ',
 'ABR',
 'ABY',
 'ACK',
 'ACT',
 'ACV',
 'ACY',
 'ADK',
 'ADQ',
 'AEX',
 'AGS',
 'AKN',
 'ALB',
 'ALO',
 'AMA',
 'ANC',
 'APN',
 'ASE',
 'ATL',
 'ATW',
 'AUS',
 'AVL',
 'AVP',
 'AZO',
 'BDL',
 'BET',
 'BFL',
 'BGM',
 'BGR',
 'BHM',
 'BIL',
 'BIS',
 'BJI',
 'BLI',
 'BMI',
 'BNA',
 'BOI',
 'BOS',
 'BPT',
 'BQK',
 'BQN',
 'BRD',
 'BRO',
 'BRW',
 'BTM',
 'BTR',
 'BTV',
 'BUF',
 'BUR',
 'BWI',
 'BZN',
 'CAE',
 'CAK',
 'CDC',
 'CDV',
 'CEC',
 'CHA',
 'CHO',
 'CHS',
 'CID',
 'CIU',
 'CLD',
 'CLE',
 'CLL',
 'CLT',
 'CMH',
 'CMI',
 'CMX',
 'CNY',
 'COD',
 'COS',
 'COU',
 'CPR',
 'CRP',
 'CRW',
 'CSG',
 'CVG',
 'CWA',
 'DAB',
 'DAL',
 'DAY',
 'DBQ',
 'DCA',
 'DEN',
 'DFW',
 'DHN',
 'DIK',
 'DLG',
 'DLH',
 'DRO',
 'DSM',
 'DTW',
 'DVL',
 'EAU',
 'ECP',
 'EGE',
 'EKO',
 'ELM',
 'ELP',
 'ERI',
 'ESC',
 'EUG',
 'EVV',
 'EWN',
 'EWR',
 'EYW',
 'FAI',
 'FAR',
 'FAT',
 'FAY',
 'FCA',
 'FLG',
 'FLL',
 'FNT',
 'FSD',
 'FSM',
 'FWA',
 'GCC',
 'GCK',
 'GEG',
 'GFK',
 'GGG',
 'GJT',
 'GNV',
 'GPT',
 'GRB',
 'GRI',
 'GRK',
 'GRR',
 'GSO',
 'GSP',
 'GST',
 'GTF',
 'GTR',
 'GUC',
 'GUM',
 'HDN',
 'HIB',
 'HLN',
 'HNL',
 'HOB',
 'HOU',
 'HPN',
 'HRL',
 'HSV',
 'HYA',
 'HYS',
 'IAD',
 'IAG',
 'IAH',
 'ICT',
 'IDA',
 'ILG',
 'ILM',
 'IMT',
 'IND',
 'INL',
 'ISN',
 'ISP',
 'ITH',
 'ITO',
 'JAC',
 'JAN',
 'JAX',
 'JFK',
 'JLN',
 'JMS',
 'JNU',
 'KOA',
 'KTN',
 'LAN',
 'LAR',
 'LAS',
 'LAW',
 'LAX',
 'LBB',
 'LBE',
 'LCH',
 'LEX',
 'LFT',
 'LGA',
 'LGB',
 'LIH',
 'LIT',
 'LNK',
 'LRD',
 'LSE',
 'LWS',
 'MAF',
 'MBS',
 'MCI',
 'MCO',
 'MDT',
 'MDW',
 'MEI',
 'MEM',
 'MFE',
 'MFR',
 'MGM',
 'MHK',
 'MHT',
 'MIA',
 'MKE',
 'MKG',
 'MLB',
 'MLI',
 'MLU',
 'MMH',
 'MOB',
 'MOT',
 'MQT',
 'MRY',
 'MSN',
 'MSO',
 'MSP',
 'MSY',
 'MTJ',
 'MVY',
 'MYR',
 'OAJ',
 'OAK',
 'OGG',
 'OKC',
 'OMA',
 'OME',
 'ONT',
 'ORD',
 'ORF',
 'ORH',
 'OTH',
 'OTZ',
 'PAH',
 'PBG',
 'PBI',
 'PDX',
 'PHF',
 'PHL',
 'PHX',
 'PIA',
 'PIB',
 'PIH',
 'PIT',
 'PLN',
 'PNS',
 'PPG',
 'PSC',
 'PSE',
 'PSG',
 'PSP',
 'PUB',
 'PVD',
 'PWM',
 'RAP',
 'RDD',
 'RDM',
 'RDU',
 'RHI',
 'RIC',
 'RKS',
 'RNO',
 'ROA',
 'ROC',
 'ROW',
 'RST',
 'RSW',
 'SAF',
 'SAN',
 'SAT',
 'SAV',
 'SBA',
 'SBN',
 'SBP',
 'SCC',
 'SCE',
 'SDF',
 'SEA',
 'SFO',
 'SGF',
 'SGU',
 'SHV',
 'SIT',
 'SJC',
 'SJT',
 'SJU',
 'SLC',
 'SMF',
 'SMX',
 'SNA',
 'SPI',
 'SPS',
 'SRQ',
 'STC',
 'STL',
 'STT',
 'STX',
 'SUN',
 'SUX',
 'SWF',
 'SYR',
 'TLH',
 'TOL',
 'TPA',
 'TRI',
 'TTN',
 'TUL',
 'TUS',
 'TVC',
 'TWF',
 'TXK',
 'TYR',
 'TYS',
 'UST',
 'VEL',
 'VLD',
 'VPS',
 'WRG',
 'WYS',
 'XNA',
 'YAK',
 'YUM'}
In [100]:
g.sort('total_flights', ascending=False)
Out[100]:
DESTINATION_AIRPORT ORIGIN_AIRPORT total_flights
LAX SFO 13744
SFO LAX 13457
LAX JFK 12016
JFK LAX 12015
LAX LAS 9715
ORD LGA 9639
LAS LAX 9594
LGA ORD 9575
JFK SFO 8440
SFO JFK 8437
[8609 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [101]:
g = g[g.apply(lambda r: r['ORIGIN_AIRPORT'] in l[1] and r['DESTINATION_AIRPORT'] in l[1] )]
g.materialize()
g
Out[101]:
DESTINATION_AIRPORT ORIGIN_AIRPORT total_flights
PSG JNU 332
HOU BNA 1225
KOA OGG 814
SLC SEA 3463
DEN PDX 3423
ORD IAD 1745
DSM DFW 857
LAS SNA 2412
LAS SEA 5009
ROC DTW 304
[4693 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [88]:
l[1]
Out[88]:
{'ABE',
 'ABI',
 'ABQ',
 'ABR',
 'ABY',
 'ACK',
 'ACT',
 'ACV',
 'ACY',
 'ADK',
 'ADQ',
 'AEX',
 'AGS',
 'AKN',
 'ALB',
 'ALO',
 'AMA',
 'ANC',
 'APN',
 'ASE',
 'ATL',
 'ATW',
 'AUS',
 'AVL',
 'AVP',
 'AZO',
 'BDL',
 'BET',
 'BFL',
 'BGM',
 'BGR',
 'BHM',
 'BIL',
 'BIS',
 'BJI',
 'BLI',
 'BMI',
 'BNA',
 'BOI',
 'BOS',
 'BPT',
 'BQK',
 'BQN',
 'BRD',
 'BRO',
 'BRW',
 'BTM',
 'BTR',
 'BTV',
 'BUF',
 'BUR',
 'BWI',
 'BZN',
 'CAE',
 'CAK',
 'CDC',
 'CDV',
 'CEC',
 'CHA',
 'CHO',
 'CHS',
 'CID',
 'CIU',
 'CLD',
 'CLE',
 'CLL',
 'CLT',
 'CMH',
 'CMI',
 'CMX',
 'CNY',
 'COD',
 'COS',
 'COU',
 'CPR',
 'CRP',
 'CRW',
 'CSG',
 'CVG',
 'CWA',
 'DAB',
 'DAL',
 'DAY',
 'DBQ',
 'DCA',
 'DEN',
 'DFW',
 'DHN',
 'DIK',
 'DLG',
 'DLH',
 'DRO',
 'DSM',
 'DTW',
 'DVL',
 'EAU',
 'ECP',
 'EGE',
 'EKO',
 'ELM',
 'ELP',
 'ERI',
 'ESC',
 'EUG',
 'EVV',
 'EWN',
 'EWR',
 'EYW',
 'FAI',
 'FAR',
 'FAT',
 'FAY',
 'FCA',
 'FLG',
 'FLL',
 'FNT',
 'FSD',
 'FSM',
 'FWA',
 'GCC',
 'GCK',
 'GEG',
 'GFK',
 'GGG',
 'GJT',
 'GNV',
 'GPT',
 'GRB',
 'GRI',
 'GRK',
 'GRR',
 'GSO',
 'GSP',
 'GST',
 'GTF',
 'GTR',
 'GUC',
 'GUM',
 'HDN',
 'HIB',
 'HLN',
 'HNL',
 'HOB',
 'HOU',
 'HPN',
 'HRL',
 'HSV',
 'HYA',
 'HYS',
 'IAD',
 'IAG',
 'IAH',
 'ICT',
 'IDA',
 'ILG',
 'ILM',
 'IMT',
 'IND',
 'INL',
 'ISN',
 'ISP',
 'ITH',
 'ITO',
 'JAC',
 'JAN',
 'JAX',
 'JFK',
 'JLN',
 'JMS',
 'JNU',
 'KOA',
 'KTN',
 'LAN',
 'LAR',
 'LAS',
 'LAW',
 'LAX',
 'LBB',
 'LBE',
 'LCH',
 'LEX',
 'LFT',
 'LGA',
 'LGB',
 'LIH',
 'LIT',
 'LNK',
 'LRD',
 'LSE',
 'LWS',
 'MAF',
 'MBS',
 'MCI',
 'MCO',
 'MDT',
 'MDW',
 'MEI',
 'MEM',
 'MFE',
 'MFR',
 'MGM',
 'MHK',
 'MHT',
 'MIA',
 'MKE',
 'MKG',
 'MLB',
 'MLI',
 'MLU',
 'MMH',
 'MOB',
 'MOT',
 'MQT',
 'MRY',
 'MSN',
 'MSO',
 'MSP',
 'MSY',
 'MTJ',
 'MVY',
 'MYR',
 'OAJ',
 'OAK',
 'OGG',
 'OKC',
 'OMA',
 'OME',
 'ONT',
 'ORD',
 'ORF',
 'ORH',
 'OTH',
 'OTZ',
 'PAH',
 'PBG',
 'PBI',
 'PDX',
 'PHF',
 'PHL',
 'PHX',
 'PIA',
 'PIB',
 'PIH',
 'PIT',
 'PLN',
 'PNS',
 'PPG',
 'PSC',
 'PSE',
 'PSG',
 'PSP',
 'PUB',
 'PVD',
 'PWM',
 'RAP',
 'RDD',
 'RDM',
 'RDU',
 'RHI',
 'RIC',
 'RKS',
 'RNO',
 'ROA',
 'ROC',
 'ROW',
 'RST',
 'RSW',
 'SAF',
 'SAN',
 'SAT',
 'SAV',
 'SBA',
 'SBN',
 'SBP',
 'SCC',
 'SCE',
 'SDF',
 'SEA',
 'SFO',
 'SGF',
 'SGU',
 'SHV',
 'SIT',
 'SJC',
 'SJT',
 'SJU',
 'SLC',
 'SMF',
 'SMX',
 'SNA',
 'SPI',
 'SPS',
 'SRQ',
 'STC',
 'STL',
 'STT',
 'STX',
 'SUN',
 'SUX',
 'SWF',
 'SYR',
 'TLH',
 'TOL',
 'TPA',
 'TRI',
 'TTN',
 'TUL',
 'TUS',
 'TVC',
 'TWF',
 'TXK',
 'TYR',
 'TYS',
 'UST',
 'VEL',
 'VLD',
 'VPS',
 'WRG',
 'WYS',
 'XNA',
 'YAK',
 'YUM'}

Let's use the Locations of Airports dataset to visualize the flights network on a map:

In [104]:
# download the dataset from Kaggle and unzip it
!kaggle datasets download flashgordon/locations-of-airports  -p ./datasets/flights

!unzip ./datasets/flights/locations-of-airports.zip  -d ./datasets/flights/
Downloading locations-of-airports.zip to ./datasets/flights
  0%|                                               | 0.00/23.6k [00:00<?, ?B/s]
100%|███████████████████████████████████████| 23.6k/23.6k [00:00<00:00, 322kB/s]
Archive:  ./datasets/flights/locations-of-airports.zip
  inflating: ./datasets/flights/Locations.csv  
In [112]:
sf = tc.SFrame.read_csv('./datasets/flights/Locations.csv')
sf.materialize()
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/Locations.csv
Parsing completed. Parsed 100 lines in 0.025282 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,float,float]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/Locations.csv
Parsing completed. Parsed 1435 lines in 0.006379 secs.
Out[112]:
Address Latitude Longitude
BTI 70.1340026855 -143.582000732
LUR 68.87509918 -166.1100006
PIZ 69.73290253 -163.0050049
ITO 19.721399307251 -155.048004150391
ORL 28.545499801636 -81.332901000977
BTT 66.91390228 -151.529007
Z84 64.301201 -149.119995
UTO 65.99279785 -153.7039948
FYU 66.5715026855469 -145.25
SVW 61.09740067 -155.5740051
[1435 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [114]:
airports_set = set(g['DESTINATION_AIRPORT']) | set(g['ORIGIN_AIRPORT'])
sf = sf[sf['Address'].apply(lambda a: a in airports_set)]
sf.materialize()
sf
Out[114]:
Address Latitude Longitude
ITO 19.721399307251 -155.048004150391
FSM 35.3366012573242 -94.3674011230469
GFK 47.949299 -97.176102
TTN 40.2766990661621 -74.8134994506836
BOS 42.36429977 -71.00520325
OAK 37.7212982177734 -122.221000671387
OMA 41.3031997680664 -95.8940963745117
OGG 20.8985996246338 -156.429992675781
ICT 37.6498985290527 -97.4330978393555
MCI 39.2976 -94.713898
[315 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [111]:
airports_set
Out[111]:
{1}
In [115]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
%matplotlib inline

def draw_map(w_size=30, h_size=30):
    plt.figure(figsize=(w_size, h_size))
    ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))
    ax.coastlines(resolution='10m', color='black', linewidth=1) # draw with batter coaslines resolution
    return ax

ax = draw_map(20,40)

for  r in sf:
    lon = r['Longitude']
    lat = r['Latitude']
    plt.plot(lon, lat,
         color='black', marker='o', markersize=4,transform=ccrs.PlateCarree(),
         )

We can observe that all the airports are in the US. Let's use a US map:

In [117]:
import matplotlib.patches as mpatches
import cartopy.io.shapereader as shpreader
import cartopy.feature as cfeature
import operator

def draw_aiports(top_airports_num=20):
    fig = plt.figure(figsize=(40,40))
    ax = plt.axes(projection=ccrs.LambertConformal())
    ax.set_extent([-125, -66.5, 20, 50], crs= ccrs.Geodetic())
    ax.add_feature(cfeature.LAND)
    ax.add_feature(cfeature.COASTLINE)
    ax.add_feature(cfeature.RIVERS)
    ax.add_feature(cfeature.LAKES)


    top_airports = sorted(dict(h.degree()).items(), key=operator.itemgetter(1), reverse=True)[:top_airports_num]
    top_set = set([a[0] for a in top_airports])
    for r in sf:
        lon = r['Longitude']
        lat = r['Latitude']
        plt.plot(lon, lat,
             color='black', marker='o', markersize=6,transform=ccrs.PlateCarree(),
             )
        
        if r["Address"] not in top_set:
            continue
        ax.text(lon, lat+0.5, r["Address"], fontsize=16, color="blue", transform=ccrs.PlateCarree())
    return ax
h = ng.subgraph(l[1])
draw_aiports()
Out[117]:
<cartopy.mpl.geoaxes.GeoAxesSubplot at 0xa39dd9278>
In [118]:
g = g.join(sf, on={"ORIGIN_AIRPORT":"Address"})
g = g.rename({'Latitude':'OLatitude', 'Longitude': 'OLongitude' })
g = g.join(sf, on={"DESTINATION_AIRPORT":"Address"})
g = g.rename({'Latitude':'DLatitude', 'Longitude': 'DLongitude' })
g                                                                         
                                                                                        
Out[118]:
DESTINATION_AIRPORT ORIGIN_AIRPORT total_flights OLatitude OLongitude DLatitude
PSG JNU 332 58.3549995422363 -134.57600402832 56.80170059
HOU BNA 1225 36.1245002746582 -86.6781997680664 29.64539909
KOA OGG 814 20.8985996246338 -156.429992675781 19.7388000488281
SLC SEA 3463 47.4490013122559 -122.30899810791 40.7883987426758
DEN PDX 3423 45.58869934 -122.5979996 39.861698150635
ORD IAD 1745 38.94449997 -77.45580292 41.97859955
DSM DFW 857 32.896800994873 -97.0380020141602 41.5340003967285
LAS SNA 2412 33.67570114 -117.8679962 36.08010101
LAS SEA 5009 47.4490013122559 -122.30899810791 36.08010101
ROC DTW 304 42.2123985290527 -83.353401184082 43.1189002990723
DLongitude
-132.9450073
-95.27890015
-156.046005249023
-111.977996826172
-104.672996521
-87.90480042
-93.6631011962891
-115.1520004
-115.1520004
-77.6724014282227
[4603 rows x 7 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [119]:
ax = draw_aiports()

for r in g.sort('total_flights', ascending=False)[:400]:
    plt.plot([r['OLongitude'], r['DLongitude']], [r['OLatitude'], r['DLatitude']],
         color='gray', linewidth=1, marker='o',
         transform=ccrs.PlateCarree(),
         )
                 

2.2 Working with GeoPandas

In this section, we are going to use GeoPandas to work with geographic datasets. GeoPandas is an open source project to make working with geospatial data in Python easier by extending the datatypes used by pandas to allow spatial operations on geometric types.

In [1]:
# These examples are inspired from http://geopandas.org/mapping.html
import geopandas
import matplotlib.pyplot as plt
%matplotlib inline

world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world.head(10)
Out[1]:
pop_est continent name iso_a3 gdp_md_est geometry
0 920938 Oceania Fiji FJI 8374.0 MULTIPOLYGON (((180.00000 -16.06713, 180.00000...
1 53950935 Africa Tanzania TZA 150600.0 POLYGON ((33.90371 -0.95000, 34.07262 -1.05982...
2 603253 Africa W. Sahara ESH 906.5 POLYGON ((-8.66559 27.65643, -8.66512 27.58948...
3 35623680 North America Canada CAN 1674000.0 MULTIPOLYGON (((-122.84000 49.00000, -122.9742...
4 326625791 North America United States of America USA 18560000.0 MULTIPOLYGON (((-122.84000 49.00000, -120.0000...
5 18556698 Asia Kazakhstan KAZ 460700.0 POLYGON ((87.35997 49.21498, 86.59878 48.54918...
6 29748859 Asia Uzbekistan UZB 202300.0 POLYGON ((55.96819 41.30864, 55.92892 44.99586...
7 6909701 Oceania Papua New Guinea PNG 28020.0 MULTIPOLYGON (((141.00021 -2.60015, 142.73525 ...
8 260580739 Asia Indonesia IDN 3028000.0 MULTIPOLYGON (((141.00021 -2.60015, 141.01706 ...
9 44293293 South America Argentina ARG 879400.0 MULTIPOLYGON (((-68.63401 -52.63637, -68.25000...
In [2]:
print(type(world.geometry[0]))
world.geometry[9]
<class 'shapely.geometry.multipolygon.MultiPolygon'>
Out[2]:
In [3]:
world.plot(figsize=(40,40))
Out[3]:
<matplotlib.axes._subplots.AxesSubplot at 0x11ad5ff28>
In [4]:
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
cities.head(10)
Out[4]:
name geometry
0 Vatican City POINT (12.45339 41.90328)
1 San Marino POINT (12.44177 43.93610)
2 Vaduz POINT (9.51667 47.13372)
3 Luxembourg POINT (6.13000 49.61166)
4 Palikir POINT (158.14997 6.91664)
5 Majuro POINT (171.38000 7.10300)
6 Funafuti POINT (179.21665 -8.51665)
7 Melekeok POINT (134.62655 7.48740)
8 Monaco POINT (7.40691 43.73965)
9 Tarawa POINT (173.01757 1.33819)

Let's put the cities on the world-map:

In [5]:
ax = world.plot(color='lightgreen', edgecolor='gray',figsize=(40,40))
cities.plot(ax=ax, marker='o', color='red', markersize=6);
# ading labels
for idx, row in cities.iterrows():
    pt = row['geometry']
    plt.annotate(s=row['name'], xy=(pt.x,pt.y),
                 horizontalalignment='center', fontsize=8,color="blue")

Let's color the maps according to each country's population size:

In [6]:
import math
fig, ax = plt.subplots()

ax.set_aspect('equal')

world['pop_est_log'] = world['pop_est'].apply(lambda i: math.log(i) if i >0 else 0)
world.plot(ax=ax, column="pop_est_log",  cmap='OrRd', legend=True  )
Out[6]:
<matplotlib.axes._subplots.AxesSubplot at 0x11dff9278>

Let's plot the US states using shape file from Natural Earth, and GeoPandas:

In [9]:
#!mkdir ./datasets/ne_50m_admin_1_states_provinces/
#!wget -O ./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.zip https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/cultural/ne_50m_admin_1_states_provinces.zip
#!unzip ./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.zip  -d ./datasets/ne_50m_admin_1_states_provinces/

fig, ax = plt.subplots(figsize=(40,40))
shp_path = "./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.shp"

#reading states data from shape file
gdf = geopandas.read_file(shp_path)
gdf
Out[9]:
featurecla scalerank adm1_code diss_me iso_3166_2 wikipedia iso_a2 adm0_sr name name_alt ... name_nl name_pl name_pt name_ru name_sv name_tr name_vi name_zh ne_id geometry
0 Admin-1 scale rank 2 AUS-2651 2651 AU-WA None AU 6 Western Australia None ... West-Australië Australia Zachodnia Austrália Ocidental Западная Австралия Western Australia Batı Avustralya Tây Úc 西澳大利亚州 1159315805 MULTIPOLYGON (((113.13181 -25.95199, 113.14823...
1 Admin-1 scale rank 2 AUS-2650 2650 AU-NT None AU 6 Northern Territory None ... Noordelijk Territorium Terytorium Północne Território do Norte Северная территория Northern Territory Kuzey Toprakları Lãnh thổ Bắc Úc 北領地 1159315809 MULTIPOLYGON (((129.00196 -25.99901, 129.00196...
2 Admin-1 scale rank 2 AUS-2655 2655 AU-SA None AU 3 South Australia None ... Zuid-Australië Australia Południowa Austrália Meridional Южная Австралия South Australia Güney Avustralya Nam Úc 南澳大利亚州 1159313267 MULTIPOLYGON (((129.00196 -31.69266, 129.00196...
3 Admin-1 scale rank 2 AUS-2657 2657 AU-QLD None AU 5 Queensland None ... Queensland Queensland Queensland Квинсленд Queensland Queensland Queensland 昆士蘭州 1159315807 MULTIPOLYGON (((138.00196 -25.99901, 138.00174...
4 Admin-1 scale rank 2 AUS-2660 2660 AU-TAS None AU 5 Tasmania None ... Tasmanië Tasmania Tasmânia Тасмания Tasmanien Tasmanya Tasmania 塔斯馬尼亞州 1159313261 MULTIPOLYGON (((147.31246 -43.28038, 147.34238...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
95 Admin-1 scale rank 2 USA-3540 3540 US-VT http://en.wikipedia.org/wiki/Vermont US 1 Vermont VT ... Vermont Vermont Vermont Вермонт Vermont Vermont Vermont 佛蒙特州 1159315305 POLYGON ((-73.35218 45.00542, -73.18201 45.005...
96 Admin-1 scale rank 2 USA-3519 3519 US-WA http://en.wikipedia.org/wiki/Washington_(state) US 6 Washington WA|Wash. ... Washington Waszyngton Washington Вашингтон Washington Vaşington Washington 华盛顿州 1159309547 MULTIPOLYGON (((-122.78878 48.99303, -122.6863...
97 Admin-1 scale rank 2 USA-3553 3553 US-WI http://en.wikipedia.org/wiki/Wisconsin US 1 Wisconsin WI|Wis. ... Wisconsin Wisconsin Wisconsin Висконсин Wisconsin Wisconsin Wisconsin 威斯康辛州 1159315321 POLYGON ((-90.65058 42.51298, -90.65733 42.520...
98 Admin-1 scale rank 2 USA-3554 3554 US-WV http://en.wikipedia.org/wiki/West_Virginia US 1 West Virginia WV|W.Va. ... West Virginia Wirginia Zachodnia Virgínia Ocidental Западная Виргиния West Virginia Batı Virginia Tây Virginia 西維吉尼亞州 1159315323 POLYGON ((-81.96528 37.53973, -82.10346 37.570...
99 Admin-1 scale rank 2 USA-3527 3527 US-WY http://en.wikipedia.org/wiki/Wyoming US 1 Wyoming WY|Wyo. ... Wyoming Wyoming Wyoming Вайоминг Wyoming Wyoming Wyoming 怀俄明州 1159315351 POLYGON ((-104.02166 41.00086, -104.33572 41.0...

100 rows × 84 columns

In [9]:
gdf
Out[9]:
featurecla scalerank adm1_code diss_me iso_3166_2 wikipedia iso_a2 adm0_sr name name_alt ... name_nl name_pl name_pt name_ru name_sv name_tr name_vi name_zh ne_id geometry
0 Admin-1 scale rank 2 AUS-2651 2651 AU-WA None AU 6 Western Australia None ... West-Australië Australia Zachodnia Austrália Ocidental Западная Австралия Western Australia Batı Avustralya Tây Úc 西澳大利亚州 1159315805 MULTIPOLYGON (((113.13181 -25.95199, 113.14823...
1 Admin-1 scale rank 2 AUS-2650 2650 AU-NT None AU 6 Northern Territory None ... Noordelijk Territorium Terytorium Północne Território do Norte Северная территория Northern Territory Kuzey Toprakları Lãnh thổ Bắc Úc 北領地 1159315809 MULTIPOLYGON (((129.00196 -25.99901, 129.00196...
2 Admin-1 scale rank 2 AUS-2655 2655 AU-SA None AU 3 South Australia None ... Zuid-Australië Australia Południowa Austrália Meridional Южная Австралия South Australia Güney Avustralya Nam Úc 南澳大利亚州 1159313267 MULTIPOLYGON (((129.00196 -31.69266, 129.00196...
3 Admin-1 scale rank 2 AUS-2657 2657 AU-QLD None AU 5 Queensland None ... Queensland Queensland Queensland Квинсленд Queensland Queensland Queensland 昆士蘭州 1159315807 MULTIPOLYGON (((138.00196 -25.99901, 138.00174...
4 Admin-1 scale rank 2 AUS-2660 2660 AU-TAS None AU 5 Tasmania None ... Tasmanië Tasmania Tasmânia Тасмания Tasmanien Tasmanya Tasmania 塔斯馬尼亞州 1159313261 MULTIPOLYGON (((147.31246 -43.28038, 147.34238...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
95 Admin-1 scale rank 2 USA-3540 3540 US-VT http://en.wikipedia.org/wiki/Vermont US 1 Vermont VT ... Vermont Vermont Vermont Вермонт Vermont Vermont Vermont 佛蒙特州 1159315305 POLYGON ((-73.35218 45.00542, -73.18201 45.005...
96 Admin-1 scale rank 2 USA-3519 3519 US-WA http://en.wikipedia.org/wiki/Washington_(state) US 6 Washington WA|Wash. ... Washington Waszyngton Washington Вашингтон Washington Vaşington Washington 华盛顿州 1159309547 MULTIPOLYGON (((-122.78878 48.99303, -122.6863...
97 Admin-1 scale rank 2 USA-3553 3553 US-WI http://en.wikipedia.org/wiki/Wisconsin US 1 Wisconsin WI|Wis. ... Wisconsin Wisconsin Wisconsin Висконсин Wisconsin Wisconsin Wisconsin 威斯康辛州 1159315321 POLYGON ((-90.65058 42.51298, -90.65733 42.520...
98 Admin-1 scale rank 2 USA-3554 3554 US-WV http://en.wikipedia.org/wiki/West_Virginia US 1 West Virginia WV|W.Va. ... West Virginia Wirginia Zachodnia Virgínia Ocidental Западная Виргиния West Virginia Batı Virginia Tây Virginia 西維吉尼亞州 1159315323 POLYGON ((-81.96528 37.53973, -82.10346 37.570...
99 Admin-1 scale rank 2 USA-3527 3527 US-WY http://en.wikipedia.org/wiki/Wyoming US 1 Wyoming WY|Wyo. ... Wyoming Wyoming Wyoming Вайоминг Wyoming Wyoming Wyoming 怀俄明州 1159315351 POLYGON ((-104.02166 41.00086, -104.33572 41.0...

100 rows × 84 columns

In [10]:
fig, ax = plt.subplots(figsize=(40,40))
shp_path = "./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.shp"

#reading states data from shape file
gdf = geopandas.read_file(shp_path)
gdf = gdf[gdf['iso_a2'] == 'US'] # selecting only US states wihtout Alaska & Hawaii
gdf = gdf[gdf['name'].apply(lambda n: n not in {'Alaska', 'Hawaii'})]
#Let's add states name # see also https://stackoverflow.com/questions/38899190/geopandas-label-polygons
gdf['repres_points'] = gdf['geometry'].apply(lambda x: x.representative_point())

for idx, row in gdf.iterrows():
    pt = row['repres_points']
    name = row['iso_3166_2'].replace("US-", "")
    plt.annotate(s=name, xy=(pt.x,pt.y),
                 horizontalalignment='center', fontsize=20,color="black")

gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')
Out[10]:
<matplotlib.axes._subplots.AxesSubplot at 0x11ed84320>

Now, let's analyze Life on the Mississippi by Mark Twain and extract locations which appear in his work:

In [14]:
!wget -O ./datasets/mark_twain.txt http://www.gutenberg.org/files/245/245-0.txt
!python -m spacy download en_core_web_lg # remember to restart the runtime 
In [12]:
import spacy
import operator

nlp = spacy.load('en_core_web_lg')


def get_locations_from_text(text):
    locations_dict= {}

    #using spaCy to get entities
    doc = nlp(text)

    for entity in doc.ents:
        label = entity.label_
        if label not in {'LOC', 'GPE'}:
            continue
        loc = entity.text.lower().strip()
        if len(loc) < 2:
            continue
        if loc not in locations_dict:
            locations_dict[loc] = 0
        locations_dict[loc] += 1
    return locations_dict

twain_full_work_path = "./datasets/mark_twain.txt"
txt = open(twain_full_work_path).read()
locations_dict = get_locations_from_text(txt)
locations_dict = {k:v for k,v in  locations_dict.items() if v>3}
print(sorted(locations_dict.items(), key=operator.itemgetter(1), reverse=True)[:20])
print(f"Number of locations {len(locations_dict.keys())}")
[('mississippi', 119), ('new orleans', 105), ('st. louis', 80), ('cairo', 31), ('vicksburg', 28), ('memphis', 26), ('missouri', 25), ('the\nriver', 23), ('arkansas', 22), ('south', 22), ('natchez', 20), ('earth', 20), ('the united states', 20), ('st. paul', 18), ('cincinnati', 17), ('ohio', 15), ('illinois', 15), ('new\norleans', 14), ('texas', 13), ('louisiana', 12)]
Number of locations 70

Using the , let's transfer locations mentioned in the book to coordinates and draw them on the map:

In [16]:
!pip install geopy
Collecting geopy
  Downloading https://files.pythonhosted.org/packages/53/fc/3d1b47e8e82ea12c25203929efb1b964918a77067a874b2c7631e2ec35ec/geopy-1.21.0-py2.py3-none-any.whl (104kB)
     |████████████████████████████████| 112kB 508kB/s eta 0:00:01
Collecting geographiclib<2,>=1.49 (from geopy)
  Downloading https://files.pythonhosted.org/packages/8b/62/26ec95a98ba64299163199e95ad1b0e34ad3f4e176e221c40245f211e425/geographiclib-1.50-py3-none-any.whl
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-1.21.0
In [17]:
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="Data Science Education App") #  Using OpenStreetMap Nominatim
location = geolocator.geocode("the missouri river")
print(location.address)
print((location.latitude, location.longitude))
print(location.raw)
Missouri River, Sully County, South Dakota, 64072, United States of America
(44.6042103, -100.6355825)
{'place_id': 235665012, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright', 'osm_type': 'relation', 'osm_id': 1756890, 'boundingbox': ['38.5348721', '48.1497052', '-112.0153374', '-90.117707'], 'lat': '44.6042103', 'lon': '-100.6355825', 'display_name': 'Missouri River, Sully County, South Dakota, 64072, United States of America', 'class': 'waterway', 'type': 'river', 'importance': 0.7126739728744267}
In [20]:
from functools import lru_cache
from scipy.interpolate import interp1d # for transfaering font size
import time

@lru_cache(maxsize=256)
def get_location(loc):
    time.sleep(1)
    return geolocator.geocode(loc)
 
In [21]:
fig, ax = plt.subplots(figsize=(40,40))
gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')

m = interp1d([4,max(locations_dict.values())],[8,40])

for loc, v in locations_dict.items():
    location = get_location(loc)
    if location is None:
        continue
    if not (-120 < location.longitude < -65) or not (57>location.latitude  > 25):
        print(f"Skipping plottin {location}: {(location.latitude, location.longitude)} ")
        continue
    
    
    plt.plot(location.longitude,location.latitude, marker='o', color='red', markersize=m(v))
    plt.annotate(s=loc, xy=(location.longitude,location.latitude),
                 horizontalalignment='center', fontsize=20,color="black")
Skipping plottin Россия: (64.6863136, 97.7453061) 
Skipping plottin Europa: (51.0, 10.0) 
Skipping plottin Deutschland: (51.0834196, 10.4234469) 
Skipping plottin France: (46.603354, 1.8883335) 
Skipping plottin Italia: (42.6384261, 12.674297) 
Skipping plottin England, United Kingdom: (52.7954791, -0.5402402866174321) 
Skipping plottin America, Horst aan de Maas, Limburg, Nederland: (51.44770365, 5.966069282055592) 
Skipping plottin Canada: (61.0666922, -107.9917071) 
Skipping plottin Рівненська область, Україна: (51.2074112, 26.5208033) 
Skipping plottin 中国: (35.000074, 104.999927) 
Skipping plottin القاهرة, محافظة القاهرة, Egypt / مصر: (30.048819, 31.243666) 
Skipping plottin Hat Island, Kitikmeot Region, Nunavut, Canada: (68.33154189999999, -100.09088423305138) 
Skipping plottin water, Bern, Verwaltungskreis Bern-Mittelland, Verwaltungsregion Bern-Mittelland, Bern/Berne, 3005, Switzerland: (46.9341389, 7.4472821) 
Skipping plottin London, Greater London, England, SW1A 2DX, United Kingdom: (51.5073219, -0.1276474) 
Skipping plottin Western, West Kenya, Kenya: (0.5090396, 34.5731341) 
Skipping plottin 대한민국: (35.7724185, 127.79654346305617) 
Skipping plottin Murel, Saint-Chamant, Tulle, Corrèze, Nouvelle-Aquitaine, France métropolitaine, 19380, France: (45.1393385, 1.8670033) 
Skipping plottin Alps, Bellagio, Como, Lombardia, Italia: (45.953168500000004, 9.237907459757649) 
Skipping plottin Norge, Namsos, Trøndelag, Norge: (64.5731537, 11.52803643954819) 
Skipping plottin Manchester, Greater Manchester, North West England, England, United Kingdom: (53.4794892, -2.2451148) 
Skipping plottin Troya'nın Arkeolojik Alanı, 17-56, Tevfikiye, Çanakkale merkez, Çanakkale, Marmara Bölgesi, Türkiye: (39.957373950000004, 26.238017461011644) 
Skipping plottin Ihamo, Rauma, Rauman seutukunta, Satakunta, Manner-Suomi, Suomi: (61.187668, 21.419709103913284) 

Let's add the course of the river to the map:

In [23]:
[k for k in locations_dict.keys() if "river" in k]
Out[23]:
['mississippi river',
 'the\nriver',
 'the mississippi river',
 'black river',
 'little river']
In [27]:
!wget -O ./datasets/ne_10m_rivers_lake_centerlines.zip https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip
!unzip     ./datasets/ne_10m_rivers_lake_centerlines.zip -d ./datasets/ne_10m_rivers_lake_centerlines
--2020-05-08 15:12:35--  https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip
Resolving www.naturalearthdata.com (www.naturalearthdata.com)... 66.147.242.194
Connecting to www.naturalearthdata.com (www.naturalearthdata.com)|66.147.242.194|:443... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: http://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip [following]
--2020-05-08 15:12:37--  http://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip
Resolving naciscdn.org (naciscdn.org)... 146.201.97.163
Connecting to naciscdn.org (naciscdn.org)|146.201.97.163|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip [following]
--2020-05-08 15:12:38--  https://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip
Connecting to naciscdn.org (naciscdn.org)|146.201.97.163|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1817597 (1.7M) [application/zip]
Saving to: ‘./datasets/ne_10m_rivers_lake_centerlines.zip’

./datasets/ne_10m_r 100%[===================>]   1.73M   713KB/s    in 2.5s    

2020-05-08 15:12:41 (713 KB/s) - ‘./datasets/ne_10m_rivers_lake_centerlines.zip’ saved [1817597/1817597]

Archive:  ./datasets/ne_10m_rivers_lake_centerlines.zip
  inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.README.html  
 extracting: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.VERSION.txt  
 extracting: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.cpg  
  inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.dbf  
  inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.prj  
  inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shp  
  inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shx  
In [33]:
fig, ax = plt.subplots(figsize=(40,40))
ax = gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')


gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')

m = interp1d([4,max(locations_dict.values())],[8,40])

for loc, v in locations_dict.items():
    location = get_location(loc)
    if location is None:
        continue
    if not (-100 < location.longitude < -65) or not (57>location.latitude  > 25):
        continue
    
    
    plt.plot(location.longitude,location.latitude, marker='o', color='red', markersize=m(v))
    plt.annotate(s=loc, xy=(location.longitude,location.latitude),
                 horizontalalignment='center', fontsize=20,color="black")

#adding the Mississippi river
# data from Natural Earth https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-rivers-lake-centerlines/
river_shp_path = "./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shp"
#reading states data from shape file
r_gdf = geopandas.read_file(river_shp_path)

r_gdf = r_gdf[r_gdf['name'] == 'Mississippi']
r_gdf.plot(ax=ax)
Out[33]:
<matplotlib.axes._subplots.AxesSubplot at 0x150129a58>
In [34]:
#Focusing only on states which are relevant to the entities
from shapely.geometry import Point
location_points = []
def is_poly_contains_point(poly, l):
    for pt in l:
        if pt.within(poly):
            return True
    return False

for loc, v in locations_dict.items():
    location = get_location(loc)
    if location is None:
        continue
    if not (-100 < location.longitude < -65) or not (57>location.latitude  > 25):
        continue
    location_points.append(Point(location.longitude,location.latitude))
gdf['is_relevant'] = gdf['geometry'].apply(lambda p: is_poly_contains_point(p,location_points))
gdf2 = gdf[gdf['is_relevant'] == True]
gdf2.plot()
Out[34]:
<matplotlib.axes._subplots.AxesSubplot at 0x14fefed68>
In [35]:
fig, ax = plt.subplots(figsize=(40,40))
ax = gdf2.plot(ax=ax, color='lightgreen', edgecolor='gray')
m = interp1d([4,max(locations_dict.values())],[8,40])

for loc, v in locations_dict.items():
    location = get_location(loc)
    if location is None:
        continue
    if not (-97 < location.longitude < -70) or not (57>location.latitude  > 25):
        continue
    
    
    plt.plot(location.longitude,location.latitude, marker='o', color='red', markersize=m(v))
    plt.annotate(s=loc, xy=(location.longitude,location.latitude),
                 horizontalalignment='center', fontsize=20,color="black")

#adding the mississippi river
# data from Natural Earth https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-rivers-lake-centerlines/
river_shp_path = "../../datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shp"
#reading states data from shape file
r_gdf = geopandas.read_file(river_shp_path)

r_gdf = r_gdf[r_gdf['name'] == 'Mississippi']
r_gdf.plot(ax=ax)
Out[35]:
<matplotlib.axes._subplots.AxesSubplot at 0x15029ac50>
In [36]:
gdf2
Out[36]:
featurecla scalerank adm1_code diss_me iso_3166_2 wikipedia iso_a2 adm0_sr name name_alt ... name_pt name_ru name_sv name_tr name_vi name_zh ne_id geometry repres_points is_relevant
51 Admin-1 scale rank 2 USA-3528 3528 US-AR http://en.wikipedia.org/wiki/Arkansas US 1 Arkansas AR|Ark. ... Arkansas Арканзас Arkansas Arkansas Arkansas 阿肯色州 1159315355 POLYGON ((-89.70477 36.00157, -89.70932 35.983... POINT (-92.47082 34.74907) True
58 Admin-1 scale rank 2 USA-3542 3542 US-FL http://en.wikipedia.org/wiki/Florida US 5 Florida FL|Fla. ... Flórida Флорида Florida Florida Florida 佛罗里达州 1159315207 MULTIPOLYGON (((-87.48951 30.37768, -87.48011 ... POINT (-81.69118 28.02564) True
61 Admin-1 scale rank 2 USA-3529 3529 US-IA http://en.wikipedia.org/wiki/Iowa US 1 Iowa IA|Iowa ... Iowa Айова Iowa Iowa Iowa 艾奥瓦州 1159315357 POLYGON ((-91.44195 40.37945, -91.52993 40.432... POINT (-93.15799 41.94286) True
63 Admin-1 scale rank 2 USA-3546 3546 US-IL http://en.wikipedia.org/wiki/Illinois US 1 Illinois IL|Ill. ... Illinois Иллинойс Illinois Illinois Illinois 伊利诺伊州 1159315309 POLYGON ((-91.44195 40.37945, -91.39078 40.397... POINT (-89.46648 39.78811) True
66 Admin-1 scale rank 2 USA-3548 3548 US-KY http://en.wikipedia.org/wiki/Kentucky US 1 Kentucky Commonwealth of Kentucky|KY ... Kentucky Кентукки Kentucky Kentucky Kentucky 肯塔基州 1159315313 MULTIPOLYGON (((-89.15431 36.99211, -89.16507 ... POINT (-84.75146 37.81147) True
67 Admin-1 scale rank 2 USA-3535 3535 US-LA http://en.wikipedia.org/wiki/Louisiana US 5 Louisiana LA ... Luisiana Луизиана Louisiana Louisiana Louisiana 路易斯安那州 1159315221 MULTIPOLYGON (((-94.04131 33.01200, -93.86163 ... POINT (-91.64884 30.97444) True
68 Admin-1 scale rank 2 USA-3513 3513 US-MA http://en.wikipedia.org/wiki/Massachusetts US 6 Massachusetts Commonwealth of Massachusetts|MA|Mass. ... Massachusetts Массачусетс Massachusetts Massachusetts Massachusetts 麻薩諸塞州 1159312157 MULTIPOLYGON (((-71.80084 42.01196, -71.80164 ... POINT (-72.09041 42.19646) True
71 Admin-1 scale rank 2 USA-3562 3562 US-MI http://en.wikipedia.org/wiki/Michigan US 1 Michigan MI|Mich. ... Michigan Мичиган Michigan Michigan Michigan 密歇根州 1159314665 POLYGON ((-89.49838 47.99790, -89.45565 47.996... POINT (-84.52730 44.97850) True
72 Admin-1 scale rank 2 USA-3514 3514 US-MN http://en.wikipedia.org/wiki/Minnesota US 1 Minnesota MN|Minn. ... Minnesota Миннесота Minnesota Minnesota Minnesota 明尼蘇達州 1159315297 POLYGON ((-97.22574 48.99318, -97.10345 48.993... POINT (-94.49195 46.42266) True
73 Admin-1 scale rank 2 USA-3531 3531 US-MO http://en.wikipedia.org/wiki/Missouri US 1 Missouri MO ... Missouri Миссури Missouri Missouri Missouri 密蘇里州 1159315361 POLYGON ((-89.70477 36.00157, -89.88818 35.999... POINT (-92.49784 38.30694) True
74 Admin-1 scale rank 2 USA-3544 3544 US-MS http://en.wikipedia.org/wiki/Mississippi US 5 Mississippi MS|Miss. ... Mississippi Миссисипи Mississippi Mississippi Mississippi 密西西比州 1159315231 MULTIPOLYGON (((-88.17327 34.99901, -88.08477 ... POINT (-89.76229 32.60562) True
78 Admin-1 scale rank 2 USA-3532 3532 US-NE http://en.wikipedia.org/wiki/Nebraska US 1 Nebraska NE|Nebr. ... Nebraska Небраска Nebraska Nebraska Nebraska 內布拉斯加州 1159315363 POLYGON ((-102.02449 40.00112, -102.02454 40.1... POINT (-100.00838 41.51073) True
80 Admin-1 scale rank 2 USA-3558 3558 US-NJ http://en.wikipedia.org/wiki/New_Jersey US 5 New Jersey NJ|N.J. ... Nova Jérsia Нью-Джерси New Jersey New Jersey New Jersey 新泽西州 1159315267 MULTIPOLYGON (((-75.07417 39.98348, -75.02353 ... POINT (-74.41977 40.11702) True
83 Admin-1 scale rank 2 USA-3559 3559 US-NY http://en.wikipedia.org/wiki/New_York US 3 New York NY|N.Y. ... Nova Iorque Нью-Йорк New York New York New York 纽约州 1159312155 MULTIPOLYGON (((-79.76202 42.53898, -79.44623 ... POINT (-76.09640 42.88696) True
84 Admin-1 scale rank 2 USA-3550 3550 US-OH http://en.wikipedia.org/wiki/Ohio US 1 Ohio OH|Ohio ... Ohio Огайо Ohio Ohio Ohio 俄亥俄州 1159315315 POLYGON ((-83.12167 41.95000, -83.02996 41.832... POINT (-82.70617 40.38192) True
85 Admin-1 scale rank 2 USA-3533 3533 US-OK http://en.wikipedia.org/wiki/Oklahoma US 1 Oklahoma OK|Okla. ... Oklahoma Оклахома Oklahoma Oklahoma Oklahoma 奧克拉荷馬州 1159315365 POLYGON ((-94.61838 36.50087, -94.59598 36.361... POINT (-97.22039 35.34591) True
87 Admin-1 scale rank 2 USA-3560 3560 US-PA http://en.wikipedia.org/wiki/Pennsylvania US 1 Pennsylvania Commonwealth of Pennsylvania|PA ... Pensilvânia Пенсильвания Pennsylvania Pensilvanya Pennsylvania 宾夕法尼亚州 1159315331 POLYGON ((-80.52076 42.32439, -80.24758 42.366... POINT (-77.74166 41.11123) True
89 Admin-1 scale rank 2 USA-3545 3545 US-SC http://en.wikipedia.org/wiki/South_Carolina US 1 South Carolina SC|S.C. ... Carolina do Sul Южная Каролина South Carolina Güney Karolina Nam Carolina 南卡罗来纳州 1159315307 POLYGON ((-80.87235 32.02957, -81.07481 32.109... POINT (-80.54117 33.59079) True
91 Admin-1 scale rank 2 USA-3551 3551 US-TN http://en.wikipedia.org/wiki/Tennessee US 1 Tennessee TN|Tenn. ... Tennessee Теннесси Tennessee Tennessee Tennessee 田纳西州 1159315319 POLYGON ((-85.62360 35.00086, -85.78404 35.002... POINT (-86.35632 35.83989) True
92 Admin-1 scale rank 2 USA-3536 3536 US-TX http://en.wikipedia.org/wiki/Texas US 4 Texas TX|Tex. ... Texas Техас Texas Teksas Texas 得克萨斯州 1159315211 MULTIPOLYGON (((-94.48415 33.64843, -94.43201 ... POINT (-99.66867 31.19509) True
94 Admin-1 scale rank 2 USA-3552 3552 US-VA http://en.wikipedia.org/wiki/Virginia US 6 Virginia VA ... Virgínia Виргиния Virginia Virjinya Virginia 弗吉尼亚州 1159315259 MULTIPOLYGON (((-77.12205 38.94354, -77.10183 ... POINT (-78.45841 37.98719) True

21 rows × 86 columns

2.3 Interactive Maps with Folium

In this section, we are going to create interactive maps using the Folium package.

Note: A very helpful Folium tutorial can be found at the following link

In [38]:
import folium
import matplotlib.pyplot as plt

m = folium.Map(location=[40.712776, -74.005974]) #  Latitude and Longitude (Northing, Easting)
m
Out[38]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Let's change the tiles and put in a marker:

In [39]:
tiles = 'Stamen Terrain'
m = folium.Map(location=[40.712776, -74.005974], 
               zoom_start=9,
               tiles = tiles)

folium.Marker(
    location=[40.712776, -74.005974], # coordinates for the marker (Earth Lab at CU Boulder)
    popup='Here is New York', # pop-up label for the marker
    icon=folium.Icon()
).add_to(m)
m  
Out[39]:
Make this Notebook Trusted to load map: File -> Trust Notebook

2.3.1 Visualizing Significant Earthquakes

Let's use Folium to visualize earthquakes, using the Significant Earthquakes dataset:

In [40]:
#!mkdir ./datasets
!mkdir ./datasets/earthquakes

# download the dataset from Kaggle and unzip it
!kaggle datasets download usgs/earthquake-database  -p ./datasets/earthquakes

!unzip ./datasets/earthquakes/*.zip  -d ./datasets/earthquakes/
Downloading earthquake-database.zip to ./datasets/earthquakes
100%|████████████████████████████████████████| 590k/590k [00:00<00:00, 1.34MB/s]
100%|████████████████████████████████████████| 590k/590k [00:00<00:00, 1.34MB/s]
Archive:  ./datasets/earthquakes/earthquake-database.zip
  inflating: ./datasets/earthquakes/database.csv  
In [41]:
import turicreate as tc
sf = tc.SFrame.read_csv("./datasets/earthquakes/database.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/earthquakes/database.csv
Parsing completed. Parsed 100 lines in 0.075065 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,str,float,float,str,float,str,str,float,str,str,str,str,str,str,str,str,str,str,str,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/earthquakes/database.csv
Parsing completed. Parsed 23412 lines in 0.059568 secs.
Out[41]:
Date Time Latitude Longitude Type Depth Depth Error Depth Seismic Stations Magnitude
01/02/1965 13:44:18 19.246 145.616 Earthquake 131.6 6.0
01/04/1965 11:29:49 1.863 127.352 Earthquake 80.0 5.8
01/05/1965 18:05:58 -20.579 -173.972 Earthquake 20.0 6.2
01/08/1965 18:49:43 -59.076 -23.557 Earthquake 15.0 5.8
01/09/1965 13:32:50 11.938 126.427 Earthquake 15.0 5.8
01/10/1965 13:36:32 -13.405 166.629 Earthquake 35.0 6.7
01/12/1965 13:32:25 27.357 87.867 Earthquake 20.0 5.9
01/15/1965 23:17:42 -13.309 166.212 Earthquake 35.0 6.0
01/16/1965 11:32:37 -56.452 -27.043 Earthquake 95.0 6.0
01/17/1965 10:43:17 -24.563 178.487 Earthquake 565.0 5.8
Magnitude Type Magnitude Error Magnitude Seismic
Stations ...
Azimuthal Gap Horizontal Distance Horizontal Error Root Mean Square
MW
MW
MW
MW
MW
MW
MW
MW
MW
MW
ID Source Location Source Magnitude Source Status
ISCGEM860706 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM860737 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM860762 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM860856 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM860890 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM860922 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM861007 ISCGEM ISCGEM ISCGEM Automatic
ISCGEM861111 ISCGEM ISCGEM ISCGEM Automatic
ISCGEMSUP861125 ISCGEMSUP ISCGEM ISCGEM Automatic
ISCGEM861148 ISCGEM ISCGEM ISCGEM Automatic
[23412 rows x 21 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Let's plot the earthquake magnitudes distributions:

In [42]:
import seaborn as sns
%matplotlib inline 
sns.set()

sns.distplot(sf['Magnitude'])
Out[42]:
<matplotlib.axes._subplots.AxesSubplot at 0x147412d30>
In [43]:
import dateutil
import turicreate.aggregate as agg

sf['Year'] = sf['Date'].apply(lambda dt: dateutil.parser.parse(dt).year)
g = sf.groupby('Year', {'Earthquakes Number':agg.COUNT()})
df = g.to_dataframe()

fig, ax = plt.subplots(figsize=(6,15))
sns.barplot(y='Year', x='Earthquakes Number',orient="h", data=df, ax=ax)
Out[43]:
<matplotlib.axes._subplots.AxesSubplot at 0x158295470>
In [45]:
sf = sf['Date','Time','Latitude', 'Longitude', 'Magnitude']
b_sf = sf[sf['Magnitude'] > 8]
b_sf.sort('Magnitude', ascending=False)
Out[45]:
Date Time Latitude Longitude Magnitude
12/26/2004 00:58:53 3.295 95.982 9.1
03/11/2011 05:46:24 38.297 142.373 9.1
02/27/2010 06:34:12 -36.122 -72.898 8.8
02/04/1965 05:01:22 51.251 178.715 8.7
04/11/2012 08:38:37 2.327 93.063 8.6
03/28/2005 16:09:37 2.085 97.108 8.6
09/12/2007 11:10:27 -4.438 101.367 8.4
06/23/2001 20:33:14 -16.265 -73.641 8.4
09/16/2015 22:54:33 -31.5729 -71.6744 8.3
05/24/2013 05:44:49 54.892 153.221 8.3
[27 rows x 5 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [46]:
tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
               tiles = tiles)

for r in b_sf:
    tooltip = f"{r['Date']} {r['Time']} - Magnitude: {r['Magnitude']} - Location ({(r['Latitude'],r['Longitude'])})" 
    folium.Marker(
        location=[r['Latitude'],r['Longitude']], # coordinates for the marker (Earth Lab at CU Boulder)
        popup= tooltip,
        icon=folium.Icon(color='red', icon='info-sign')
    ).add_to(m)
m  
Out[46]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Let's use the full dataset and create a heatmap of areas with many earthquakes:

In [47]:
import folium
from folium.plugins import HeatMap

tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
               tiles = tiles)

data = [(r['Latitude'],r['Longitude']) for r in b_sf]
HeatMap(data, radius = 20).add_to(m)
m
Out[47]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [48]:
import folium
from folium.plugins import HeatMap

tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
               tiles = tiles)

data = [(r['Latitude'],r['Longitude'],r['Magnitude']) for r in sf]
HeatMap(data, radius = 20).add_to(m)
m
Out[48]:
Make this Notebook Trusted to load map: File -> Trust Notebook

2.3.2 Visualizing UFO Sightings

In this section, we will use Folium to analyze the UFO Sightings dataset. First let's explore the data:

In [49]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('hAAlDoAtV7Y')
Out[49]:
In [50]:
#!mkdir ./datasets
!mkdir ./datasets/ufo

# download the dataset from Kaggle and unzip it
!kaggle datasets download NUFORC/ufo-sightings  -p ./datasets/ufo

!unzip ./datasets/ufo/*.zip  -d ./datasets/ufo/
Downloading ufo-sightings.zip to ./datasets/ufo
 98%|█████████████████████████████████████▏| 10.0M/10.2M [00:01<00:00, 5.84MB/s]
100%|██████████████████████████████████████| 10.2M/10.2M [00:01<00:00, 5.51MB/s]
Archive:  ./datasets/ufo/ufo-sightings.zip
  inflating: ./datasets/ufo/complete.csv  
  inflating: ./datasets/ufo/scrubbed.csv  
In [54]:
import turicreate as tc
import turicreate.aggregate as agg
import dateutil
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set()

sf = tc.SFrame.read_csv('./datasets/ufo/complete.csv')
sf
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	5/5/2002 13:00,,,,,0,,,"Characteristics: Not aura or haze, but more like emanation of an energy field.",5/24/2005,0,0^
Successfully parsed 11 tokens: 
	0: 5/5/2002 13:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: Characteri ... rgy field.
	9: 5/24/2005
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	3/30/2007 04:30,,ms,,,0,sphere,5 minutes,"White luminous Sphere",4/27/2007,0,0^
Successfully parsed 11 tokens: 
	0: 3/30/2007 04:30
	1: 
	2: ms
	3: 
	4: 
	5: 0
	6: sphere
	7: 5 minutes
	8: White luminous Sphere
	9: 4/27/2007
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	3/31/1966 23:50,,,,,0,,,"Please refer to my report I reported to you 6/7/2001 and you posted on 8/5/2001.  Anyway, do you know of any doctor that would remove t",10/30/2006,0,0^
Successfully parsed 11 tokens: 
	0: 3/31/1966 23:50
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: Please ref ... d remove t
	9: 10/30/2006
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	7/15/2013 21:00,,ms,,,0,formation,5 minutes,"My two children and I saw strange lights move around the sky. Antigravity type motion,silent,some suddenly vanished",8/30/2013,0,0^
Successfully parsed 11 tokens: 
	0: 7/15/2013 21:00
	1: 
	2: ms
	3: 
	4: 
	5: 0
	6: formation
	7: 5 minutes
	8: My two chi ... y vanished
	9: 8/30/2013
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	1/26/2009 24:00,,,,,0,,,"I followed an FAA regulation, and you went back on your word for me to remain anonymous. Now you've been removed from the 7110.65. HA!",3/19/2009,0,0^
Successfully parsed 11 tokens: 
	0: 1/26/2009 24:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: I followed ... 65. HA!
	9: 3/19/2009
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	3/31/2010 01:30,,nv,,,0,teardrop,,"unusual  flying objects taken by flash earth satalite  over nevada highway",11/21/2010,0,0^
Successfully parsed 11 tokens: 
	0: 3/31/2010 01:30
	1: 
	2: nv
	3: 
	4: 
	5: 0
	6: teardrop
	7: 
	8: unusual  f ... da highway
	9: 11/21/2010
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	12/7/2002 14:00,,,,,0,circle,,"it was round made nosies i saw it uabove the sky.",12/23/2002,0,0^
Successfully parsed 11 tokens: 
	0: 12/7/2002 14:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: circle
	7: 
	8: it was rou ... e the sky.
	9: 12/23/2002
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	5/6/2008 02:00,,,,,0,circle,30 seconds,"it was early morning i was reading a book for some reason and i saw 5 circle shaped UFO'S in the sky that were moving rapidly through t",6/12/2008,0,0^
Successfully parsed 11 tokens: 
	0: 5/6/2008 02:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: circle
	7: 30 seconds
	8: it was ear ...  through t
	9: 6/12/2008
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	12/7/2008 12:07,,,,,0,,,"I was looking at a friends house and tilted the view up(north) and noticed what appears to be a classic shape of a "space ship".",1/10/2009,0,0^
Successfully parsed 11 tokens: 
	0: 12/7/2008 12:07
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: I was look ... hip".
	9: 1/10/2009
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	11/20/2008 22:00,,tx,,,0,light,30 min,"10:00 pm= I saw 2 pair of lights. moments later one pair disapeared. another moment later I saw the other pair disapear & I saw a air p",1/10/2009,0,0^
Successfully parsed 11 tokens: 
	0: 11/20/2008 22:00
	1: 
	2: tx
	3: 
	4: 
	5: 0
	6: light
	7: 30 min
	8: 10:00 pm=  ... aw a air p
	9: 1/10/2009
	10: 0
196 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/ufo/complete.csv
Parsing completed. Parsed 100 lines in 0.258661 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,str,str,str,str,int,str,str,str,float,float]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	5/5/2002 13:00,,,,,0,,,"Characteristics: Not aura or haze, but more like emanation of an energy field.",5/24/2005,0,0^
Successfully parsed 11 tokens: 
	0: 5/5/2002 13:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: Characteri ... rgy field.
	9: 5
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	3/30/2007 04:30,,ms,,,0,sphere,5 minutes,"White luminous Sphere",4/27/2007,0,0^
Successfully parsed 11 tokens: 
	0: 3/30/2007 04:30
	1: 
	2: ms
	3: 
	4: 
	5: 0
	6: sphere
	7: 5 minutes
	8: White luminous Sphere
	9: 4
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	3/31/1966 23:50,,,,,0,,,"Please refer to my report I reported to you 6/7/2001 and you posted on 8/5/2001.  Anyway, do you know of any doctor that would remove t",10/30/2006,0,0^
Successfully parsed 11 tokens: 
	0: 3/31/1966 23:50
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: Please ref ... d remove t
	9: 10
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	7/15/2013 21:00,,ms,,,0,formation,5 minutes,"My two children and I saw strange lights move around the sky. Antigravity type motion,silent,some suddenly vanished",8/30/2013,0,0^
Successfully parsed 11 tokens: 
	0: 7/15/2013 21:00
	1: 
	2: ms
	3: 
	4: 
	5: 0
	6: formation
	7: 5 minutes
	8: My two chi ... y vanished
	9: 8
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	3/31/2010 01:30,,nv,,,0,teardrop,,"unusual  flying objects taken by flash earth satalite  over nevada highway",11/21/2010,0,0^
Successfully parsed 11 tokens: 
	0: 3/31/2010 01:30
	1: 
	2: nv
	3: 
	4: 
	5: 0
	6: teardrop
	7: 
	8: unusual  f ... da highway
	9: 11
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	1/26/2009 24:00,,,,,0,,,"I followed an FAA regulation, and you went back on your word for me to remain anonymous. Now you've been removed from the 7110.65. HA!",3/19/2009,0,0^
Successfully parsed 11 tokens: 
	0: 1/26/2009 24:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: I followed ... 65. HA!
	9: 3
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	11/20/2008 22:00,,tx,,,0,light,30 min,"10:00 pm= I saw 2 pair of lights. moments later one pair disapeared. another moment later I saw the other pair disapear & I saw a air p",1/10/2009,0,0^
Successfully parsed 11 tokens: 
	0: 11/20/2008 22:00
	1: 
	2: tx
	3: 
	4: 
	5: 0
	6: light
	7: 30 min
	8: 10:00 pm=  ... aw a air p
	9: 1
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	5/6/2008 02:00,,,,,0,circle,30 seconds,"it was early morning i was reading a book for some reason and i saw 5 circle shaped UFO'S in the sky that were moving rapidly through t",6/12/2008,0,0^
Successfully parsed 11 tokens: 
	0: 5/6/2008 02:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: circle
	7: 30 seconds
	8: it was ear ...  through t
	9: 6
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	12/7/2002 14:00,,,,,0,circle,,"it was round made nosies i saw it uabove the sky.",12/23/2002,0,0^
Successfully parsed 11 tokens: 
	0: 12/7/2002 14:00
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: circle
	7: 
	8: it was rou ... e the sky.
	9: 12
	10: 0
Unexpected characters after last column. "0"
Parse failed at token ending at: 
	12/7/2008 12:07,,,,,0,,,"I was looking at a friends house and tilted the view up(north) and noticed what appears to be a classic shape of a "space ship".",1/10/2009,0,0^
Successfully parsed 11 tokens: 
	0: 12/7/2008 12:07
	1: 
	2: 
	3: 
	4: 
	5: 0
	6: 
	7: 
	8: I was look ... hip".
	9: 1
	10: 0
196 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/ufo/complete.csv
Parsing completed. Parsed 88679 lines in 0.303616 secs.
Out[54]:
datetime city state country shape duration (seconds) duration (hours/min)
10/10/1949 20:30 san marcos tx us cylinder 2700 45 minutes
10/10/1949 21:00 lackland afb tx light 7200 1-2 hrs
10/10/1955 17:00 chester (uk/england) gb circle 20 20 seconds
10/10/1956 21:00 edna tx us circle 20 1/2 hour
10/10/1960 20:00 kaneohe hi us light 900 15 minutes
10/10/1961 19:00 bristol tn us sphere 300 5 minutes
10/10/1965 21:00 penarth (uk/wales) gb circle 180 about 3 mins
10/10/1965 23:45 norwalk ct us disk 1200 20 minutes
10/10/1966 20:00 pell city al us disk 180 3 minutes
10/10/1966 21:00 live oak fl us disk 120 several minutes
comments date posted latitude longitude
This event took place in
early fall around ...
4/27/2004 29.8830556 -97.9411111
1949 Lackland AFB&#44 TX.
Lights racing across the ...
12/16/2005 29.38421 -98.581082
Green/Orange circular
disc over Chester&#44 ...
1/21/2008 53.2 -2.916667
My older brother and twin
sister were leaving the ...
1/17/2004 28.9783333 -96.6458333
AS a Marine 1st Lt.
flying an FJ4B ...
1/22/2004 21.4180556 -157.8036111
My father is now 89 my
brother 52 the girl with ...
4/27/2007 36.595 -82.1888889
penarth uk circle 3mins
stayed 30ft above me for ...
2/14/2006 51.434722 -3.18
A bright orange color
changing to reddish c ...
10/2/1999 41.1175 -73.4083333
Strobe Lighted disk shape
object observed close ...
3/19/2009 33.5861111 -86.2861111
Saucer zaps energy from
powerline as my pregnant ...
5/11/2005 30.2947222 -82.9841667
[88679 rows x 11 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [55]:
def get_datetime(dt_str):
    try:
        return dateutil.parser.parse(dt_str)
    except:
        return None


sf['datetime'] = sf['datetime'].apply(lambda dt_str: get_datetime(dt_str))
sf = sf.dropna()
sf['Hour'] = sf['datetime'].apply(lambda dt: dt.hour)
sf['Month'] = sf['datetime'].apply(lambda dt: dt.month)
sf['Year'] = sf['datetime'].apply(lambda dt: dt.year)
sf['Decade'] = sf['Year'].apply(lambda y: y - y%10)
In [56]:
sf2 = sf[sf['Year'] >= 1950]
g = sf2.groupby('Decade', {'Sightings Number': agg.COUNT()})
df = g.to_dataframe()
fig, ax = plt.subplots(figsize=(10,6))
sns.barplot(x='Decade', y='Sightings Number', data=df, ax=ax)
Out[56]:
<matplotlib.axes._subplots.AxesSubplot at 0x15efb0be0>
In [57]:
g = sf.groupby('Hour', {'Sightings Number': agg.COUNT()})
df = g.to_dataframe()
fig, ax = plt.subplots(figsize=(10,6))
sns.barplot(x='Hour', y='Sightings Number', data=df, ax=ax)
Out[57]:
<matplotlib.axes._subplots.AxesSubplot at 0x1609e7e10>
In [58]:
g = sf.groupby('shape', {'Sightings Number': agg.COUNT()})
g = g.sort('Sightings Number')
g = g[g['Sightings Number'] > 100]
df = g.to_dataframe()

fig, ax = plt.subplots(figsize=(10,6))

sns.barplot(x='shape', y='Sightings Number', data=df, ax=ax)
plt.xticks(rotation=45)
Out[58]:
(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
        17, 18, 19, 20, 21]), <a list of 22 Text xticklabel objects>)
In [59]:
import folium
from folium.plugins import HeatMap

tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
               tiles = tiles)

data = [(r['latitude'],r['longitude']) for r in sf]
HeatMap(data, radius = 20).add_to(m)
m
Out[59]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Let's visualize the sighting locations in 1993:

In [60]:
sf_80 = sf[sf['Decade'] == 1980]
len(sf_80)
Out[60]:
2273
In [61]:
from folium.plugins import MarkerCluster
def pop_text(r):
    txt = f"<b>{r['datetime']}</b><br> {r['comments'][:500]}"
    return txt

m = folium.Map(zoom_start=8, tiles='CartoDB dark_matter')
mc = MarkerCluster()
for r in sf_80[:200]:
    mc.add_child(folium.CircleMarker(location=[r['latitude'],r['longitude']],
                        radius=5,color="#007849", popup=pop_text(r), parse_html=True))
m.add_child(mc)
m
Out[61]:
Make this Notebook Trusted to load map: File -> Trust Notebook
In [62]:
def get_color(shape):
    if r['shape'] == 'circle':
         return 'red'
    if r['shape'] == 'triangle':
        return 'green'
    else:
        return 'blue'

s_sf = sf[sf['shape'].apply(lambda s: s in ['circle', 'triangle'])]
random_sample_sf, x = s_sf.random_split(0.1)

m = folium.Map(zoom_start=8, tiles='CartoDB dark_matter')
for r in random_sample_sf:
    m.add_child(folium.CircleMarker(location=[r['latitude'],r['longitude']],
                        radius=5,color=get_color(r['shape']), popup=pop_text(r), parse_html=True))
m
Out[62]:
Make this Notebook Trusted to load map: File -> Trust Notebook

2.3.3 Working with TopoJSON

In this example, we are going to work with Folium and TopoJSON. Namely, we are going to draw an interactive choropleth map of the population in Washington state by county. First let's get data of Washington state counties' population and TopoJSON data with the counties' geographic data. Let's create a map that presents the population in each county:

In [65]:
#!mkdir ./datasets
!mkdir ./datasets/WA
!wget -O ./datasets/WA/population.csv https://data.wa.gov/api/views/2hia-rqet/rows.csv?accessType=DOWNLOAD
!wget -O ./datasets/WA/WA-53-washington-counties.json https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/WA-53-washington-counties.json
--2020-05-08 16:04:25--  https://data.wa.gov/api/views/2hia-rqet/rows.csv?accessType=DOWNLOAD
Resolving data.wa.gov (data.wa.gov)... 52.206.68.26, 52.206.140.205, 52.206.140.199
Connecting to data.wa.gov (data.wa.gov)|52.206.68.26|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/csv]
Saving to: ‘./datasets/WA/population.csv’

./datasets/WA/popul     [ <=>                ]  73.63K   449KB/s    in 0.2s    

2020-05-08 16:04:26 (449 KB/s) - ‘./datasets/WA/population.csv’ saved [75396]

--2020-05-08 16:04:26--  https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/WA-53-washington-counties.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.112.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.112.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 15542 (15K) [text/plain]
Saving to: ‘./datasets/WA/WA-53-washington-counties.json’

./datasets/WA/WA-53 100%[===================>]  15.18K  --.-KB/s    in 0.07s   

2020-05-08 16:04:27 (232 KB/s) - ‘./datasets/WA/WA-53-washington-counties.json’ saved [15542/15542]

In [70]:
import folium

datasets_path = "./datasets/WA"
tiles = 'Mapbox Bright'
m = folium.Map(
    location=[47.6117, -122.332],
    tiles=tiles,
    zoom_start=7
)
folium.Marker(
    location=[47.611700, -122.332000], # coordinates for the marker (Earth Lab at CU Boulder)
    popup='Here is Washington State', # pop-up label for the marker
    icon=folium.Icon()
).add_to(m)

#Create a layer of WAshington State Counties
topoJSONpath = f"{datasets_path}/WA-53-washington-counties.json"
folium.TopoJson(
    open(topoJSONpath),
    object_path='objects.cb_2015_washington_county_20m',

).add_to(m)
m
Out[70]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Now, let's use the counties' population data to create a choropleth map:

In [68]:
import geopandas as gpd
gdf = gpd.read_file(topoJSONpath)

gdf.plot()
Out[68]:
<matplotlib.axes._subplots.AxesSubplot at 0x16105cbe0>
In [69]:
gdf.head()
Out[69]:
id STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD ALAND AWATER geometry
0 None 53 063 01529225 0500000US53063 53063 Spokane 06 4568197031 43789502 POLYGON ((-117.82144 47.82584, -117.69950 47.8...
1 None 53 041 01531927 0500000US53041 53041 Lewis 06 6223223859 86636988 POLYGON ((-123.37212 46.79151, -123.20424 46.7...
2 None 53 025 01531924 0500000US53025 53025 Grant 06 6939890129 289550821 POLYGON ((-120.00743 47.22020, -120.00566 47.3...
3 None 53 051 01529157 0500000US53051 53051 Pend Oreille 06 3626035232 65404066 POLYGON ((-117.42912 48.99974, -117.26831 48.9...
4 None 53 023 01533500 0500000US53023 53023 Garfield 06 1840672367 19490299 POLYGON ((-117.85325 46.62453, -117.75075 46.6...
In [73]:
import pandas as pd
wa_pop_path = f"{datasets_path}/population.csv"
df = pd.read_csv(wa_pop_path)
df 
Out[73]:
SEQUENCE FILTER COUNTY JURISDICTION POP_1990 POP_1991 POP_1992 POP_1993 POP_1994 POP_1995 ... POP_2010 POP_2011 POP_2012 POP_2013 POP_2014 POP_2015 POP_2016 POP_2017 POP_2018 POP_2019
0 1 1 Adams Adams County 13603.0 13823.0 14063.0 14335.0 14679.0 15030.0 ... 18728 18950 19050 19200 19400 19410 19510 19870 20020 20150
1 2 2 Adams Unincorporated Adams County 6466.0 6698.0 6776.0 7009.0 7162.0 7303.0 ... 8818 8960 8980 9040 9135 9085 9105 9165 9220 9270
2 3 3 Adams Incorporated Adams County 7137.0 7125.0 7287.0 7326.0 7517.0 7727.0 ... 9910 9990 10070 10160 10265 10325 10405 10705 10800 10880
3 4 4 Adams Hatton 71.0 80.0 81.0 82.0 83.0 84.0 ... 101 100 105 110 110 110 110 110 110 115
4 5 4 Adams Lind 472.0 400.0 523.0 435.0 452.0 451.0 ... 564 560 565 570 565 560 550 550 550 550
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
404 405 4 Yakima Yakima 54843.0 58925.0 60455.0 61693.0 62387.0 63930.0 ... 91196 91630 91930 92620 93080 93220 93410 93900 94190 94440
405 406 4 Yakima Zillah 1911.0 1922.0 1938.0 1991.0 2062.0 2096.0 ... 2964 3000 3035 3115 3140 3140 3145 3150 3165 3185
406 407 1 Washington State Total 4866659.0 5000353.0 5091138.0 5188009.0 5291577.0 5396569.0 ... 6724540 6767900 6817770 6882400 6968170 7061410 7183700 7310300 7427570 7546410
407 408 2 Washington Unincorporated State Total 2341365.0 2394824.0 2438904.0 2435178.0 2475442.0 2507091.0 ... 2478323 2454633 2438547 2449701 2470761 2497039 2516902 2557466 2591085 2635501
408 409 3 Washington Incorporated State Total 2525294.0 2605529.0 2652234.0 2752831.0 2816135.0 2889478.0 ... 4246217 4313267 4379223 4432699 4497409 4564371 4666798 4752834 4836485 4910909

409 rows × 34 columns

In [75]:
import turicreate as tc
import turicreate.aggregate as agg
import pandas as pd


sf = tc.SFrame.read_csv(wa_pop_path)
sf = sf[sf['JURISDICTION'].apply(lambda j: "corporated" not in j)]
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/WA/population.csv
Parsing completed. Parsed 100 lines in 0.00683 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,int,str,str,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/WA/population.csv
Parsing completed. Parsed 409 lines in 0.006773 secs.
Out[75]:
SEQUENCE FILTER COUNTY JURISDICTION POP_1990 POP_1991 POP_1992 POP_1993 POP_1994 POP_1995 POP_1996
1 1 Adams Adams County 13603 13823 14063 14335 14679 15030 15323
4 4 Adams Hatton 71 80 81 82 83 84 93
5 4 Adams Lind 472 400 523 435 452 451 484
6 4 Adams Othello 4638 4692 4735 4868 5033 5240 5265
7 4 Adams Ritzville 1725 1728 1730 1729 1730 1733 1733
8 4 Adams Washtucna 231 225 218 212 219 219 218
9 1 Asotin Asotin County 17605 17677 17866 18124 18666 18937 19622
12 4 Asotin Asotin 981 1039 1046 1002 1017 1072 1086
13 4 Asotin Clarkston 6753 6632 6762 6771 7128 6748 6982
14 1 Benton Benton County 112560 114439 116503 119374 123457 128359 132590
POP_1997 POP_1998 POP_1999 POP_2000 POP_2001 POP_2002 POP_2003 POP_2004 POP_2005 POP_2006 POP_2007
15698 15879 16151 16428 16699 16911 17081 17489 17643 17690 17959
94 96 97 98 119 97 97 97 97 96 96
517 535 567 582 582 576 574 561 556 556 550
5508 5614 5681 5847 5961 6062 6129 6434 6551 6523 6714
1731 1733 1733 1736 1745 1735 1716 1707 1695 1685 1678
250 254 258 260 255 248 241 239 237 243 239
19943 20202 20442 20551 20650 20652 20709 20779 20939 21176 21413
1083 1094 1081 1095 1106 1114 1120 1128 1133 1174 1189
7168 7369 7565 7337 7371 7300 7275 7265 7270 7258 7273
135620 137717 139498 142475 145267 148290 151933 155874 159286 162255 165096
POP_2008 POP_2009 POP_2010 POP_2011 POP_2012 POP_2013 POP_2014 POP_2015 POP_2016 POP_2017 POP_2018
18214 18421 18728 18950 19050 19200 19400 19410 19510 19870 20020
96 98 101 100 105 110 110 110 110 110 110
550 550 564 560 565 570 565 560 550 550 550
6931 7089 7364 7420 7495 7565 7695 7780 7875 8175 8270
1681 1675 1673 1705 1695 1700 1680 1670 1660 1660 1660
214 210 208 205 210 215 215 205 210 210 210
21522 21593 21623 21650 21700 21800 21950 22010 22150 22290 22420
1224 1244 1251 1255 1255 1265 1265 1260 1270 1275 1275
7226 7228 7229 7200 7205 7210 7225 7235 7260 7250 7205
167598 171402 175177 177900 180000 183400 186500 188590 190500 193500 197420
POP_2019
20150
115
550
8345
1660
210
22520
1280
7205
201800
[? rows x 34 columns]
Note: Only the head of the SFrame is printed. This SFrame is lazily evaluated.
You can use sf.materialize() to force materialization.
In [82]:
sf = sf[sf["FILTER"] == 1]
sf
Out[82]:
SEQUENCE FILTER COUNTY JURISDICTION POP_1990 POP_1991 POP_1992 POP_1993 POP_1994 POP_1995
1 1 Adams Adams County 13603 13823 14063 14335 14679 15030
9 1 Asotin Asotin County 17605 17677 17866 18124 18666 18937
14 1 Benton Benton County 112560 114439 116503 119374 123457 128359
22 1 Chelan Chelan County 52250 53436 54965 56423 58319 60079
30 1 Clallam Clallam County 56210 57626 58275 59155 59919 60548
36 1 Clark Clark County 238053 248417 255915 264548 274423 286804
47 1 Columbia Columbia County 4024 4044 4041 4047 4053 4051
52 1 Cowlitz Cowlitz County 82119 82862 83802 84813 85921 87232
60 1 Douglas Douglas County 26195 27648 27423 28143 28692 29312
69 1 Ferry Ferry County 6295 6366 6465 6561 6648 6812
POP_1996 POP_1997 POP_1998 POP_1999 POP_2000 POP_2001 POP_2002 POP_2003 POP_2004 POP_2005 POP_2006
15323 15698 15879 16151 16428 16699 16911 17081 17489 17643 17690
19622 19943 20202 20442 20551 20650 20652 20709 20779 20939 21176
132590 135620 137717 139498 142475 145267 148290 151933 155874 159286 162255
61240 62895 64199 65575 66616 66896 67400 67507 68013 68963 69895
61469 62037 62933 63425 64179 64717 65398 65928 66725 67672 68948
298364 310512 323892 334641 345238 352715 364855 374091 385370 394600 404737
4051 4056 4058 4062 4064 4114 4115 4111 4122 4135 4128
88531 89568 90600 91744 92948 94081 94854 95849 96593 97673 99095
29967 30548 31427 32035 32603 32817 32871 33545 33944 34466 35505
6950 7058 7135 7184 7260 7340 7378 7397 7367 7405 7462
POP_2007 POP_2008 POP_2009 POP_2010 POP_2011 POP_2012 POP_2013 POP_2014 POP_2015 POP_2016 POP_2017
17959 18214 18421 18728 18950 19050 19200 19400 19410 19510 19870
21413 21522 21593 21623 21650 21700 21800 21950 22010 22150 22290
165096 167598 171402 175177 177900 180000 183400 186500 188590 190500 193500
70773 71799 72185 72453 72700 73200 73600 74300 75030 75910 76830
69847 70629 71027 71404 71600 72000 72350 72500 72650 73410 74240
412692 419091 423775 425363 428000 431250 435500 442800 451820 461010 471000
4095 4099 4097 4078 4100 4100 4100 4080 4090 4050 4100
100377 101542 102175 102410 102700 103050 103300 103700 104280 104850 105900
36340 37238 38036 38431 38650 38900 39280 39700 39990 40720 41420
7484 7529 7563 7551 7600 7650 7650 7660 7710 7700 7740
POP_2018 POP_2019
20020 20150
22420 22520
197420 201800
77800 78420
75130 76010
479500 488500
4150 4160
107310 108950
42120 42820
7780 7830
[? rows x 34 columns]
Note: Only the head of the SFrame is printed. This SFrame is lazily evaluated.
You can use sf.materialize() to force materialization.
In [83]:
total_wa = sf['POP_2019'].sum()
sf['Pop Percentage'] = sf['POP_2019'].apply(lambda p: p/float(total_wa))
sf = sf.rename({"COUNTY":"NAME"})
gsf = tc.SFrame(gdf[['GEOID', 'NAME']]) 
g = sf.join(gsf, on="NAME")
g.sort('Pop Percentage', ascending=False)
Out[83]:
SEQUENCE FILTER NAME JURISDICTION POP_1990 POP_1991 POP_1992 POP_1993 POP_1994 POP_1995
124 1 King King County 1507305 1549991 1570997 1590603 1609529 1625241
245 1 Pierce Pierce County 586203 598065 610619 623697 636802 649284
292 1 Snohomish Snohomish County 465628 488075 496461 507336 519960 531704
315 1 Spokane Spokane County 361333 365887 371147 377020 384035 391318
36 1 Clark Clark County 238053 248417 255915 264548 274423 286804
340 1 Thurston Thurston County 161238 167663 172425 177058 181715 186419
166 1 Kitsap Kitsap County 189731 196926 202113 207976 212429 218308
390 1 Yakima Yakima County 188823 191490 194939 198225 202044 206046
361 1 Whatcom Whatcom County 127780 132669 137791 141270 146056 149942
14 1 Benton Benton County 112560 114439 116503 119374 123457 128359
POP_1996 POP_1997 POP_1998 POP_1999 POP_2000 POP_2001 POP_2002 POP_2003 POP_2004 POP_2005 POP_2006
1640249 1659106 1686266 1712122 1737046 1755487 1777514 1788082 1800783 1814999 1845209
653212 664070 675651 688884 700818 709288 721124 731969 743701 756919 774050
542738 554585 570896 589266 606024 617864 629287 639942 648778 661346 676126
397508 403954 408740 413665 417939 423127 428755 428889 431905 438249 446751
298364 310512 323892 334641 345238 352715 364855 374091 385370 394600 404737
190409 194440 198435 203167 207355 210102 214139 218264 223065 229286 234083
221849 225251 227179 229569 231969 233918 236656 239443 240777 239819 244049
209381 212375 215587 219483 222581 224229 224790 227956 230002 231902 234408
154122 157071 160667 163774 166826 170980 174238 175984 180245 184965 190088
132590 135620 137717 139498 142475 145267 148290 151933 155874 159286 162255
POP_2007 POP_2008 POP_2009 POP_2010 POP_2011 POP_2012 POP_2013 POP_2014 POP_2015 POP_2016 POP_2017
1871098 1891125 1909205 1931249 1942600 1957000 1981900 2017250 2052800 2105100 2153700
786911 794330 796900 795225 802150 808200 814500 821300 830120 844490 859400
689314 699330 705894 713335 717000 722900 730500 741000 757600 772860 789400
454034 460303 466426 471221 472650 475600 480000 484500 488310 492530 499800
412692 419091 423775 425363 428000 431250 435500 442800 451820 461010 471000
239570 244853 249336 252264 254100 256800 260100 264000 267410 272690 276900
247476 249905 251249 251133 253900 254500 254000 255900 258200 262590 264300
236923 239524 241708 243231 244700 246000 247250 248800 249970 250900 253000
195298 197675 199736 201140 202100 203500 205800 207600 209790 212540 216300
165096 167598 171402 175177 177900 180000 183400 186500 188590 190500 193500
POP_2018 POP_2019 Pop Percentage GEOID
2190200 2226300 0.14750722528990606 53033
872220 888300 0.05885580030769598 53053
805120 818700 0.05424433604853168 53061
507950 515250 0.034138749418597715 53063
479500 488500 0.032366383485657416 53011
281700 285800 0.018936156397545322 53067
267120 270100 0.017895926672417746 53035
254500 255950 0.01695839478639512 53077
220350 225300 0.014927627838932684 53073
197420 201800 0.013370596084760834 53005
[39 rows x 36 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [84]:
tiles = 'Mapbox Bright'
m = folium.Map(
    location=[47.6117, -122.332],
    tiles=tiles,
    zoom_start=7
)

#Create a layer of WAshington State Counties
topoJSONpath = f"{datasets_path}/WA-53-washington-counties.json"
folium.Choropleth(
    geo_data=open(topoJSONpath,"r"),
    topojson='objects.cb_2015_washington_county_20m',
    data=g.to_dataframe(),
    columns=['GEOID', 'Pop Percentage'],
    key_on='properties.GEOID',
    fill_color='BuPu',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Population Percentage',
    reset=True
).add_to(m)
folium.LayerControl().add_to(m)

m
Out[84]:
Make this Notebook Trusted to load map: File -> Trust Notebook

2.4 Working with Plotly-Express

In this section, we are going to create an interactive map using Plotly-Express, using data from WikiTree. Our goal is to create a map of the most common birth locations over time. Let's start by loading the data into an SFrame object:

In [41]:
import turicreate as tc
import turicreate.aggregate as agg

sf = tc.SFrame.read_csv('./datasets/wikitree_birth_locations.csv', verbose=True)
# sf['Birth Location'] = sf['Birth Location'].apply(lambda s: s.strip().lower() if len(s.lower()) > 1 else None)
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/datasets/wikitree_birth_locations.csv
Parsing completed. Parsed 100 lines in 0.418882 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Read 2252637 lines. Lines per second: 2.81638e+06
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/datasets/wikitree_birth_locations.csv
Parsing completed. Parsed 16420973 lines in 4.60602 secs.
Out[41]:
Birth Date Birth Location
0
1940
1920
1920
1940
1910
1920
1940
1940
18870921 Canning, Kings, Nova
Scotia, Canada ...
[16420973 rows x 2 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [42]:
def get_year(s):
    if len(s) < 4:
        return None
    if len(s) == 4:
        return int(s)
    return int(s[:4])
    
sf['Year'] = sf['Birth Date'].apply(lambda s: get_year(str(s)))
sf = sf[sf['Year'] < 2019]
sf = sf[sf['Year'] >= 0]
sf = sf.dropna()
print("Unique locations number %s" % len(sf['Birth Location'].unique()))
Unique locations number 1691018

We have too many locations (at least for this example). Let's filter out places that with less than 50 births:

In [43]:
g = sf.groupby('Birth Location', {'count': agg.COUNT()})
g = g[g['count'] > 50]
len(g)
Out[43]:
23420

We have about over 23,500 locations, so let's reolve each location's and find each location longitude and latitude:

In [44]:
from geopy import Bing
BING_MAPS_API_KEY = open("../../env/.keys").read()
b = Bing(BING_MAPS_API_KEY)
r = b.geocode("New York")
r
Out[44]:
Location(New York, NY, United States, (40.71455001831055, -74.00714111328125, 0.0))
In [45]:
r.raw
Out[45]:
{'__type': 'Location:http://schemas.microsoft.com/search/local/ws/rest/v1',
 'bbox': [40.363765716552734,
  -74.74592590332031,
  41.0565299987793,
  -73.26721954345703],
 'name': 'New York, NY',
 'point': {'type': 'Point',
  'coordinates': [40.71455001831055, -74.00714111328125]},
 'address': {'adminDistrict': 'NY',
  'countryRegion': 'United States',
  'formattedAddress': 'New York, NY',
  'locality': 'New York'},
 'confidence': 'High',
 'entityType': 'PopulatedPlace',
 'geocodePoints': [{'type': 'Point',
   'coordinates': [40.71455001831055, -74.00714111328125],
   'calculationMethod': 'Rooftop',
   'usageTypes': ['Display']}],
 'matchCodes': ['Good']}

Let's resolve the top-mentioned locations and insert each location into a MongoDB collection. Let's create a collection and insert to it a single location data:

In [46]:
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client['locations'] # Created a new DB named locations
collection = db['wikitree_locatios'] 

j = {'raw': r.raw, 'result': str(r), "query": "New York"}
collection.insert_one(j)
# Creating a index for faster search
collection.create_index([('query', pymongo.TEXT)], name='location_query_index', default_language='english')
Out[46]:
'location_query_index'
In [47]:
from geopy.exc import GeocoderTimedOut
import time

def find_location(collection, query , search_bing=True):
    result = collection.find_one({'query': query})
    if result is None and search_bing:
        try:
            result = b.geocode(query)
            result = add_location(collection, query, result)
        except:
            time.sleep(5)
            pass
            
    return result

def add_location(collection, query, result):
    if result is not None:
        j = {'raw': result.raw, 'result': str(result), "query": query}
    else:
        j = {'raw': {}, 'result': None, "query": query}
    collection.insert_one(j)
    return j

find_location(collection, "Italy")
Out[47]:
{'_id': ObjectId('5ca26c697a00fc082e44119f'),
 'raw': {'__type': 'Location:http://schemas.microsoft.com/search/local/ws/rest/v1',
  'bbox': [36.67594528198242,
   1.228389024734497,
   47.06735610961914,
   23.807432174682617],
  'name': 'Italy',
  'point': {'type': 'Point',
   'coordinates': [43.529029846191406, 12.16218376159668]},
  'address': {'countryRegion': 'Italy', 'formattedAddress': 'Italy'},
  'confidence': 'High',
  'entityType': 'CountryRegion',
  'geocodePoints': [{'type': 'Point',
    'coordinates': [43.529029846191406, 12.16218376159668],
    'calculationMethod': 'Rooftop',
    'usageTypes': ['Display']}],
  'matchCodes': ['Good']},
 'result': 'Italy',
 'query': 'Italy'}
In [48]:
from tqdm import tqdm_notebook as tqdm
n =5000
g = g.sort('count', ascending=False)
l = list(g['Birth Location'])[:n]
for loc in tqdm(l):
    find_location(collection, loc)

In [49]:
from functools import lru_cache

@lru_cache(maxsize=None)
def get_country(query):
    r = find_location(collection, query)
    try:
        return r['raw']['address']['countryRegion']
    except:
        return None

locations_set = set(l)
sf2 = sf[sf['Birth Location'].apply(lambda loc: loc in locations_set)]
countries = []
for loc in tqdm(sf2['Birth Location']):
    countries.append(get_country( loc ))
sf2['Country'] = countries 
sf2

Out[49]:
Birth Date Birth Location Year Country
1940 1940 None
1920 1920 None
1920 1920 None
1940 1940 None
1910 1910 None
1920 1920 None
1940 1940 None
1940 1940 None
1910 1910 None
1920 1920 None
[8389420 rows x 4 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [53]:
import plotly_express as px
import pycountry

def get_country_to_alpha3(name):
    if name == 'Czech Republic':
        return 'CZE'
    if name == 'Russia':
        return 'RUS'


    try:
        return pycountry.countries.get(name=name).alpha_3
    except:
        return None
    
sf2 = sf2.dropna()
g2 = sf2.groupby(['Year','Country'], {'Count': agg.COUNT()})
g3 = g2[g2['Year'] >= 1400]
g3['alpha3'] = g3['Country'].apply(lambda c: get_country_to_alpha3(c))
g3 = g3.dropna()
g3 = g3[g3['Year'] >= 1750]
g3 = g3[g3['Year'] <= 1950]
px.scatter_geo(g3.to_dataframe(), locations="alpha3", hover_name="Country", size="Count",
               animation_frame="Year", projection="natural earth")
In [54]:
us_sf = sf2[sf2['Country'] == 'United States']
us_sf.materialize()
us_sf
Out[54]:
Birth Date Birth Location Year Country
19200714 Massachusetts 1920 United States
18791130 Massachusetts 1879 United States
19230926 Boston, MA 1923 United States
19010524 Boston, Massachusetts,
United States ...
1901 United States
19030118 Cambridge, Massachusetts 1903 United States
19120000 Massachusetts 1912 United States
18900828 New York City, New York,
United States ...
1890 United States
18660200 New York, NY 1866 United States
18461100 Massachusetts 1846 United States
19280422 Worcester, Massachusetts 1928 United States
[2779918 rows x 4 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [55]:
us_g = us_sf.groupby(['Year','Birth Location'], {"Count": agg.COUNT()})
us_g
Out[55]:
Birth Location Year Count
Smith County, Tennessee,
USA ...
1887 1
Perry, Ohio, United
States ...
1891 1
Giles, Tennessee, USA 1824 3
Jersey City, Hudson, New
Jersey ...
1918 3
Cabell County, West
Virginia ...
1837 4
Cabarrus County, North
Carolina, United States ...
1873 3
Harwich, Barnstable,
Massachusetts ...
1716 3
Cumberland, Pennsylvania 1712 2
St. Louis, Missouri 1844 4
Germantown, Philadelphia,
Pennsylvania ...
1769 2
[324284 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [56]:
@lru_cache(maxsize=None)
def get_long_lat(query):
    r = find_location(collection, query)
    try:
        return r['raw']['point']['coordinates']
    except:
        return None
    
@lru_cache(maxsize=None)
def get_state(query):
    r = find_location(collection, query)
    try:
        return r['raw']['address']['adminDistrict']
    except:
        return None
    
state_l = []
cor_l = []
for loc in tqdm(us_g['Birth Location']):
    cor_l.append(get_long_lat(loc))
    state_l.append(get_state(loc))
us_g['Cooridnates'] = cor_l
us_g['State'] = state_l
us_g = us_g.dropna()
us_g

Out[56]:
Birth Location Year Count Cooridnates State
Smith County, Tennessee,
USA ...
1887 1 [36.250572204589844,
-85.95669555664062] ...
TN
Perry, Ohio, United
States ...
1891 1 [41.771671295166016,
-81.14666748046875] ...
OH
Giles, Tennessee, USA 1824 3 [35.202117919921875,
-87.03473663330078] ...
TN
Jersey City, Hudson, New
Jersey ...
1918 3 [40.71747970581055,
-74.04385375976562] ...
NJ
Cabell County, West
Virginia ...
1837 4 [38.42015075683594,
-82.24178314208984] ...
WV
Cabarrus County, North
Carolina, United States ...
1873 3 [35.3868408203125,
-80.55193328857422] ...
NC
Harwich, Barnstable,
Massachusetts ...
1716 3 [41.68627166748047,
-70.07341766357422] ...
MA
Cumberland, Pennsylvania 1712 2 [39.76728057861328,
-77.27111053466797] ...
PA
St. Louis, Missouri 1844 4 [38.627750396728516,
-90.1995620727539] ...
MO
Germantown, Philadelphia,
Pennsylvania ...
1769 2 [40.029541015625,
-75.17510986328125] ...
PA
[? rows x 5 columns]
Note: Only the head of the SFrame is printed. This SFrame is lazily evaluated.
You can use sf.materialize() to force materialization.
In [57]:
df = us_g[us_g['Year'] == 1900].to_dataframe()
px.choropleth(df, locations='State', locationmode='USA-states', color='Count')
In [58]:
us_g = us_g.sort('Year')
df = us_g.to_dataframe()
px.choropleth(df, locations='State', locationmode='USA-states', color='Count', animation_frame='Year')

2.5 Using Kepler.gl

Let's load the 2016 Parties in New York to an SFrame object:

In [85]:
#!mkdir ./datasets
!mkdir ./datasets/partynyc

# download the dataset from Kaggle and unzip it
!kaggle datasets download somesnm/partynyc  -p ./datasets/partynyc

!unzip ./datasets/partynyc/*.zip  -d ./datasets/partynyc/
Downloading partynyc.zip to ./datasets/partynyc
 97%|████████████████████████████████████▉ | 14.0M/14.4M [00:04<00:00, 5.57MB/s]
100%|██████████████████████████████████████| 14.4M/14.4M [00:04<00:00, 3.45MB/s]
Archive:  ./datasets/partynyc/partynyc.zip
  inflating: ./datasets/partynyc/bar_locations.csv  
  inflating: ./datasets/partynyc/party_in_nyc.csv  
  inflating: ./datasets/partynyc/test_parties.csv  
  inflating: ./datasets/partynyc/train_parties.csv  
In [86]:
import turicreate as tc
import turicreate.aggregate as agg

sf = tc.SFrame.read_csv("./datasets/partynyc/party_in_nyc.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/partynyc/party_in_nyc.csv
Parsing completed. Parsed 100 lines in 0.367555 secs.
------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[str,str,str,float,str,str,float,float]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/partynyc/party_in_nyc.csv
Parsing completed. Parsed 225414 lines in 0.368045 secs.
Out[86]:
Created Date Closed Date Location Type Incident Zip City Borough
2015-12-31 00:01:15 2015-12-31 03:48:04 Store/Commercial 10034.0 NEW YORK MANHATTAN
2015-12-31 00:02:48 2015-12-31 04:36:13 Store/Commercial 10040.0 NEW YORK MANHATTAN
2015-12-31 00:03:25 2015-12-31 00:40:15 Residential
Building/House ...
10026.0 NEW YORK MANHATTAN
2015-12-31 00:03:26 2015-12-31 01:53:38 Residential
Building/House ...
11231.0 BROOKLYN BROOKLYN
2015-12-31 00:05:10 2015-12-31 03:49:10 Residential
Building/House ...
10033.0 NEW YORK MANHATTAN
2015-12-31 00:08:05 2015-12-31 01:59:12 Residential
Building/House ...
10467.0 BRONX BRONX
2015-12-31 00:11:40 2015-12-31 06:24:00 Residential
Building/House ...
11230.0 BROOKLYN BROOKLYN
2015-12-31 00:12:13 2015-12-31 00:38:09 Residential
Building/House ...
11215.0 BROOKLYN BROOKLYN
2015-12-31 00:12:37 2015-12-31 05:03:39 Residential
Building/House ...
10463.0 BRONX BRONX
2015-12-31 00:14:13 2015-12-31 06:25:40 Store/Commercial 11372.0 JACKSON HEIGHTS QUEENS
Latitude Longitude
40.86618344001468 -73.91893042945345
40.85932419390543 -73.93123733660876
40.79941540978025 -73.95337116858667
40.6782851094981 -73.99466779426595
40.85030372032608 -73.93851562699031
40.8587476839271 -73.86562454420242
40.61700535900229 -73.95692046165364
40.66505114462701 -73.98127790267175
40.875894942376384 -73.91247127084895
40.75558360239671 -73.88520104800678
[225414 rows x 8 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
In [87]:
sf['Latitude'] = sf['Latitude'].apply(lambda i: round(i,4))
sf['Longitude'] = sf['Longitude'].apply(lambda i: round(i,4))
sf = sf.dropna()
g = sf.groupby(['Latitude','Longitude'], {'Count': agg.COUNT()})
g.sort('Count', ascending=False)
g.export_csv("./parties_location_count.csv")
g[g['Count'] >= 50].export_csv("./parties_location_count_50.csv")
In [88]:
g
Out[88]:
Latitude Longitude Count
40.6693 -73.8848 1
40.8707 -73.8901 6
40.8793 -73.8599 28
40.6862 -73.872 4
40.688 -73.8317 2
40.869 -73.9028 16
40.6657 -73.772 1
40.8558 -73.9056 2
40.645 -73.946 2
40.8487 -73.9059 19
[56894 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Let's use Kepler.gl to visualize the data:

In [43]:
from IPython.display import Image
Image("./kepler.gl_party.png")
Out[43]:

2.6 Working with BaseMap

In this section, we are going to use BaseMap. Let's first install BaseMap using Conda:

!conda install basemap
In [11]:
# In case a problem with PROJ_LIB
# than install conda install proj4 and set PROJ_LIB to the share/proj directory
#import os
# os.environ["PROJ_LIB"] = "/anaconda3/pkgs/proj4-<complete path>/share/proj" # or
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
%matplotlib inline

plt.figure(figsize=(8, 8))
m = Basemap(projection='ortho', resolution=None, lat_0=0, lon_0=0)
m.bluemarble(scale=0.5);
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
In [5]:
plt.figure(figsize=(8, 8))
m = Basemap(projection='cea', resolution=None)
m.bluemarble(scale=1);
In [6]:
# inspired from https://jakevdp.github.io/PythonDataScienceHandbook/04.13-geographic-data-with-basemap.html
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='lcc', resolution='h',
            width=2E6, height=2E6, 
            lat_0=51, lon_0=0.12,)
m.etopo(scale=0.51, alpha=0.5)

# Map (long, lat) to (x, y) for plotting
x, y = m(0.1278, 51.5074)
plt.plot(x, y, 'ok', markersize=6)
plt.text(x, y, 'London', fontsize=16);

Using BaseMap, we can plot maps with different resolutions and colors:

In [7]:
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='gall', resolution='c')
m.etopo(scale=0.51, alpha=0.5)
m.fillcontinents(color='white',lake_color='aqua')
# Map (long, lat) to (x, y) for plotting
x, y = m(0.1278, 51.5074)
plt.plot(x, y, 'ok', markersize=2)
plt.text(x, y, 'London', fontsize=6);

Each map projection has its own advantages and disadvantages. In BaseMap there are 24 different map projections.

In [20]:
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='gnom', resolution='c',
            width=3E6, height=3E6, 
            lat_0=47.608013, lon_0=-122.335167)
m.etopo(scale=0.51, alpha=0.5)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
Out[20]:
<matplotlib.image.AxesImage at 0xb27391eb8>
In [ ]: