For this lecture, we are going to use the TuriCreate, spacy, Cartopy, imageio, pymongo, GeoPandas, descartes, Geopy package, and Folium package packages. Let's set them up:
!pip install turicreate
!pip install spaCy
!pip install pymongo
!pip install geopandas
!pip install descartes
!pip install geopy
!pip install folium
Requirement already satisfied: turicreate in /anaconda3/envs/massivedata/lib/python3.6/site-packages (6.1) Requirement already satisfied: numpy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (1.17.2) Requirement already satisfied: pillow>=5.2.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (6.2.0) Requirement already satisfied: requests>=2.9.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (2.22.0) Requirement already satisfied: prettytable==0.7.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (0.7.2) Requirement already satisfied: coremltools==3.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (3.3) Requirement already satisfied: six>=1.10.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (1.12.0) Requirement already satisfied: pandas>=0.23.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (0.25.1) Requirement already satisfied: decorator>=4.0.9 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (4.4.0) Requirement already satisfied: scipy>=1.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (1.3.1) Requirement already satisfied: tensorflow>=2.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (2.1.0) Requirement already satisfied: resampy==0.2.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from turicreate) (0.2.1) Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (2.8) Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (2019.9.11) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (1.24.2) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests>=2.9.1->turicreate) (3.0.4) Requirement already satisfied: protobuf>=3.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from coremltools==3.3->turicreate) (3.11.3) Requirement already satisfied: python-dateutil>=2.6.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.2->turicreate) (2.8.0) Requirement already satisfied: pytz>=2017.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.2->turicreate) (2019.3) Requirement already satisfied: opt-einsum>=2.3.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (3.1.0) Requirement already satisfied: absl-py>=0.7.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.9.0) Requirement already satisfied: astor>=0.6.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.8.1) Requirement already satisfied: keras-preprocessing>=1.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.1.0) Requirement already satisfied: google-pasta>=0.1.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.1.8) Requirement already satisfied: wrapt>=1.11.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.11.2) Requirement already satisfied: termcolor>=1.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.1.0) Requirement already satisfied: gast==0.2.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.2.2) Requirement already satisfied: tensorboard<2.2.0,>=2.1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (2.1.0) Requirement already satisfied: wheel>=0.26; python_version >= "3" in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (0.33.6) Requirement already satisfied: keras-applications>=1.0.8 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.0.8) Requirement already satisfied: grpcio>=1.8.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (1.27.2) Requirement already satisfied: tensorflow-estimator<2.2.0,>=2.1.0rc0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorflow>=2.0.0->turicreate) (2.1.0) Requirement already satisfied: numba>=0.32 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from resampy==0.2.1->turicreate) (0.45.1) Requirement already satisfied: setuptools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from protobuf>=3.1.0->coremltools==3.3->turicreate) (41.4.0) Requirement already satisfied: google-auth<2,>=1.6.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (1.11.2) Requirement already satisfied: markdown>=2.6.8 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (3.2.1) Requirement already satisfied: werkzeug>=0.11.15 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.16.0) Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.4.1) Requirement already satisfied: h5py in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from keras-applications>=1.0.8->tensorflow>=2.0.0->turicreate) (2.9.0) Requirement already satisfied: llvmlite>=0.29.0dev0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from numba>=0.32->resampy==0.2.1->turicreate) (0.29.0) Requirement already satisfied: pyasn1-modules>=0.2.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.2.8) Requirement already satisfied: rsa<4.1,>=3.1.4 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (4.0) Requirement already satisfied: cachetools<5.0,>=2.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (4.0.0) Requirement already satisfied: requests-oauthlib>=0.7.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (1.3.0) Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (0.4.8) Requirement already satisfied: oauthlib>=3.0.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.2.0,>=2.1.0->tensorflow>=2.0.0->turicreate) (3.1.0) Requirement already satisfied: spaCy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (2.2.4) Requirement already satisfied: srsly<1.1.0,>=1.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.0.2) Requirement already satisfied: setuptools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (41.4.0) Requirement already satisfied: blis<0.5.0,>=0.4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (0.4.1) Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (0.6.0) Requirement already satisfied: numpy>=1.15.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.17.2) Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.0.0) Requirement already satisfied: requests<3.0.0,>=2.13.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (2.22.0) Requirement already satisfied: thinc==7.4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (7.4.0) Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (4.46.0) Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.0.2) Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (2.0.3) Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (3.0.2) Requirement already satisfied: plac<1.2.0,>=0.9.6 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from spaCy) (1.1.3) Requirement already satisfied: importlib-metadata>=0.20; python_version < "3.8" in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from catalogue<1.1.0,>=0.0.7->spaCy) (0.23) Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (2.8) Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (2019.9.11) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (1.24.2) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests<3.0.0,>=2.13.0->spaCy) (3.0.4) Requirement already satisfied: zipp>=0.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spaCy) (0.6.0) Requirement already satisfied: more-itertools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from zipp>=0.5->importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spaCy) (7.2.0) Requirement already satisfied: pymongo in /anaconda3/envs/massivedata/lib/python3.6/site-packages (3.10.1) Requirement already satisfied: geopandas in /anaconda3/envs/massivedata/lib/python3.6/site-packages (0.7.0) Requirement already satisfied: fiona in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (1.8.13.post1) Requirement already satisfied: pandas>=0.23.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (0.25.1) Requirement already satisfied: pyproj>=2.2.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (2.3.1) Requirement already satisfied: shapely in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopandas) (1.6.4.post2) Requirement already satisfied: click-plugins>=1.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (1.1.1) Requirement already satisfied: attrs>=17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (19.2.0) Requirement already satisfied: munch in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (2.5.0) Requirement already satisfied: cligj>=0.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (0.5.0) Requirement already satisfied: click<8,>=4.0 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (7.0) Requirement already satisfied: six>=1.7 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from fiona->geopandas) (1.12.0) Requirement already satisfied: pytz>=2017.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (2019.3) Requirement already satisfied: python-dateutil>=2.6.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (2.8.0) Requirement already satisfied: numpy>=1.13.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from pandas>=0.23.0->geopandas) (1.17.2) Requirement already satisfied: descartes in /anaconda3/envs/massivedata/lib/python3.6/site-packages (1.1.0) Requirement already satisfied: matplotlib in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from descartes) (3.1.1) Requirement already satisfied: cycler>=0.10 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (0.10.0) Requirement already satisfied: kiwisolver>=1.0.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (1.1.0) Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (2.4.2) Requirement already satisfied: python-dateutil>=2.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (2.8.0) Requirement already satisfied: numpy>=1.11 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from matplotlib->descartes) (1.17.2) Requirement already satisfied: six in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from cycler>=0.10->matplotlib->descartes) (1.12.0) Requirement already satisfied: setuptools in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kiwisolver>=1.0.1->matplotlib->descartes) (41.4.0) Requirement already satisfied: geopy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (1.21.0) Requirement already satisfied: geographiclib<2,>=1.49 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from geopy) (1.50) Collecting folium Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB) |████████████████████████████████| 102kB 615kB/s ta 0:00:01 Collecting branca>=0.3.0 (from folium) Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl Requirement already satisfied: jinja2>=2.9 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from folium) (2.10.3) Requirement already satisfied: requests in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from folium) (2.22.0) Requirement already satisfied: numpy in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from folium) (1.17.2) Requirement already satisfied: MarkupSafe>=0.23 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from jinja2>=2.9->folium) (1.1.1) Requirement already satisfied: certifi>=2017.4.17 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (2019.9.11) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (3.0.4) Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (2.8) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->folium) (1.24.2) Installing collected packages: branca, folium Successfully installed branca-0.4.1 folium-0.11.0
!conda install -y -c conda-forge imageio
!conda install -y -c conda-forge cartopy
Collecting package metadata (repodata.json): done Solving environment: done ==> WARNING: A newer version of conda exists. <== current version: 4.8.2 latest version: 4.8.3 Please update conda by running $ conda update -n base conda # All requested packages already installed. Collecting package metadata (repodata.json): done Solving environment: done ==> WARNING: A newer version of conda exists. <== current version: 4.8.2 latest version: 4.8.3 Please update conda by running $ conda update -n base conda # All requested packages already installed.
Let's install and setup the Kaggle package:
# Installing the Kaggle package
import json
!pip install kaggle
#Important Note: complete this with your own key - after running this for the first time remmember to **remove** your API_KEY
api_token = {"username":"<Insert Your Kaggle User Name>","key":"<Insert Your Kaggle API key>"}
# creating kaggle.json file with the personal API-Key details
# You can also put this file on your Google Drive
with open('~/.kaggle/kaggle.json', 'w') as file:
json.dump(api_token, file)
!chmod 600 ~/.kaggle/kaggle.json
Requirement already satisfied: kaggle in /anaconda3/envs/massivedata/lib/python3.6/site-packages (1.5.6) Requirement already satisfied: python-dateutil in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (2.8.0) Requirement already satisfied: python-slugify in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (4.0.0) Requirement already satisfied: requests in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (2.22.0) Requirement already satisfied: tqdm in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (4.46.0) Requirement already satisfied: urllib3<1.25,>=1.21.1 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (1.24.2) Requirement already satisfied: certifi in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (2019.9.11) Requirement already satisfied: six>=1.10 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from kaggle) (1.12.0) Requirement already satisfied: text-unidecode>=1.3 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from python-slugify->kaggle) (1.3) Requirement already satisfied: idna<2.9,>=2.5 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->kaggle) (2.8) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /anaconda3/envs/massivedata/lib/python3.6/site-packages (from requests->kaggle) (3.0.4)
To work with MongoDB first we need to download and install it. In this notebook, we will be working with MongoDB. Therefore, I prefer to run the notebook locally on my laptop. Another option is to work with MongoDB Atlas.
Now let's run MongoDB and test the connection to it:
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017/")
client.list_database_names()
['admin', 'config', 'local', 'locations']
Now let's create a new collection and load the US baby names dataset to the collection:
#!mkdir ./datasets
!mkdir ./datasets/us-baby-name
# download the dataset from Kaggle and unzip it
!kaggle datasets download kaggle/us-baby-names -f StateNames.csv -p ./datasets/
!unzip ./datasets/StateNames.csv.zip -d ./datasets/us-baby-name/
Downloading StateNames.csv.zip to ./datasets 98%|█████████████████████████████████████▍| 30.0M/30.5M [00:04<00:00, 6.79MB/s] 100%|██████████████████████████████████████| 30.5M/30.5M [00:04<00:00, 6.42MB/s] Archive: ./datasets/StateNames.csv.zip inflating: ./datasets/us-baby-name/StateNames.csv
import turicreate as tc
import turicreate.aggregate as agg
sf = tc.SFrame.read_csv("./datasets/us-baby-name/StateNames.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/us-baby-name/StateNames.csv
Parsing completed. Parsed 100 lines in 1.31749 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[int,str,int,str,str,int] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Read 1940415 lines. Lines per second: 1.30741e+06
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/us-baby-name/StateNames.csv
Parsing completed. Parsed 5647426 lines in 2.93838 secs.
Id | Name | Year | Gender | State | Count |
---|---|---|---|---|---|
1 | Mary | 1910 | F | AK | 14 |
2 | Annie | 1910 | F | AK | 12 |
3 | Anna | 1910 | F | AK | 10 |
4 | Margaret | 1910 | F | AK | 8 |
5 | Helen | 1910 | F | AK | 7 |
6 | Elsie | 1910 | F | AK | 6 |
7 | Lucy | 1910 | F | AK | 6 |
8 | Dorothy | 1910 | F | AK | 5 |
9 | Mary | 1911 | F | AK | 12 |
10 | Margaret | 1911 | F | AK | 7 |
sf['Year_Count'] = sf.apply(lambda r: (r['Year'], r['Count']))
g = sf.groupby(["State", "Name", "Gender"], {"years_list" : agg.CONCAT("Year_Count")})
g
Gender | Name | State | years_list |
---|---|---|---|
M | Ferman | LA | [[1926.0, 5.0]] |
M | Holton | NC | [[2013.0, 5.0]] |
F | Faiga | NJ | [[1995.0, 5.0], [1999.0, 5.0], [2007.0, 17.0], ... |
F | Carlie | MD | [[1990.0, 6.0], [1997.0, 7.0], [1998.0, 10.0], ... |
M | Beau | SC | [[2014.0, 16.0], [1981.0, 6.0], [2012.0, 7.0], ... |
F | Kaitlynn | WA | [[1995.0, 19.0], [1996.0, 26.0], [2003.0, 15.0], ... |
F | Tiny | NC | [[1917.0, 5.0], [1936.0, 5.0], [1920.0, 5.0], ... |
F | Cayla | NJ | [[1993.0, 6.0], [1998.0, 12.0], [1999.0, 14.0], ... |
M | Renardo | NC | [[1983.0, 6.0]] |
M | Auther | SC | [[1936.0, 5.0], [1937.0, 5.0], [1920.0, 6.0], ... |
g['YearsCountDict'] = g['years_list'].apply(lambda l: {str(int(y)): int(c) for y,c in l})
g = g.remove_column("years_list")
g
Gender | Name | State | YearsCountDict |
---|---|---|---|
M | Ferman | LA | {'1926': 5} |
M | Holton | NC | {'2013': 5} |
F | Faiga | NJ | {'1995': 5, '1999': 5, '2007': 17, '2008': 8, ... |
F | Carlie | MD | {'1990': 6, '1997': 7, '1998': 10, '1992': 12, ... |
M | Beau | SC | {'2014': 16, '1981': 6, '2012': 7, '2008': 7, ... |
F | Kaitlynn | WA | {'1995': 19, '1996': 26, '2003': 15, '2008': 18, ... |
F | Tiny | NC | {'1917': 5, '1936': 5, '1920': 5, '1932': 5, ... |
F | Cayla | NJ | {'1993': 6, '1998': 12, '1999': 14, '2003': 12, ... |
M | Renardo | NC | {'1983': 6} |
M | Auther | SC | {'1936': 5, '1937': 5, '1920': 6, '1939': 5, ... |
Let's insert data from the SFrame object into a MongoDB collection:
db = client['baby_names'] # Created a new DB named baby_names
collection = db['names'] # Created a new collection named names
# insert each row as a document
for r in g[:100]:
collection.insert_one(r)
print(f"Total documents {collection.count_documents({})}")
Total documents 100
Let's delete all the documents in the collection:
collection.delete_many({}) # delete all documents
print(f"Total documents {collection.count_documents({})}")
Total documents 0
collection.insert_many(g)
print(f"Total documents {collection.count_documents({})}")
Total documents 304918
Let's search for documents:
collection.find_one({ "Name": "Mary", "Gender": "F" })
{'_id': ObjectId('5eb51b889993d0175fb5f810'), 'Gender': 'F', 'Name': 'Mary', 'State': 'CA', 'YearsCountDict': {'1910': 295, '1915': 998, '1918': 1252, '1928': 1787, '1930': 1851, '1964': 2709, '1971': 1056, '1983': 751, '2004': 292, '1912': 534, '1916': 1091, '1921': 1697, '1929': 1713, '1945': 3019, '1947': 3460, '1953': 3400, '1980': 831, '1984': 756, '1985': 706, '1988': 707, '1991': 723, '1997': 446, '1999': 438, '2003': 304, '2005': 282, '2014': 171, '1944': 2975, '1946': 3147, '1950': 3134, '1963': 2608, '1965': 2240, '1966': 1852, '1974': 742, '1979': 806, '1996': 500, '1998': 428, '2010': 177, '2011': 201, '2012': 166, '1917': 1149, '1925': 1890, '1931': 1626, '1939': 1699, '1941': 1951, '1949': 3217, '1954': 3718, '1961': 2716, '1967': 1621, '1968': 1447, '1970': 1278, '1975': 729, '1977': 710, '1987': 649, '1990': 678, '1992': 706, '1993': 639, '1995': 550, '2009': 201, '1911': 390, '1913': 584, '1914': 773, '1932': 1498, '1935': 1484, '1940': 1819, '1956': 3414, '1959': 3192, '1960': 3105, '2008': 220, '1923': 1829, '1934': 1590, '1936': 1541, '1942': 2441, '1952': 3422, '1958': 3158, '1962': 2755, '1972': 928, '1976': 722, '2001': 381, '1922': 1732, '1924': 1958, '1933': 1470, '1948': 3426, '1981': 845, '1982': 837, '1994': 567, '2000': 416, '2007': 239, '2013': 190, '1919': 1204, '1920': 1554, '1926': 1719, '1927': 1817, '1937': 1712, '1938': 1876, '1943': 2929, '1951': 3184, '1955': 3389, '1957': 3461, '1969': 1346, '1973': 795, '1978': 700, '1986': 671, '1989': 682, '2002': 339, '2006': 270}}
c = list(collection.find({ "Name": "Mary", "Gender": "F" }))
list(c)
[{'_id': ObjectId('5eb51b889993d0175fb5f810'), 'Gender': 'F', 'Name': 'Mary', 'State': 'CA', 'YearsCountDict': {'1910': 295, '1915': 998, '1918': 1252, '1928': 1787, '1930': 1851, '1964': 2709, '1971': 1056, '1983': 751, '2004': 292, '1912': 534, '1916': 1091, '1921': 1697, '1929': 1713, '1945': 3019, '1947': 3460, '1953': 3400, '1980': 831, '1984': 756, '1985': 706, '1988': 707, '1991': 723, '1997': 446, '1999': 438, '2003': 304, '2005': 282, '2014': 171, '1944': 2975, '1946': 3147, '1950': 3134, '1963': 2608, '1965': 2240, '1966': 1852, '1974': 742, '1979': 806, '1996': 500, '1998': 428, '2010': 177, '2011': 201, '2012': 166, '1917': 1149, '1925': 1890, '1931': 1626, '1939': 1699, '1941': 1951, '1949': 3217, '1954': 3718, '1961': 2716, '1967': 1621, '1968': 1447, '1970': 1278, '1975': 729, '1977': 710, '1987': 649, '1990': 678, '1992': 706, '1993': 639, '1995': 550, '2009': 201, '1911': 390, '1913': 584, '1914': 773, '1932': 1498, '1935': 1484, '1940': 1819, '1956': 3414, '1959': 3192, '1960': 3105, '2008': 220, '1923': 1829, '1934': 1590, '1936': 1541, '1942': 2441, '1952': 3422, '1958': 3158, '1962': 2755, '1972': 928, '1976': 722, '2001': 381, '1922': 1732, '1924': 1958, '1933': 1470, '1948': 3426, '1981': 845, '1982': 837, '1994': 567, '2000': 416, '2007': 239, '2013': 190, '1919': 1204, '1920': 1554, '1926': 1719, '1927': 1817, '1937': 1712, '1938': 1876, '1943': 2929, '1951': 3184, '1955': 3389, '1957': 3461, '1969': 1346, '1973': 795, '1978': 700, '1986': 671, '1989': 682, '2002': 339, '2006': 270}}, {'_id': ObjectId('5eb51b889993d0175fb60b00'), 'Gender': 'F', 'Name': 'Mary', 'State': 'LA', 'YearsCountDict': {'1993': 154, '1996': 139, '2003': 113, '1927': 1379, '1944': 1639, '1959': 1128, '1980': 250, '1983': 227, '1999': 119, '2000': 110, '2002': 114, '2006': 79, '2010': 61, '2013': 57, '1912': 658, '1919': 1164, '1926': 1340, '1934': 1243, '1940': 1412, '1942': 1623, '1943': 1603, '1963': 872, '1965': 683, '1982': 251, '2004': 88, '1913': 634, '1921': 1195, '1925': 1304, '1929': 1306, '1949': 1557, '1969': 411, '1970': 448, '1976': 236, '1988': 169, '1995': 146, '2008': 75, '2011': 57, '1910': 586, '1914': 747, '1931': 1228, '1941': 1555, '1952': 1510, '1953': 1468, '1954': 1564, '1960': 1087, '1967': 535, '1978': 228, '1984': 236, '1985': 219, '1986': 195, '1989': 153, '1990': 170, '1916': 881, '1918': 1111, '1920': 1301, '1922': 1269, '1924': 1288, '1928': 1265, '1932': 1312, '1938': 1327, '1939': 1485, '1950': 1582, '1957': 1318, '1958': 1214, '1971': 355, '1991': 175, '1992': 161, '1997': 123, '1998': 144, '2001': 104, '1911': 502, '1915': 817, '1923': 1263, '1930': 1198, '1935': 1213, '1936': 1250, '1948': 1485, '1955': 1343, '1956': 1348, '1964': 936, '1966': 622, '1975': 240, '1977': 225, '1994': 184, '2007': 62, '2009': 64, '2012': 63, '1962': 963, '1968': 441, '1973': 262, '1979': 230, '1987': 164, '2005': 97, '2014': 65, '1917': 957, '1933': 1141, '1937': 1299, '1945': 1467, '1946': 1602, '1947': 1639, '1951': 1511, '1961': 1012, '1972': 282, '1974': 221, '1981': 230}}, {'_id': ObjectId('5eb51b889993d0175fb63fc5'), 'Gender': 'F', 'Name': 'Mary', 'State': 'SD', 'YearsCountDict': {'1912': 108, '1918': 229, '1920': 229, '1924': 250, '1928': 224, '1947': 310, '1952': 299, '1953': 313, '1955': 315, '1960': 217, '1973': 25, '1986': 24, '1988': 23, '1998': 11, '2000': 16, '2003': 13, '2006': 10, '1914': 128, '1923': 269, '1937': 228, '1946': 300, '1962': 171, '1964': 177, '1965': 115, '1968': 85, '1976': 27, '1979': 30, '1985': 17, '1992': 25, '1997': 13, '2002': 15, '2009': 6, '2012': 7, '1925': 265, '1926': 227, '1938': 225, '1950': 338, '1956': 269, '1959': 225, '1980': 35, '1981': 34, '1982': 38, '1990': 16, '1996': 9, '2010': 11, '2011': 5, '2013': 5, '1922': 249, '1929': 232, '1935': 249, '1949': 357, '1958': 229, '1967': 99, '1970': 55, '1974': 37, '1989': 26, '1993': 21, '2001': 13, '2007': 11, '1917': 228, '1919': 208, '1933': 244, '1934': 242, '1940': 209, '1943': 271, '1957': 239, '1966': 115, '1969': 66, '1971': 52, '1978': 26, '1991': 18, '1994': 19, '1999': 12, '1910': 58, '1913': 111, '1927': 238, '1930': 238, '1932': 274, '1963': 190, '1983': 16, '2005': 7, '1915': 177, '1921': 273, '1931': 229, '1941': 216, '1948': 317, '1951': 307, '1954': 337, '1961': 187, '1972': 36, '1975': 26, '1977': 27, '1995': 21, '1911': 68, '1916': 192, '1936': 226, '1939': 215, '1942': 225, '1944': 273, '1945': 225, '1984': 23, '1987': 22, '2014': 6}}, {'_id': ObjectId('5eb51b889993d0175fb64d97'), 'Gender': 'F', 'Name': 'Mary', 'State': 'WV', 'YearsCountDict': {'1916': 1120, '1923': 1522, '1924': 1647, '1936': 1085, '1944': 888, '1953': 765, '1963': 374, '1964': 436, '1978': 140, '1994': 44, '1998': 57, '2000': 35, '2006': 19, '1912': 519, '1921': 1494, '1925': 1549, '1930': 1252, '1931': 1185, '1942': 1042, '1943': 1022, '1945': 840, '1952': 825, '1977': 159, '1979': 154, '1980': 147, '1982': 106, '1991': 55, '2003': 33, '2004': 31, '2009': 16, '1918': 1220, '1919': 1225, '1920': 1379, '1927': 1467, '1937': 1129, '1947': 1082, '1956': 646, '1958': 491, '1995': 41, '1996': 53, '1997': 35, '1926': 1436, '1934': 1163, '1949': 957, '1976': 151, '1985': 103, '1993': 60, '2002': 26, '2011': 16, '1911': 441, '1922': 1541, '1932': 1184, '1933': 1111, '1941': 991, '1946': 1016, '1954': 715, '1959': 492, '1965': 357, '1975': 143, '1984': 98, '1987': 75, '1988': 72, '1992': 61, '2001': 21, '2005': 15, '2008': 14, '2013': 14, '1938': 1087, '1951': 843, '1955': 626, '1957': 559, '1962': 422, '1966': 285, '1967': 228, '1968': 248, '1969': 229, '1970': 224, '1981': 127, '1986': 78, '1914': 790, '1928': 1412, '1929': 1286, '1935': 1149, '1940': 1045, '1948': 1015, '1960': 449, '1974': 157, '1989': 80, '1990': 62, '2007': 21, '2010': 13, '2014': 10, '1910': 380, '1913': 641, '1915': 1054, '1917': 1230, '1939': 1026, '1950': 889, '1961': 468, '1971': 218, '1972': 189, '1973': 173, '1983': 115, '1999': 37}}, {'_id': ObjectId('5eb51b889993d0175fb64ff9'), 'Gender': 'F', 'Name': 'Mary', 'State': 'HI', 'YearsCountDict': {'1938': 63, '1949': 87, '1965': 60, '1967': 57, '1982': 19, '1984': 16, '1987': 14, '1990': 15, '2004': 5, '1915': 92, '1916': 86, '1930': 122, '1937': 69, '1962': 82, '1976': 26, '1979': 22, '1981': 22, '1992': 12, '1998': 12, '1925': 100, '1926': 113, '1957': 85, '1958': 114, '1960': 82, '1969': 53, '1970': 43, '1975': 25, '1977': 22, '1991': 17, '2013': 9, '1922': 101, '1923': 103, '1924': 119, '1928': 111, '1929': 108, '1933': 92, '1934': 71, '1940': 66, '1944': 82, '1945': 89, '1950': 96, '1953': 81, '1959': 89, '1963': 92, '1983': 21, '1917': 88, '1935': 81, '1941': 52, '1946': 76, '1951': 85, '1961': 102, '1973': 31, '1974': 30, '1986': 18, '1993': 16, '2000': 6, '2008': 7, '2010': 8, '1910': 47, '1911': 56, '1912': 62, '1913': 60, '1920': 102, '1927': 95, '1931': 114, '1932': 102, '1943': 73, '1947': 96, '1954': 85, '1966': 63, '1968': 39, '1978': 31, '1980': 29, '1988': 15, '1989': 15, '1995': 16, '2001': 12, '1918': 103, '1921': 121, '1936': 59, '1939': 90, '1952': 80, '1955': 69, '1971': 36, '1972': 37, '1985': 13, '1914': 79, '1919': 100, '1942': 69, '1948': 98, '1956': 95, '1964': 99, '1994': 14, '1996': 9, '1999': 9, '2005': 7, '2012': 5, '2014': 7}}, {'_id': ObjectId('5eb51b889993d0175fb6625a'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NY', 'YearsCountDict': {'1911': 2322, '1917': 5502, '1933': 3932, '1937': 3805, '1951': 4732, '1970': 1469, '1980': 596, '1984': 463, '1988': 434, '1991': 468, '2002': 246, '2013': 103, '1919': 5061, '1926': 4779, '1940': 3662, '1948': 4720, '1959': 4869, '1967': 2205, '1976': 669, '1982': 553, '1983': 537, '1996': 305, '1997': 334, '1998': 291, '1999': 276, '1916': 5496, '1923': 5160, '1942': 4169, '1943': 4449, '1946': 4797, '1947': 4966, '1949': 4597, '1950': 4734, '1963': 3782, '1964': 3623, '2010': 146, '2011': 92, '1918': 5526, '1920': 5296, '1921': 5413, '1922': 5303, '1925': 4916, '1929': 4765, '1935': 3766, '1938': 3622, '1941': 3879, '1961': 4426, '1972': 951, '1974': 797, '1981': 585, '2001': 247, '2006': 153, '1912': 2909, '1915': 5342, '1927': 5094, '1928': 4905, '1930': 4854, '1934': 3701, '1952': 4940, '1953': 4920, '1957': 5265, '1958': 5010, '1962': 3823, '1975': 689, '1989': 461, '1993': 444, '1994': 380, '2007': 149, '2008': 141, '2009': 128, '2012': 100, '1910': 1923, '1914': 4244, '1936': 3689, '1939': 3570, '1954': 5467, '1955': 5114, '1956': 5063, '1968': 1796, '1971': 1141, '1979': 570, '1986': 418, '1995': 373, '2004': 214, '2014': 121, '1931': 4591, '1932': 4257, '1945': 3907, '1965': 3076, '1985': 455, '2000': 293, '2003': 206, '1913': 3267, '1924': 5144, '1944': 4100, '1960': 4630, '1966': 2546, '1969': 1555, '1973': 820, '1977': 665, '1978': 539, '1987': 453, '1990': 465, '1992': 464, '2005': 207}}, {'_id': ObjectId('5eb51b899993d0175fb67cd1'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NM', 'YearsCountDict': {'1912': 155, '1919': 270, '1921': 416, '1922': 377, '1924': 400, '1926': 342, '1942': 519, '1971': 86, '2008': 18, '1928': 380, '1931': 390, '1932': 402, '1943': 528, '1946': 557, '1950': 512, '1953': 488, '1956': 425, '1966': 170, '1967': 144, '1974': 73, '1976': 60, '1977': 72, '1983': 48, '1989': 51, '1992': 25, '1999': 13, '2006': 16, '2012': 8, '2013': 6, '1915': 202, '1916': 250, '1927': 394, '1935': 475, '1941': 505, '1949': 522, '1964': 257, '1969': 97, '1975': 60, '1997': 14, '2002': 21, '1917': 239, '1923': 382, '1933': 370, '1936': 432, '1937': 481, '1944': 509, '1945': 469, '1947': 589, '1951': 516, '1952': 508, '1963': 264, '1973': 72, '1993': 29, '2001': 22, '2014': 12, '1910': 98, '1911': 111, '1914': 140, '1918': 319, '1929': 394, '1938': 453, '1954': 509, '1955': 455, '1961': 289, '1968': 120, '1972': 77, '1984': 45, '1987': 38, '1990': 43, '2003': 15, '2004': 16, '1925': 392, '1948': 547, '1959': 366, '1994': 26, '1998': 28, '2010': 14, '1913': 127, '1920': 307, '1934': 433, '1939': 504, '1958': 386, '1960': 342, '1979': 72, '1981': 58, '1982': 69, '1988': 34, '1995': 22, '2005': 15, '2007': 17, '2009': 11, '1930': 389, '1940': 481, '1957': 440, '1962': 285, '1965': 208, '1970': 111, '1978': 61, '1980': 57, '1985': 46, '1986': 46, '1991': 33, '1996': 34, '2000': 29}}, {'_id': ObjectId('5eb51b899993d0175fb68908'), 'Gender': 'F', 'Name': 'Mary', 'State': 'ME', 'YearsCountDict': {'1910': 92, '1916': 261, '1941': 211, '1942': 265, '1943': 245, '1947': 282, '1953': 233, '1957': 275, '1968': 112, '1969': 89, '1970': 74, '1976': 40, '1979': 40, '1985': 32, '1990': 34, '1997': 28, '1915': 237, '1917': 249, '1927': 261, '1938': 208, '1952': 219, '1961': 217, '1963': 160, '1977': 42, '1986': 35, '1988': 33, '1992': 34, '2002': 16, '2005': 11, '1934': 219, '1936': 223, '1940': 223, '1950': 264, '1954': 249, '1971': 69, '1973': 38, '1980': 47, '1987': 32, '1989': 24, '2001': 19, '1911': 83, '1914': 182, '1918': 262, '1920': 311, '1945': 226, '1967': 105, '1978': 37, '1983': 38, '1993': 26, '2010': 6, '2012': 7, '1913': 140, '1922': 296, '1924': 289, '1931': 256, '1939': 211, '1946': 263, '1994': 28, '1998': 20, '2000': 14, '1925': 286, '1928': 217, '1935': 230, '1948': 262, '1955': 256, '1964': 191, '1965': 148, '1982': 49, '2004': 12, '2014': 6, '1912': 131, '1926': 260, '1929': 220, '1930': 213, '1932': 248, '1933': 224, '1937': 236, '1944': 223, '1956': 239, '1958': 227, '1966': 133, '1972': 63, '1974': 46, '1975': 47, '1981': 41, '1984': 31, '1999': 15, '2003': 18, '2006': 9, '2007': 11, '2008': 8, '2013': 9, '1919': 281, '1921': 293, '1923': 257, '1949': 275, '1951': 225, '1959': 214, '1960': 210, '1962': 201, '1991': 26, '1995': 21, '1996': 16, '2009': 9}}, {'_id': ObjectId('5eb51b899993d0175fb69f9c'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MI', 'YearsCountDict': {'1915': 1447, '1917': 1774, '1938': 2118, '1962': 2106, '1972': 556, '1978': 345, '1990': 303, '1994': 217, '1996': 196, '2000': 151, '1933': 1935, '1935': 1847, '1944': 2492, '1953': 3056, '1955': 3143, '1965': 1584, '1970': 779, '1971': 659, '1983': 348, '1984': 321, '1986': 285, '1992': 249, '1922': 1772, '1929': 2208, '1936': 2066, '1943': 2734, '1948': 2933, '1949': 2982, '1950': 2936, '1954': 3435, '1957': 3197, '1959': 2772, '1960': 2518, '1964': 1983, '1976': 367, '1991': 265, '1993': 222, '1999': 191, '2014': 56, '1912': 628, '1926': 2114, '1932': 2002, '1941': 2289, '1942': 2455, '1946': 2811, '1951': 3088, '1967': 1127, '1973': 507, '1981': 406, '1997': 195, '2001': 171, '2003': 137, '2006': 94, '1923': 1962, '1927': 2156, '1939': 1999, '1966': 1384, '1968': 949, '1977': 389, '1979': 379, '1980': 431, '1995': 222, '2008': 87, '1911': 456, '1916': 1587, '1919': 1554, '1920': 1809, '1924': 1946, '1925': 1932, '1930': 2291, '1934': 2034, '1963': 2028, '1975': 407, '1985': 300, '1988': 284, '1989': 295, '2002': 130, '2005': 104, '2012': 60, '1910': 349, '1913': 744, '1918': 1787, '1921': 1958, '1928': 2250, '1947': 2961, '1952': 2963, '1956': 3152, '1958': 2744, '1969': 843, '1982': 369, '1987': 287, '2007': 105, '1914': 885, '1931': 2092, '1937': 1958, '1940': 2017, '1945': 2503, '1961': 2344, '1974': 414, '1998': 193, '2004': 145, '2009': 80, '2010': 70, '2011': 60, '2013': 74}}, {'_id': ObjectId('5eb51b899993d0175fb6a2e9'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MN', 'YearsCountDict': {'1912': 379, '1923': 893, '1928': 889, '1937': 962, '1950': 1812, '1972': 206, '1977': 155, '1978': 150, '1981': 173, '1989': 125, '1998': 85, '2009': 43, '1910': 216, '1914': 501, '1920': 825, '1925': 805, '1933': 895, '1952': 1755, '1953': 1680, '1954': 1920, '1962': 1242, '1975': 141, '1985': 144, '1995': 105, '1917': 756, '1921': 872, '1936': 1024, '1943': 1400, '1947': 1753, '1949': 1641, '1960': 1468, '1973': 183, '1986': 160, '2005': 65, '1911': 267, '1916': 810, '1919': 733, '1926': 942, '1930': 930, '1931': 944, '1944': 1316, '1946': 1692, '1948': 1768, '1955': 1720, '1961': 1327, '1965': 821, '1970': 342, '1971': 242, '1974': 182, '1991': 124, '1993': 141, '2000': 86, '2001': 113, '2003': 82, '2008': 56, '2010': 52, '1913': 430, '1918': 754, '1922': 850, '1924': 855, '1927': 913, '1934': 931, '1940': 1135, '1945': 1348, '1951': 1855, '1956': 1725, '1957': 1731, '1959': 1634, '1964': 1113, '1979': 175, '1980': 177, '1987': 116, '1994': 106, '1997': 107, '2007': 40, '2011': 35, '2013': 34, '1929': 942, '1938': 1009, '1963': 1141, '1967': 533, '1976': 157, '1982': 224, '1983': 140, '1988': 129, '1999': 92, '2012': 44, '1932': 980, '1939': 1052, '1941': 1148, '1969': 336, '1992': 138, '1996': 116, '2002': 82, '1915': 731, '1935': 981, '1942': 1293, '1958': 1595, '1966': 669, '1968': 443, '1984': 149, '1990': 134, '2004': 86, '2006': 73, '2014': 42}}, {'_id': ObjectId('5eb51b899993d0175fb6b480'), 'Gender': 'F', 'Name': 'Mary', 'State': 'CT', 'YearsCountDict': {'1916': 1078, '1922': 836, '1923': 863, '1928': 643, '1947': 665, '1949': 586, '1950': 611, '1953': 665, '1955': 772, '1958': 702, '1961': 661, '1971': 198, '1981': 105, '1985': 92, '1986': 81, '1994': 100, '2001': 49, '2004': 50, '1924': 760, '1926': 751, '1931': 552, '1935': 428, '1940': 450, '1942': 581, '1954': 749, '1957': 762, '1964': 592, '1967': 336, '1969': 256, '1970': 236, '1992': 84, '2008': 27, '1920': 970, '1930': 625, '1956': 752, '1963': 590, '1966': 381, '1974': 89, '1982': 98, '1984': 73, '2003': 50, '2005': 44, '2006': 30, '2010': 29, '2012': 20, '2014': 22, '1913': 606, '1914': 851, '1921': 1010, '1933': 493, '1934': 459, '1939': 419, '1946': 643, '1965': 519, '1976': 103, '1999': 68, '2007': 38, '2011': 22, '1918': 1119, '1936': 422, '1937': 438, '1948': 619, '1959': 763, '1972': 136, '1979': 80, '1995': 80, '1996': 73, '1911': 382, '1938': 390, '1952': 632, '1960': 682, '1968': 277, '1973': 124, '1997': 65, '2002': 59, '2009': 27, '1915': 989, '1919': 969, '1925': 729, '1932': 493, '1941': 509, '1944': 520, '1945': 532, '1951': 623, '1987': 76, '1991': 109, '1993': 94, '1998': 68, '2000': 66, '2013': 24, '1910': 304, '1912': 471, '1917': 1138, '1927': 668, '1929': 624, '1943': 589, '1962': 576, '1975': 89, '1977': 83, '1978': 95, '1980': 120, '1983': 99, '1988': 94, '1989': 93, '1990': 96}}, {'_id': ObjectId('5eb51b899993d0175fb7174c'), 'Gender': 'F', 'Name': 'Mary', 'State': 'AL', 'YearsCountDict': {'1913': 1125, '1919': 2223, '1925': 2694, '1933': 2106, '1936': 2017, '1952': 1720, '1957': 1223, '1960': 982, '1962': 821, '1967': 590, '1980': 321, '1981': 298, '1985': 274, '2001': 234, '2011': 138, '1912': 1041, '1916': 1722, '1923': 2420, '1939': 2073, '1955': 1354, '1968': 496, '1982': 314, '1987': 246, '1989': 257, '2004': 204, '1922': 2384, '1927': 2610, '1928': 2501, '1944': 2079, '1958': 1044, '1965': 715, '1991': 266, '1998': 243, '2002': 265, '2008': 183, '1917': 1825, '1924': 2596, '1930': 2397, '1943': 2308, '1959': 1048, '1963': 781, '1969': 465, '1971': 444, '1986': 240, '1990': 294, '2005': 221, '2007': 191, '2010': 147, '1910': 875, '1911': 804, '1920': 2357, '1921': 2318, '1947': 2117, '1951': 1704, '1954': 1447, '1964': 800, '1966': 645, '1973': 320, '1979': 253, '1997': 264, '2009': 144, '1914': 1429, '1932': 2339, '1935': 2214, '1937': 2154, '1938': 2193, '1942': 2351, '1945': 2081, '1956': 1247, '1972': 361, '1975': 254, '1976': 290, '1977': 268, '1983': 283, '2012': 123, '2013': 126, '1926': 2535, '1929': 2339, '1934': 2287, '1941': 2099, '1946': 2107, '1948': 2074, '1949': 1976, '1961': 930, '1970': 455, '1978': 291, '1984': 266, '1988': 275, '1992': 259, '1994': 284, '1995': 282, '1999': 269, '2000': 261, '2003': 204, '2006': 150, '2014': 112, '1915': 1552, '1918': 1914, '1931': 2274, '1940': 2074, '1950': 1839, '1953': 1503, '1974': 339, '1993': 243, '1996': 276}}, {'_id': ObjectId('5eb51b899993d0175fb72926'), 'Gender': 'F', 'Name': 'Mary', 'State': 'TN', 'YearsCountDict': {'1938': 1827, '1942': 1865, '1957': 1104, '1965': 636, '1967': 559, '1968': 504, '1969': 525, '1970': 520, '1995': 270, '2005': 193, '1916': 1723, '1923': 2371, '1927': 2329, '1930': 2102, '1932': 2023, '1936': 1792, '1943': 1927, '1953': 1463, '1963': 767, '1973': 408, '1975': 342, '1985': 320, '1990': 281, '1993': 325, '2002': 231, '2004': 178, '1913': 1029, '1917': 1744, '1948': 1778, '1956': 1207, '1959': 973, '1972': 413, '1974': 349, '1994': 302, '1996': 290, '2000': 258, '2001': 248, '2003': 215, '2010': 127, '2011': 103, '2014': 101, '1921': 2254, '1926': 2151, '1928': 2115, '1941': 1926, '1946': 1876, '1947': 1928, '1954': 1321, '1960': 907, '1971': 469, '1977': 327, '1986': 287, '1998': 257, '1911': 721, '1922': 2245, '1924': 2474, '1925': 2442, '1934': 1957, '1937': 1742, '1939': 1734, '1958': 1037, '1962': 797, '1964': 755, '1976': 337, '1979': 350, '1983': 344, '1987': 320, '1988': 313, '1989': 302, '1992': 308, '1999': 229, '2006': 162, '1914': 1158, '1920': 2076, '1944': 1810, '1945': 1646, '1949': 1670, '1950': 1555, '1952': 1488, '1961': 874, '1978': 352, '1991': 297, '2013': 102, '1910': 735, '1915': 1515, '1929': 2028, '1931': 2042, '1933': 1844, '1935': 1952, '1951': 1547, '1955': 1284, '1966': 563, '1981': 365, '2007': 173, '2008': 169, '2012': 111, '1912': 869, '1918': 1909, '1919': 2001, '1940': 1780, '1980': 363, '1982': 364, '1984': 314, '1997': 268, '2009': 165}}, {'_id': ObjectId('5eb51b899993d0175fb74f71'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MS', 'YearsCountDict': {'1913': 899, '1959': 1010, '1977': 232, '1985': 171, '2000': 153, '2004': 114, '2013': 78, '1910': 762, '1914': 970, '1922': 1719, '1943': 1715, '1947': 1742, '1950': 1639, '1952': 1536, '1955': 1312, '1961': 833, '1962': 817, '1964': 721, '1967': 512, '1970': 406, '1973': 284, '1980': 231, '2001': 148, '2002': 124, '2007': 83, '1912': 806, '1932': 1568, '1933': 1497, '1948': 1714, '1956': 1225, '1960': 941, '1998': 141, '2003': 113, '1917': 1215, '1919': 1520, '1923': 1617, '1931': 1463, '1939': 1565, '1942': 1652, '1989': 145, '1990': 161, '1999': 144, '2010': 74, '2012': 89, '1916': 1263, '1918': 1348, '1928': 1680, '1929': 1624, '1945': 1500, '1951': 1587, '1954': 1377, '1958': 1002, '1971': 353, '1975': 237, '1978': 192, '1988': 179, '1995': 194, '1996': 145, '2008': 104, '1915': 1141, '1920': 1556, '1924': 1734, '1927': 1840, '1940': 1559, '1944': 1545, '1965': 617, '1974': 254, '1976': 213, '1987': 160, '1992': 153, '1994': 159, '2006': 88, '1911': 606, '1921': 1526, '1930': 1662, '1935': 1504, '1937': 1498, '1946': 1621, '1949': 1765, '1953': 1496, '1963': 764, '1972': 314, '1979': 222, '1982': 261, '1986': 160, '1993': 147, '2005': 121, '2009': 90, '2014': 82, '1925': 1778, '1926': 1713, '1934': 1491, '1936': 1549, '1938': 1538, '1941': 1628, '1957': 1164, '1966': 565, '1968': 424, '1969': 374, '1981': 207, '1983': 178, '1984': 193, '1991': 162, '1997': 145, '2011': 96}}, {'_id': ObjectId('5eb51b899993d0175fb769d2'), 'Gender': 'F', 'Name': 'Mary', 'State': 'AK', 'YearsCountDict': {'1930': 35, '1931': 41, '1932': 26, '1933': 25, '1934': 31, '1982': 29, '1984': 31, '2011': 15, '1954': 93, '1976': 23, '1978': 29, '1981': 32, '1985': 42, '1992': 21, '1994': 21, '1995': 31, '2000': 18, '2005': 18, '1951': 77, '1974': 35, '2006': 9, '2007': 11, '1910': 14, '1911': 12, '1912': 9, '1913': 21, '1914': 22, '1915': 23, '1916': 18, '1917': 21, '1918': 27, '1919': 22, '1952': 75, '1962': 86, '1973': 29, '1979': 25, '1991': 31, '1996': 14, '2002': 16, '1939': 28, '1940': 43, '1941': 41, '1947': 63, '1948': 71, '1957': 99, '1959': 61, '1960': 78, '1961': 80, '1971': 46, '1977': 26, '1987': 32, '1997': 25, '2008': 12, '1942': 43, '1943': 47, '1944': 42, '1956': 96, '1963': 78, '1966': 67, '1967': 59, '1968': 36, '1969': 51, '1970': 47, '1980': 38, '1983': 36, '1989': 29, '1993': 26, '1999': 19, '2014': 6, '1925': 24, '1926': 39, '1927': 30, '1928': 27, '1929': 25, '1935': 29, '1936': 33, '1937': 41, '1938': 37, '1945': 47, '1946': 47, '1949': 79, '1950': 71, '1953': 80, '1975': 29, '1988': 28, '2003': 9, '2004': 11, '1920': 38, '1921': 36, '1922': 29, '1923': 26, '1924': 41, '1955': 91, '1958': 84, '1964': 84, '1965': 68, '1972': 31, '1986': 27, '1990': 27, '1998': 19, '2001': 16, '2009': 6, '2010': 15, '2012': 15, '2013': 14}}, {'_id': ObjectId('5eb51b899993d0175fb76eab'), 'Gender': 'F', 'Name': 'Mary', 'State': 'PA', 'YearsCountDict': {'1921': 7771, '1932': 4536, '1934': 4088, '1935': 3929, '1937': 4037, '1938': 4059, '1940': 3942, '1941': 4085, '1953': 4025, '1956': 3872, '1970': 1051, '1972': 739, '1982': 497, '2002': 264, '1911': 3188, '1944': 3952, '1963': 2577, '1964': 2531, '1969': 1145, '1977': 526, '1985': 398, '1995': 367, '1997': 303, '2012': 130, '1926': 6004, '1939': 3878, '1946': 4455, '1951': 4208, '1958': 3697, '1959': 3578, '1967': 1418, '1987': 365, '1991': 415, '1992': 430, '1910': 2913, '1912': 4106, '1923': 7034, '1924': 7200, '1925': 6565, '1930': 5186, '1949': 4350, '1974': 593, '1980': 531, '1998': 286, '2007': 191, '1914': 5981, '1920': 7651, '1931': 4916, '1936': 3932, '1943': 4471, '1950': 4074, '1965': 2049, '1966': 1695, '1973': 611, '1976': 476, '1981': 502, '1984': 430, '1999': 319, '2001': 299, '2008': 165, '2014': 124, '1915': 7970, '1922': 7303, '1928': 5739, '1955': 4071, '1961': 3035, '1968': 1275, '1983': 455, '1993': 437, '2004': 223, '2005': 199, '2006': 183, '2010': 158, '1913': 4738, '1927': 6228, '1942': 4474, '1947': 4845, '1952': 4171, '1960': 3295, '1962': 2635, '1978': 468, '1979': 500, '1986': 388, '2003': 225, '2011': 128, '1916': 7730, '1917': 7987, '1918': 8184, '1919': 7428, '1929': 5303, '1933': 4122, '1945': 3658, '1948': 4440, '1954': 4394, '1957': 3968, '1971': 807, '1975': 580, '1988': 400, '1989': 447, '1990': 414, '1994': 430, '1996': 358, '2000': 310, '2009': 160, '2013': 123}}, {'_id': ObjectId('5eb51b899993d0175fb7848e'), 'Gender': 'F', 'Name': 'Mary', 'State': 'WA', 'YearsCountDict': {'1914': 281, '1915': 368, '1918': 479, '1930': 401, '1938': 382, '1944': 684, '1955': 735, '1967': 326, '1988': 112, '1923': 475, '1931': 377, '1952': 828, '1954': 791, '1957': 684, '1969': 231, '1970': 228, '1972': 130, '1977': 133, '1990': 129, '1999': 78, '2013': 39, '2014': 32, '1927': 463, '1929': 432, '1935': 375, '1950': 818, '1958': 621, '1961': 495, '1965': 353, '1980': 193, '1985': 123, '1992': 114, '1994': 108, '2001': 90, '2005': 62, '2007': 60, '2011': 38, '1913': 235, '1940': 366, '1951': 736, '1960': 548, '1964': 443, '1973': 166, '1986': 122, '1996': 108, '1910': 112, '1912': 207, '1925': 436, '1928': 400, '1937': 378, '1956': 702, '1968': 226, '1971': 177, '1974': 122, '1975': 134, '1978': 123, '1995': 103, '2012': 35, '1917': 408, '1919': 419, '1922': 531, '1926': 434, '1934': 361, '1941': 423, '1945': 655, '1953': 757, '1959': 683, '1979': 161, '1987': 134, '2002': 74, '2003': 80, '2004': 71, '2009': 47, '1920': 544, '1932': 398, '1942': 571, '1943': 600, '1946': 753, '1949': 796, '1962': 514, '1976': 119, '1981': 179, '1982': 149, '1984': 145, '1989': 140, '1993': 83, '1997': 93, '2006': 61, '2010': 40, '1911': 131, '1916': 364, '1921': 531, '1924': 464, '1933': 388, '1936': 351, '1939': 349, '1947': 790, '1948': 805, '1963': 465, '1966': 325, '1983': 147, '1991': 117, '1998': 85, '2000': 73, '2008': 50}}, {'_id': ObjectId('5eb51b899993d0175fb7aaf5'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NC', 'YearsCountDict': {'1914': 1581, '1921': 2811, '1945': 2224, '1948': 2402, '1954': 1700, '1957': 1505, '1959': 1256, '1978': 358, '1979': 381, '1997': 292, '2008': 161, '1917': 2129, '1918': 2344, '1925': 2915, '1931': 2441, '1932': 2674, '1940': 2281, '1962': 1084, '1964': 999, '1965': 893, '1966': 764, '1967': 705, '1980': 423, '1984': 337, '1986': 333, '2010': 141, '1913': 1180, '1916': 2056, '1919': 2577, '1920': 2675, '1924': 3126, '1929': 2665, '1933': 2387, '1934': 2442, '1937': 2305, '1947': 2571, '1974': 415, '1989': 355, '1992': 355, '2001': 265, '2006': 214, '2013': 115, '1912': 1086, '1915': 1946, '1936': 2254, '1951': 2115, '1953': 1779, '1968': 650, '1969': 635, '1976': 369, '1987': 369, '1991': 386, '1994': 339, '2011': 130, '1910': 837, '1911': 838, '1922': 2852, '1927': 2967, '1949': 2236, '1955': 1681, '1963': 1062, '2000': 280, '2005': 221, '1935': 2410, '1946': 2413, '1971': 555, '1973': 434, '1977': 384, '1982': 377, '1983': 357, '1998': 285, '2002': 228, '2007': 181, '2012': 110, '1928': 2898, '1942': 2413, '1943': 2447, '1950': 2032, '1952': 1960, '1956': 1519, '1958': 1353, '1970': 647, '1988': 336, '2004': 262, '1923': 2891, '1926': 2912, '1930': 2483, '1938': 2347, '1939': 2287, '1941': 2250, '1944': 2312, '1960': 1165, '1961': 1176, '1972': 471, '1975': 383, '1981': 417, '1985': 369, '1990': 358, '1993': 346, '1995': 285, '1996': 291, '1999': 312, '2003': 244, '2009': 145, '2014': 114}}, {'_id': ObjectId('5eb51b899993d0175fb7d408'), 'Gender': 'F', 'Name': 'Mary', 'State': 'VT', 'YearsCountDict': {'1918': 139, '1921': 154, '1935': 99, '1939': 106, '1949': 173, '1968': 42, '1973': 18, '1988': 14, '2008': 8, '1920': 132, '1926': 114, '1929': 141, '1936': 112, '1953': 173, '1954': 187, '1957': 156, '1969': 39, '1974': 20, '1975': 20, '1978': 20, '1980': 32, '1986': 19, '1999': 13, '1910': 45, '1911': 62, '1912': 95, '1914': 106, '1922': 129, '1927': 114, '1930': 126, '1937': 104, '1940': 123, '1947': 183, '1951': 147, '1959': 126, '1960': 101, '1961': 98, '1962': 81, '1964': 96, '1976': 23, '1977': 22, '1982': 13, '2002': 11, '1915': 120, '1916': 144, '1928': 123, '1952': 155, '1965': 73, '1967': 56, '1970': 35, '1983': 18, '1990': 22, '1991': 13, '1917': 126, '1919': 121, '1931': 114, '1945': 120, '1946': 132, '1950': 155, '1956': 123, '1963': 84, '1971': 44, '1972': 31, '1987': 9, '1995': 9, '2000': 13, '2001': 10, '1913': 80, '1923': 119, '1932': 127, '1943': 121, '1944': 118, '1948': 146, '1955': 144, '1966': 61, '1989': 13, '1992': 17, '1997': 6, '1998': 10, '2003': 15, '1925': 141, '1933': 117, '1938': 105, '1942': 128, '1958': 117, '1981': 23, '1984': 18, '1985': 20, '1996': 15, '1924': 124, '1934': 114, '1941': 111, '1979': 27, '1993': 11, '1994': 14, '2004': 5}}, {'_id': ObjectId('5eb51b8a9993d0175fb828f7'), 'Gender': 'F', 'Name': 'Mary', 'State': 'OR', 'YearsCountDict': {'1910': 54, '1931': 217, '1950': 429, '1951': 450, '1954': 459, '1957': 383, '1961': 305, '1962': 233, '1969': 132, '1972': 98, '1984': 87, '1992': 63, '1997': 62, '1999': 66, '2004': 41, '2006': 44, '1912': 129, '1919': 256, '1928': 252, '1932': 198, '1933': 188, '1978': 97, '1985': 82, '1987': 63, '1998': 54, '1918': 230, '1921': 326, '1936': 210, '1939': 235, '1944': 368, '1947': 504, '1949': 435, '1952': 439, '1955': 417, '1971': 112, '1974': 83, '1976': 94, '1981': 109, '1993': 74, '2001': 39, '2014': 20, '1914': 149, '1915': 248, '1922': 271, '1935': 190, '1938': 253, '1942': 351, '1946': 421, '1948': 442, '1956': 373, '1966': 160, '1979': 99, '2003': 44, '2010': 22, '1911': 73, '1913': 136, '1923': 321, '1925': 274, '1926': 238, '1958': 362, '1968': 144, '1975': 94, '1983': 80, '1988': 68, '1996': 60, '1916': 198, '1917': 249, '1927': 259, '1943': 395, '1960': 323, '1967': 165, '1973': 90, '1980': 118, '1989': 69, '1991': 80, '2009': 32, '2013': 20, '1920': 271, '1924': 291, '1929': 244, '1930': 268, '1937': 238, '1953': 486, '1959': 295, '1963': 222, '1965': 229, '1970': 122, '1982': 121, '1990': 85, '1995': 49, '2000': 60, '2005': 27, '2007': 31, '2011': 20, '1934': 206, '1940': 250, '1941': 255, '1945': 371, '1964': 252, '1977': 84, '1986': 68, '1994': 56, '2002': 39, '2008': 40, '2012': 14}}, {'_id': ObjectId('5eb51b8a9993d0175fb85300'), 'Gender': 'F', 'Name': 'Mary', 'State': 'SC', 'YearsCountDict': {'1911': 508, '1912': 711, '1918': 1412, '1923': 1646, '1939': 1483, '1946': 1570, '1954': 1300, '1977': 237, '1982': 225, '1986': 162, '1999': 162, '2001': 174, '2002': 159, '1920': 1615, '1921': 1687, '1924': 1742, '1976': 240, '1979': 243, '1994': 168, '1996': 197, '1914': 908, '1941': 1626, '1949': 1595, '1953': 1357, '1957': 995, '1978': 228, '1990': 188, '1991': 207, '1992': 209, '2009': 95, '1925': 1670, '1929': 1419, '1934': 1528, '1935': 1463, '1947': 1710, '1948': 1639, '1951': 1458, '1952': 1402, '1981': 225, '1987': 178, '1995': 188, '1997': 171, '2005': 141, '2006': 123, '2008': 98, '2014': 92, '1915': 1098, '1919': 1616, '1922': 1634, '1926': 1628, '1933': 1465, '1937': 1548, '1938': 1478, '1944': 1603, '1956': 1097, '1959': 877, '1966': 469, '1967': 475, '2003': 142, '2007': 103, '1916': 1243, '1936': 1435, '1942': 1635, '1960': 805, '1963': 644, '1969': 365, '1972': 277, '1985': 214, '2000': 167, '2013': 93, '1928': 1539, '1930': 1450, '1932': 1565, '1940': 1570, '1958': 897, '1962': 715, '1964': 617, '1968': 422, '1971': 338, '1973': 273, '1974': 245, '1975': 233, '1988': 189, '1989': 196, '1910': 602, '1913': 725, '1917': 1263, '1927': 1638, '1931': 1365, '1943': 1699, '1945': 1548, '1950': 1455, '1955': 1144, '1961': 750, '1965': 531, '1970': 364, '1980': 235, '1983': 203, '1984': 202, '1993': 179, '1998': 175, '2004': 124, '2010': 72, '2011': 97, '2012': 79}}, {'_id': ObjectId('5eb51b8a9993d0175fb8564f'), 'Gender': 'F', 'Name': 'Mary', 'State': 'IA', 'YearsCountDict': {'1987': 58, '2009': 27, '2012': 21, '1957': 966, '1962': 729, '2001': 48, '2002': 43, '2008': 31, '2013': 27, '2014': 16, '1936': 1072, '1945': 1079, '1953': 1216, '1968': 309, '1996': 59, '2000': 54, '2006': 36, '1911': 279, '1913': 480, '1918': 1070, '1927': 1170, '1929': 1025, '1933': 1059, '1937': 1100, '1943': 1242, '1967': 344, '1969': 270, '1991': 80, '1998': 53, '1999': 52, '2005': 34, '2010': 26, '1910': 239, '1914': 624, '1915': 867, '1920': 1213, '1921': 1215, '1926': 1051, '1934': 1091, '1935': 1025, '1940': 1098, '1944': 1173, '1952': 1220, '1956': 1082, '1960': 789, '1977': 112, '1980': 134, '1982': 117, '1984': 94, '1990': 62, '2004': 40, '1928': 1100, '1939': 1028, '1941': 1091, '1949': 1290, '1954': 1275, '1955': 1153, '1963': 685, '1979': 112, '1995': 65, '1997': 54, '2003': 31, '2011': 21, '1922': 1144, '1923': 1204, '1950': 1312, '1958': 831, '1961': 780, '1965': 495, '1973': 121, '1974': 119, '1976': 116, '1983': 110, '1989': 64, '1992': 64, '2007': 22, '1917': 1077, '1919': 1076, '1931': 1118, '1932': 1077, '1938': 1087, '1942': 1192, '1947': 1397, '1948': 1354, '1951': 1362, '1964': 615, '1966': 449, '1972': 156, '1975': 129, '1981': 118, '1986': 72, '1988': 73, '1993': 73, '1994': 53, '1912': 403, '1916': 982, '1924': 1158, '1925': 1123, '1930': 1140, '1946': 1296, '1959': 832, '1970': 246, '1971': 217, '1978': 133, '1985': 91}}, {'_id': ObjectId('5eb51b8a9993d0175fb85e6a'), 'Gender': 'F', 'Name': 'Mary', 'State': 'OK', 'YearsCountDict': {'1911': 348, '1925': 1452, '1941': 1056, '2000': 78, '2002': 57, '2004': 55, '1910': 326, '1921': 1338, '1922': 1358, '1950': 814, '1955': 607, '1956': 546, '1957': 541, '1963': 358, '1977': 155, '1982': 168, '1990': 97, '1994': 83, '2001': 63, '2003': 51, '2006': 42, '2014': 32, '1920': 1313, '1923': 1446, '1933': 1254, '1937': 1094, '1939': 1082, '1945': 851, '1948': 819, '1949': 809, '1951': 746, '1953': 660, '1962': 364, '1965': 298, '1967': 227, '1972': 171, '1978': 126, '1980': 167, '1986': 127, '1989': 94, '2005': 51, '2012': 22, '1917': 931, '1919': 1051, '1931': 1237, '1932': 1224, '1934': 1223, '1935': 1154, '1943': 1047, '1947': 914, '1966': 254, '1968': 210, '1976': 126, '1988': 94, '1913': 525, '1918': 1036, '1926': 1410, '1928': 1341, '1930': 1318, '1946': 877, '1954': 669, '1961': 413, '1981': 166, '1983': 169, '1992': 102, '1915': 763, '1916': 815, '1940': 1042, '1942': 1067, '1944': 997, '1952': 710, '1959': 460, '1969': 204, '1971': 192, '1975': 126, '1984': 144, '1987': 97, '1996': 70, '1998': 70, '2007': 39, '2009': 39, '2010': 29, '2011': 29, '1912': 431, '1914': 572, '1924': 1461, '1929': 1350, '1960': 470, '1974': 138, '1979': 162, '1985': 136, '1991': 92, '1993': 91, '1995': 82, '1997': 65, '2008': 46, '2013': 29, '1927': 1502, '1936': 1097, '1938': 1115, '1958': 468, '1964': 347, '1970': 193, '1973': 129, '1999': 58}}, {'_id': ObjectId('5eb51b8a9993d0175fb85f05'), 'Gender': 'F', 'Name': 'Mary', 'State': 'FL', 'YearsCountDict': {'1910': 239, '1911': 223, '1922': 706, '1929': 823, '1932': 837, '1935': 781, '1942': 1066, '1947': 1237, '1955': 1256, '1962': 1003, '1977': 348, '1979': 341, '1994': 327, '1995': 291, '1999': 234, '2001': 219, '1934': 788, '1954': 1244, '1965': 849, '1971': 546, '1973': 445, '2002': 195, '2006': 146, '2011': 75, '1920': 626, '1923': 794, '1931': 782, '1933': 679, '1938': 803, '1948': 1224, '1953': 1222, '1956': 1348, '1963': 966, '1964': 1052, '1968': 640, '1970': 577, '1984': 321, '1988': 340, '1992': 328, '2007': 153, '2013': 84, '1913': 326, '1914': 427, '1918': 585, '1930': 801, '1950': 1157, '1958': 1256, '1967': 704, '1978': 363, '1982': 418, '1985': 356, '1993': 322, '1998': 232, '2003': 175, '2009': 93, '2010': 100, '2012': 80, '2014': 83, '1912': 290, '1921': 717, '1927': 1042, '1936': 783, '1943': 1215, '1949': 1107, '1951': 1222, '1959': 1195, '1961': 1133, '1972': 455, '1976': 386, '1987': 349, '2000': 223, '2005': 166, '2008': 114, '1916': 537, '1917': 581, '1919': 620, '1926': 1068, '1928': 916, '1937': 816, '1946': 1123, '1952': 1229, '1969': 549, '1981': 422, '1991': 371, '1997': 221, '1915': 482, '1924': 846, '1940': 902, '1941': 907, '1944': 1175, '1960': 1201, '1966': 779, '1974': 417, '1975': 379, '1980': 407, '1989': 361, '1990': 357, '2004': 200, '1925': 972, '1939': 888, '1945': 1216, '1957': 1239, '1983': 360, '1986': 326, '1996': 253}}, {'_id': ObjectId('5eb51b8a9993d0175fb861bc'), 'Gender': 'F', 'Name': 'Mary', 'State': 'IL', 'YearsCountDict': {'1910': 1076, '1918': 3381, '1919': 3396, '1921': 3674, '1926': 3336, '1931': 3118, '1933': 2753, '1935': 2570, '1936': 2680, '1946': 3492, '1963': 2658, '1969': 1169, '1975': 577, '1978': 491, '1980': 621, '1991': 439, '1999': 278, '2006': 172, '2007': 157, '1916': 3287, '1942': 3205, '1949': 3728, '1951': 3778, '1959': 3700, '1982': 519, '1983': 523, '1987': 457, '2000': 293, '2001': 246, '2005': 196, '1914': 2332, '1924': 3553, '1927': 3575, '1955': 4100, '1966': 1706, '1970': 1046, '1971': 884, '1974': 629, '1985': 448, '1988': 392, '1989': 415, '1998': 292, '2002': 261, '1923': 3607, '1950': 3810, '1952': 3871, '1954': 4341, '1960': 3571, '1964': 2556, '1965': 2102, '1967': 1439, '1973': 637, '1976': 504, '1981': 598, '1984': 426, '1995': 356, '1996': 314, '2008': 127, '2010': 96, '1911': 1207, '1930': 3444, '1945': 2931, '1962': 2780, '1968': 1319, '1972': 692, '1992': 395, '2003': 238, '1912': 1594, '1928': 3415, '1943': 3352, '1958': 3659, '1961': 3157, '1979': 536, '1986': 394, '1994': 347, '2004': 205, '2009': 106, '2013': 105, '2014': 85, '1922': 3519, '1929': 3274, '1938': 2733, '1940': 2770, '1944': 3109, '1948': 3646, '1956': 4008, '1977': 546, '1990': 393, '1993': 420, '1913': 1956, '1915': 3043, '1917': 3474, '1920': 3472, '1925': 3498, '1932': 2903, '1934': 2670, '1937': 2772, '1939': 2634, '1941': 2797, '1947': 3938, '1953': 3806, '1957': 3993, '1997': 320, '2011': 92, '2012': 84}}, {'_id': ObjectId('5eb51b8a9993d0175fb87b99'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MD', 'YearsCountDict': {'1912': 562, '1915': 963, '1923': 1226, '1926': 1146, '1942': 1064, '1950': 1028, '1961': 849, '1963': 827, '1968': 441, '1972': 267, '1979': 169, '1928': 1132, '1938': 890, '1953': 1125, '1965': 713, '1966': 540, '1980': 193, '1986': 180, '1990': 215, '1996': 146, '2013': 41, '1919': 1069, '1922': 1189, '1927': 1200, '1937': 877, '1946': 1184, '1958': 1015, '1960': 920, '1974': 218, '1982': 199, '1985': 166, '1993': 149, '2002': 92, '1913': 617, '1930': 1043, '1931': 931, '1932': 995, '1943': 1191, '1949': 1023, '1951': 1063, '1952': 1087, '1988': 167, '1997': 149, '2004': 85, '2010': 54, '2012': 55, '1914': 718, '1916': 961, '1929': 1050, '1933': 903, '1941': 988, '1945': 965, '1956': 1108, '1962': 815, '1970': 371, '1975': 166, '1976': 181, '1978': 151, '1983': 168, '1984': 159, '1989': 168, '1994': 185, '2006': 76, '2008': 82, '1918': 1076, '1921': 1302, '1924': 1248, '1940': 906, '1955': 1103, '1957': 1013, '1959': 1001, '1969': 385, '1973': 223, '1992': 190, '1995': 149, '2000': 141, '2001': 123, '2003': 96, '2011': 52, '2014': 49, '1910': 393, '1911': 425, '1917': 1031, '1934': 886, '1939': 846, '1964': 729, '1967': 528, '1971': 317, '1977': 177, '1981': 191, '1987': 144, '1991': 202, '1998': 149, '1920': 1208, '1925': 1181, '1935': 882, '1936': 875, '1944': 1083, '1947': 1255, '1948': 1042, '1954': 1207, '1999': 97, '2005': 90, '2007': 61, '2009': 62}}, {'_id': ObjectId('5eb51b8a9993d0175fb8a0e4'), 'Gender': 'F', 'Name': 'Mary', 'State': 'RI', 'YearsCountDict': {'1916': 414, '1919': 410, '1923': 426, '1924': 414, '1929': 269, '1931': 216, '1937': 144, '1945': 215, '1946': 253, '1948': 243, '1960': 241, '1971': 66, '1972': 37, '1974': 29, '1979': 31, '1992': 29, '1996': 14, '2000': 16, '1914': 350, '1922': 461, '1925': 382, '1930': 278, '1936': 167, '1941': 171, '1956': 292, '1957': 267, '2002': 14, '2007': 7, '2014': 6, '1921': 485, '1935': 172, '1938': 163, '1939': 180, '1949': 226, '1952': 249, '1953': 259, '1958': 268, '1964': 188, '1976': 29, '1984': 30, '1991': 21, '1993': 24, '1995': 19, '1999': 12, '2009': 6, '2011': 8, '1928': 319, '1951': 256, '1954': 300, '1961': 227, '1962': 235, '1968': 79, '1982': 21, '1985': 23, '1986': 27, '1989': 24, '2001': 11, '2003': 17, '1911': 168, '1913': 269, '1926': 309, '1942': 173, '1944': 195, '1950': 226, '1980': 33, '1981': 26, '1988': 23, '1997': 12, '2010': 5, '1918': 458, '1932': 225, '1940': 177, '1943': 234, '1955': 278, '1959': 258, '1963': 203, '1966': 116, '1967': 101, '1969': 66, '1983': 31, '1987': 25, '2004': 13, '2006': 12, '2008': 8, '1910': 141, '1915': 415, '1970': 77, '1973': 34, '1975': 23, '2013': 8, '1912': 266, '1917': 442, '1920': 452, '1927': 338, '1933': 187, '1934': 164, '1947': 281, '1965': 150, '1977': 31, '1978': 24, '1990': 31, '1994': 21, '1998': 20, '2005': 14}}, {'_id': ObjectId('5eb51b8a9993d0175fb8ae1e'), 'Gender': 'F', 'Name': 'Mary', 'State': 'WI', 'YearsCountDict': {'1921': 950, '1933': 1042, '1945': 1780, '1946': 1972, '1947': 2255, '1952': 2246, '1955': 2198, '1958': 1864, '1975': 174, '1983': 172, '1992': 113, '1994': 132, '2004': 64, '2009': 36, '1914': 541, '1928': 1130, '1931': 1089, '1957': 2128, '1965': 1065, '1969': 450, '1991': 135, '1995': 108, '1996': 100, '1997': 91, '2005': 57, '1910': 260, '1918': 873, '1924': 975, '1925': 1010, '1934': 1178, '1950': 2061, '1981': 186, '1982': 150, '1984': 141, '2007': 53, '1917': 838, '1920': 920, '1922': 970, '1929': 1130, '1932': 1118, '1942': 1583, '1943': 1723, '1960': 1791, '1966': 877, '1970': 399, '1972': 278, '1976': 172, '1985': 136, '1988': 113, '1999': 79, '2006': 57, '1923': 928, '1938': 1308, '1951': 2247, '1954': 2399, '1959': 1775, '1961': 1669, '1968': 571, '1973': 213, '1989': 130, '1990': 129, '1998': 80, '2000': 68, '2002': 63, '2003': 70, '1913': 458, '1926': 960, '1930': 1128, '1935': 1115, '1949': 2069, '1953': 2126, '1967': 683, '1974': 217, '1980': 195, '1912': 394, '1915': 759, '1927': 1120, '1936': 1189, '1937': 1274, '1940': 1461, '1962': 1464, '1963': 1363, '1971': 333, '1978': 154, '1979': 178, '2001': 78, '2013': 36, '2014': 36, '1911': 278, '1916': 838, '1919': 771, '1939': 1288, '1941': 1313, '1944': 1699, '1948': 2170, '1956': 2013, '1964': 1309, '1977': 183, '1986': 136, '1987': 110, '1993': 123, '2008': 52, '2010': 41, '2011': 34, '2012': 28}}, {'_id': ObjectId('5eb51b8a9993d0175fb8affe'), 'Gender': 'F', 'Name': 'Mary', 'State': 'UT', 'YearsCountDict': {'1921': 235, '1942': 189, '1950': 240, '1954': 230, '1955': 181, '1973': 83, '1978': 101, '2008': 54, '2013': 25, '1910': 57, '1911': 71, '1925': 189, '1929': 177, '1939': 179, '1941': 168, '1945': 195, '1946': 220, '1952': 230, '1967': 89, '1970': 97, '1991': 71, '1992': 64, '1994': 53, '2011': 40, '1931': 187, '1932': 164, '1944': 218, '1947': 240, '1949': 233, '1953': 230, '1956': 205, '1993': 62, '1997': 60, '1999': 62, '2009': 47, '1914': 123, '1915': 183, '1938': 165, '1961': 145, '1964': 129, '1965': 125, '1966': 110, '1971': 77, '1981': 112, '1987': 56, '1990': 68, '1995': 56, '2003': 64, '2005': 51, '2006': 52, '1913': 82, '1927': 200, '1936': 185, '1960': 148, '1963': 137, '1972': 77, '1977': 98, '1986': 53, '2001': 57, '2007': 48, '1912': 86, '1923': 219, '1924': 187, '1926': 202, '1930': 161, '1934': 158, '1958': 183, '1962': 171, '1979': 113, '1982': 104, '2002': 60, '1916': 172, '1917': 203, '1918': 206, '1943': 240, '1957': 207, '1968': 81, '1969': 79, '1974': 70, '1983': 79, '1985': 83, '1998': 86, '2000': 58, '2004': 50, '2012': 36, '1919': 195, '1920': 222, '1922': 223, '1928': 179, '1933': 164, '1935': 160, '1937': 161, '1940': 168, '1948': 256, '1951': 234, '1959': 164, '1975': 89, '1976': 88, '1980': 109, '1984': 84, '1988': 61, '1989': 60, '1996': 65, '2010': 30, '2014': 32}}, {'_id': ObjectId('5eb51b8a9993d0175fb8e1e2'), 'Gender': 'F', 'Name': 'Mary', 'State': 'CO', 'YearsCountDict': {'1920': 576, '1929': 522, '1948': 583, '1954': 557, '1960': 438, '1985': 117, '1997': 89, '1998': 79, '2001': 70, '2003': 74, '2004': 82, '2005': 49, '1910': 193, '1919': 472, '1934': 509, '1938': 458, '1951': 575, '1952': 562, '1957': 474, '1959': 448, '1967': 219, '1970': 173, '1971': 156, '1977': 96, '1980': 106, '1992': 76, '2000': 90, '1941': 474, '1943': 534, '1947': 618, '1961': 418, '1983': 114, '1994': 76, '2009': 44, '1912': 234, '1916': 457, '1917': 462, '1928': 539, '1936': 487, '1939': 476, '1953': 574, '1956': 567, '1964': 355, '1965': 292, '1976': 122, '1984': 119, '1990': 75, '1991': 91, '1995': 79, '1996': 92, '2007': 39, '2011': 34, '1915': 387, '1918': 539, '1922': 612, '1937': 528, '1942': 500, '1944': 522, '1945': 476, '1950': 549, '1963': 347, '1969': 171, '1974': 111, '1982': 127, '1989': 102, '2002': 57, '1911': 169, '1925': 533, '1926': 542, '1927': 512, '1932': 496, '1946': 553, '1949': 559, '1968': 161, '1972': 119, '1978': 110, '1986': 98, '1999': 87, '2008': 50, '2010': 30, '2013': 40, '1913': 258, '1921': 627, '1923': 626, '1924': 551, '1930': 515, '1935': 472, '1966': 241, '1973': 105, '1979': 113, '1981': 125, '1987': 111, '1988': 97, '1993': 75, '2006': 61, '2014': 38, '1914': 338, '1931': 511, '1933': 469, '1940': 486, '1955': 498, '1958': 462, '1962': 343, '1975': 112, '2012': 32}}, {'_id': ObjectId('5eb51b8a9993d0175fb8ef3d'), 'Gender': 'F', 'Name': 'Mary', 'State': 'IN', 'YearsCountDict': {'1917': 2231, '1935': 1406, '1939': 1278, '1963': 975, '1969': 504, '1978': 300, '1983': 250, '1986': 198, '1992': 185, '1995': 177, '2003': 125, '2005': 115, '1911': 612, '1912': 935, '1922': 2376, '1925': 2220, '1936': 1370, '1949': 1559, '1954': 1601, '1964': 956, '1974': 313, '1977': 300, '1980': 288, '1984': 208, '1996': 195, '2010': 69, '1915': 1977, '1943': 1586, '1947': 1611, '1948': 1664, '1960': 1257, '1962': 1016, '1966': 684, '1973': 320, '1979': 282, '2000': 157, '2002': 127, '1913': 1105, '1921': 2633, '1928': 2050, '1929': 1795, '1932': 1616, '1933': 1497, '1937': 1437, '1944': 1408, '1946': 1610, '1950': 1517, '1968': 536, '1989': 206, '1993': 181, '1997': 165, '1998': 160, '2001': 137, '2009': 75, '2011': 81, '1910': 619, '1926': 2063, '1938': 1459, '1941': 1360, '1961': 1151, '1967': 647, '1976': 273, '1919': 2209, '1920': 2394, '1924': 2400, '1942': 1547, '1953': 1525, '1957': 1476, '1965': 844, '1970': 516, '1971': 422, '1981': 277, '1987': 186, '1991': 204, '1999': 154, '2014': 72, '1914': 1349, '1916': 2223, '1923': 2397, '1930': 1884, '1931': 1581, '1951': 1621, '1952': 1540, '1956': 1489, '1959': 1284, '1975': 284, '1982': 291, '1988': 193, '1994': 200, '2006': 102, '2012': 63, '2013': 59, '1918': 2316, '1927': 2186, '1934': 1466, '1940': 1286, '1945': 1331, '1955': 1471, '1958': 1354, '1972': 358, '1985': 210, '1990': 215, '2004': 145, '2007': 92, '2008': 90}}, {'_id': ObjectId('5eb51b8a9993d0175fb8f475'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MO', 'YearsCountDict': {'1911': 626, '1917': 1882, '1918': 1950, '1923': 2115, '1924': 2261, '1925': 2154, '1927': 2315, '1945': 1682, '1947': 2062, '1956': 1724, '1963': 1060, '1968': 492, '1971': 440, '1981': 269, '1996': 164, '1910': 611, '1931': 1884, '1972': 357, '1975': 295, '1986': 215, '2010': 69, '1932': 1972, '1946': 1946, '1951': 1783, '1954': 1814, '1965': 913, '1970': 468, '1976': 277, '1982': 259, '1989': 224, '1993': 203, '2004': 98, '2012': 62, '1912': 834, '1920': 2022, '1921': 2122, '1930': 1990, '1935': 1668, '1937': 1680, '1943': 1850, '1953': 1666, '1958': 1536, '1959': 1409, '1974': 291, '1978': 243, '1980': 278, '1987': 195, '2001': 138, '2002': 114, '2003': 130, '2007': 94, '2011': 74, '1914': 1126, '1929': 1872, '1936': 1681, '1940': 1707, '1950': 1816, '1952': 1794, '1957': 1609, '1967': 628, '1991': 233, '1992': 232, '2006': 117, '1916': 1735, '1933': 1724, '1941': 1743, '1948': 1933, '1960': 1350, '1964': 1043, '1969': 518, '1985': 239, '2000': 163, '1913': 971, '1915': 1562, '1919': 1817, '1928': 2088, '1934': 1797, '1938': 1688, '1942': 1853, '1944': 1794, '1955': 1692, '1962': 1133, '1988': 227, '1994': 199, '1995': 208, '1999': 138, '1922': 2107, '1926': 2097, '1939': 1618, '1949': 1862, '1961': 1278, '1966': 710, '1973': 296, '1977': 290, '1979': 246, '1983': 282, '1984': 238, '1990': 218, '1997': 173, '1998': 164, '2005': 107, '2008': 77, '2009': 103, '2013': 70, '2014': 73}}, {'_id': ObjectId('5eb51b8a9993d0175fb8fd77'), 'Gender': 'F', 'Name': 'Mary', 'State': 'GA', 'YearsCountDict': {'1933': 2113, '1943': 2382, '1946': 2253, '1950': 2062, '1955': 1625, '1964': 978, '1983': 342, '1999': 377, '2001': 315, '1915': 1683, '1918': 2203, '1938': 2183, '1940': 2173, '1949': 2251, '1956': 1507, '1960': 1169, '1962': 1020, '1970': 609, '1978': 342, '1982': 372, '1984': 364, '2004': 297, '2007': 195, '2010': 175, '1911': 893, '1912': 1170, '1913': 1223, '1928': 2213, '1945': 2100, '1967': 690, '1972': 452, '1974': 427, '1980': 404, '1989': 365, '1990': 358, '1997': 348, '2002': 348, '2013': 142, '1919': 2386, '1929': 2153, '1930': 2171, '1939': 2197, '1941': 2175, '1965': 843, '2006': 243, '1921': 2474, '1923': 2401, '1924': 2502, '1931': 2133, '1947': 2350, '1952': 1969, '1953': 1823, '1963': 897, '1996': 341, '2000': 364, '2003': 296, '2005': 244, '2011': 148, '1917': 2025, '1932': 2212, '1936': 1923, '1942': 2314, '1957': 1456, '1958': 1319, '1977': 383, '1979': 367, '1985': 369, '2008': 199, '1916': 1947, '1920': 2538, '1922': 2614, '1926': 2377, '1934': 2219, '1937': 2096, '1951': 1988, '1954': 1777, '1959': 1277, '1973': 399, '1975': 341, '1976': 363, '1991': 378, '1992': 358, '1993': 366, '1994': 352, '2014': 147, '1910': 841, '1914': 1471, '1925': 2558, '1927': 2354, '1935': 2086, '1944': 2314, '1948': 2285, '1961': 1103, '1966': 791, '1968': 591, '1969': 585, '1971': 583, '1981': 380, '1986': 366, '1987': 395, '1988': 346, '1995': 344, '1998': 351, '2009': 194, '2012': 133}}, {'_id': ObjectId('5eb51b8a9993d0175fb9046e'), 'Gender': 'F', 'Name': 'Mary', 'State': 'DE', 'YearsCountDict': {'1917': 161, '1918': 178, '1925': 101, '1928': 116, '1945': 139, '1948': 145, '1958': 158, '1960': 131, '1963': 130, '1971': 46, '1979': 24, '1980': 32, '1983': 27, '1985': 18, '1987': 39, '1989': 32, '2001': 24, '2003': 14, '2006': 5, '1922': 162, '1923': 149, '1933': 71, '1934': 93, '1935': 92, '1939': 98, '1961': 142, '1962': 139, '1965': 98, '1968': 62, '1986': 31, '1992': 21, '1919': 181, '1920': 165, '1929': 115, '1930': 105, '1931': 100, '1936': 102, '1937': 98, '1944': 135, '1946': 124, '1956': 165, '1974': 44, '1978': 34, '1981': 30, '1990': 34, '1997': 21, '2000': 24, '1921': 141, '1924': 132, '1940': 115, '1941': 111, '1949': 135, '1952': 135, '1977': 23, '1984': 24, '1999': 11, '2002': 18, '2004': 13, '2008': 10, '1926': 115, '1927': 120, '1932': 97, '1938': 96, '1951': 132, '1953': 162, '1954': 166, '1959': 166, '1988': 24, '1993': 28, '1994': 35, '1995': 21, '1912': 77, '1913': 80, '1914': 101, '1942': 123, '1947': 139, '1955': 189, '1957': 166, '1966': 93, '1972': 55, '1975': 42, '1976': 39, '1982': 28, '1991': 33, '1996': 16, '2005': 8, '2009': 5, '1915': 141, '1916': 114, '1950': 149, '1964': 128, '1970': 55, '1973': 30, '2010': 6, '2011': 7, '1910': 59, '1911': 49, '1943': 133, '1967': 66, '1969': 56, '1998': 25, '2007': 5, '2014': 5}}, {'_id': ObjectId('5eb51b8a9993d0175fb90946'), 'Gender': 'F', 'Name': 'Mary', 'State': 'AR', 'YearsCountDict': {'1911': 386, '1927': 1340, '1944': 1114, '1954': 718, '1965': 339, '1991': 119, '1996': 69, '1998': 86, '1999': 87, '2012': 34, '2014': 34, '1913': 527, '1921': 1190, '1930': 1322, '1938': 1159, '1952': 842, '1953': 762, '1955': 682, '1956': 642, '1961': 441, '1972': 183, '1973': 139, '1992': 89, '1993': 92, '1995': 96, '1997': 97, '2003': 79, '1916': 904, '1924': 1363, '1932': 1249, '1943': 1227, '1945': 971, '1948': 1131, '1949': 1017, '1958': 551, '1959': 502, '1968': 210, '1976': 120, '1977': 145, '1989': 112, '1990': 95, '2005': 68, '2007': 54, '1912': 491, '1915': 837, '1920': 1206, '1928': 1172, '1931': 1172, '1935': 1227, '1939': 1111, '1940': 1189, '1942': 1208, '1966': 295, '1974': 165, '1975': 149, '1979': 120, '1982': 123, '1983': 130, '1987': 97, '1917': 957, '1922': 1283, '1923': 1257, '1951': 893, '1964': 379, '1969': 195, '1970': 228, '1981': 133, '1984': 121, '1986': 119, '2000': 85, '2013': 46, '1914': 645, '1937': 1148, '1950': 975, '1957': 669, '1962': 438, '1963': 389, '1978': 132, '2002': 100, '1918': 1048, '1919': 1101, '1933': 1149, '1934': 1211, '1941': 1223, '1946': 1074, '1947': 1188, '1971': 177, '1980': 139, '1988': 128, '1994': 95, '2001': 88, '2010': 45, '1910': 408, '1925': 1403, '1926': 1321, '1929': 1240, '1936': 1088, '1960': 478, '1967': 255, '1985': 131, '2004': 86, '2006': 57, '2008': 39, '2009': 44, '2011': 33}}, {'_id': ObjectId('5eb51b8a9993d0175fb92004'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NH', 'YearsCountDict': {'1913': 98, '1915': 186, '1919': 181, '1922': 171, '1924': 167, '1926': 137, '1937': 101, '1953': 128, '1970': 56, '1977': 27, '1978': 25, '1981': 37, '1991': 27, '1998': 23, '1999': 27, '2002': 27, '2010': 10, '2014': 16, '1932': 124, '1936': 90, '1943': 117, '1949': 146, '1951': 136, '1954': 121, '1956': 135, '1963': 103, '1982': 37, '1988': 25, '1917': 139, '1928': 151, '1933': 113, '1941': 121, '1948': 158, '1962': 102, '1964': 104, '1980': 39, '1983': 22, '1987': 31, '1989': 37, '1993': 26, '1995': 32, '2004': 15, '1910': 50, '1925': 154, '1931': 110, '1935': 96, '1939': 104, '1947': 160, '1950': 135, '1992': 31, '1916': 157, '1927': 136, '1930': 124, '1934': 100, '1940': 112, '1957': 153, '1959': 127, '1967': 77, '1969': 35, '1976': 24, '2000': 16, '2001': 25, '2008': 5, '2012': 5, '1942': 119, '1944': 113, '1952': 110, '1965': 99, '1966': 63, '1972': 35, '1973': 33, '1985': 32, '1990': 31, '1994': 29, '2003': 22, '2006': 14, '2007': 7, '1911': 68, '1912': 95, '1920': 156, '1923': 148, '1938': 107, '1945': 125, '1946': 146, '1955': 131, '1958': 120, '1961': 107, '1971': 37, '1975': 34, '1979': 34, '1986': 30, '1996': 12, '2009': 10, '1914': 138, '1918': 158, '1921': 182, '1929': 107, '1960': 119, '1968': 54, '1974': 44, '1984': 27, '1997': 25, '2005': 26}}, {'_id': ObjectId('5eb51b8a9993d0175fb92d0d'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NV', 'YearsCountDict': {'1959': 46, '1962': 51, '1983': 28, '1997': 22, '2008': 11, '2014': 9, '1910': 10, '1911': 6, '1912': 5, '1913': 21, '1914': 12, '1915': 27, '1925': 36, '1926': 33, '1927': 34, '1928': 25, '1934': 23, '1935': 29, '1936': 32, '1937': 35, '1950': 49, '1952': 60, '1953': 61, '1956': 59, '1967': 52, '1973': 24, '1982': 29, '1992': 25, '2004': 15, '2010': 13, '1920': 28, '1921': 33, '1922': 30, '1923': 29, '1924': 29, '1947': 61, '1954': 65, '1957': 69, '1965': 69, '1976': 20, '1977': 19, '1978': 19, '1980': 28, '1985': 21, '1987': 22, '1990': 24, '2002': 24, '2012': 17, '2013': 19, '1938': 14, '1939': 30, '1940': 22, '1993': 24, '1995': 33, '1999': 29, '2000': 18, '2003': 17, '2007': 16, '1929': 22, '1930': 25, '1931': 21, '1932': 23, '1933': 21, '1945': 55, '1946': 45, '1963': 53, '1964': 58, '1966': 40, '1968': 32, '1970': 32, '1971': 36, '1943': 50, '1944': 50, '1951': 54, '1975': 25, '1981': 28, '1994': 29, '1998': 21, '2001': 22, '1955': 67, '1984': 27, '1991': 33, '1996': 27, '2005': 20, '2006': 23, '2009': 13, '1916': 40, '1917': 31, '1918': 31, '1919': 37, '1941': 23, '1942': 48, '1948': 48, '1949': 57, '1958': 40, '1960': 50, '1961': 46, '1969': 41, '1972': 36, '1974': 16, '1979': 28, '1986': 21, '1988': 25, '1989': 26, '2011': 10}}, {'_id': ObjectId('5eb51b8b9993d0175fb99714'), 'Gender': 'F', 'Name': 'Mary', 'State': 'OH', 'YearsCountDict': {'1918': 4202, '1927': 4026, '1929': 3554, '1931': 3265, '1952': 3561, '1957': 3459, '1962': 2434, '1987': 377, '1991': 388, '1993': 358, '1996': 327, '2007': 153, '1911': 1277, '1933': 2917, '1946': 3333, '1949': 3377, '1966': 1638, '1992': 406, '2000': 237, '2001': 234, '2013': 96, '2014': 99, '1912': 1868, '1916': 3736, '1941': 2722, '1944': 2885, '1958': 3088, '1964': 2247, '1980': 565, '1995': 348, '1999': 319, '2010': 113, '1914': 2731, '1922': 4384, '1935': 2753, '1939': 2721, '1940': 2692, '1945': 2680, '1953': 3374, '1959': 3053, '1961': 2568, '1969': 1126, '1973': 679, '1977': 559, '1979': 525, '1982': 504, '2005': 186, '1913': 2181, '1924': 4200, '1936': 2767, '1951': 3522, '1963': 2285, '1967': 1403, '1978': 553, '1994': 349, '2004': 174, '2006': 180, '1919': 3814, '1921': 4386, '1928': 4003, '1942': 3035, '1950': 3318, '1985': 428, '2012': 112, '1917': 3948, '1925': 3812, '1930': 3585, '1932': 3090, '1934': 2901, '1955': 3557, '1970': 1097, '1972': 758, '1975': 627, '1986': 399, '1989': 371, '1990': 388, '1998': 294, '2003': 204, '2009': 119, '1910': 1099, '1915': 3653, '1920': 4232, '1923': 4128, '1926': 3806, '1937': 2759, '1938': 2788, '1943': 3166, '1947': 3563, '1948': 3463, '1954': 3754, '1956': 3467, '1960': 2904, '1965': 1881, '1968': 1177, '1971': 963, '1974': 701, '1976': 547, '1981': 591, '1983': 487, '1984': 445, '1988': 373, '1997': 323, '2002': 225, '2008': 114, '2011': 125}}, {'_id': ObjectId('5eb51b8b9993d0175fb9a6fc'), 'Gender': 'F', 'Name': 'Mary', 'State': 'TX', 'YearsCountDict': {'1911': 906, '1929': 3421, '1932': 3181, '1935': 3180, '1938': 3382, '1941': 3619, '1945': 3778, '1947': 4211, '1958': 3261, '1959': 3078, '1962': 2435, '1964': 2333, '1969': 1352, '1980': 905, '1993': 551, '1915': 2051, '1923': 3247, '1924': 3529, '1933': 2996, '1942': 3772, '1954': 4081, '1960': 2769, '1961': 2614, '1971': 1257, '1992': 615, '1998': 425, '2002': 381, '2009': 248, '1925': 3535, '1928': 3422, '1934': 3254, '1937': 3122, '1955': 3726, '1972': 968, '1975': 875, '1995': 557, '1910': 895, '1916': 2328, '1920': 3192, '1931': 3302, '2011': 211, '1927': 3607, '1949': 4141, '1950': 4061, '1952': 4153, '1957': 3617, '1963': 2342, '1968': 1423, '1977': 794, '1979': 867, '1984': 748, '1985': 803, '1986': 728, '1988': 659, '1991': 595, '2006': 315, '2013': 203, '1912': 1179, '1917': 2504, '1919': 2740, '1936': 3110, '1939': 3254, '1940': 3375, '1946': 4203, '1951': 3992, '1956': 3621, '1973': 973, '1974': 915, '1976': 770, '1978': 755, '1997': 445, '1999': 477, '2001': 370, '2004': 332, '2007': 270, '2012': 204, '1918': 2589, '1922': 3370, '1926': 3442, '1930': 3395, '1944': 4139, '1948': 4177, '1953': 4154, '1966': 1720, '1967': 1562, '1970': 1352, '1981': 820, '1987': 650, '1996': 463, '2005': 314, '2008': 263, '1913': 1425, '1914': 1707, '1921': 3413, '1943': 4065, '1965': 1891, '1982': 902, '1983': 783, '1989': 665, '1990': 663, '1994': 526, '2000': 428, '2003': 346, '2010': 196, '2014': 220}}, {'_id': ObjectId('5eb51b8b9993d0175fb9d386'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NE', 'YearsCountDict': {'1911': 142, '1915': 371, '1925': 485, '1928': 513, '1933': 476, '1940': 460, '1963': 399, '1973': 87, '1981': 74, '1999': 40, '2008': 21, '1920': 516, '1932': 514, '1934': 561, '1948': 658, '1949': 644, '1952': 639, '1985': 56, '1988': 52, '1991': 51, '2003': 38, '1910': 161, '1926': 483, '1931': 546, '1936': 503, '1958': 478, '1962': 326, '1968': 152, '1998': 50, '2006': 37, '1913': 255, '1938': 510, '1946': 563, '1953': 630, '1972': 108, '1975': 64, '1994': 43, '2001': 33, '2004': 31, '2011': 16, '2014': 13, '1914': 288, '1947': 624, '1950': 665, '1957': 512, '1966': 197, '1967': 177, '1982': 81, '1987': 67, '1990': 46, '1992': 44, '1993': 58, '2000': 38, '2007': 16, '2012': 22, '1921': 575, '1924': 580, '1941': 421, '1942': 454, '1955': 552, '1956': 618, '1969': 133, '1979': 64, '1980': 78, '1986': 47, '1989': 54, '1995': 47, '1996': 40, '1997': 47, '1916': 413, '1919': 435, '1922': 531, '1923': 533, '1929': 545, '1935': 480, '1939': 458, '1960': 452, '1964': 395, '1965': 284, '1970': 145, '1971': 120, '1976': 72, '1977': 64, '1984': 53, '2005': 22, '2010': 12, '1912': 215, '1917': 439, '1918': 486, '1927': 577, '1930': 535, '1937': 466, '1943': 506, '1944': 530, '1945': 498, '1951': 619, '1954': 620, '1959': 493, '1961': 433, '1974': 78, '1978': 56, '1983': 68, '2002': 28, '2009': 26, '2013': 19}}, {'_id': ObjectId('5eb51b8b9993d0175fba06b3'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NJ', 'YearsCountDict': {'1910': 593, '1919': 1869, '1923': 1786, '1929': 1399, '1933': 1141, '1936': 964, '1938': 1001, '1949': 1324, '1953': 1557, '1965': 1073, '2012': 42, '2013': 44, '1911': 733, '1922': 1801, '1928': 1521, '1937': 1057, '1966': 868, '1983': 200, '1984': 202, '2004': 103, '1915': 1923, '1921': 1922, '1927': 1567, '1931': 1272, '1932': 1265, '1947': 1448, '1960': 1620, '1964': 1242, '1975': 262, '1988': 189, '1999': 148, '2002': 127, '2005': 82, '2007': 70, '1914': 1449, '1917': 2014, '1920': 1885, '1939': 1011, '1943': 1324, '1944': 1240, '1951': 1407, '1955': 1709, '1958': 1725, '1961': 1462, '1970': 491, '1976': 229, '1978': 210, '1987': 193, '1996': 163, '1997': 156, '1998': 135, '2006': 81, '1916': 1882, '1935': 1038, '1942': 1331, '1959': 1567, '1962': 1273, '1968': 608, '1971': 473, '1973': 301, '1974': 263, '1981': 210, '1992': 214, '1994': 169, '2000': 120, '2009': 44, '2011': 47, '2014': 44, '1918': 1990, '1930': 1388, '1940': 1013, '1941': 1128, '1945': 1203, '1950': 1331, '1956': 1772, '1972': 367, '1977': 272, '1985': 199, '1986': 187, '2001': 118, '1912': 1024, '1925': 1617, '1934': 1062, '1946': 1420, '1980': 225, '1989': 203, '1990': 202, '2008': 63, '2010': 49, '1913': 1078, '1924': 1771, '1926': 1466, '1948': 1398, '1952': 1467, '1954': 1823, '1957': 1779, '1963': 1300, '1967': 776, '1969': 566, '1979': 225, '1982': 221, '1991': 233, '1993': 190, '1995': 134, '2003': 112}}, {'_id': ObjectId('5eb51b8b9993d0175fba2674'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MT', 'YearsCountDict': {'1912': 108, '1919': 287, '1927': 201, '1931': 184, '1944': 192, '1950': 265, '1952': 252, '1953': 237, '1955': 252, '1959': 211, '1966': 79, '1970': 49, '1976': 33, '1997': 11, '1998': 12, '2010': 8, '1916': 257, '1932': 192, '1943': 204, '1949': 248, '1961': 170, '1962': 135, '1989': 26, '1991': 22, '1917': 289, '1918': 342, '1930': 208, '1942': 209, '1958': 187, '1964': 107, '1969': 48, '1978': 33, '1979': 40, '1984': 35, '1995': 16, '2000': 21, '2005': 10, '2012': 9, '2013': 6, '1929': 211, '1935': 184, '1940': 185, '1945': 169, '1947': 221, '1963': 126, '1981': 41, '1999': 16, '2003': 15, '2009': 7, '1913': 140, '1920': 311, '1922': 281, '1923': 256, '1926': 216, '1928': 200, '1936': 184, '1941': 186, '1975': 30, '1977': 29, '1987': 19, '1990': 24, '1996': 17, '2001': 11, '2014': 8, '1910': 81, '1911': 72, '1921': 326, '1924': 261, '1933': 170, '1948': 228, '1965': 121, '1973': 38, '1974': 31, '1994': 23, '2002': 7, '2004': 5, '2007': 10, '1925': 213, '1934': 232, '1937': 177, '1938': 177, '1946': 240, '1951': 238, '1954': 254, '1957': 239, '1967': 62, '1968': 68, '1980': 31, '1983': 29, '1988': 27, '2008': 10, '1914': 187, '1915': 244, '1939': 164, '1956': 235, '1960': 191, '1971': 51, '1972': 35, '1982': 32, '1985': 16, '1986': 27, '1992': 23, '1993': 18, '2006': 9, '2011': 5}}, {'_id': ObjectId('5eb51b8b9993d0175fba4219'), 'Gender': 'F', 'Name': 'Mary', 'State': 'KY', 'YearsCountDict': {'1911': 827, '1914': 1322, '1918': 2139, '1920': 2338, '1934': 1881, '1945': 1434, '1947': 1814, '1979': 310, '1980': 352, '1985': 221, '1990': 188, '1991': 187, '1996': 151, '1917': 1913, '1922': 2442, '1923': 2345, '1925': 2341, '1932': 1950, '1948': 1673, '1959': 1131, '1964': 862, '1966': 596, '1972': 369, '1973': 338, '1999': 132, '2002': 119, '2008': 71, '2014': 40, '1913': 1189, '1930': 2001, '1939': 1731, '1954': 1464, '1968': 511, '1970': 462, '1981': 302, '1983': 247, '2007': 85, '2009': 72, '1924': 2537, '1926': 2312, '1933': 1852, '1937': 1725, '1940': 1787, '1944': 1599, '1949': 1593, '1952': 1385, '1953': 1409, '1958': 1127, '1974': 344, '1975': 339, '1978': 305, '1986': 212, '2000': 143, '2011': 62, '1916': 1834, '1921': 2511, '1936': 1690, '1941': 1697, '1946': 1734, '1957': 1198, '1965': 742, '1967': 586, '1971': 461, '1976': 310, '1977': 318, '1992': 185, '2003': 100, '2010': 50, '1928': 2165, '1956': 1245, '1961': 966, '1962': 945, '1969': 446, '1984': 261, '1987': 202, '1997': 160, '1998': 149, '2001': 120, '1915': 1807, '1931': 1875, '1963': 841, '1988': 174, '1989': 211, '1993': 182, '1994': 179, '1995': 162, '2004': 80, '2006': 105, '1910': 793, '1912': 984, '1919': 2161, '1927': 2352, '1929': 1994, '1935': 1811, '1938': 1783, '1942': 1747, '1943': 1678, '1950': 1560, '1951': 1464, '1955': 1311, '1960': 997, '1982': 256, '2005': 97, '2012': 59, '2013': 71}}, {'_id': ObjectId('5eb51b8b9993d0175fba45e4'), 'Gender': 'F', 'Name': 'Mary', 'State': 'ID', 'YearsCountDict': {'1924': 180, '1927': 142, '1970': 46, '1974': 43, '1979': 49, '1980': 56, '1995': 30, '1996': 22, '1915': 149, '1918': 214, '1928': 147, '1939': 144, '1940': 144, '1947': 188, '1952': 186, '1958': 154, '1972': 42, '1982': 45, '1987': 21, '2001': 18, '2003': 18, '1916': 171, '1933': 135, '1937': 158, '1941': 126, '1944': 174, '1945': 169, '1946': 180, '1951': 170, '1959': 134, '1965': 98, '1973': 34, '1975': 51, '1992': 24, '1999': 24, '2000': 28, '2013': 9, '1912': 77, '1913': 91, '1938': 156, '1943': 162, '1953': 184, '1955': 167, '1961': 110, '1962': 94, '1964': 101, '1967': 54, '1990': 29, '1991': 21, '1993': 33, '2002': 22, '2008': 11, '1910': 53, '1911': 50, '1922': 223, '1926': 135, '1954': 159, '1957': 149, '1963': 99, '1981': 47, '1994': 23, '1997': 16, '1998': 15, '2005': 17, '1919': 185, '1920': 209, '1921': 207, '1929': 151, '1932': 145, '1948': 177, '1950': 186, '1960': 121, '1976': 37, '1978': 42, '1983': 36, '1985': 37, '1989': 23, '2009': 12, '2010': 7, '2011': 10, '1917': 199, '1931': 159, '1934': 120, '1935': 136, '1936': 145, '1942': 150, '1984': 35, '1988': 32, '2006': 17, '2007': 13, '2012': 11, '2014': 17, '1914': 93, '1923': 195, '1925': 169, '1930': 125, '1949': 198, '1956': 156, '1966': 66, '1968': 56, '1969': 44, '1971': 49, '1977': 54, '1986': 24, '2004': 25}}, {'_id': ObjectId('5eb51b8b9993d0175fba5735'), 'Gender': 'F', 'Name': 'Mary', 'State': 'KS', 'YearsCountDict': {'1925': 926, '1926': 905, '1928': 860, '1936': 711, '1948': 727, '1950': 681, '1959': 537, '1970': 153, '1971': 138, '1975': 107, '1976': 83, '1982': 119, '1983': 86, '1991': 64, '1994': 81, '2001': 49, '1920': 890, '1943': 746, '1974': 90, '2003': 56, '2006': 29, '2013': 32, '1912': 369, '1927': 987, '1929': 845, '1931': 838, '1934': 814, '1937': 649, '1940': 662, '1952': 724, '1954': 655, '1961': 427, '1962': 425, '1964': 358, '1973': 110, '1989': 85, '1999': 50, '2000': 67, '2005': 39, '2014': 33, '1911': 278, '1922': 979, '1933': 771, '1935': 673, '1942': 688, '1946': 739, '1951': 744, '1955': 718, '1958': 552, '1977': 93, '1981': 118, '1990': 62, '1993': 84, '2008': 45, '2009': 27, '1910': 251, '1916': 757, '1919': 828, '1921': 1058, '1923': 982, '1968': 198, '1972': 105, '1986': 83, '1992': 63, '1914': 527, '1917': 856, '1932': 792, '1939': 712, '1944': 745, '1957': 617, '1960': 514, '1965': 273, '1967': 212, '1969': 188, '1978': 110, '1987': 64, '1995': 53, '1996': 73, '1998': 60, '2004': 51, '2010': 25, '1915': 671, '1918': 884, '1924': 952, '1930': 860, '1941': 642, '1945': 644, '1947': 812, '1953': 733, '1963': 396, '1966': 235, '1984': 98, '1985': 87, '1988': 73, '2011': 23, '1913': 437, '1938': 737, '1949': 796, '1956': 631, '1979': 89, '1980': 98, '1997': 66, '2002': 44, '2007': 31, '2012': 34}}, {'_id': ObjectId('5eb51b8c9993d0175fba7430'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MA', 'YearsCountDict': {'1916': 2754, '1925': 2505, '1938': 1275, '1939': 1303, '1941': 1439, '1972': 305, '1973': 232, '1974': 214, '1975': 208, '1981': 179, '1983': 193, '1995': 188, '1996': 164, '1999': 152, '2012': 49, '1914': 2497, '1927': 2260, '1945': 1418, '1949': 1732, '1969': 508, '1982': 196, '1987': 176, '1997': 147, '1998': 146, '2001': 130, '2004': 107, '2010': 56, '2013': 45, '2014': 52, '1920': 2971, '1931': 1797, '1977': 209, '1979': 207, '1988': 194, '1989': 193, '1994': 161, '2000': 117, '2003': 104, '2011': 50, '1913': 1874, '1923': 2778, '1926': 2329, '1928': 2152, '1933': 1513, '1934': 1506, '1935': 1424, '1937': 1365, '1940': 1322, '1942': 1623, '1947': 1859, '1950': 1710, '1953': 1751, '1955': 1881, '1957': 1990, '1959': 1754, '1968': 575, '1971': 346, '1986': 134, '1990': 194, '1992': 208, '2009': 61, '1917': 2818, '1919': 2769, '1922': 2847, '1944': 1520, '1961': 1555, '1962': 1373, '1964': 1251, '1970': 437, '1985': 178, '1991': 203, '2002': 135, '1910': 989, '1915': 2794, '1921': 3035, '1930': 1959, '1936': 1429, '1946': 1665, '1948': 1817, '1954': 2133, '1956': 1970, '1958': 1835, '1960': 1693, '1963': 1309, '1965': 1120, '1967': 759, '1976': 168, '1929': 1938, '1932': 1727, '1952': 1774, '1980': 198, '1984': 167, '2006': 70, '2007': 66, '1911': 1248, '1912': 1636, '1918': 3006, '1924': 2747, '1943': 1703, '1951': 1704, '1966': 820, '1978': 206, '1993': 162, '2005': 83, '2008': 54}}, {'_id': ObjectId('5eb51b8c9993d0175fba82d9'), 'Gender': 'F', 'Name': 'Mary', 'State': 'ND', 'YearsCountDict': {'1910': 85, '1915': 196, '1937': 182, '1947': 298, '1953': 284, '1987': 22, '1990': 12, '1919': 220, '1922': 226, '1926': 215, '1939': 199, '1940': 230, '1943': 214, '1946': 295, '1948': 315, '1964': 133, '1965': 117, '1968': 71, '1969': 66, '1972': 43, '1981': 20, '1995': 20, '1998': 10, '2005': 6, '1913': 114, '1927': 238, '1932': 200, '1945': 227, '1954': 269, '1958': 217, '1963': 143, '1971': 43, '1973': 24, '1977': 34, '1988': 21, '2001': 6, '2006': 6, '1911': 96, '1916': 271, '1920': 230, '1921': 241, '1929': 213, '1952': 269, '1957': 247, '1970': 43, '1975': 30, '1979': 23, '1984': 21, '1997': 8, '2004': 7, '2014': 5, '1914': 142, '1917': 234, '1933': 194, '1934': 202, '1935': 189, '1950': 290, '1956': 289, '1960': 202, '1980': 27, '1992': 11, '1996': 8, '2002': 9, '1918': 199, '1923': 220, '1924': 231, '1928': 216, '1941': 222, '1951': 250, '1959': 231, '1961': 182, '1966': 94, '1982': 24, '2012': 9, '1912': 100, '1930': 210, '1936': 236, '1938': 204, '1942': 228, '1944': 256, '1955': 246, '1962': 169, '1967': 64, '1983': 24, '1985': 23, '1991': 9, '1993': 15, '1994': 8, '2000': 6, '2003': 6, '2007': 7, '1925': 214, '1931': 198, '1949': 269, '1974': 32, '1976': 37, '1978': 30, '1986': 17, '1989': 13, '1999': 7, '2009': 8}}, {'_id': ObjectId('5eb51b8c9993d0175fba8ac8'), 'Gender': 'F', 'Name': 'Mary', 'State': 'WY', 'YearsCountDict': {'1937': 121, '1938': 105, '1952': 116, '1957': 108, '1961': 59, '1966': 48, '1972': 19, '1989': 13, '1998': 11, '2014': 5, '1910': 27, '1911': 30, '1923': 127, '1926': 116, '1945': 102, '1949': 114, '1955': 97, '1960': 84, '1963': 68, '1979': 31, '1980': 24, '2007': 7, '1928': 108, '1929': 106, '1941': 109, '1942': 96, '1956': 96, '1965': 47, '1975': 13, '1982': 28, '1990': 12, '1997': 6, '1917': 89, '1925': 132, '1967': 41, '1976': 16, '1978': 24, '1995': 5, '1996': 8, '1999': 5, '1919': 104, '1922': 130, '1924': 142, '1932': 100, '1933': 76, '1934': 105, '1935': 97, '1962': 52, '1968': 28, '1977': 17, '1981': 36, '1987': 13, '1988': 9, '1991': 13, '2000': 12, '2001': 5, '2002': 9, '2004': 5, '1915': 57, '1916': 95, '1930': 109, '1943': 103, '1948': 113, '1959': 83, '1973': 14, '1983': 24, '1985': 14, '2008': 6, '1920': 112, '1921': 149, '1944': 104, '1946': 119, '1947': 127, '1950': 137, '1953': 127, '1964': 66, '1974': 31, '1984': 17, '1912': 44, '1913': 50, '1914': 55, '1918': 106, '1927': 124, '1931': 96, '1936': 107, '1939': 95, '1940': 94, '1951': 114, '1954': 135, '1958': 93, '1969': 28, '1970': 31, '1971': 25, '1986': 18, '1992': 11, '2009': 7, '2010': 8}}, {'_id': ObjectId('5eb51b8c9993d0175fba8cbf'), 'Gender': 'F', 'Name': 'Mary', 'State': 'AZ', 'YearsCountDict': {'1914': 133, '1924': 281, '1940': 319, '1946': 385, '1948': 441, '1950': 414, '1961': 378, '1962': 360, '1971': 141, '1987': 95, '2004': 54, '2013': 32, '2014': 30, '1910': 74, '1918': 272, '1921': 266, '1922': 252, '1927': 310, '1931': 298, '1932': 288, '1937': 354, '1959': 420, '1967': 220, '1983': 103, '1990': 89, '1992': 99, '1935': 310, '1944': 383, '1953': 493, '1957': 475, '1964': 368, '1966': 246, '1989': 86, '1994': 76, '2001': 56, '2005': 40, '1919': 290, '1928': 283, '1930': 286, '1938': 334, '1951': 438, '1955': 492, '1958': 458, '1965': 279, '1974': 117, '1976': 107, '1978': 115, '1984': 81, '1985': 90, '1997': 84, '1999': 78, '2002': 64, '2003': 55, '2007': 44, '2010': 43, '1913': 103, '1915': 158, '1947': 426, '1963': 328, '1972': 121, '1975': 106, '1980': 121, '2011': 26, '1936': 297, '1942': 353, '1945': 366, '1949': 411, '1952': 472, '1956': 449, '1960': 409, '1981': 112, '1986': 89, '1988': 113, '1998': 83, '1911': 67, '1912': 100, '1916': 191, '1920': 280, '1923': 272, '1926': 301, '1934': 287, '1968': 195, '1969': 181, '1970': 169, '1973': 151, '1977': 111, '1979': 95, '1982': 108, '1991': 98, '1995': 101, '2009': 35, '1917': 218, '1925': 293, '1929': 303, '1933': 265, '1939': 347, '1941': 349, '1943': 431, '1954': 465, '1993': 106, '1996': 90, '2000': 63, '2006': 66, '2008': 54, '2012': 32}}, {'_id': ObjectId('5eb51b8c9993d0175fba8cd5'), 'Gender': 'F', 'Name': 'Mary', 'State': 'VA', 'YearsCountDict': {'1916': 1652, '1918': 1918, '1919': 1940, '1922': 2010, '1928': 1698, '1932': 1580, '1942': 1621, '1945': 1517, '1950': 1447, '1957': 1195, '1963': 909, '1983': 320, '1984': 310, '1986': 284, '2002': 191, '2009': 98, '1925': 1940, '1926': 1844, '1933': 1449, '1939': 1349, '1943': 1627, '1956': 1290, '1977': 327, '1985': 317, '1994': 274, '2006': 139, '2007': 115, '2012': 94, '1947': 1598, '1949': 1600, '1958': 1072, '1966': 774, '1988': 318, '1991': 308, '1997': 233, '2010': 90, '1911': 747, '1923': 2020, '1930': 1658, '1936': 1377, '1941': 1500, '1946': 1561, '1955': 1260, '1959': 1079, '1962': 945, '1976': 286, '1981': 346, '1982': 359, '1990': 327, '1993': 327, '2004': 165, '2005': 176, '2008': 122, '1910': 848, '1931': 1558, '1940': 1426, '1951': 1402, '1964': 896, '1974': 350, '1980': 365, '1987': 267, '1996': 249, '2003': 166, '1912': 998, '1921': 2060, '1934': 1538, '1937': 1452, '1938': 1387, '1948': 1487, '1954': 1368, '1967': 619, '1969': 508, '1992': 313, '1998': 228, '1999': 217, '2011': 99, '2013': 82, '1913': 1060, '1920': 2046, '1924': 2076, '1927': 1789, '1935': 1464, '1944': 1547, '1960': 1098, '1961': 985, '1965': 824, '1968': 552, '1970': 537, '1971': 543, '1978': 297, '1979': 300, '2014': 91, '1914': 1265, '1915': 1576, '1917': 1783, '1929': 1600, '1952': 1389, '1953': 1361, '1972': 395, '1973': 396, '1975': 340, '1989': 298, '1995': 251, '2000': 219, '2001': 209}}, {'_id': ObjectId('5eb51b8c9993d0175fba8d64'), 'Gender': 'F', 'Name': 'Mary', 'State': 'DC', 'YearsCountDict': {'1929': 286, '1954': 520, '1956': 433, '1959': 407, '1963': 307, '1967': 175, '1971': 102, '1975': 65, '1979': 54, '1983': 45, '1988': 38, '1992': 46, '2009': 11, '2010': 17, '1921': 362, '1930': 293, '1931': 258, '1934': 288, '1940': 365, '1941': 400, '1942': 515, '1945': 493, '1948': 519, '1951': 537, '1960': 367, '1985': 54, '1990': 51, '2003': 22, '2004': 26, '2007': 19, '2011': 16, '1918': 247, '1922': 347, '1926': 295, '1937': 330, '1962': 319, '1965': 251, '1969': 137, '1981': 60, '1984': 54, '1987': 55, '1991': 48, '1998': 28, '2012': 18, '1932': 309, '1933': 294, '1943': 546, '1947': 533, '1976': 49, '1982': 50, '1989': 23, '1993': 46, '1995': 30, '2001': 26, '2002': 27, '2013': 14, '1911': 88, '1912': 121, '1913': 136, '1914': 176, '1916': 200, '1917': 234, '1925': 346, '1949': 491, '1952': 513, '1953': 510, '1955': 526, '1970': 112, '1972': 73, '1974': 50, '2005': 16, '1910': 80, '1920': 255, '1935': 325, '1939': 367, '1946': 559, '1957': 472, '1958': 435, '1973': 50, '1980': 50, '2008': 13, '1915': 195, '1924': 345, '1928': 335, '1944': 520, '1961': 374, '1964': 287, '1966': 221, '1978': 53, '1994': 48, '1997': 19, '2006': 17, '2014': 17, '1919': 274, '1923': 350, '1927': 348, '1936': 320, '1938': 347, '1950': 520, '1968': 144, '1977': 59, '1986': 44, '1996': 26, '1999': 31, '2000': 20}}]
We can also return only part of each document:
c = list(collection.find({ "Name": "Mary", "Gender": "F" }, {'YearsCountDict':0})) # Exclude 'YearsCountDict' values
c[1]
{'_id': ObjectId('5eb51b889993d0175fb60b00'), 'Gender': 'F', 'Name': 'Mary', 'State': 'LA'}
#regex query - Return only names that start with 'M' and ends with 'y'
query = { "Name": { "$regex": "^M.*y$" } }
list(collection.find(query, {'YearsCountDict':0}))
[{'_id': ObjectId('5eb51b889993d0175fb5e8a9'), 'Gender': 'F', 'Name': 'Macy', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5e9ea'), 'Gender': 'F', 'Name': 'Melody', 'State': 'NH'}, {'_id': ObjectId('5eb51b889993d0175fb5ea0d'), 'Gender': 'F', 'Name': 'Marely', 'State': 'NV'}, {'_id': ObjectId('5eb51b889993d0175fb5eb44'), 'Gender': 'F', 'Name': 'Margery', 'State': 'NJ'}, {'_id': ObjectId('5eb51b889993d0175fb5ebd2'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'SC'}, {'_id': ObjectId('5eb51b889993d0175fb5ebd9'), 'Gender': 'F', 'Name': 'Marley', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb5ed4d'), 'Gender': 'F', 'Name': 'Mazzy', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5ed56'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'SC'}, {'_id': ObjectId('5eb51b889993d0175fb5edf7'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb5efaf'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb5f0de'), 'Gender': 'F', 'Name': 'May', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb5f127'), 'Gender': 'F', 'Name': 'Milly', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb5f195'), 'Gender': 'M', 'Name': 'Mackey', 'State': 'SC'}, {'_id': ObjectId('5eb51b889993d0175fb5f1b5'), 'Gender': 'F', 'Name': 'Merry', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f277'), 'Gender': 'F', 'Name': 'Maizy', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f287'), 'Gender': 'F', 'Name': 'Melany', 'State': 'SC'}, {'_id': ObjectId('5eb51b889993d0175fb5f4bd'), 'Gender': 'F', 'Name': 'Maudry', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb5f4cc'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb5f4e2'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb5f556'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb5f59d'), 'Gender': 'F', 'Name': 'Margy', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f5b1'), 'Gender': 'F', 'Name': 'Macey', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb5f5b7'), 'Gender': 'F', 'Name': 'Marley', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb5f5f8'), 'Gender': 'F', 'Name': 'May', 'State': 'DE'}, {'_id': ObjectId('5eb51b889993d0175fb5f61e'), 'Gender': 'F', 'Name': 'Melany', 'State': 'IA'}, {'_id': ObjectId('5eb51b889993d0175fb5f735'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'IA'}, {'_id': ObjectId('5eb51b889993d0175fb5f7ae'), 'Gender': 'F', 'Name': 'Marley', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb5f810'), 'Gender': 'F', 'Name': 'Mary', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f967'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'SC'}, {'_id': ObjectId('5eb51b889993d0175fb5fa11'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb5fb11'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb5fbfd'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb5fccc'), 'Gender': 'F', 'Name': 'May', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5fd9c'), 'Gender': 'F', 'Name': 'Macy', 'State': 'NH'}, {'_id': ObjectId('5eb51b889993d0175fb5ffa5'), 'Gender': 'F', 'Name': 'Malory', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb6004f'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb600bb'), 'Gender': 'M', 'Name': 'Marty', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb6018d'), 'Gender': 'M', 'Name': 'Marley', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb602e4'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb60364'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb603e7'), 'Gender': 'F', 'Name': 'Marely', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb60496'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb605fa'), 'Gender': 'F', 'Name': 'Misty', 'State': 'NV'}, {'_id': ObjectId('5eb51b889993d0175fb6067a'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'IA'}, {'_id': ObjectId('5eb51b889993d0175fb606a1'), 'Gender': 'F', 'Name': 'Madeley', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb60710'), 'Gender': 'F', 'Name': 'Miley', 'State': 'MD'}, {'_id': ObjectId('5eb51b889993d0175fb60718'), 'Gender': 'F', 'Name': 'Melaney', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb60773'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'NJ'}, {'_id': ObjectId('5eb51b889993d0175fb607f5'), 'Gender': 'F', 'Name': 'Molly', 'State': 'IA'}, {'_id': ObjectId('5eb51b889993d0175fb60812'), 'Gender': 'F', 'Name': 'Mackenzy', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb60912'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb60a47'), 'Gender': 'F', 'Name': 'Miley', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb60b00'), 'Gender': 'F', 'Name': 'Mary', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb60b56'), 'Gender': 'M', 'Name': 'Marley', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb60b7e'), 'Gender': 'F', 'Name': 'Marely', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb60c21'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'NV'}, {'_id': ObjectId('5eb51b889993d0175fb60c50'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb60ca1'), 'Gender': 'M', 'Name': 'Murray', 'State': 'MD'}, {'_id': ObjectId('5eb51b889993d0175fb60d36'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb60e00'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'IA'}, {'_id': ObjectId('5eb51b889993d0175fb60e0a'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb60eb3'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'NJ'}, {'_id': ObjectId('5eb51b889993d0175fb60fe6'), 'Gender': 'M', 'Name': 'Mary', 'State': 'NJ'}, {'_id': ObjectId('5eb51b889993d0175fb610ad'), 'Gender': 'F', 'Name': 'Melony', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb611ef'), 'Gender': 'M', 'Name': 'Manny', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb61246'), 'Gender': 'F', 'Name': 'Mallary', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb61323'), 'Gender': 'F', 'Name': 'Misty', 'State': 'DE'}, {'_id': ObjectId('5eb51b889993d0175fb613c1'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb6144e'), 'Gender': 'M', 'Name': 'Marty', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb61494'), 'Gender': 'F', 'Name': 'Marely', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb614d9'), 'Gender': 'M', 'Name': 'My', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb6167d'), 'Gender': 'M', 'Name': 'Monty', 'State': 'NJ'}, {'_id': ObjectId('5eb51b889993d0175fb61682'), 'Gender': 'F', 'Name': 'My', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb61718'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb6172a'), 'Gender': 'F', 'Name': 'Mariely', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb61a56'), 'Gender': 'M', 'Name': 'Manny', 'State': 'NJ'}, {'_id': ObjectId('5eb51b889993d0175fb61adf'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'OK'}, {'_id': ObjectId('5eb51b889993d0175fb61b65'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb61c6f'), 'Gender': 'F', 'Name': 'Meily', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb61dde'), 'Gender': 'F', 'Name': 'Merry', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb61e8f'), 'Gender': 'F', 'Name': 'Molly', 'State': 'MD'}, {'_id': ObjectId('5eb51b889993d0175fb61f42'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb61f43'), 'Gender': 'F', 'Name': 'Melany', 'State': 'OK'}, {'_id': ObjectId('5eb51b889993d0175fb61f65'), 'Gender': 'M', 'Name': 'Monty', 'State': 'MD'}, {'_id': ObjectId('5eb51b889993d0175fb62116'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb62207'), 'Gender': 'F', 'Name': 'Molly', 'State': 'OK'}, {'_id': ObjectId('5eb51b889993d0175fb622a4'), 'Gender': 'F', 'Name': 'Melany', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb622c3'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'OK'}, {'_id': ObjectId('5eb51b889993d0175fb62446'), 'Gender': 'F', 'Name': 'Misty', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb62469'), 'Gender': 'M', 'Name': 'Monty', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb624fa'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb62504'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb62559'), 'Gender': 'F', 'Name': 'Margery', 'State': 'MD'}, {'_id': ObjectId('5eb51b889993d0175fb6255c'), 'Gender': 'F', 'Name': 'Marshay', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb62617'), 'Gender': 'M', 'Name': 'Mary', 'State': 'OK'}, {'_id': ObjectId('5eb51b889993d0175fb62684'), 'Gender': 'F', 'Name': 'Macy', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb62749'), 'Gender': 'F', 'Name': 'Maily', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb6279a'), 'Gender': 'M', 'Name': 'Mary', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb62917'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'IL'}, {'_id': ObjectId('5eb51b889993d0175fb62955'), 'Gender': 'F', 'Name': 'Melody', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb62983'), 'Gender': 'F', 'Name': 'Misty', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb629c5'), 'Gender': 'F', 'Name': 'Melody', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb62a9d'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb62b1a'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'LA'}, {'_id': ObjectId('5eb51b889993d0175fb62b4b'), 'Gender': 'F', 'Name': 'Marty', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb62bc4'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'MD'}, {'_id': ObjectId('5eb51b889993d0175fb62d61'), 'Gender': 'F', 'Name': 'Misty', 'State': 'WA'}, {'_id': ObjectId('5eb51b889993d0175fb62d73'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'AR'}, {'_id': ObjectId('5eb51b889993d0175fb62e27'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb62e3c'), 'Gender': 'F', 'Name': 'Margery', 'State': 'ND'}, {'_id': ObjectId('5eb51b889993d0175fb62f2e'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb630a8'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb630b3'), 'Gender': 'F', 'Name': 'Molly', 'State': 'AZ'}, {'_id': ObjectId('5eb51b889993d0175fb63133'), 'Gender': 'F', 'Name': 'Miley', 'State': 'IN'}, {'_id': ObjectId('5eb51b889993d0175fb632c2'), 'Gender': 'F', 'Name': 'Melany', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb63317'), 'Gender': 'M', 'Name': 'Mary', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb63332'), 'Gender': 'M', 'Name': 'Maury', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb63360'), 'Gender': 'F', 'Name': 'Misty', 'State': 'VT'}, {'_id': ObjectId('5eb51b889993d0175fb6342c'), 'Gender': 'F', 'Name': 'Malory', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb634fa'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb63575'), 'Gender': 'F', 'Name': 'Molly', 'State': 'PA'}, {'_id': ObjectId('5eb51b889993d0175fb63676'), 'Gender': 'F', 'Name': 'Missy', 'State': 'MA'}, {'_id': ObjectId('5eb51b889993d0175fb63810'), 'Gender': 'M', 'Name': 'Mary', 'State': 'IN'}, {'_id': ObjectId('5eb51b889993d0175fb63827'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb63979'), 'Gender': 'F', 'Name': 'Melody', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb63985'), 'Gender': 'M', 'Name': 'Mary', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb63a85'), 'Gender': 'M', 'Name': 'Mary', 'State': 'DC'}, {'_id': ObjectId('5eb51b889993d0175fb63aab'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb63b06'), 'Gender': 'F', 'Name': 'Melany', 'State': 'PA'}, {'_id': ObjectId('5eb51b889993d0175fb63b94'), 'Gender': 'F', 'Name': 'Missy', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb63c06'), 'Gender': 'F', 'Name': 'Merry', 'State': 'SD'}, {'_id': ObjectId('5eb51b889993d0175fb63c82'), 'Gender': 'F', 'Name': 'Marley', 'State': 'VT'}, {'_id': ObjectId('5eb51b889993d0175fb63cbf'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'AZ'}, {'_id': ObjectId('5eb51b889993d0175fb63cdc'), 'Gender': 'F', 'Name': 'Marty', 'State': 'SD'}, {'_id': ObjectId('5eb51b889993d0175fb63d95'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'PA'}, {'_id': ObjectId('5eb51b889993d0175fb63e12'), 'Gender': 'M', 'Name': 'Mary', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb63ed4'), 'Gender': 'F', 'Name': 'Miley', 'State': 'AK'}, {'_id': ObjectId('5eb51b889993d0175fb63f90'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb63fc5'), 'Gender': 'F', 'Name': 'Mary', 'State': 'SD'}, {'_id': ObjectId('5eb51b889993d0175fb6400b'), 'Gender': 'F', 'Name': 'Misty', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb64037'), 'Gender': 'M', 'Name': 'Monty', 'State': 'WY'}, {'_id': ObjectId('5eb51b889993d0175fb640d1'), 'Gender': 'F', 'Name': 'Molly', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb6410b'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb6423f'), 'Gender': 'M', 'Name': 'Monty', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb64317'), 'Gender': 'F', 'Name': 'Melany', 'State': 'AZ'}, {'_id': ObjectId('5eb51b889993d0175fb6438f'), 'Gender': 'F', 'Name': 'Melany', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb64429'), 'Gender': 'F', 'Name': 'Melany', 'State': 'DC'}, {'_id': ObjectId('5eb51b889993d0175fb6450e'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'ND'}, {'_id': ObjectId('5eb51b889993d0175fb64604'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb64680'), 'Gender': 'F', 'Name': 'Macy', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb64765'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'IN'}, {'_id': ObjectId('5eb51b889993d0175fb647df'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'MA'}, {'_id': ObjectId('5eb51b889993d0175fb648e2'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb649de'), 'Gender': 'F', 'Name': 'Macy', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb64a55'), 'Gender': 'F', 'Name': 'Miley', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb64a6f'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb64a85'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb64c50'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb64ccd'), 'Gender': 'F', 'Name': 'Melany', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb64cd4'), 'Gender': 'M', 'Name': 'Monty', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb64d97'), 'Gender': 'F', 'Name': 'Mary', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb64d9e'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'ND'}, {'_id': ObjectId('5eb51b889993d0175fb64daf'), 'Gender': 'F', 'Name': 'Mackenzy', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb64db0'), 'Gender': 'F', 'Name': 'Macy', 'State': 'VT'}, {'_id': ObjectId('5eb51b889993d0175fb64e6e'), 'Gender': 'F', 'Name': 'Marley', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb64fef'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'WV'}, {'_id': ObjectId('5eb51b889993d0175fb64ff9'), 'Gender': 'F', 'Name': 'Mary', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb650d3'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb65160'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'SD'}, {'_id': ObjectId('5eb51b889993d0175fb653ad'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'MA'}, {'_id': ObjectId('5eb51b889993d0175fb653e0'), 'Gender': 'F', 'Name': 'May', 'State': 'HI'}, {'_id': ObjectId('5eb51b889993d0175fb65501'), 'Gender': 'M', 'Name': 'Murray', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb6550b'), 'Gender': 'F', 'Name': 'Miley', 'State': 'CO'}, {'_id': ObjectId('5eb51b889993d0175fb6553b'), 'Gender': 'M', 'Name': 'Maury', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb6566a'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb65976'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'AZ'}, {'_id': ObjectId('5eb51b889993d0175fb65986'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'MO'}, {'_id': ObjectId('5eb51b889993d0175fb65988'), 'Gender': 'F', 'Name': 'Missy', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb659b3'), 'Gender': 'F', 'Name': 'Missy', 'State': 'AZ'}, {'_id': ObjectId('5eb51b889993d0175fb65c5a'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'SD'}, {'_id': ObjectId('5eb51b889993d0175fb65ca9'), 'Gender': 'F', 'Name': 'Mckinsey', 'State': 'VA'}, {'_id': ObjectId('5eb51b889993d0175fb65d56'), 'Gender': 'F', 'Name': 'Missy', 'State': 'GA'}, {'_id': ObjectId('5eb51b889993d0175fb65f3b'), 'Gender': 'F', 'Name': 'Melany', 'State': 'NY'}, {'_id': ObjectId('5eb51b889993d0175fb65f9a'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'AL'}, {'_id': ObjectId('5eb51b889993d0175fb65fb9'), 'Gender': 'M', 'Name': 'Mary', 'State': 'WI'}, {'_id': ObjectId('5eb51b889993d0175fb6616d'), 'Gender': 'M', 'Name': 'Murray', 'State': 'WI'}, {'_id': ObjectId('5eb51b889993d0175fb6625a'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NY'}, {'_id': ObjectId('5eb51b889993d0175fb663bb'), 'Gender': 'F', 'Name': 'Marley', 'State': 'WI'}, {'_id': ObjectId('5eb51b889993d0175fb663e3'), 'Gender': 'M', 'Name': 'Marley', 'State': 'OH'}, {'_id': ObjectId('5eb51b889993d0175fb66413'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'UT'}, {'_id': ObjectId('5eb51b889993d0175fb664ba'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'WI'}, {'_id': ObjectId('5eb51b889993d0175fb664cf'), 'Gender': 'F', 'Name': 'May', 'State': 'OH'}, {'_id': ObjectId('5eb51b889993d0175fb66544'), 'Gender': 'F', 'Name': 'Murphy', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb6658d'), 'Gender': 'F', 'Name': 'Missy', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb665d3'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'UT'}, {'_id': ObjectId('5eb51b889993d0175fb667c5'), 'Gender': 'F', 'Name': 'Melody', 'State': 'NY'}, {'_id': ObjectId('5eb51b889993d0175fb66815'), 'Gender': 'F', 'Name': 'Maizy', 'State': 'UT'}, {'_id': ObjectId('5eb51b889993d0175fb6687a'), 'Gender': 'F', 'Name': 'Mckinsey', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb668bc'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb66a3c'), 'Gender': 'F', 'Name': 'Malillany', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb66b95'), 'Gender': 'M', 'Name': 'Monty', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb66c6c'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb66caa'), 'Gender': 'M', 'Name': 'Mary', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb66cca'), 'Gender': 'F', 'Name': 'Macey', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb66ccf'), 'Gender': 'M', 'Name': 'Micky', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb66ef5'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb66f24'), 'Gender': 'F', 'Name': 'Merrily', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb66fc1'), 'Gender': 'M', 'Name': 'Mary', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb67095'), 'Gender': 'F', 'Name': 'Mackenzy', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb670b1'), 'Gender': 'F', 'Name': 'Mckay', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb670c9'), 'Gender': 'M', 'Name': 'Monty', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb670e0'), 'Gender': 'F', 'Name': 'Mallary', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb67168'), 'Gender': 'F', 'Name': 'Melany', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6718d'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb6719d'), 'Gender': 'M', 'Name': 'Motty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb671c9'), 'Gender': 'F', 'Name': 'Melony', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb67280'), 'Gender': 'F', 'Name': 'Merrily', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb672fb'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb67330'), 'Gender': 'F', 'Name': 'Malory', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb67597'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb67645'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb676a8'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb67819'), 'Gender': 'F', 'Name': 'Makinsey', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6784f'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb678b8'), 'Gender': 'M', 'Name': 'Mckay', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb67954'), 'Gender': 'F', 'Name': 'Merry', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb679f5'), 'Gender': 'F', 'Name': 'Margery', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67b99'), 'Gender': 'F', 'Name': 'Miley', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb67c0c'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb67c48'), 'Gender': 'F', 'Name': 'Marely', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67cd1'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb67cf7'), 'Gender': 'F', 'Name': 'Miley', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb67d70'), 'Gender': 'F', 'Name': 'Misty', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb67e28'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb67eba'), 'Gender': 'M', 'Name': 'Michaelanthony', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb67ef8'), 'Gender': 'F', 'Name': 'Misty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb67f34'), 'Gender': 'M', 'Name': 'Murray', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb67f3f'), 'Gender': 'M', 'Name': 'Manley', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb6801a'), 'Gender': 'F', 'Name': 'Molly', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb68290'), 'Gender': 'M', 'Name': 'Micky', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb682c3'), 'Gender': 'F', 'Name': 'Marley', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb683d6'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb684d8'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb68526'), 'Gender': 'F', 'Name': 'Marely', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb68586'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb686c5'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6873f'), 'Gender': 'M', 'Name': 'Monty', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb688e4'), 'Gender': 'F', 'Name': 'Marely', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68908'), 'Gender': 'F', 'Name': 'Mary', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb689dc'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb689f1'), 'Gender': 'F', 'Name': 'Melaney', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68a6e'), 'Gender': 'M', 'Name': 'Moishy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68a89'), 'Gender': 'M', 'Name': 'Mikey', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68b2f'), 'Gender': 'F', 'Name': 'Milady', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68ba4'), 'Gender': 'M', 'Name': 'Mallory', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb68bdf'), 'Gender': 'F', 'Name': 'Majesty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68c18'), 'Gender': 'F', 'Name': 'Marly', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb68c2e'), 'Gender': 'M', 'Name': 'Marty', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb68c69'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb68c86'), 'Gender': 'M', 'Name': 'Matty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68ccb'), 'Gender': 'M', 'Name': 'Mackey', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb68d44'), 'Gender': 'F', 'Name': 'Mckinzy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb68d50'), 'Gender': 'F', 'Name': 'Macey', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb68ea7'), 'Gender': 'M', 'Name': 'Monty', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb68f48'), 'Gender': 'M', 'Name': 'Murray', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb68fd4'), 'Gender': 'M', 'Name': 'Marty', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb690a6'), 'Gender': 'F', 'Name': 'Magaby', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb690c5'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb690d8'), 'Gender': 'F', 'Name': 'Misty', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb690fb'), 'Gender': 'F', 'Name': 'Marley', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb69142'), 'Gender': 'M', 'Name': 'Monty', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb69220'), 'Gender': 'F', 'Name': 'Melony', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb69325'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb6945b'), 'Gender': 'F', 'Name': 'Modesty', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb69615'), 'Gender': 'F', 'Name': 'Miley', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb6961f'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb6965c'), 'Gender': 'F', 'Name': 'Meloney', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb696ad'), 'Gender': 'F', 'Name': 'Melany', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb6983f'), 'Gender': 'M', 'Name': 'Mary', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb69b4d'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb69c37'), 'Gender': 'F', 'Name': 'Marley', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb69c8c'), 'Gender': 'F', 'Name': 'Molly', 'State': 'RI'}, {'_id': ObjectId('5eb51b899993d0175fb69e57'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb69ee8'), 'Gender': 'M', 'Name': 'Marty', 'State': 'WY'}, {'_id': ObjectId('5eb51b899993d0175fb69f9c'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb69fff'), 'Gender': 'F', 'Name': 'Melody', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb6a162'), 'Gender': 'F', 'Name': 'Marty', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb6a16d'), 'Gender': 'M', 'Name': 'Marty', 'State': 'DC'}, {'_id': ObjectId('5eb51b899993d0175fb6a170'), 'Gender': 'F', 'Name': 'May', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb6a195'), 'Gender': 'F', 'Name': 'Merry', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb6a21c'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb6a2e9'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb6a346'), 'Gender': 'F', 'Name': 'Misty', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb6a453'), 'Gender': 'F', 'Name': 'Miley', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb6a456'), 'Gender': 'F', 'Name': 'Murphy', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb6a4db'), 'Gender': 'F', 'Name': 'Milly', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb6a5b4'), 'Gender': 'F', 'Name': 'Melony', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb6a602'), 'Gender': 'F', 'Name': 'Marley', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb6a628'), 'Gender': 'F', 'Name': 'Miley', 'State': 'HI'}, {'_id': ObjectId('5eb51b899993d0175fb6a640'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb6a67b'), 'Gender': 'F', 'Name': 'Marely', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb6a681'), 'Gender': 'M', 'Name': 'Micky', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb6a7b9'), 'Gender': 'F', 'Name': 'Marty', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb6a7dc'), 'Gender': 'F', 'Name': 'Melody', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb6a8e4'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb6a9c4'), 'Gender': 'F', 'Name': 'Miley', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb6aa91'), 'Gender': 'M', 'Name': 'Marty', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb6abf1'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb6ac09'), 'Gender': 'F', 'Name': 'Macey', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb6ad58'), 'Gender': 'F', 'Name': 'May', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb6ae18'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'AK'}, {'_id': ObjectId('5eb51b899993d0175fb6aea0'), 'Gender': 'F', 'Name': 'Misty', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb6aede'), 'Gender': 'F', 'Name': 'Melony', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb6afc8'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb6b2e4'), 'Gender': 'F', 'Name': 'Marely', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb6b30d'), 'Gender': 'F', 'Name': 'Melanny', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb6b318'), 'Gender': 'F', 'Name': 'Marley', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb6b477'), 'Gender': 'M', 'Name': 'Micky', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb6b480'), 'Gender': 'F', 'Name': 'Mary', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb6b501'), 'Gender': 'F', 'Name': 'Macy', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb6b51f'), 'Gender': 'F', 'Name': 'Mazzy', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb6b5b4'), 'Gender': 'F', 'Name': 'Misty', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb6b6fb'), 'Gender': 'F', 'Name': 'Marcey', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb6b746'), 'Gender': 'F', 'Name': 'Merry', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb6b7d4'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb6b7f4'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb6b810'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb6b8a6'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb6b8e4'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb6b92e'), 'Gender': 'M', 'Name': 'Marley', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb6b9e0'), 'Gender': 'F', 'Name': 'Misty', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb6bac1'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb6bb1f'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb6bd62'), 'Gender': 'M', 'Name': 'Mary', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb6bd65'), 'Gender': 'F', 'Name': 'Miley', 'State': 'NH'}, {'_id': ObjectId('5eb51b899993d0175fb6be7b'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6bf18'), 'Gender': 'F', 'Name': 'Macy', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6bfd6'), 'Gender': 'F', 'Name': 'Mey', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6c081'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb6c1b0'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6c253'), 'Gender': 'F', 'Name': 'Mackenzy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6c439'), 'Gender': 'F', 'Name': 'Missy', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6c61d'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6c67f'), 'Gender': 'F', 'Name': 'Marley', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6c708'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6c735'), 'Gender': 'F', 'Name': 'Macey', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6c84d'), 'Gender': 'F', 'Name': 'Maleny', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6cae1'), 'Gender': 'F', 'Name': 'May', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6cc36'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb6cdf8'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb6d1a0'), 'Gender': 'F', 'Name': 'Marty', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6d235'), 'Gender': 'F', 'Name': 'Marty', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb6d26b'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6d2b6'), 'Gender': 'F', 'Name': 'Misty', 'State': 'MD'}, {'_id': ObjectId('5eb51b899993d0175fb6d2cb'), 'Gender': 'F', 'Name': 'Marty', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6d33a'), 'Gender': 'F', 'Name': 'May', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6d378'), 'Gender': 'M', 'Name': 'Murray', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb6d3e2'), 'Gender': 'F', 'Name': 'Misty', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6d470'), 'Gender': 'F', 'Name': 'Marykay', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6d60a'), 'Gender': 'F', 'Name': 'Marshay', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6d661'), 'Gender': 'M', 'Name': 'Markanthony', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6d669'), 'Gender': 'M', 'Name': 'Monty', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6d786'), 'Gender': 'F', 'Name': 'Molly', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb6d8cb'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6d8f3'), 'Gender': 'F', 'Name': 'Macy', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb6da59'), 'Gender': 'M', 'Name': 'Mary', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6da73'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb6da7d'), 'Gender': 'F', 'Name': 'Melody', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6da95'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6dbae'), 'Gender': 'F', 'Name': 'Mily', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6dcd8'), 'Gender': 'F', 'Name': 'Mily', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6df5c'), 'Gender': 'F', 'Name': 'Maggy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e04a'), 'Gender': 'F', 'Name': 'Magaby', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6e0d2'), 'Gender': 'F', 'Name': 'Molly', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb6e3d8'), 'Gender': 'F', 'Name': 'Marley', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6e527'), 'Gender': 'F', 'Name': 'Margery', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb6e555'), 'Gender': 'M', 'Name': 'Murry', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb6e572'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6e5d2'), 'Gender': 'F', 'Name': 'Marely', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6e61c'), 'Gender': 'F', 'Name': 'Mellany', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e6d7'), 'Gender': 'F', 'Name': 'Majesty', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e7ea'), 'Gender': 'F', 'Name': 'Marelly', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e83f'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'MD'}, {'_id': ObjectId('5eb51b899993d0175fb6e8b9'), 'Gender': 'F', 'Name': 'Mallary', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6e92e'), 'Gender': 'F', 'Name': 'Marley', 'State': 'MD'}, {'_id': ObjectId('5eb51b899993d0175fb6eb0b'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6ec32'), 'Gender': 'F', 'Name': 'Margery', 'State': 'DE'}, {'_id': ObjectId('5eb51b899993d0175fb6ec3d'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6ec6a'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6ece6'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb6ed09'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6ef00'), 'Gender': 'F', 'Name': 'Milly', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb6ef95'), 'Gender': 'F', 'Name': 'Milly', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6f0f3'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6f18f'), 'Gender': 'F', 'Name': 'Marigny', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb6f190'), 'Gender': 'M', 'Name': 'Malachy', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6f1e9'), 'Gender': 'F', 'Name': 'Melody', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb6f214'), 'Gender': 'M', 'Name': 'Marley', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb6f2ab'), 'Gender': 'F', 'Name': 'Mariely', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6f318'), 'Gender': 'F', 'Name': 'Melody', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6f54b'), 'Gender': 'M', 'Name': 'Moody', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6f602'), 'Gender': 'M', 'Name': 'Marty', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6f7f7'), 'Gender': 'F', 'Name': 'Margery', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb6f951'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb6fa25'), 'Gender': 'F', 'Name': 'Melanny', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6faa6'), 'Gender': 'F', 'Name': 'Molly', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb6faed'), 'Gender': 'F', 'Name': 'Marely', 'State': 'MD'}, {'_id': ObjectId('5eb51b899993d0175fb6fb54'), 'Gender': 'F', 'Name': 'May', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6fbd1'), 'Gender': 'F', 'Name': 'Makinsey', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6fc17'), 'Gender': 'M', 'Name': 'Marty', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb6fc3d'), 'Gender': 'F', 'Name': 'Melony', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6fceb'), 'Gender': 'F', 'Name': 'Marty', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6fe0d'), 'Gender': 'F', 'Name': 'Macey', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6ff4b'), 'Gender': 'F', 'Name': 'Misty', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb6ff51'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb6ff61'), 'Gender': 'F', 'Name': 'Marely', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb6ffe3'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb70047'), 'Gender': 'M', 'Name': 'Macy', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb7006d'), 'Gender': 'F', 'Name': 'My', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb700b4'), 'Gender': 'F', 'Name': 'Marely', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb700cf'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb70115'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb70148'), 'Gender': 'M', 'Name': 'Marley', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb7014a'), 'Gender': 'F', 'Name': 'Marely', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb70152'), 'Gender': 'M', 'Name': 'Macaulay', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb70173'), 'Gender': 'M', 'Name': 'Marcanthony', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb701eb'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb701f5'), 'Gender': 'M', 'Name': 'Murray', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb70203'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb70235'), 'Gender': 'F', 'Name': 'May', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb7035c'), 'Gender': 'M', 'Name': 'Mary', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb7039a'), 'Gender': 'F', 'Name': 'Melany', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb704c0'), 'Gender': 'F', 'Name': 'Miley', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb70831'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb7084e'), 'Gender': 'M', 'Name': 'Mary', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb70915'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'RI'}, {'_id': ObjectId('5eb51b899993d0175fb7096f'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'RI'}, {'_id': ObjectId('5eb51b899993d0175fb709b2'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb709d0'), 'Gender': 'F', 'Name': 'Missy', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb70a07'), 'Gender': 'F', 'Name': 'Margery', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb70a77'), 'Gender': 'F', 'Name': 'Merry', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb70afb'), 'Gender': 'M', 'Name': 'Manley', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb70b07'), 'Gender': 'F', 'Name': 'Melany', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb70b18'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb70b80'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb70b9e'), 'Gender': 'M', 'Name': 'Mary', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb70bdc'), 'Gender': 'F', 'Name': 'Maly', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb70bf4'), 'Gender': 'F', 'Name': 'Molly', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb70c05'), 'Gender': 'M', 'Name': 'Murray', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb70de3'), 'Gender': 'F', 'Name': 'Missy', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb70e1b'), 'Gender': 'F', 'Name': 'Marney', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb710ca'), 'Gender': 'F', 'Name': 'Macy', 'State': 'RI'}, {'_id': ObjectId('5eb51b899993d0175fb7112c'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb711df'), 'Gender': 'M', 'Name': 'Manny', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb7122c'), 'Gender': 'F', 'Name': 'Miley', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb7129c'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb712a9'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb712cf'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb712de'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'CT'}, {'_id': ObjectId('5eb51b899993d0175fb71444'), 'Gender': 'F', 'Name': 'Margery', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb715f3'), 'Gender': 'F', 'Name': 'Missy', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb715f9'), 'Gender': 'F', 'Name': 'Marley', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb71649'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb716f5'), 'Gender': 'M', 'Name': 'Monty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7174c'), 'Gender': 'F', 'Name': 'Mary', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb7182a'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb71866'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb719d3'), 'Gender': 'F', 'Name': 'Miley', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb71b7a'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb71bbd'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb71d35'), 'Gender': 'M', 'Name': 'Macaulay', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb71d37'), 'Gender': 'F', 'Name': 'Merry', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb71fdf'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb72059'), 'Gender': 'M', 'Name': 'Maury', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb7207a'), 'Gender': 'M', 'Name': 'Murry', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb7208c'), 'Gender': 'M', 'Name': 'Mary', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb72123'), 'Gender': 'M', 'Name': 'Majesty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb72144'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb72252'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb724b3'), 'Gender': 'M', 'Name': 'Mary', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb725a0'), 'Gender': 'F', 'Name': 'Misty', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb72604'), 'Gender': 'F', 'Name': 'Melody', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb72734'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb72756'), 'Gender': 'F', 'Name': 'Molly', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb727c2'), 'Gender': 'F', 'Name': 'Marily', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb72873'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb72875'), 'Gender': 'M', 'Name': 'Murray', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb728e1'), 'Gender': 'M', 'Name': 'Marley', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb72926'), 'Gender': 'F', 'Name': 'Mary', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb72a18'), 'Gender': 'F', 'Name': 'Murphy', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb72a51'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb72a63'), 'Gender': 'F', 'Name': 'Missy', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb72b1c'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb72b35'), 'Gender': 'F', 'Name': 'May', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb72be5'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb72c78'), 'Gender': 'F', 'Name': 'Macy', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb72ca8'), 'Gender': 'F', 'Name': 'Misty', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb72cb2'), 'Gender': 'M', 'Name': 'Markanthony', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb72cbc'), 'Gender': 'F', 'Name': 'Marely', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb72d49'), 'Gender': 'F', 'Name': 'Misty', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb72ea9'), 'Gender': 'M', 'Name': 'Mary', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb72f5c'), 'Gender': 'F', 'Name': 'Melony', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb730ec'), 'Gender': 'F', 'Name': 'Maily', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7313f'), 'Gender': 'F', 'Name': 'Merry', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb7314c'), 'Gender': 'M', 'Name': 'Micky', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb732a5'), 'Gender': 'M', 'Name': 'Macy', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb7334c'), 'Gender': 'M', 'Name': 'Mckay', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb73466'), 'Gender': 'M', 'Name': 'Murry', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb73530'), 'Gender': 'F', 'Name': 'Marty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7354b'), 'Gender': 'F', 'Name': 'Misty', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb73569'), 'Gender': 'F', 'Name': 'Missy', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb737f7'), 'Gender': 'M', 'Name': 'Marty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7385f'), 'Gender': 'F', 'Name': 'Maebry', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb73918'), 'Gender': 'F', 'Name': 'Marny', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb73931'), 'Gender': 'M', 'Name': 'Murray', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb7398b'), 'Gender': 'F', 'Name': 'Missy', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb739cd'), 'Gender': 'F', 'Name': 'Marry', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb739f1'), 'Gender': 'M', 'Name': 'Mallory', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb73abf'), 'Gender': 'M', 'Name': 'Marley', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb73cf4'), 'Gender': 'F', 'Name': 'Melany', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb73e3d'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb73f4a'), 'Gender': 'F', 'Name': 'Marty', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb73f65'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb74214'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb7422a'), 'Gender': 'F', 'Name': 'Margy', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb742a9'), 'Gender': 'F', 'Name': 'Marely', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb74335'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb7439d'), 'Gender': 'F', 'Name': 'Marley', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb743f9'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb7451c'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb7451d'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb74571'), 'Gender': 'F', 'Name': 'Marty', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7476b'), 'Gender': 'F', 'Name': 'Mahaley', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb747ed'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb7487b'), 'Gender': 'F', 'Name': 'Marcey', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb748e4'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7490d'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7498b'), 'Gender': 'F', 'Name': 'Milly', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb74a8f'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb74bc4'), 'Gender': 'F', 'Name': 'Marly', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb74ceb'), 'Gender': 'F', 'Name': 'Mariely', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb74e40'), 'Gender': 'F', 'Name': 'Milly', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb74ee7'), 'Gender': 'F', 'Name': 'Macy', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb74f3d'), 'Gender': 'M', 'Name': 'Marty', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb74f4a'), 'Gender': 'F', 'Name': 'Matty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb74f71'), 'Gender': 'F', 'Name': 'Mary', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb74f98'), 'Gender': 'M', 'Name': 'Mikey', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb74ffc'), 'Gender': 'M', 'Name': 'Marley', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb75107'), 'Gender': 'F', 'Name': 'Melody', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb75109'), 'Gender': 'F', 'Name': 'Marely', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb7516c'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb751a1'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'WY'}, {'_id': ObjectId('5eb51b899993d0175fb75208'), 'Gender': 'F', 'Name': 'Margy', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7526e'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb752a6'), 'Gender': 'F', 'Name': 'Mckenzy', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb752c5'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb753eb'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb7547b'), 'Gender': 'F', 'Name': 'Molly', 'State': 'VT'}, {'_id': ObjectId('5eb51b899993d0175fb7547c'), 'Gender': 'F', 'Name': 'Merry', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb75569'), 'Gender': 'F', 'Name': 'Miley', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb755a2'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb75602'), 'Gender': 'F', 'Name': 'Marely', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb75609'), 'Gender': 'F', 'Name': 'Misty', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb75673'), 'Gender': 'F', 'Name': 'Misty', 'State': 'WY'}, {'_id': ObjectId('5eb51b899993d0175fb757b6'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7580a'), 'Gender': 'F', 'Name': 'Marty', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb75866'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb758d3'), 'Gender': 'F', 'Name': 'Misty', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb75b7e'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb75c62'), 'Gender': 'F', 'Name': 'Maizy', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb75c7a'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'HI'}, {'_id': ObjectId('5eb51b899993d0175fb75c8f'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb75d13'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'MN'}, {'_id': ObjectId('5eb51b899993d0175fb75d27'), 'Gender': 'M', 'Name': 'Marley', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb75d90'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb75dcc'), 'Gender': 'F', 'Name': 'Macey', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb75dfc'), 'Gender': 'F', 'Name': 'Macey', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb75e60'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb75ec0'), 'Gender': 'F', 'Name': 'May', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb75ef3'), 'Gender': 'F', 'Name': 'Marley', 'State': 'NE'}, {'_id': ObjectId('5eb51b899993d0175fb75f56'), 'Gender': 'F', 'Name': 'Macey', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb75f71'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'WY'}, {'_id': ObjectId('5eb51b899993d0175fb75f7f'), 'Gender': 'F', 'Name': 'May', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb760db'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb761ff'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb762dd'), 'Gender': 'F', 'Name': 'Melody', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb7649b'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'WY'}, {'_id': ObjectId('5eb51b899993d0175fb764a5'), 'Gender': 'F', 'Name': 'Mellody', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb764b1'), 'Gender': 'M', 'Name': 'Marley', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb764b5'), 'Gender': 'F', 'Name': 'Macy', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb764fd'), 'Gender': 'F', 'Name': 'Marely', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb764ff'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb76510'), 'Gender': 'F', 'Name': 'Marty', 'State': 'OR'}, {'_id': ObjectId('5eb51b899993d0175fb76541'), 'Gender': 'F', 'Name': 'Macy', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb7662d'), 'Gender': 'M', 'Name': 'Marley', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb76769'), 'Gender': 'F', 'Name': 'Mallary', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb76786'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb768af'), 'Gender': 'F', 'Name': 'Molly', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb768f9'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb769c4'), 'Gender': 'F', 'Name': 'Melody', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb769d2'), 'Gender': 'F', 'Name': 'Mary', 'State': 'AK'}, {'_id': ObjectId('5eb51b899993d0175fb76b47'), 'Gender': 'F', 'Name': 'Melony', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb76c3b'), 'Gender': 'F', 'Name': 'Merry', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb76d86'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb76da8'), 'Gender': 'F', 'Name': 'Macy', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb76eab'), 'Gender': 'F', 'Name': 'Mary', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb76eb2'), 'Gender': 'F', 'Name': 'Merry', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb76ec1'), 'Gender': 'F', 'Name': 'Marykay', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb76fa2'), 'Gender': 'F', 'Name': 'Melody', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb77034'), 'Gender': 'F', 'Name': 'Merry', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb77194'), 'Gender': 'F', 'Name': 'Misty', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb772eb'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb773b5'), 'Gender': 'M', 'Name': 'Monty', 'State': 'MT'}, {'_id': ObjectId('5eb51b899993d0175fb77403'), 'Gender': 'F', 'Name': 'Macy', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb7773d'), 'Gender': 'F', 'Name': 'Majesty', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb77758'), 'Gender': 'M', 'Name': 'Marcanthony', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb77781'), 'Gender': 'F', 'Name': 'May', 'State': 'NH'}, {'_id': ObjectId('5eb51b899993d0175fb77802'), 'Gender': 'M', 'Name': 'Mary', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb77990'), 'Gender': 'F', 'Name': 'Macey', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb779b1'), 'Gender': 'M', 'Name': 'Manley', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb77a6b'), 'Gender': 'M', 'Name': 'Marley', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb77aab'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb77b3f'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb77ba4'), 'Gender': 'M', 'Name': 'Marty', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb77d3a'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb77da4'), 'Gender': 'F', 'Name': 'Mailey', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb77e00'), 'Gender': 'M', 'Name': 'Monty', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb77e90'), 'Gender': 'M', 'Name': 'Malachy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb77f5d'), 'Gender': 'F', 'Name': 'Micky', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb77fc9'), 'Gender': 'M', 'Name': 'Monty', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb78066'), 'Gender': 'M', 'Name': 'Maury', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb78088'), 'Gender': 'F', 'Name': 'Melody', 'State': 'DE'}, {'_id': ObjectId('5eb51b899993d0175fb780fa'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb78107'), 'Gender': 'F', 'Name': 'Marley', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb78148'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'DE'}, {'_id': ObjectId('5eb51b899993d0175fb782a9'), 'Gender': 'F', 'Name': 'Missy', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb7836b'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb783e7'), 'Gender': 'M', 'Name': 'Maury', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7848e'), 'Gender': 'F', 'Name': 'Mary', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb785ea'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb78606'), 'Gender': 'F', 'Name': 'Macy', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb786eb'), 'Gender': 'F', 'Name': 'Missy', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb786fa'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb78741'), 'Gender': 'F', 'Name': 'May', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb7874e'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb787f1'), 'Gender': 'M', 'Name': 'Murry', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb78820'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7883a'), 'Gender': 'F', 'Name': 'Miley', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb788ab'), 'Gender': 'F', 'Name': 'May', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb7892c'), 'Gender': 'F', 'Name': 'Marly', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb78b25'), 'Gender': 'F', 'Name': 'Marry', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb78cc0'), 'Gender': 'F', 'Name': 'Marry', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb78d67'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb78e77'), 'Gender': 'M', 'Name': 'Marley', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb78f45'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb78f5d'), 'Gender': 'M', 'Name': 'Mikey', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb78fe1'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb79023'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb793d4'), 'Gender': 'F', 'Name': 'Macey', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb79446'), 'Gender': 'M', 'Name': 'Murray', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7947c'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb794ef'), 'Gender': 'M', 'Name': 'Micky', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb79607'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb79702'), 'Gender': 'F', 'Name': 'Miley', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb7978e'), 'Gender': 'F', 'Name': 'Molly', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb797b6'), 'Gender': 'F', 'Name': 'Mackenzy', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7999a'), 'Gender': 'F', 'Name': 'Margery', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb79a50'), 'Gender': 'F', 'Name': 'Misty', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb79e60'), 'Gender': 'F', 'Name': 'Macy', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb79ee3'), 'Gender': 'F', 'Name': 'Mily', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb79f2f'), 'Gender': 'M', 'Name': 'Manny', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb79f34'), 'Gender': 'M', 'Name': 'Murry', 'State': 'SC'}, {'_id': ObjectId('5eb51b899993d0175fb7a036'), 'Gender': 'F', 'Name': 'Missy', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb7a0bd'), 'Gender': 'F', 'Name': 'Marty', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb7a113'), 'Gender': 'M', 'Name': 'Murray', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7a2d4'), 'Gender': 'F', 'Name': 'Misty', 'State': 'NH'}, {'_id': ObjectId('5eb51b899993d0175fb7a48b'), 'Gender': 'F', 'Name': 'Melany', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7a4bf'), 'Gender': 'F', 'Name': 'Macey', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb7a525'), 'Gender': 'F', 'Name': 'Marley', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb7a5be'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'NV'}, {'_id': ObjectId('5eb51b899993d0175fb7a6a8'), 'Gender': 'F', 'Name': 'Malky', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7a756'), 'Gender': 'F', 'Name': 'Melany', 'State': 'MD'}, {'_id': ObjectId('5eb51b899993d0175fb7a7fb'), 'Gender': 'M', 'Name': 'Marty', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb7a9ac'), 'Gender': 'F', 'Name': 'Marley', 'State': 'DE'}, {'_id': ObjectId('5eb51b899993d0175fb7a9be'), 'Gender': 'F', 'Name': 'Miley', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb7a9ce'), 'Gender': 'M', 'Name': 'Markanthony', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7aaa0'), 'Gender': 'M', 'Name': 'Morrissey', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7aab7'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7aaf5'), 'Gender': 'F', 'Name': 'Mary', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb7ad96'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7af5d'), 'Gender': 'F', 'Name': 'Melony', 'State': 'LA'}, {'_id': ObjectId('5eb51b899993d0175fb7af7a'), 'Gender': 'M', 'Name': 'Monty', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7af9a'), 'Gender': 'F', 'Name': 'Melody', 'State': 'AR'}, {'_id': ObjectId('5eb51b899993d0175fb7afe7'), 'Gender': 'M', 'Name': 'Monty', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb7b142'), 'Gender': 'M', 'Name': 'Marcanthony', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7b297'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'DE'}, {'_id': ObjectId('5eb51b899993d0175fb7b340'), 'Gender': 'F', 'Name': 'Marty', 'State': 'WA'}, {'_id': ObjectId('5eb51b899993d0175fb7b412'), 'Gender': 'F', 'Name': 'Maddy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7b433'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7b46f'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'DE'}, {'_id': ObjectId('5eb51b899993d0175fb7b4bc'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb7b4ec'), 'Gender': 'F', 'Name': 'Melody', 'State': 'NC'}, {'_id': ObjectId('5eb51b899993d0175fb7b51f'), 'Gender': 'M', 'Name': 'Molly', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7b5aa'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb7b5c6'), 'Gender': 'M', 'Name': 'Macaulay', 'State': 'IA'}, {'_id': ObjectId('5eb51b899993d0175fb7b63e'), 'Gender': 'F', 'Name': 'Molly', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7b658'), 'Gender': 'F', 'Name': 'Margery', 'State': 'OK'}, {'_id': ObjectId('5eb51b899993d0175fb7b6f4'), 'Gender': 'F', 'Name': 'Margery', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb7b827'), 'Gender': 'M', 'Name': 'Mary', 'State': 'MD'}, {'_id': ObjectId('5eb51b899993d0175fb7b834'), 'Gender': 'M', 'Name': 'Markanthony', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb7b92f'), 'Gender': 'F', 'Name': 'Marshay', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb7b952'), 'Gender': 'M', 'Name': 'Markanthony', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7b9a7'), 'Gender': 'F', 'Name': 'Mitzy', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7b9e8'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb7bb76'), 'Gender': 'M', 'Name': 'Monty', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb7bbd2'), 'Gender': 'F', 'Name': 'Margery', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb7bcb1'), 'Gender': 'F', 'Name': 'Molly', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb7be90'), 'Gender': 'F', 'Name': 'Melody', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb7c08d'), 'Gender': 'M', 'Name': 'Murray', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb7c142'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb7c19a'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7c244'), 'Gender': 'M', 'Name': 'Mckay', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb7c344'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb7c4fb'), 'Gender': 'F', 'Name': 'Molly', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb7c5a6'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7c5ce'), 'Gender': 'F', 'Name': 'Molly', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb7c633'), 'Gender': 'M', 'Name': 'Mary', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb7c6e1'), 'Gender': 'F', 'Name': 'Margery', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb7caa6'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7cb41'), 'Gender': 'F', 'Name': 'Margery', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb7cbe1'), 'Gender': 'M', 'Name': 'Monty', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7cd10'), 'Gender': 'M', 'Name': 'Manny', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb7cece'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb7cfae'), 'Gender': 'M', 'Name': 'Monty', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb7cfbe'), 'Gender': 'F', 'Name': 'Melody', 'State': 'HI'}, {'_id': ObjectId('5eb51b899993d0175fb7cfd7'), 'Gender': 'F', 'Name': 'Margery', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7d1a6'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb7d279'), 'Gender': 'F', 'Name': 'Margery', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb7d2a4'), 'Gender': 'F', 'Name': 'Molly', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb7d349'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb7d408'), 'Gender': 'F', 'Name': 'Mary', 'State': 'VT'}, {'_id': ObjectId('5eb51b899993d0175fb7d4b7'), 'Gender': 'F', 'Name': 'Miley', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb7d4e7'), 'Gender': 'F', 'Name': 'Merrily', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7d578'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb7d641'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb7d66f'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb7d723'), 'Gender': 'F', 'Name': 'Miley', 'State': 'AZ'}, {'_id': ObjectId('5eb51b899993d0175fb7d7e6'), 'Gender': 'F', 'Name': 'Miley', 'State': 'DC'}, {'_id': ObjectId('5eb51b899993d0175fb7d952'), 'Gender': 'F', 'Name': 'Macey', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb7da2d'), 'Gender': 'M', 'Name': 'Murray', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb7dd54'), 'Gender': 'F', 'Name': 'May', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb7e15c'), 'Gender': 'F', 'Name': 'Molly', 'State': 'AK'}, {'_id': ObjectId('5eb51b899993d0175fb7e278'), 'Gender': 'F', 'Name': 'Miley', 'State': 'MA'}, {'_id': ObjectId('5eb51b899993d0175fb7e29b'), 'Gender': 'F', 'Name': 'Malory', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7e48c'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb7e579'), 'Gender': 'M', 'Name': 'Marty', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb7e58b'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'DC'}, {'_id': ObjectId('5eb51b899993d0175fb7e5b1'), 'Gender': 'F', 'Name': 'Margery', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb7e636'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'VA'}, {'_id': ObjectId('5eb51b899993d0175fb7e66f'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb7e6ae'), 'Gender': 'F', 'Name': 'Merrily', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb7e6c3'), 'Gender': 'M', 'Name': 'Marty', 'State': 'WV'}, {'_id': ObjectId('5eb51b899993d0175fb7e6cb'), 'Gender': 'F', 'Name': 'Margaretmary', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7e6de'), 'Gender': 'F', 'Name': 'Molly', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb7e734'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'ND'}, {'_id': ObjectId('5eb51b899993d0175fb7e760'), 'Gender': 'F', 'Name': 'Macey', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb7e967'), 'Gender': 'M', 'Name': 'Murray', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7ea23'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'SD'}, {'_id': ObjectId('5eb51b899993d0175fb7ea8c'), 'Gender': 'F', 'Name': 'Marjory', 'State': 'CO'}, {'_id': ObjectId('5eb51b899993d0175fb7eaec'), 'Gender': 'F', 'Name': 'Melany', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb7eb39'), 'Gender': 'F', 'Name': 'Miley', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7eba4'), 'Gender': 'F', 'Name': 'Molly', 'State': 'MO'}, {'_id': ObjectId('5eb51b899993d0175fb7ecb1'), 'Gender': 'F', 'Name': 'Margy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7eea9'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb7ef04'), 'Gender': 'F', 'Name': 'Marykay', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7ef2f'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb7ef7a'), 'Gender': 'F', 'Name': 'Malky', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7eff7'), 'Gender': 'F', 'Name': 'Marshay', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7f042'), 'Gender': 'M', 'Name': 'Marley', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb7f21c'), 'Gender': 'F', 'Name': 'May', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb7f246'), 'Gender': 'F', 'Name': 'Marykay', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7f440'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7f479'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7f4d4'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7f4db'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb7f524'), 'Gender': 'F', 'Name': 'Marty', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb7f5c7'), 'Gender': 'M', 'Name': 'Morty', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7f5f1'), 'Gender': 'F', 'Name': 'Miley', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb7f622'), 'Gender': 'F', 'Name': 'May', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7f86b'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7f965'), 'Gender': 'F', 'Name': 'Melody', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb7f97c'), 'Gender': 'F', 'Name': 'Macy', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7f98c'), 'Gender': 'F', 'Name': 'Merry', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7fcba'), 'Gender': 'M', 'Name': 'Murray', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb7fcd1'), 'Gender': 'F', 'Name': 'Marty', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb7fd0e'), 'Gender': 'F', 'Name': 'Miley', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb7fd4d'), 'Gender': 'M', 'Name': 'Micky', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb7fed7'), 'Gender': 'F', 'Name': 'Molly', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7ff24'), 'Gender': 'F', 'Name': 'Marty', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb7ffef'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb80017'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'KY'}, {'_id': ObjectId('5eb51b899993d0175fb8004e'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb80136'), 'Gender': 'F', 'Name': 'Missy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb803a5'), 'Gender': 'F', 'Name': 'Marty', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb803be'), 'Gender': 'F', 'Name': 'Meleny', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb804a5'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb8051c'), 'Gender': 'M', 'Name': 'Murray', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb805b3'), 'Gender': 'F', 'Name': 'Marley', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb80693'), 'Gender': 'F', 'Name': 'May', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb8069e'), 'Gender': 'M', 'Name': 'Marty', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb806ab'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb806cf'), 'Gender': 'F', 'Name': 'Melody', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb8070b'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb8070e'), 'Gender': 'M', 'Name': 'Maury', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb80813'), 'Gender': 'F', 'Name': 'Mckinsey', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb8084e'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb80a61'), 'Gender': 'F', 'Name': 'Molly', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb80ab0'), 'Gender': 'F', 'Name': 'Macy', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb80b5e'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'AL'}, {'_id': ObjectId('5eb51b899993d0175fb80b70'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb80be2'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb80c95'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'UT'}, {'_id': ObjectId('5eb51b899993d0175fb80cd1'), 'Gender': 'F', 'Name': 'Milady', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb80d29'), 'Gender': 'F', 'Name': 'Merrily', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb80e10'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb80e46'), 'Gender': 'F', 'Name': 'Macey', 'State': 'ME'}, {'_id': ObjectId('5eb51b899993d0175fb80f30'), 'Gender': 'M', 'Name': 'Mary', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb80f7c'), 'Gender': 'F', 'Name': 'Mileidy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb80fad'), 'Gender': 'M', 'Name': 'Murray', 'State': 'KS'}, {'_id': ObjectId('5eb51b899993d0175fb81060'), 'Gender': 'F', 'Name': 'Molly', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb81085'), 'Gender': 'F', 'Name': 'Miley', 'State': 'MS'}, {'_id': ObjectId('5eb51b899993d0175fb81148'), 'Gender': 'F', 'Name': 'Mckenzy', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb81371'), 'Gender': 'F', 'Name': 'Marely', 'State': 'NM'}, {'_id': ObjectId('5eb51b899993d0175fb81636'), 'Gender': 'F', 'Name': 'Misty', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb81646'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb816f1'), 'Gender': 'F', 'Name': 'Marly', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb81861'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb81890'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'TN'}, {'_id': ObjectId('5eb51b899993d0175fb81896'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'WI'}, {'_id': ObjectId('5eb51b899993d0175fb818d6'), 'Gender': 'M', 'Name': 'Murry', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb81959'), 'Gender': 'M', 'Name': 'Mckay', 'State': 'ID'}, {'_id': ObjectId('5eb51b899993d0175fb819fb'), 'Gender': 'F', 'Name': 'Macey', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb81a44'), 'Gender': 'F', 'Name': 'Molly', 'State': 'ME'}, {'_id': ObjectId('5eb51b8a9993d0175fb81b78'), 'Gender': 'M', 'Name': 'Marty', 'State': 'TN'}, {'_id': ObjectId('5eb51b8a9993d0175fb81bbe'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'OH'}, {'_id': ObjectId('5eb51b8a9993d0175fb81c5c'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'ID'}, {'_id': ObjectId('5eb51b8a9993d0175fb81cd7'), 'Gender': 'F', 'Name': 'Mendy', 'State': 'OH'}, {'_id': ObjectId('5eb51b8a9993d0175fb81d29'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'TN'}, {'_id': ObjectId('5eb51b8a9993d0175fb81dc9'), 'Gender': 'F', 'Name': 'Margery', 'State': 'OH'}, {'_id': ObjectId('5eb51b8a9993d0175fb81f75'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'KS'}, {'_id': ObjectId('5eb51b8a9993d0175fb81fca'), 'Gender': 'M', 'Name': 'Mary', 'State': 'TN'}, {'_id': ObjectId('5eb51b8a9993d0175fb81fe6'), 'Gender': 'M', 'Name': 'Marty', 'State': 'KS'}, {'_id': ObjectId('5eb51b8a9993d0175fb82008'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb8203d'), 'Gender': 'F', 'Name': 'Mahogany', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb820b1'), 'Gender': 'F', 'Name': 'Mitzy', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb82141'), 'Gender': 'F', 'Name': 'Margy', 'State': 'OH'}, {'_id': ObjectId('5eb51b8a9993d0175fb8225b'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb82286'), 'Gender': 'M', 'Name': 'Micky', 'State': 'TN'}, {'_id': ObjectId('5eb51b8a9993d0175fb82463'), 'Gender': 'F', 'Name': 'Marty', 'State': 'WI'}, {'_id': ObjectId('5eb51b8a9993d0175fb82482'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'OH'}, {'_id': ObjectId('5eb51b8a9993d0175fb8251c'), 'Gender': 'M', 'Name': 'Mccoy', 'State': 'ID'}, {'_id': ObjectId('5eb51b8a9993d0175fb82540'), 'Gender': 'F', 'Name': 'Marley', 'State': 'TN'}, {'_id': ObjectId('5eb51b8a9993d0175fb825d7'), 'Gender': 'M', 'Name': 'Malachy', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb826ef'), 'Gender': 'F', 'Name': 'Melany', 'State': 'NM'}, {'_id': ObjectId('5eb51b8a9993d0175fb827bc'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'CO'}, {'_id': ObjectId('5eb51b8a9993d0175fb827eb'), 'Gender': 'F', 'Name': 'Melony', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb828f7'), 'Gender': 'F', 'Name': 'Mary', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb829b8'), 'Gender': 'F', 'Name': 'Macy', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb829c9'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb829fa'), 'Gender': 'F', 'Name': 'Marley', 'State': 'MA'}, {'_id': ObjectId('5eb51b8a9993d0175fb82a69'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'IN'}, {'_id': ObjectId('5eb51b8a9993d0175fb82a72'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'CT'}, {'_id': ObjectId('5eb51b8a9993d0175fb82b70'), 'Gender': 'F', 'Name': 'Macey', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb82b91'), 'Gender': 'M', 'Name': 'Marley', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb82b95'), 'Gender': 'M', 'Name': 'Mallory', 'State': 'GA'}, {'_id': ObjectId('5eb51b8a9993d0175fb82ba6'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'VA'}, {'_id': ObjectId('5eb51b8a9993d0175fb82c5e'), 'Gender': 'F', 'Name': 'Merikay', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb82cdb'), 'Gender': 'F', 'Name': 'Macy', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb82d20'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'PA'}, {'_id': ObjectId('5eb51b8a9993d0175fb82d40'), 'Gender': 'F', 'Name': 'May', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb82d75'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb82d8b'), 'Gender': 'M', 'Name': 'Marty', 'State': 'VA'}, {'_id': ObjectId('5eb51b8a9993d0175fb82edd'), 'Gender': 'F', 'Name': 'Misty', 'State': 'NE'}, {'_id': ObjectId('5eb51b8a9993d0175fb82f2e'), 'Gender': 'F', 'Name': 'Macey', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb82f77'), 'Gender': 'F', 'Name': 'Marley', 'State': 'CO'}, {'_id': ObjectId('5eb51b8a9993d0175fb82fa9'), 'Gender': 'F', 'Name': 'Melony', 'State': 'GA'}, {'_id': ObjectId('5eb51b8a9993d0175fb82ff7'), 'Gender': 'F', 'Name': 'Melanny', 'State': 'VA'}, {'_id': ObjectId('5eb51b8a9993d0175fb830fa'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'WY'}, {'_id': ObjectId('5eb51b8a9993d0175fb831a2'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'GA'}, {'_id': ObjectId('5eb51b8a9993d0175fb832fc'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'AZ'}, {'_id': ObjectId('5eb51b8a9993d0175fb83313'), 'Gender': 'F', 'Name': 'Marykay', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb8332c'), 'Gender': 'F', 'Name': 'Margery', 'State': 'RI'}, {'_id': ObjectId('5eb51b8a9993d0175fb833af'), 'Gender': 'F', 'Name': 'Macey', 'State': 'NE'}, {'_id': ObjectId('5eb51b8a9993d0175fb833b4'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'DC'}, {'_id': ObjectId('5eb51b8a9993d0175fb83530'), 'Gender': 'F', 'Name': 'Marley', 'State': 'VA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8359a'), 'Gender': 'F', 'Name': 'Macy', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb8373d'), 'Gender': 'F', 'Name': 'Merry', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb8376e'), 'Gender': 'F', 'Name': 'Marley', 'State': 'GA'}, {'_id': ObjectId('5eb51b8a9993d0175fb83818'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'PA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8386a'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'VA'}, {'_id': ObjectId('5eb51b8a9993d0175fb838a6'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb838f8'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'NE'}, {'_id': ObjectId('5eb51b8a9993d0175fb83961'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb839b4'), 'Gender': 'M', 'Name': 'Montgomery', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb839bc'), 'Gender': 'F', 'Name': 'Molly', 'State': 'MT'}, {'_id': ObjectId('5eb51b8a9993d0175fb839c2'), 'Gender': 'F', 'Name': 'Makinley', 'State': 'WV'}, {'_id': ObjectId('5eb51b8a9993d0175fb83a18'), 'Gender': 'F', 'Name': 'Marely', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb83ae9'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb83bbb'), 'Gender': 'F', 'Name': 'Marty', 'State': 'CO'}, {'_id': ObjectId('5eb51b8a9993d0175fb83c5f'), 'Gender': 'F', 'Name': 'Margy', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb83cdb'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb83d7b'), 'Gender': 'F', 'Name': 'Marley', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb83da3'), 'Gender': 'F', 'Name': 'Mabry', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb83e42'), 'Gender': 'F', 'Name': 'Marley', 'State': 'AK'}, {'_id': ObjectId('5eb51b8a9993d0175fb83e4d'), 'Gender': 'M', 'Name': 'Monty', 'State': 'WV'}, {'_id': ObjectId('5eb51b8a9993d0175fb83ed2'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb83fff'), 'Gender': 'F', 'Name': 'Miley', 'State': 'VT'}, {'_id': ObjectId('5eb51b8a9993d0175fb841d4'), 'Gender': 'F', 'Name': 'Marty', 'State': 'PA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8438e'), 'Gender': 'F', 'Name': 'Margery', 'State': 'MT'}, {'_id': ObjectId('5eb51b8a9993d0175fb843f9'), 'Gender': 'F', 'Name': 'Marty', 'State': 'GA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84458'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb844f6'), 'Gender': 'F', 'Name': 'Melody', 'State': 'CT'}, {'_id': ObjectId('5eb51b8a9993d0175fb844fe'), 'Gender': 'M', 'Name': 'Marty', 'State': 'MO'}, {'_id': ObjectId('5eb51b8a9993d0175fb8458c'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'OR'}, {'_id': ObjectId('5eb51b8a9993d0175fb84691'), 'Gender': 'F', 'Name': 'My', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb846de'), 'Gender': 'F', 'Name': 'Macy', 'State': 'CT'}, {'_id': ObjectId('5eb51b8a9993d0175fb846f1'), 'Gender': 'F', 'Name': 'Marely', 'State': 'MN'}, {'_id': ObjectId('5eb51b8a9993d0175fb84970'), 'Gender': 'M', 'Name': 'Manley', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84a27'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb84a97'), 'Gender': 'M', 'Name': 'Mary', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84bf4'), 'Gender': 'F', 'Name': 'Mindy', 'State': 'IA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84c12'), 'Gender': 'F', 'Name': 'Milly', 'State': 'SC'}, {'_id': ObjectId('5eb51b8a9993d0175fb84e28'), 'Gender': 'F', 'Name': 'Merry', 'State': 'SC'}, {'_id': ObjectId('5eb51b8a9993d0175fb84e3c'), 'Gender': 'F', 'Name': 'Miley', 'State': 'LA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84e8d'), 'Gender': 'M', 'Name': 'Murray', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84e9b'), 'Gender': 'F', 'Name': 'Marely', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb84f8c'), 'Gender': 'M', 'Name': 'Murray', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84f97'), 'Gender': 'M', 'Name': 'Mckinley', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb8517a'), 'Gender': 'F', 'Name': 'Melany', 'State': 'AR'}, {'_id': ObjectId('5eb51b8a9993d0175fb851c5'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85201'), 'Gender': 'M', 'Name': 'Mary', 'State': 'AR'}, {'_id': ObjectId('5eb51b8a9993d0175fb852eb'), 'Gender': 'F', 'Name': 'Magaly', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb85300'), 'Gender': 'F', 'Name': 'Mary', 'State': 'SC'}, {'_id': ObjectId('5eb51b8a9993d0175fb85349'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'AR'}, {'_id': ObjectId('5eb51b8a9993d0175fb853cf'), 'Gender': 'F', 'Name': 'Missy', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85612'), 'Gender': 'F', 'Name': 'Magally', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8564f'), 'Gender': 'F', 'Name': 'Mary', 'State': 'IA'}, {'_id': ObjectId('5eb51b8a9993d0175fb856a8'), 'Gender': 'M', 'Name': 'Murray', 'State': 'AR'}, {'_id': ObjectId('5eb51b8a9993d0175fb85749'), 'Gender': 'F', 'Name': 'Miley', 'State': 'AR'}, {'_id': ObjectId('5eb51b8a9993d0175fb8581d'), 'Gender': 'F', 'Name': 'Malillany', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8599e'), 'Gender': 'F', 'Name': 'Merary', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb859ef'), 'Gender': 'F', 'Name': 'Mayerly', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85a01'), 'Gender': 'F', 'Name': 'Melany', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85a2b'), 'Gender': 'F', 'Name': 'Marleny', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85c1b'), 'Gender': 'M', 'Name': 'Monty', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85cd7'), 'Gender': 'M', 'Name': 'Maury', 'State': 'LA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85e3c'), 'Gender': 'F', 'Name': 'Merry', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb85e6a'), 'Gender': 'F', 'Name': 'Mary', 'State': 'OK'}, {'_id': ObjectId('5eb51b8a9993d0175fb85e89'), 'Gender': 'F', 'Name': 'Molly', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85e9e'), 'Gender': 'F', 'Name': 'Melody', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb85ee9'), 'Gender': 'F', 'Name': 'Mandy', 'State': 'IA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85ef6'), 'Gender': 'F', 'Name': 'Molly', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85f05'), 'Gender': 'F', 'Name': 'Mary', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb85f6c'), 'Gender': 'M', 'Name': 'Marley', 'State': 'SC'}, {'_id': ObjectId('5eb51b8a9993d0175fb861bc'), 'Gender': 'F', 'Name': 'Mary', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb861fd'), 'Gender': 'F', 'Name': 'Mickey', 'State': 'NC'}, {'_id': ObjectId('5eb51b8a9993d0175fb86220'), 'Gender': 'F', 'Name': 'Marcey', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb862bc'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'MD'}, {'_id': ObjectId('5eb51b8a9993d0175fb8648f'), 'Gender': 'F', 'Name': 'Macy', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb864e6'), 'Gender': 'M', 'Name': 'Mckay', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb865a2'), 'Gender': 'F', 'Name': 'Macy', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb8669c'), 'Gender': 'M', 'Name': 'Marcanthony', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb866af'), 'Gender': 'F', 'Name': 'Mercy', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb86715'), 'Gender': 'F', 'Name': 'Merrily', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb86889'), 'Gender': 'F', 'Name': 'Margery', 'State': 'NC'}, {'_id': ObjectId('5eb51b8a9993d0175fb869ab'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb869ec'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'DE'}, {'_id': ObjectId('5eb51b8a9993d0175fb86a73'), 'Gender': 'F', 'Name': 'Macy', 'State': 'MD'}, {'_id': ObjectId('5eb51b8a9993d0175fb86bcd'), 'Gender': 'M', 'Name': 'Mosby', 'State': 'NC'}, {'_id': ObjectId('5eb51b8a9993d0175fb86d74'), 'Gender': 'M', 'Name': 'Matvey', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb86d90'), 'Gender': 'F', 'Name': 'Margy', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb86ed8'), 'Gender': 'F', 'Name': 'Melanny', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb86eff'), 'Gender': 'M', 'Name': 'Maury', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb86fb8'), 'Gender': 'F', 'Name': 'Marcy', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb86fc9'), 'Gender': 'F', 'Name': 'Mallory', 'State': 'NV'}, {'_id': ObjectId('5eb51b8a9993d0175fb87006'), 'Gender': 'F', 'Name': 'Marley', 'State': 'OK'}, {'_id': ObjectId('5eb51b8a9993d0175fb8709e'), 'Gender': 'F', 'Name': 'Macey', 'State': 'MD'}, {'_id': ObjectId('5eb51b8a9993d0175fb87239'), 'Gender': 'F', 'Name': 'Molly', 'State': 'DE'}, {'_id': ObjectId('5eb51b8a9993d0175fb8727d'), 'Gender': 'F', 'Name': 'Margery', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb87501'), 'Gender': 'F', 'Name': 'Melany', 'State': 'NV'}, {'_id': ObjectId('5eb51b8a9993d0175fb87558'), 'Gender': 'M', 'Name': 'Mickey', 'State': 'AR'}, {'_id': ObjectId('5eb51b8a9993d0175fb875c4'), 'Gender': 'F', 'Name': 'Maizy', 'State': 'OK'}, {'_id': ObjectId('5eb51b8a9993d0175fb87607'), 'Gender': 'F', 'Name': 'Merry', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb87630'), 'Gender': 'F', 'Name': 'Maisy', 'State': 'WA'}, {'_id': ObjectId('5eb51b8a9993d0175fb876a7'), 'Gender': 'F', 'Name': 'Marley', 'State': 'NJ'}, {'_id': ObjectId('5eb51b8a9993d0175fb877e0'), 'Gender': 'M', 'Name': 'Murphy', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb8782e'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb8783d'), 'Gender': 'F', 'Name': 'Merry', 'State': 'MD'}, {'_id': ObjectId('5eb51b8a9993d0175fb87845'), 'Gender': 'F', 'Name': 'Mckinley', 'State': 'OK'}, {'_id': ObjectId('5eb51b8a9993d0175fb8789a'), 'Gender': 'F', 'Name': 'Makenzy', 'State': 'OK'}, ...]
# search in nested documents
# returns all baby names that in 1990 more than 4000 babies were born with this name
query = { "YearsCountDict.1990": {"$gt": 4000}}
list(collection.find(query, {'YearsCountDict':0}))
[{'_id': ObjectId('5eb51b889993d0175fb5fbeb'), 'Gender': 'M', 'Name': 'David', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb66eb7'), 'Gender': 'M', 'Name': 'Michael', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67e7f'), 'Gender': 'M', 'Name': 'Michael', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6bfd9'), 'Gender': 'M', 'Name': 'Anthony', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6c2b7'), 'Gender': 'F', 'Name': 'Ashley', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6da47'), 'Gender': 'M', 'Name': 'Michael', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e54b'), 'Gender': 'M', 'Name': 'Andrew', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb78eaa'), 'Gender': 'F', 'Name': 'Jessica', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8522b'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8607a'), 'Gender': 'M', 'Name': 'Jose', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8d57b'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb93e75'), 'Gender': 'M', 'Name': 'Daniel', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9e653'), 'Gender': 'M', 'Name': 'Jonathan', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba07e0'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba13d9'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'CA'}]
We can also sort the return results, or limit the number of results:
query = { "YearsCountDict.2000": {"$gt": 1000}}
list(collection.find(query, {'YearsCountDict':0}).sort("Name"))
[{'_id': ObjectId('5eb51b899993d0175fb78593'), 'Gender': 'M', 'Name': 'Aaron', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8bd8c'), 'Gender': 'F', 'Name': 'Abigail', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb87b78'), 'Gender': 'M', 'Name': 'Adrian', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb88b2d'), 'Gender': 'M', 'Name': 'Alejandro', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7fb10'), 'Gender': 'M', 'Name': 'Alexander', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb99010'), 'Gender': 'M', 'Name': 'Alexander', 'State': 'NY'}, {'_id': ObjectId('5eb51b8b9993d0175fb9eee7'), 'Gender': 'M', 'Name': 'Alexander', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb79496'), 'Gender': 'F', 'Name': 'Alexandra', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb79893'), 'Gender': 'F', 'Name': 'Alexis', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8c926'), 'Gender': 'F', 'Name': 'Alexis', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb9224b'), 'Gender': 'F', 'Name': 'Alyssa', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba531e'), 'Gender': 'F', 'Name': 'Alyssa', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb8588f'), 'Gender': 'F', 'Name': 'Amanda', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5eb84'), 'Gender': 'F', 'Name': 'Andrea', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb67f92'), 'Gender': 'M', 'Name': 'Andrew', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb69171'), 'Gender': 'M', 'Name': 'Andrew', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb6e54b'), 'Gender': 'M', 'Name': 'Andrew', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9a759'), 'Gender': 'M', 'Name': 'Andrew', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb68f9a'), 'Gender': 'M', 'Name': 'Angel', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb939c0'), 'Gender': 'M', 'Name': 'Angel', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb628b2'), 'Gender': 'M', 'Name': 'Anthony', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb6bfd9'), 'Gender': 'M', 'Name': 'Anthony', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7f775'), 'Gender': 'M', 'Name': 'Anthony', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb940a7'), 'Gender': 'M', 'Name': 'Anthony', 'State': 'FL'}, {'_id': ObjectId('5eb51b8b9993d0175fb99479'), 'Gender': 'M', 'Name': 'Anthony', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb60eea'), 'Gender': 'F', 'Name': 'Ashley', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb66f07'), 'Gender': 'F', 'Name': 'Ashley', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6c2b7'), 'Gender': 'F', 'Name': 'Ashley', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7eea0'), 'Gender': 'F', 'Name': 'Ashley', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb66ce3'), 'Gender': 'M', 'Name': 'Austin', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb85260'), 'Gender': 'M', 'Name': 'Austin', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb78b9e'), 'Gender': 'M', 'Name': 'Benjamin', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7f535'), 'Gender': 'M', 'Name': 'Brandon', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb88554'), 'Gender': 'M', 'Name': 'Brandon', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb924e4'), 'Gender': 'M', 'Name': 'Brandon', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb990cf'), 'Gender': 'M', 'Name': 'Brandon', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9e000'), 'Gender': 'M', 'Name': 'Brian', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb80993'), 'Gender': 'F', 'Name': 'Brianna', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb860df'), 'Gender': 'F', 'Name': 'Brianna', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb98696'), 'Gender': 'F', 'Name': 'Brianna', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb7ac9c'), 'Gender': 'M', 'Name': 'Bryan', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e612'), 'Gender': 'M', 'Name': 'Cameron', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb88732'), 'Gender': 'M', 'Name': 'Carlos', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9a506'), 'Gender': 'M', 'Name': 'Carlos', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb66419'), 'Gender': 'M', 'Name': 'Christian', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6c805'), 'Gender': 'M', 'Name': 'Christian', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb92d0e'), 'Gender': 'M', 'Name': 'Christian', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb5f030'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb73a20'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb797a0'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7dd5b'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'GA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8d57b'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'NY'}, {'_id': ObjectId('5eb51b8b9993d0175fba13d9'), 'Gender': 'M', 'Name': 'Christopher', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6de60'), 'Gender': 'M', 'Name': 'Daniel', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb8dcbe'), 'Gender': 'M', 'Name': 'Daniel', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb93e75'), 'Gender': 'M', 'Name': 'Daniel', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9eb0b'), 'Gender': 'M', 'Name': 'Daniel', 'State': 'IL'}, {'_id': ObjectId('5eb51b8c9993d0175fba5ce6'), 'Gender': 'M', 'Name': 'Daniel', 'State': 'NY'}, {'_id': ObjectId('5eb51b889993d0175fb5fbeb'), 'Gender': 'M', 'Name': 'David', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6f9f2'), 'Gender': 'M', 'Name': 'David', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb721b6'), 'Gender': 'M', 'Name': 'David', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb72f48'), 'Gender': 'M', 'Name': 'David', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb9066a'), 'Gender': 'F', 'Name': 'Destiny', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f480'), 'Gender': 'M', 'Name': 'Dylan', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb66203'), 'Gender': 'M', 'Name': 'Dylan', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb5f236'), 'Gender': 'M', 'Name': 'Eduardo', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba13d5'), 'Gender': 'F', 'Name': 'Elizabeth', 'State': 'CA'}, {'_id': ObjectId('5eb51b8c9993d0175fba6b09'), 'Gender': 'F', 'Name': 'Elizabeth', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6fca2'), 'Gender': 'F', 'Name': 'Emily', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb73214'), 'Gender': 'F', 'Name': 'Emily', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb777d5'), 'Gender': 'F', 'Name': 'Emily', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb835e2'), 'Gender': 'F', 'Name': 'Emily', 'State': 'PA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8877c'), 'Gender': 'F', 'Name': 'Emily', 'State': 'FL'}, {'_id': ObjectId('5eb51b8b9993d0175fba3d1d'), 'Gender': 'F', 'Name': 'Emily', 'State': 'NY'}, {'_id': ObjectId('5eb51b8c9993d0175fba5f2e'), 'Gender': 'F', 'Name': 'Emily', 'State': 'OH'}, {'_id': ObjectId('5eb51b8b9993d0175fb9e04c'), 'Gender': 'F', 'Name': 'Emma', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb84f3d'), 'Gender': 'M', 'Name': 'Eric', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f910'), 'Gender': 'M', 'Name': 'Ethan', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba3c71'), 'Gender': 'M', 'Name': 'Ethan', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6cf8b'), 'Gender': 'M', 'Name': 'Gabriel', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb6226b'), 'Gender': 'F', 'Name': 'Grace', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb700f5'), 'Gender': 'F', 'Name': 'Hannah', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb73fd6'), 'Gender': 'F', 'Name': 'Hannah', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb74419'), 'Gender': 'F', 'Name': 'Hannah', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb78e67'), 'Gender': 'F', 'Name': 'Hannah', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb91a60'), 'Gender': 'F', 'Name': 'Hannah', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb84e9a'), 'Gender': 'M', 'Name': 'Isaac', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6e0b6'), 'Gender': 'F', 'Name': 'Isabella', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb909db'), 'Gender': 'M', 'Name': 'Isaiah', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb69e00'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'IN'}, {'_id': ObjectId('5eb51b899993d0175fb6a72e'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb71add'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb790ad'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'NC'}, {'_id': ObjectId('5eb51b8a9993d0175fb8867b'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'FL'}, {'_id': ObjectId('5eb51b8a9993d0175fb8e589'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'MI'}, {'_id': ObjectId('5eb51b8a9993d0175fb914d9'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba1412'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'IL'}, {'_id': ObjectId('5eb51b8b9993d0175fba3ca9'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'NY'}, {'_id': ObjectId('5eb51b8b9993d0175fba4aff'), 'Gender': 'M', 'Name': 'Jacob', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb94303'), 'Gender': 'F', 'Name': 'Jacqueline', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7322e'), 'Gender': 'M', 'Name': 'James', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb87540'), 'Gender': 'M', 'Name': 'James', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8ccff'), 'Gender': 'M', 'Name': 'James', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6c017'), 'Gender': 'F', 'Name': 'Jasmine', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb79871'), 'Gender': 'M', 'Name': 'Jason', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5fc1f'), 'Gender': 'F', 'Name': 'Jennifer', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba4ab0'), 'Gender': 'F', 'Name': 'Jennifer', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67729'), 'Gender': 'F', 'Name': 'Jessica', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb78eaa'), 'Gender': 'F', 'Name': 'Jessica', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7f48f'), 'Gender': 'F', 'Name': 'Jessica', 'State': 'NY'}, {'_id': ObjectId('5eb51b889993d0175fb607dc'), 'Gender': 'M', 'Name': 'Jesus', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8bcc0'), 'Gender': 'M', 'Name': 'Jesus', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb63f84'), 'Gender': 'M', 'Name': 'John', 'State': 'PA'}, {'_id': ObjectId('5eb51b889993d0175fb668d4'), 'Gender': 'M', 'Name': 'John', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb803b2'), 'Gender': 'M', 'Name': 'John', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9fa17'), 'Gender': 'M', 'Name': 'John', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb69442'), 'Gender': 'M', 'Name': 'Jonathan', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb931b0'), 'Gender': 'M', 'Name': 'Jonathan', 'State': 'FL'}, {'_id': ObjectId('5eb51b8b9993d0175fb982cc'), 'Gender': 'M', 'Name': 'Jonathan', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9e653'), 'Gender': 'M', 'Name': 'Jonathan', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6deeb'), 'Gender': 'M', 'Name': 'Jordan', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8827b'), 'Gender': 'M', 'Name': 'Jorge', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb67387'), 'Gender': 'M', 'Name': 'Jose', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb8607a'), 'Gender': 'M', 'Name': 'Jose', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb610b3'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb694c3'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6dbaf'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb771d5'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'PA'}, {'_id': ObjectId('5eb51b8b9993d0175fb98d5b'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'OH'}, {'_id': ObjectId('5eb51b8b9993d0175fb99f8a'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'NY'}, {'_id': ObjectId('5eb51b8b9993d0175fb9f4ff'), 'Gender': 'M', 'Name': 'Joseph', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb656e0'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb68364'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb690af'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb70cd6'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'MI'}, {'_id': ObjectId('5eb51b899993d0175fb78d60'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb805d9'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'OH'}, {'_id': ObjectId('5eb51b8a9993d0175fb882c0'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'NC'}, {'_id': ObjectId('5eb51b8a9993d0175fb9190a'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'IL'}, {'_id': ObjectId('5eb51b8b9993d0175fba07e0'), 'Gender': 'M', 'Name': 'Joshua', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6d36d'), 'Gender': 'M', 'Name': 'Juan', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba4565'), 'Gender': 'M', 'Name': 'Juan', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb78396'), 'Gender': 'F', 'Name': 'Julia', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6fdec'), 'Gender': 'M', 'Name': 'Julian', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb61bd7'), 'Gender': 'M', 'Name': 'Justin', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba4ae9'), 'Gender': 'M', 'Name': 'Justin', 'State': 'NY'}, {'_id': ObjectId('5eb51b8b9993d0175fba57f4'), 'Gender': 'M', 'Name': 'Justin', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6e633'), 'Gender': 'F', 'Name': 'Kayla', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fba5837'), 'Gender': 'F', 'Name': 'Kayla', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6e524'), 'Gender': 'M', 'Name': 'Kevin', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7334e'), 'Gender': 'M', 'Name': 'Kevin', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb7420a'), 'Gender': 'M', 'Name': 'Kevin', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb5fc20'), 'Gender': 'F', 'Name': 'Kimberly', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9e51a'), 'Gender': 'M', 'Name': 'Kyle', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8738a'), 'Gender': 'F', 'Name': 'Lauren', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8db27'), 'Gender': 'F', 'Name': 'Lauren', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb6cbde'), 'Gender': 'F', 'Name': 'Leslie', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6d054'), 'Gender': 'M', 'Name': 'Luis', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7f553'), 'Gender': 'M', 'Name': 'Luis', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67e4c'), 'Gender': 'F', 'Name': 'Madison', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb68fbd'), 'Gender': 'F', 'Name': 'Madison', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb93a3a'), 'Gender': 'F', 'Name': 'Madison', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb80676'), 'Gender': 'F', 'Name': 'Maria', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9d7d2'), 'Gender': 'F', 'Name': 'Maria', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb62bed'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7330e'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb746a4'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7a2a1'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb7be9f'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'MA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8522b'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8d039'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb94681'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'IL'}, {'_id': ObjectId('5eb51b8a9993d0175fb95eff'), 'Gender': 'M', 'Name': 'Matthew', 'State': 'PA'}, {'_id': ObjectId('5eb51b8a9993d0175fb9296e'), 'Gender': 'F', 'Name': 'Megan', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9efe5'), 'Gender': 'F', 'Name': 'Melissa', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb625c6'), 'Gender': 'M', 'Name': 'Michael', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb66eb7'), 'Gender': 'M', 'Name': 'Michael', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67e7f'), 'Gender': 'M', 'Name': 'Michael', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6888c'), 'Gender': 'M', 'Name': 'Michael', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb6da47'), 'Gender': 'M', 'Name': 'Michael', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7a96c'), 'Gender': 'M', 'Name': 'Michael', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb7ac87'), 'Gender': 'M', 'Name': 'Michael', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb7c849'), 'Gender': 'M', 'Name': 'Michael', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7d025'), 'Gender': 'M', 'Name': 'Michael', 'State': 'MA'}, {'_id': ObjectId('5eb51b8b9993d0175fba0223'), 'Gender': 'F', 'Name': 'Michelle', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6c10d'), 'Gender': 'M', 'Name': 'Miguel', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb60fce'), 'Gender': 'F', 'Name': 'Natalie', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8addd'), 'Gender': 'M', 'Name': 'Nathan', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9de39'), 'Gender': 'M', 'Name': 'Nathan', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb60df6'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'FL'}, {'_id': ObjectId('5eb51b889993d0175fb617b6'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'NJ'}, {'_id': ObjectId('5eb51b899993d0175fb6c177'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7946d'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'IL'}, {'_id': ObjectId('5eb51b899993d0175fb7e98f'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb7ede2'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb801aa'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'OH'}, {'_id': ObjectId('5eb51b8b9993d0175fb98b11'), 'Gender': 'M', 'Name': 'Nicholas', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb87444'), 'Gender': 'F', 'Name': 'Nicole', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb925c0'), 'Gender': 'M', 'Name': 'Noah', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6fcbe'), 'Gender': 'M', 'Name': 'Oscar', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7adcd'), 'Gender': 'F', 'Name': 'Rachel', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9fe1e'), 'Gender': 'M', 'Name': 'Richard', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb855b4'), 'Gender': 'M', 'Name': 'Robert', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8a2ba'), 'Gender': 'M', 'Name': 'Robert', 'State': 'TX'}, {'_id': ObjectId('5eb51b899993d0175fb67ce8'), 'Gender': 'M', 'Name': 'Ryan', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb79e90'), 'Gender': 'M', 'Name': 'Ryan', 'State': 'CA'}, {'_id': ObjectId('5eb51b8b9993d0175fb9a1a4'), 'Gender': 'M', 'Name': 'Ryan', 'State': 'TX'}, {'_id': ObjectId('5eb51b8c9993d0175fba8473'), 'Gender': 'M', 'Name': 'Ryan', 'State': 'PA'}, {'_id': ObjectId('5eb51b899993d0175fb67056'), 'Gender': 'F', 'Name': 'Samantha', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb81709'), 'Gender': 'F', 'Name': 'Samantha', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb92a32'), 'Gender': 'F', 'Name': 'Samantha', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb66f35'), 'Gender': 'M', 'Name': 'Samuel', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9f55c'), 'Gender': 'M', 'Name': 'Samuel', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb60e3a'), 'Gender': 'F', 'Name': 'Sarah', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb67a6f'), 'Gender': 'F', 'Name': 'Sarah', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb815e5'), 'Gender': 'F', 'Name': 'Sarah', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb87a24'), 'Gender': 'M', 'Name': 'Sebastian', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5fc0f'), 'Gender': 'F', 'Name': 'Sophia', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb85971'), 'Gender': 'F', 'Name': 'Stephanie', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7b882'), 'Gender': 'M', 'Name': 'Steven', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb71a5f'), 'Gender': 'F', 'Name': 'Taylor', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb9127d'), 'Gender': 'F', 'Name': 'Taylor', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb68e07'), 'Gender': 'M', 'Name': 'Thomas', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb9064c'), 'Gender': 'M', 'Name': 'Thomas', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb6cb88'), 'Gender': 'M', 'Name': 'Tyler', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb75106'), 'Gender': 'M', 'Name': 'Tyler', 'State': 'OH'}, {'_id': ObjectId('5eb51b899993d0175fb7ca6a'), 'Gender': 'M', 'Name': 'Tyler', 'State': 'PA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8b3db'), 'Gender': 'M', 'Name': 'Tyler', 'State': 'TX'}, {'_id': ObjectId('5eb51b8a9993d0175fb93872'), 'Gender': 'M', 'Name': 'Tyler', 'State': 'FL'}, {'_id': ObjectId('5eb51b8b9993d0175fba3356'), 'Gender': 'M', 'Name': 'Tyler', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb6c5f8'), 'Gender': 'F', 'Name': 'Vanessa', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb88d95'), 'Gender': 'M', 'Name': 'Victor', 'State': 'CA'}, {'_id': ObjectId('5eb51b889993d0175fb5f697'), 'Gender': 'F', 'Name': 'Victoria', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb7f4bd'), 'Gender': 'F', 'Name': 'Victoria', 'State': 'TX'}, {'_id': ObjectId('5eb51b889993d0175fb65cc9'), 'Gender': 'M', 'Name': 'William', 'State': 'GA'}, {'_id': ObjectId('5eb51b899993d0175fb6c672'), 'Gender': 'M', 'Name': 'William', 'State': 'CA'}, {'_id': ObjectId('5eb51b899993d0175fb729fd'), 'Gender': 'M', 'Name': 'William', 'State': 'NY'}, {'_id': ObjectId('5eb51b8a9993d0175fb8a72f'), 'Gender': 'M', 'Name': 'William', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb9dec1'), 'Gender': 'M', 'Name': 'William', 'State': 'NC'}, {'_id': ObjectId('5eb51b889993d0175fb5ef4c'), 'Gender': 'M', 'Name': 'Zachary', 'State': 'FL'}, {'_id': ObjectId('5eb51b899993d0175fb672cc'), 'Gender': 'M', 'Name': 'Zachary', 'State': 'NY'}, {'_id': ObjectId('5eb51b899993d0175fb683c1'), 'Gender': 'M', 'Name': 'Zachary', 'State': 'TX'}, {'_id': ObjectId('5eb51b8b9993d0175fb982df'), 'Gender': 'M', 'Name': 'Zachary', 'State': 'OH'}, {'_id': ObjectId('5eb51b8b9993d0175fb9f34a'), 'Gender': 'M', 'Name': 'Zachary', 'State': 'CA'}]
query = { "YearsCountDict.2000": {"$gt": 1000}}
list(collection.find(query, {'YearsCountDict':0}).sort("Name").limit(2))
[{'_id': ObjectId('5eb51b899993d0175fb78593'), 'Gender': 'M', 'Name': 'Aaron', 'State': 'CA'}, {'_id': ObjectId('5eb51b8a9993d0175fb8bd8c'), 'Gender': 'F', 'Name': 'Abigail', 'State': 'TX'}]
In this section, we are going to use Cartopy:
We can easily use Cartopy to plot various map projections:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline
plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.PlateCarree())
print(f"ax type: {type(ax)}")
ax.coastlines()
ax type: <class 'cartopy.mpl.geoaxes.GeoAxesSubplot'>
<cartopy.mpl.feature_artist.FeatureArtist at 0xa1a3d9c50>
plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.InterruptedGoodeHomolosine())
print(f"ax type: {type(ax)}")
ax.coastlines()
ax.stock_img() # Add a standard image to the map -> add some colors :)
ax type: <class 'cartopy.mpl.geoaxes.GeoAxesSubplot'>
<matplotlib.image.AxesImage at 0xa1c663940>
Let's add some data to the map:
plt.figure(figsize=(20, 20))
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))
ax.set_extent([-20, 20, 40, 60]) # Select a specific part of the map
ax.coastlines(resolution='10m', color='black', linewidth=1) # draw with batter coaslines resolution
ax.stock_img() # add colors
london_lon, london_lat = 0.1278, 51.5074
plt.plot(london_lon, london_lat,
color='blue', marker='o', markersize=8
)
ax.text(0.1278, 52, 'London', fontsize=14)
Text(0.1278, 52, 'London')
Let's plot a map with all the capital cities' names:
import turicreate as tc
!wget -O ./datasets/country-capitals.csv http://techslides.com/demos/country-capitals.csv
sf = tc.SFrame.read_csv("./datasets/country-capitals.csv", error_bad_lines=False)
sf
--2020-05-08 11:58:21-- http://techslides.com/demos/country-capitals.csv Resolving techslides.com (techslides.com)... 107.170.15.66 Connecting to techslides.com (techslides.com)|107.170.15.66|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 13643 (13K) [application/octet-stream] Saving to: ‘./datasets/country-capitals.csv’ ./datasets/country- 100%[===================>] 13.32K --.-KB/s in 0s 2020-05-08 11:58:22 (129 MB/s) - ‘./datasets/country-capitals.csv’ saved [13643/13643]
Unexpected characters after last column. "Central America" Parse failed at token ending at: United States,Washington, D.C.,38.883333,-77.000000,US,Central America[1;31m^[0m Successfully parsed 6 tokens: 0: United States 1: Washington 2: D.C. 3: 38.8833 4: -77 5: US
Unexpected characters after last column. "Australia" Parse failed at token ending at: US Minor Outlying Islands,Washington, D.C.,38.883333,-77.000000,UM,Australia[1;31m^[0m Successfully parsed 6 tokens: 0: US Minor O ... ng Islands 1: Washington 2: D.C. 3: 38.8833 4: -77 5: UM
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[str,str,float,float,str,str] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
2 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/country-capitals.csv
Parsing completed. Parsed 100 lines in 0.027236 secs.
Unable to interpret "D.C." as a integer Parse failed at token ending at: United States,Washington, D.C.,[1;31m^[0m38.883333,-77.000000,US,Central America Successfully parsed 2 tokens: 0: United States 1: Washington
Unable to interpret "D.C." as a integer Parse failed at token ending at: US Minor Outlying Islands,Washington, D.C.,[1;31m^[0m38.883333,-77.000000,UM,Australia Successfully parsed 2 tokens: 0: US Minor O ... ng Islands 1: Washington
2 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/country-capitals.csv
Parsing completed. Parsed 243 lines in 0.006348 secs.
CountryName | CapitalName | CapitalLatitude | CapitalLongitude | CountryCode | ContinentName |
---|---|---|---|---|---|
Somaliland | Hargeisa | 9.55 | 44.05 | NULL | Africa |
South Georgia and South Sandwich Islands ... |
King Edward Point | -54.283333 | -36.5 | GS | Antarctica |
French Southern and Antarctic Lands ... |
Port-aux-Français | -49.35 | 70.216667 | TF | Antarctica |
Palestine | Jerusalem | 31.766666666666666 | 35.233333 | PS | Asia |
Aland Islands | Mariehamn | 60.116667 | 19.9 | AX | Europe |
Nauru | Yaren | -0.5477 | 166.920867 | NR | Australia |
Saint Martin | Marigot | 18.0731 | -63.0822 | MF | North America |
Tokelau | Atafu | -9.166667 | -171.833333 | TK | Australia |
Western Sahara | El-Aaiún | 27.153611 | -13.203333 | EH | Africa |
Afghanistan | Kabul | 34.516666666666666 | 69.183333 | AF | Asia |
def draw_map(w_size=30, h_size=30):
plt.figure(figsize=(w_size, h_size))
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))
ax.coastlines(resolution='10m', color='black', linewidth=1) # draw with batter coaslines resolution
return ax
ax = draw_map()
for r in sf:
lon, lat, name = r['CapitalLongitude'], r['CapitalLatitude'], r['CapitalName']
plt.plot(lon, lat,
color='black', marker='o', markersize=4,transform=ccrs.PlateCarree(),
)
ax.text(lon, lat+0.2, name, fontsize=8, color="blue", transform=ccrs.PlateCarree(),)
ax.stock_img() # add colors
<matplotlib.image.AxesImage at 0xa1e1215c0>
r = sf[sf['CapitalName'] == 'Canberra'][0]
canb_long, canb_lat = r['CapitalLongitude'], r['CapitalLatitude']
r = sf[sf['CapitalName'] == 'London'][0]
lon_long, lon_lat = r['CapitalLongitude'], r['CapitalLatitude']
ax = draw_map(20,40)
plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
color='blue', linewidth=2, marker='o',
)
ax.text(canb_long, canb_lat+0.5, "Canberra", fontsize=16, color="red", transform=ccrs.PlateCarree())
ax.text(lon_long, lon_lat+0.5, "London", fontsize=16, color="red", transform=ccrs.PlateCarree())
Text(-0.083333, 52.0, 'London')
Let's plot a line connecting London and Canberra:
r = sf[sf['CapitalName'] == 'Canberra'][0]
canb_long, canb_lat = r['CapitalLongitude'], r['CapitalLatitude']
r = sf[sf['CapitalName'] == 'London'][0]
lon_long, lon_lat = r['CapitalLongitude'], r['CapitalLatitude']
ax = draw_map(20,40)
plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
color='blue', linewidth=2, marker='o',
transform=ccrs.Geodetic(),
)
ax.text(canb_long, canb_lat+0.5, "Canberra", fontsize=16, color="red", transform=ccrs.PlateCarree())
ax.text(lon_long, lon_lat+0.5, "London", fontsize=16, color="red", transform=ccrs.PlateCarree())
Text(-0.083333, 52.0, 'London')
Let's draw another connecting line:
ax = draw_map(20,40)
plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
color='blue', linewidth=2, marker='o',
transform=ccrs.Geodetic(),
)
ax.text(canb_long, canb_lat+0.5, "Canberra", fontsize=16, color="red", transform=ccrs.PlateCarree())
ax.text(lon_long, lon_lat+0.5, "London", fontsize=16, color="red", transform=ccrs.PlateCarree())
plt.plot([lon_long, canb_long], [lon_lat, canb_lat],
color='gray', linestyle='--',
transform=ccrs.PlateCarree(),
)
[<matplotlib.lines.Line2D at 0xa22f8bc18>]
We can add additional features to each map:
import cartopy.feature as cfeature
fig = plt.figure(figsize=(30,30))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
ax.stock_img()
# Create a feature for States from Natural Earth. See https://www.naturalearthdata.com/features/
states_provinces = cfeature.NaturalEarthFeature(
category='cultural',
name='admin_1_states_provinces_lines',
scale='10m',
)
#it is possible to add land, rivers,lakes, borders,costlines features
ax.add_feature(cfeature.LAND)
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.RIVERS)
ax.add_feature(states_provinces, edgecolor='gray')
<cartopy.mpl.feature_artist.FeatureArtist at 0xa23279128>
Let move to look at a map of the US states:
import cartopy.io.shapereader as shpreader
fig = plt.figure(figsize=(30,30))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.LambertConformal())
ax.set_extent([-125, -66.5, 20, 50], crs= ccrs.Geodetic())
ax.background_patch.set_visible(False)
ax.outline_patch.set_visible(False)
shapename = 'admin_1_states_provinces_lakes_shp'
states_shp = shpreader.natural_earth(resolution='110m',category='cultural', name=shapename)
ax.add_geometries(shpreader.Reader(states_shp).geometries(), ccrs.PlateCarree(), edgecolor='#FFFFFF')
<cartopy.mpl.feature_artist.FeatureArtist at 0xa29f7b630>
Each shape record contains attributes and bounds:
k = list(shpreader.Reader(states_shp).records())[0]
print(f"Bounds: {k.bounds}")
k.attributes
Bounds: (-97.22894344764502, 43.500187486335385, -89.5997116839191, 49.38928538674973)
{'scalerank': 2, 'featurecla': 'Admin-1 scale rank', 'adm1_code': 'USA-3514', 'diss_me': 3514, 'adm1_cod_1': 'USA-3514', 'iso_3166_2': 'US-MN', 'wikipedia': 'http://en.wikipedia.org/wiki/Minnesota', 'sr_sov_a3': 'US1', 'sr_adm0_a3': 'USA', 'iso_a2': 'US', 'adm0_sr': 1, 'admin0_lab': 2, 'name': 'Minnesota', 'name_alt': 'MN|Minn.', 'name_local': '', 'type': 'State', 'type_en': 'State', 'code_local': 'US32', 'code_hasc': 'US.MN', 'note': '', 'hasc_maybe': '', 'region': 'Midwest', 'region_cod': '', 'region_big': 'West North Central', 'big_code': '', 'provnum_ne': 0, 'gadm_level': 1, 'check_me': 10, 'scaleran_1': 2, 'datarank': 1, 'abbrev': 'Minn.', 'postal': 'MN', 'area_sqkm': 0.0, 'sameascity': -99, 'labelrank': 0, 'featurec_1': 'Admin-1 scale rank', 'admin': 'United States of America', 'name_len': 9, 'mapcolor9': 1, 'mapcolor13': 1}
Let's color the states according to their U.S. President votes:
import turicreate as tc
import turicreate.aggregate as agg
dataset_path = "./datasets/1976-2016-president.csv"
sf = tc.SFrame.read_csv(dataset_path)
sf = sf["year", "state",'state_po', "party", "candidatevotes"]
sf["party"] = sf["party"].apply(lambda s: "democrat" if "democrat" in s else s ) # there is Minnesota Democratic–Farmer–Labor
d_sf = sf[sf["party"] == "democrat"]
r_sf = sf[sf["party"] == "republican"]
v_sf = d_sf.join(r_sf,on={"state":"state", "year":"year"})
v_sf = v_sf.rename({'candidatevotes': 'democrat_votes', 'candidatevotes.1': 'republican_votes' })
v_sf['result'] = v_sf.apply(lambda r: 'democrat' if r['democrat_votes'] > r['republican_votes'] else "republican")
v_sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/1976-2016-president.csv
Parsing completed. Parsed 100 lines in 0.032195 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[int,str,str,int,int,int,str,str,str,str,int,int,int,str] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/1976-2016-president.csv
Parsing completed. Parsed 3740 lines in 0.01495 secs.
year | state | state_po | party | democrat_votes | state_po.1 | party.1 | republican_votes |
---|---|---|---|---|---|---|---|
1976 | Alabama | AL | democrat | 659170 | AL | republican | 504070 |
1976 | Alaska | AK | democrat | 44058 | AK | republican | 71555 |
1976 | Arizona | AZ | democrat | 295602 | AZ | republican | 418642 |
1976 | Arkansas | AR | democrat | 498604 | AR | republican | 267903 |
1976 | California | CA | democrat | 3742284 | CA | republican | 3882244 |
1976 | Colorado | CO | democrat | 460801 | CO | republican | 584278 |
1976 | Connecticut | CT | democrat | 647895 | CT | republican | 719261 |
1976 | Delaware | DE | democrat | 122461 | DE | republican | 109780 |
1976 | District of Columbia | DC | democrat | 137818 | DC | republican | 27873 |
1976 | Florida | FL | democrat | 1636000 | FL | republican | 1469531 |
result |
---|
democrat |
republican |
republican |
democrat |
republican |
republican |
republican |
democrat |
democrat |
democrat |
import matplotlib.patches as mpatches
def draw_us_map():
fig = plt.figure(figsize=(30,30))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.LambertConformal())
ax.set_extent([-125, -66.5, 20, 50], crs= ccrs.Geodetic())
ax.background_patch.set_visible(False)
ax.outline_patch.set_visible(False)
shapename = 'admin_1_states_provinces_lakes_shp'
states_shp = shpreader.natural_earth(resolution='110m',category='cultural', name=shapename)
ax.add_geometries(shpreader.Reader(states_shp).geometries(), ccrs.PlateCarree())
return ax
def create_election_result_by_year(sf, year):
sf = sf[sf["year"] == year]
results_dict = {}
for r in sf:
results_dict[r['state']] = r['result']
results_dict[r['state_po']] = r['result'] # adding additional name options for each state
ax = draw_us_map()
for state_record in shpreader.Reader(states_shp).records():
edgecolor = 'black'
if 'postal' not in state_record.attributes:
continue
name = state_record.attributes['postal']
facecolor = 'green'
if name not in results_dict:
facecolor = 'green'
elif results_dict[name] == 'democrat':
facecolor = 'blue'
elif results_dict[name] == 'republican':
facecolor = 'red'
ax.add_geometries([state_record.geometry], ccrs.PlateCarree(),
facecolor=facecolor, edgecolor=edgecolor)
#let's add legend
ax.set_title(f'{year} United States Presidential Election', fontsize=42)
rebuplican = mpatches.Rectangle((0, 0), 1, 1, facecolor="red")
democrat = mpatches.Rectangle((0, 0), 1, 1, facecolor="blue")
labels = ['Democrat won','Republican won']
ax.legend([democrat, rebuplican], labels,
loc='lower left', bbox_to_anchor=(0.025, 0.05), fancybox=True,
prop={'size': 32})
return ax
create_election_result_by_year(v_sf, 2016)
<cartopy.mpl.geoaxes.GeoAxesSubplot at 0xa2afd9a90>
from tqdm import tqdm
import imageio # need install imageio
!mkdir ./images
!mkdir ./images/elections
#Creating image for the election in each year
years = list(v_sf['year'].unique().sort())
images_path = "./images/elections/"
images_list = []
for y in tqdm(years):
ax = create_election_result_by_year(v_sf, y)
img_path = f"{images_path}/{y}_elections.png"
plt.savefig(img_path)
images_list.append(img_path)
plt.clf()
import imageio
images = []
for filename in images_list:
images.append(imageio.imread(filename))
imageio.mimsave(f"{images_path}/all_elections.gif", images, duration=1 )
mkdir: ./images: File exists mkdir: ./images/elections: File exists
100%|██████████| 11/11 [00:09<00:00, 1.15it/s]
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
<Figure size 2160x2160 with 0 Axes>
In this section we will explore flight routes. Let's start by loading the flights dataset with over 5.8 flights into an SFrame object:
#!mkdir ./datasets
!mkdir ./datasets/flights
# download the dataset from Kaggle and unzip it
!kaggle datasets download freddejn/flights -p ./datasets/flights
!unzip ./datasets/flights/*.zip -d ./datasets/flights/
Archive: ./datasets/flights/flights.zip inflating: ./datasets/flights/L_AIRPORT.csv inflating: ./datasets/flights/L_AIRPORT_ID.csv inflating: ./datasets/flights/cleaned_and_sampled_flights_v2.csv inflating: ./datasets/flights/flights.csv
import turicreate as tc
import turicreate.aggregate as agg
dataset_path = "./datasets/flights"
sf = tc.SFrame.read_csv(f"{dataset_psth}/flights.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/flights.csv
Parsing completed. Parsed 100 lines in 0.828921 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[int,int,int,int,str,int,str,str,str,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,str,int,int,int,int,int] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Read 520345 lines. Lines per second: 328598
Read 3627333 lines. Lines per second: 498810
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/flights.csv
Parsing completed. Parsed 5819079 lines in 11.2635 secs.
YEAR | MONTH | DAY | DAY_OF_WEEK | AIRLINE | FLIGHT_NUMBER | TAIL_NUMBER | ORIGIN_AIRPORT | DESTINATION_AIRPORT | SCHEDULED_DEPARTURE |
---|---|---|---|---|---|---|---|---|---|
2015 | 1 | 1 | 4 | AS | 98 | N407AS | ANC | SEA | 5 |
2015 | 1 | 1 | 4 | AA | 2336 | N3KUAA | LAX | PBI | 10 |
2015 | 1 | 1 | 4 | US | 840 | N171US | SFO | CLT | 20 |
2015 | 1 | 1 | 4 | AA | 258 | N3HYAA | LAX | MIA | 20 |
2015 | 1 | 1 | 4 | AS | 135 | N527AS | SEA | ANC | 25 |
2015 | 1 | 1 | 4 | DL | 806 | N3730B | SFO | MSP | 25 |
2015 | 1 | 1 | 4 | NK | 612 | N635NK | LAS | MSP | 25 |
2015 | 1 | 1 | 4 | US | 2013 | N584UW | LAX | CLT | 30 |
2015 | 1 | 1 | 4 | AA | 1112 | N3LAAA | SFO | DFW | 30 |
2015 | 1 | 1 | 4 | DL | 1173 | N826DN | LAS | ATL | 30 |
DEPARTURE_TIME | DEPARTURE_DELAY | TAXI_OUT | WHEELS_OFF | SCHEDULED_TIME | ELAPSED_TIME | AIR_TIME | DISTANCE | WHEELS_ON |
---|---|---|---|---|---|---|---|---|
2354 | -11 | 21 | 15 | 205 | 194 | 169 | 1448 | 404 |
2 | -8 | 12 | 14 | 280 | 279 | 263 | 2330 | 737 |
18 | -2 | 16 | 34 | 286 | 293 | 266 | 2296 | 800 |
15 | -5 | 15 | 30 | 285 | 281 | 258 | 2342 | 748 |
24 | -1 | 11 | 35 | 235 | 215 | 199 | 1448 | 254 |
20 | -5 | 18 | 38 | 217 | 230 | 206 | 1589 | 604 |
19 | -6 | 11 | 30 | 181 | 170 | 154 | 1299 | 504 |
44 | 14 | 13 | 57 | 273 | 249 | 228 | 2125 | 745 |
19 | -11 | 17 | 36 | 195 | 193 | 173 | 1464 | 529 |
33 | 3 | 12 | 45 | 221 | 203 | 186 | 1747 | 651 |
TAXI_IN | SCHEDULED_ARRIVAL | ARRIVAL_TIME | ARRIVAL_DELAY | DIVERTED | CANCELLED | CANCELLATION_REASON | AIR_SYSTEM_DELAY |
---|---|---|---|---|---|---|---|
4 | 430 | 408 | -22 | 0 | 0 | None | |
4 | 750 | 741 | -9 | 0 | 0 | None | |
11 | 806 | 811 | 5 | 0 | 0 | None | |
8 | 805 | 756 | -9 | 0 | 0 | None | |
5 | 320 | 259 | -21 | 0 | 0 | None | |
6 | 602 | 610 | 8 | 0 | 0 | None | |
5 | 526 | 509 | -17 | 0 | 0 | None | |
8 | 803 | 753 | -10 | 0 | 0 | None | |
3 | 545 | 532 | -13 | 0 | 0 | None | |
5 | 711 | 656 | -15 | 0 | 0 | None |
SECURITY_DELAY | AIRLINE_DELAY | LATE_AIRCRAFT_DELAY | WEATHER_DELAY |
---|---|---|---|
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
None | None | None | None |
Let's calculate how many flights took place in each route:
g = sf.groupby(['ORIGIN_AIRPORT', 'DESTINATION_AIRPORT'], {'total_flights': agg.COUNT()})
g.sort('total_flights', ascending=False)
DESTINATION_AIRPORT | ORIGIN_AIRPORT | total_flights |
---|---|---|
LAX | SFO | 13744 |
SFO | LAX | 13457 |
LAX | JFK | 12016 |
JFK | LAX | 12015 |
LAX | LAS | 9715 |
ORD | LGA | 9639 |
LAS | LAX | 9594 |
LGA | ORD | 9575 |
JFK | SFO | 8440 |
SFO | JFK | 8437 |
Let's create a flight network and visualize it using Cytoscape:
import networkx as nx
ng = nx.DiGraph()
for r in g:
ng.add_edge(r['ORIGIN_AIRPORT'], r['DESTINATION_AIRPORT'], weight=r['total_flights'])
nx.write_gml(ng, f"{dataset_psth}/flights_network.gml")
nx.info(ng)
'Name: \nType: DiGraph\nNumber of nodes: 629\nNumber of edges: 8609\nAverage in degree: 13.6868\nAverage out degree: 13.6868'
We can see that the network consists of two main components. Additionally, in each component, there are few central nodes (according to vertices' betweenness measure). Let's select only one of the components and draw the routes on a map using the airport location data from the OpenFlight website:
cc = nx.weakly_connected_components(ng)
l = list(cc)
l
[{'10135', '10136', '10140', '10141', '10146', '10154', '10155', '10157', '10158', '10165', '10170', '10185', '10208', '10257', '10268', '10279', '10299', '10333', '10372', '10397', '10408', '10423', '10431', '10434', '10469', '10529', '10551', '10561', '10577', '10581', '10599', '10620', '10627', '10631', '10666', '10685', '10693', '10713', '10721', '10728', '10731', '10732', '10739', '10747', '10754', '10779', '10781', '10785', '10792', '10800', '10821', '10849', '10868', '10874', '10918', '10926', '10980', '10990', '10994', '11003', '11013', '11042', '11049', '11057', '11066', '11067', '11076', '11097', '11109', '11111', '11122', '11140', '11146', '11150', '11193', '11203', '11252', '11259', '11267', '11274', '11278', '11292', '11298', '11308', '11315', '11337', '11413', '11423', '11433', '11447', '11471', '11481', '11503', '11525', '11537', '11540', '11577', '11587', '11603', '11612', '11617', '11618', '11624', '11630', '11637', '11638', '11641', '11648', '11695', '11697', '11721', '11775', '11778', '11823', '11865', '11867', '11884', '11898', '11905', '11921', '11953', '11973', '11977', '11980', '11982', '11986', '11995', '11996', '12003', '12007', '12016', '12094', '12129', '12156', '12173', '12177', '12191', '12197', '12206', '12217', '12255', '12264', '12265', '12266', '12278', '12280', '12323', '12335', '12339', '12343', '12389', '12391', '12402', '12441', '12448', '12451', '12478', '12511', '12519', '12523', '12758', '12819', '12884', '12888', '12889', '12891', '12892', '12896', '12898', '12915', '12945', '12951', '12953', '12954', '12982', '12992', '13029', '13061', '13076', '13127', '13158', '13184', '13198', '13204', '13230', '13232', '13241', '13244', '13256', '13264', '13277', '13290', '13296', '13303', '13342', '13344', '13360', '13367', '13377', '13422', '13433', '13459', '13476', '13485', '13486', '13487', '13495', '13502', '13541', '13577', '13795', '13796', '13830', '13851', '13871', '13873', '13891', '13930', '13931', '13933', '13964', '13970', '14006', '14025', '14027', '14057', '14098', '14100', '14107', '14108', '14109', '14113', '14122', '14150', '14193', '14222', '14252', '14254', '14256', '14262', '14307', '14321', '14457', '14487', '14489', '14492', '14520', '14524', '14543', '14570', '14574', '14576', '14588', '14633', '14635', '14674', '14679', '14683', '14685', '14689', '14696', '14698', '14709', '14711', '14730', '14747', '14771', '14783', '14794', '14814', '14828', '14831', '14842', '14843', '14869', '14893', '14905', '14908', '14952', '14960', '14986', '15016', '15024', '15027', '15041', '15048', '15070', '15096', '15249', '15295', '15304', '15323', '15356', '15370', '15376', '15380', '15389', '15401', '15411', '15412', '15497', '15607', '15624', '15841', '15919', '15991', '16218'}, {'ABE', 'ABI', 'ABQ', 'ABR', 'ABY', 'ACK', 'ACT', 'ACV', 'ACY', 'ADK', 'ADQ', 'AEX', 'AGS', 'AKN', 'ALB', 'ALO', 'AMA', 'ANC', 'APN', 'ASE', 'ATL', 'ATW', 'AUS', 'AVL', 'AVP', 'AZO', 'BDL', 'BET', 'BFL', 'BGM', 'BGR', 'BHM', 'BIL', 'BIS', 'BJI', 'BLI', 'BMI', 'BNA', 'BOI', 'BOS', 'BPT', 'BQK', 'BQN', 'BRD', 'BRO', 'BRW', 'BTM', 'BTR', 'BTV', 'BUF', 'BUR', 'BWI', 'BZN', 'CAE', 'CAK', 'CDC', 'CDV', 'CEC', 'CHA', 'CHO', 'CHS', 'CID', 'CIU', 'CLD', 'CLE', 'CLL', 'CLT', 'CMH', 'CMI', 'CMX', 'CNY', 'COD', 'COS', 'COU', 'CPR', 'CRP', 'CRW', 'CSG', 'CVG', 'CWA', 'DAB', 'DAL', 'DAY', 'DBQ', 'DCA', 'DEN', 'DFW', 'DHN', 'DIK', 'DLG', 'DLH', 'DRO', 'DSM', 'DTW', 'DVL', 'EAU', 'ECP', 'EGE', 'EKO', 'ELM', 'ELP', 'ERI', 'ESC', 'EUG', 'EVV', 'EWN', 'EWR', 'EYW', 'FAI', 'FAR', 'FAT', 'FAY', 'FCA', 'FLG', 'FLL', 'FNT', 'FSD', 'FSM', 'FWA', 'GCC', 'GCK', 'GEG', 'GFK', 'GGG', 'GJT', 'GNV', 'GPT', 'GRB', 'GRI', 'GRK', 'GRR', 'GSO', 'GSP', 'GST', 'GTF', 'GTR', 'GUC', 'GUM', 'HDN', 'HIB', 'HLN', 'HNL', 'HOB', 'HOU', 'HPN', 'HRL', 'HSV', 'HYA', 'HYS', 'IAD', 'IAG', 'IAH', 'ICT', 'IDA', 'ILG', 'ILM', 'IMT', 'IND', 'INL', 'ISN', 'ISP', 'ITH', 'ITO', 'JAC', 'JAN', 'JAX', 'JFK', 'JLN', 'JMS', 'JNU', 'KOA', 'KTN', 'LAN', 'LAR', 'LAS', 'LAW', 'LAX', 'LBB', 'LBE', 'LCH', 'LEX', 'LFT', 'LGA', 'LGB', 'LIH', 'LIT', 'LNK', 'LRD', 'LSE', 'LWS', 'MAF', 'MBS', 'MCI', 'MCO', 'MDT', 'MDW', 'MEI', 'MEM', 'MFE', 'MFR', 'MGM', 'MHK', 'MHT', 'MIA', 'MKE', 'MKG', 'MLB', 'MLI', 'MLU', 'MMH', 'MOB', 'MOT', 'MQT', 'MRY', 'MSN', 'MSO', 'MSP', 'MSY', 'MTJ', 'MVY', 'MYR', 'OAJ', 'OAK', 'OGG', 'OKC', 'OMA', 'OME', 'ONT', 'ORD', 'ORF', 'ORH', 'OTH', 'OTZ', 'PAH', 'PBG', 'PBI', 'PDX', 'PHF', 'PHL', 'PHX', 'PIA', 'PIB', 'PIH', 'PIT', 'PLN', 'PNS', 'PPG', 'PSC', 'PSE', 'PSG', 'PSP', 'PUB', 'PVD', 'PWM', 'RAP', 'RDD', 'RDM', 'RDU', 'RHI', 'RIC', 'RKS', 'RNO', 'ROA', 'ROC', 'ROW', 'RST', 'RSW', 'SAF', 'SAN', 'SAT', 'SAV', 'SBA', 'SBN', 'SBP', 'SCC', 'SCE', 'SDF', 'SEA', 'SFO', 'SGF', 'SGU', 'SHV', 'SIT', 'SJC', 'SJT', 'SJU', 'SLC', 'SMF', 'SMX', 'SNA', 'SPI', 'SPS', 'SRQ', 'STC', 'STL', 'STT', 'STX', 'SUN', 'SUX', 'SWF', 'SYR', 'TLH', 'TOL', 'TPA', 'TRI', 'TTN', 'TUL', 'TUS', 'TVC', 'TWF', 'TXK', 'TYR', 'TYS', 'UST', 'VEL', 'VLD', 'VPS', 'WRG', 'WYS', 'XNA', 'YAK', 'YUM'}]
l[0]
{'10135', '10136', '10140', '10141', '10146', '10154', '10155', '10157', '10158', '10165', '10170', '10185', '10208', '10257', '10268', '10279', '10299', '10333', '10372', '10397', '10408', '10423', '10431', '10434', '10469', '10529', '10551', '10561', '10577', '10581', '10599', '10620', '10627', '10631', '10666', '10685', '10693', '10713', '10721', '10728', '10731', '10732', '10739', '10747', '10754', '10779', '10781', '10785', '10792', '10800', '10821', '10849', '10868', '10874', '10918', '10926', '10980', '10990', '10994', '11003', '11013', '11042', '11049', '11057', '11066', '11067', '11076', '11097', '11109', '11111', '11122', '11140', '11146', '11150', '11193', '11203', '11252', '11259', '11267', '11274', '11278', '11292', '11298', '11308', '11315', '11337', '11413', '11423', '11433', '11447', '11471', '11481', '11503', '11525', '11537', '11540', '11577', '11587', '11603', '11612', '11617', '11618', '11624', '11630', '11637', '11638', '11641', '11648', '11695', '11697', '11721', '11775', '11778', '11823', '11865', '11867', '11884', '11898', '11905', '11921', '11953', '11973', '11977', '11980', '11982', '11986', '11995', '11996', '12003', '12007', '12016', '12094', '12129', '12156', '12173', '12177', '12191', '12197', '12206', '12217', '12255', '12264', '12265', '12266', '12278', '12280', '12323', '12335', '12339', '12343', '12389', '12391', '12402', '12441', '12448', '12451', '12478', '12511', '12519', '12523', '12758', '12819', '12884', '12888', '12889', '12891', '12892', '12896', '12898', '12915', '12945', '12951', '12953', '12954', '12982', '12992', '13029', '13061', '13076', '13127', '13158', '13184', '13198', '13204', '13230', '13232', '13241', '13244', '13256', '13264', '13277', '13290', '13296', '13303', '13342', '13344', '13360', '13367', '13377', '13422', '13433', '13459', '13476', '13485', '13486', '13487', '13495', '13502', '13541', '13577', '13795', '13796', '13830', '13851', '13871', '13873', '13891', '13930', '13931', '13933', '13964', '13970', '14006', '14025', '14027', '14057', '14098', '14100', '14107', '14108', '14109', '14113', '14122', '14150', '14193', '14222', '14252', '14254', '14256', '14262', '14307', '14321', '14457', '14487', '14489', '14492', '14520', '14524', '14543', '14570', '14574', '14576', '14588', '14633', '14635', '14674', '14679', '14683', '14685', '14689', '14696', '14698', '14709', '14711', '14730', '14747', '14771', '14783', '14794', '14814', '14828', '14831', '14842', '14843', '14869', '14893', '14905', '14908', '14952', '14960', '14986', '15016', '15024', '15027', '15041', '15048', '15070', '15096', '15249', '15295', '15304', '15323', '15356', '15370', '15376', '15380', '15389', '15401', '15411', '15412', '15497', '15607', '15624', '15841', '15919', '15991', '16218'}
l[1]
{'ABE', 'ABI', 'ABQ', 'ABR', 'ABY', 'ACK', 'ACT', 'ACV', 'ACY', 'ADK', 'ADQ', 'AEX', 'AGS', 'AKN', 'ALB', 'ALO', 'AMA', 'ANC', 'APN', 'ASE', 'ATL', 'ATW', 'AUS', 'AVL', 'AVP', 'AZO', 'BDL', 'BET', 'BFL', 'BGM', 'BGR', 'BHM', 'BIL', 'BIS', 'BJI', 'BLI', 'BMI', 'BNA', 'BOI', 'BOS', 'BPT', 'BQK', 'BQN', 'BRD', 'BRO', 'BRW', 'BTM', 'BTR', 'BTV', 'BUF', 'BUR', 'BWI', 'BZN', 'CAE', 'CAK', 'CDC', 'CDV', 'CEC', 'CHA', 'CHO', 'CHS', 'CID', 'CIU', 'CLD', 'CLE', 'CLL', 'CLT', 'CMH', 'CMI', 'CMX', 'CNY', 'COD', 'COS', 'COU', 'CPR', 'CRP', 'CRW', 'CSG', 'CVG', 'CWA', 'DAB', 'DAL', 'DAY', 'DBQ', 'DCA', 'DEN', 'DFW', 'DHN', 'DIK', 'DLG', 'DLH', 'DRO', 'DSM', 'DTW', 'DVL', 'EAU', 'ECP', 'EGE', 'EKO', 'ELM', 'ELP', 'ERI', 'ESC', 'EUG', 'EVV', 'EWN', 'EWR', 'EYW', 'FAI', 'FAR', 'FAT', 'FAY', 'FCA', 'FLG', 'FLL', 'FNT', 'FSD', 'FSM', 'FWA', 'GCC', 'GCK', 'GEG', 'GFK', 'GGG', 'GJT', 'GNV', 'GPT', 'GRB', 'GRI', 'GRK', 'GRR', 'GSO', 'GSP', 'GST', 'GTF', 'GTR', 'GUC', 'GUM', 'HDN', 'HIB', 'HLN', 'HNL', 'HOB', 'HOU', 'HPN', 'HRL', 'HSV', 'HYA', 'HYS', 'IAD', 'IAG', 'IAH', 'ICT', 'IDA', 'ILG', 'ILM', 'IMT', 'IND', 'INL', 'ISN', 'ISP', 'ITH', 'ITO', 'JAC', 'JAN', 'JAX', 'JFK', 'JLN', 'JMS', 'JNU', 'KOA', 'KTN', 'LAN', 'LAR', 'LAS', 'LAW', 'LAX', 'LBB', 'LBE', 'LCH', 'LEX', 'LFT', 'LGA', 'LGB', 'LIH', 'LIT', 'LNK', 'LRD', 'LSE', 'LWS', 'MAF', 'MBS', 'MCI', 'MCO', 'MDT', 'MDW', 'MEI', 'MEM', 'MFE', 'MFR', 'MGM', 'MHK', 'MHT', 'MIA', 'MKE', 'MKG', 'MLB', 'MLI', 'MLU', 'MMH', 'MOB', 'MOT', 'MQT', 'MRY', 'MSN', 'MSO', 'MSP', 'MSY', 'MTJ', 'MVY', 'MYR', 'OAJ', 'OAK', 'OGG', 'OKC', 'OMA', 'OME', 'ONT', 'ORD', 'ORF', 'ORH', 'OTH', 'OTZ', 'PAH', 'PBG', 'PBI', 'PDX', 'PHF', 'PHL', 'PHX', 'PIA', 'PIB', 'PIH', 'PIT', 'PLN', 'PNS', 'PPG', 'PSC', 'PSE', 'PSG', 'PSP', 'PUB', 'PVD', 'PWM', 'RAP', 'RDD', 'RDM', 'RDU', 'RHI', 'RIC', 'RKS', 'RNO', 'ROA', 'ROC', 'ROW', 'RST', 'RSW', 'SAF', 'SAN', 'SAT', 'SAV', 'SBA', 'SBN', 'SBP', 'SCC', 'SCE', 'SDF', 'SEA', 'SFO', 'SGF', 'SGU', 'SHV', 'SIT', 'SJC', 'SJT', 'SJU', 'SLC', 'SMF', 'SMX', 'SNA', 'SPI', 'SPS', 'SRQ', 'STC', 'STL', 'STT', 'STX', 'SUN', 'SUX', 'SWF', 'SYR', 'TLH', 'TOL', 'TPA', 'TRI', 'TTN', 'TUL', 'TUS', 'TVC', 'TWF', 'TXK', 'TYR', 'TYS', 'UST', 'VEL', 'VLD', 'VPS', 'WRG', 'WYS', 'XNA', 'YAK', 'YUM'}
g.sort('total_flights', ascending=False)
DESTINATION_AIRPORT | ORIGIN_AIRPORT | total_flights |
---|---|---|
LAX | SFO | 13744 |
SFO | LAX | 13457 |
LAX | JFK | 12016 |
JFK | LAX | 12015 |
LAX | LAS | 9715 |
ORD | LGA | 9639 |
LAS | LAX | 9594 |
LGA | ORD | 9575 |
JFK | SFO | 8440 |
SFO | JFK | 8437 |
g = g[g.apply(lambda r: r['ORIGIN_AIRPORT'] in l[1] and r['DESTINATION_AIRPORT'] in l[1] )]
g.materialize()
g
DESTINATION_AIRPORT | ORIGIN_AIRPORT | total_flights |
---|---|---|
PSG | JNU | 332 |
HOU | BNA | 1225 |
KOA | OGG | 814 |
SLC | SEA | 3463 |
DEN | PDX | 3423 |
ORD | IAD | 1745 |
DSM | DFW | 857 |
LAS | SNA | 2412 |
LAS | SEA | 5009 |
ROC | DTW | 304 |
l[1]
{'ABE', 'ABI', 'ABQ', 'ABR', 'ABY', 'ACK', 'ACT', 'ACV', 'ACY', 'ADK', 'ADQ', 'AEX', 'AGS', 'AKN', 'ALB', 'ALO', 'AMA', 'ANC', 'APN', 'ASE', 'ATL', 'ATW', 'AUS', 'AVL', 'AVP', 'AZO', 'BDL', 'BET', 'BFL', 'BGM', 'BGR', 'BHM', 'BIL', 'BIS', 'BJI', 'BLI', 'BMI', 'BNA', 'BOI', 'BOS', 'BPT', 'BQK', 'BQN', 'BRD', 'BRO', 'BRW', 'BTM', 'BTR', 'BTV', 'BUF', 'BUR', 'BWI', 'BZN', 'CAE', 'CAK', 'CDC', 'CDV', 'CEC', 'CHA', 'CHO', 'CHS', 'CID', 'CIU', 'CLD', 'CLE', 'CLL', 'CLT', 'CMH', 'CMI', 'CMX', 'CNY', 'COD', 'COS', 'COU', 'CPR', 'CRP', 'CRW', 'CSG', 'CVG', 'CWA', 'DAB', 'DAL', 'DAY', 'DBQ', 'DCA', 'DEN', 'DFW', 'DHN', 'DIK', 'DLG', 'DLH', 'DRO', 'DSM', 'DTW', 'DVL', 'EAU', 'ECP', 'EGE', 'EKO', 'ELM', 'ELP', 'ERI', 'ESC', 'EUG', 'EVV', 'EWN', 'EWR', 'EYW', 'FAI', 'FAR', 'FAT', 'FAY', 'FCA', 'FLG', 'FLL', 'FNT', 'FSD', 'FSM', 'FWA', 'GCC', 'GCK', 'GEG', 'GFK', 'GGG', 'GJT', 'GNV', 'GPT', 'GRB', 'GRI', 'GRK', 'GRR', 'GSO', 'GSP', 'GST', 'GTF', 'GTR', 'GUC', 'GUM', 'HDN', 'HIB', 'HLN', 'HNL', 'HOB', 'HOU', 'HPN', 'HRL', 'HSV', 'HYA', 'HYS', 'IAD', 'IAG', 'IAH', 'ICT', 'IDA', 'ILG', 'ILM', 'IMT', 'IND', 'INL', 'ISN', 'ISP', 'ITH', 'ITO', 'JAC', 'JAN', 'JAX', 'JFK', 'JLN', 'JMS', 'JNU', 'KOA', 'KTN', 'LAN', 'LAR', 'LAS', 'LAW', 'LAX', 'LBB', 'LBE', 'LCH', 'LEX', 'LFT', 'LGA', 'LGB', 'LIH', 'LIT', 'LNK', 'LRD', 'LSE', 'LWS', 'MAF', 'MBS', 'MCI', 'MCO', 'MDT', 'MDW', 'MEI', 'MEM', 'MFE', 'MFR', 'MGM', 'MHK', 'MHT', 'MIA', 'MKE', 'MKG', 'MLB', 'MLI', 'MLU', 'MMH', 'MOB', 'MOT', 'MQT', 'MRY', 'MSN', 'MSO', 'MSP', 'MSY', 'MTJ', 'MVY', 'MYR', 'OAJ', 'OAK', 'OGG', 'OKC', 'OMA', 'OME', 'ONT', 'ORD', 'ORF', 'ORH', 'OTH', 'OTZ', 'PAH', 'PBG', 'PBI', 'PDX', 'PHF', 'PHL', 'PHX', 'PIA', 'PIB', 'PIH', 'PIT', 'PLN', 'PNS', 'PPG', 'PSC', 'PSE', 'PSG', 'PSP', 'PUB', 'PVD', 'PWM', 'RAP', 'RDD', 'RDM', 'RDU', 'RHI', 'RIC', 'RKS', 'RNO', 'ROA', 'ROC', 'ROW', 'RST', 'RSW', 'SAF', 'SAN', 'SAT', 'SAV', 'SBA', 'SBN', 'SBP', 'SCC', 'SCE', 'SDF', 'SEA', 'SFO', 'SGF', 'SGU', 'SHV', 'SIT', 'SJC', 'SJT', 'SJU', 'SLC', 'SMF', 'SMX', 'SNA', 'SPI', 'SPS', 'SRQ', 'STC', 'STL', 'STT', 'STX', 'SUN', 'SUX', 'SWF', 'SYR', 'TLH', 'TOL', 'TPA', 'TRI', 'TTN', 'TUL', 'TUS', 'TVC', 'TWF', 'TXK', 'TYR', 'TYS', 'UST', 'VEL', 'VLD', 'VPS', 'WRG', 'WYS', 'XNA', 'YAK', 'YUM'}
Let's use the Locations of Airports dataset to visualize the flights network on a map:
# download the dataset from Kaggle and unzip it
!kaggle datasets download flashgordon/locations-of-airports -p ./datasets/flights
!unzip ./datasets/flights/locations-of-airports.zip -d ./datasets/flights/
Downloading locations-of-airports.zip to ./datasets/flights 0%| | 0.00/23.6k [00:00<?, ?B/s] 100%|███████████████████████████████████████| 23.6k/23.6k [00:00<00:00, 322kB/s] Archive: ./datasets/flights/locations-of-airports.zip inflating: ./datasets/flights/Locations.csv
sf = tc.SFrame.read_csv('./datasets/flights/Locations.csv')
sf.materialize()
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/Locations.csv
Parsing completed. Parsed 100 lines in 0.025282 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[str,float,float] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/flights/Locations.csv
Parsing completed. Parsed 1435 lines in 0.006379 secs.
Address | Latitude | Longitude |
---|---|---|
BTI | 70.1340026855 | -143.582000732 |
LUR | 68.87509918 | -166.1100006 |
PIZ | 69.73290253 | -163.0050049 |
ITO | 19.721399307251 | -155.048004150391 |
ORL | 28.545499801636 | -81.332901000977 |
BTT | 66.91390228 | -151.529007 |
Z84 | 64.301201 | -149.119995 |
UTO | 65.99279785 | -153.7039948 |
FYU | 66.5715026855469 | -145.25 |
SVW | 61.09740067 | -155.5740051 |
airports_set = set(g['DESTINATION_AIRPORT']) | set(g['ORIGIN_AIRPORT'])
sf = sf[sf['Address'].apply(lambda a: a in airports_set)]
sf.materialize()
sf
Address | Latitude | Longitude |
---|---|---|
ITO | 19.721399307251 | -155.048004150391 |
FSM | 35.3366012573242 | -94.3674011230469 |
GFK | 47.949299 | -97.176102 |
TTN | 40.2766990661621 | -74.8134994506836 |
BOS | 42.36429977 | -71.00520325 |
OAK | 37.7212982177734 | -122.221000671387 |
OMA | 41.3031997680664 | -95.8940963745117 |
OGG | 20.8985996246338 | -156.429992675781 |
ICT | 37.6498985290527 | -97.4330978393555 |
MCI | 39.2976 | -94.713898 |
airports_set
{1}
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
%matplotlib inline
def draw_map(w_size=30, h_size=30):
plt.figure(figsize=(w_size, h_size))
ax = plt.axes(projection=ccrs.PlateCarree(central_longitude=0))
ax.coastlines(resolution='10m', color='black', linewidth=1) # draw with batter coaslines resolution
return ax
ax = draw_map(20,40)
for r in sf:
lon = r['Longitude']
lat = r['Latitude']
plt.plot(lon, lat,
color='black', marker='o', markersize=4,transform=ccrs.PlateCarree(),
)
We can observe that all the airports are in the US. Let's use a US map:
import matplotlib.patches as mpatches
import cartopy.io.shapereader as shpreader
import cartopy.feature as cfeature
import operator
def draw_aiports(top_airports_num=20):
fig = plt.figure(figsize=(40,40))
ax = plt.axes(projection=ccrs.LambertConformal())
ax.set_extent([-125, -66.5, 20, 50], crs= ccrs.Geodetic())
ax.add_feature(cfeature.LAND)
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.RIVERS)
ax.add_feature(cfeature.LAKES)
top_airports = sorted(dict(h.degree()).items(), key=operator.itemgetter(1), reverse=True)[:top_airports_num]
top_set = set([a[0] for a in top_airports])
for r in sf:
lon = r['Longitude']
lat = r['Latitude']
plt.plot(lon, lat,
color='black', marker='o', markersize=6,transform=ccrs.PlateCarree(),
)
if r["Address"] not in top_set:
continue
ax.text(lon, lat+0.5, r["Address"], fontsize=16, color="blue", transform=ccrs.PlateCarree())
return ax
h = ng.subgraph(l[1])
draw_aiports()
<cartopy.mpl.geoaxes.GeoAxesSubplot at 0xa39dd9278>
g = g.join(sf, on={"ORIGIN_AIRPORT":"Address"})
g = g.rename({'Latitude':'OLatitude', 'Longitude': 'OLongitude' })
g = g.join(sf, on={"DESTINATION_AIRPORT":"Address"})
g = g.rename({'Latitude':'DLatitude', 'Longitude': 'DLongitude' })
g
DESTINATION_AIRPORT | ORIGIN_AIRPORT | total_flights | OLatitude | OLongitude | DLatitude |
---|---|---|---|---|---|
PSG | JNU | 332 | 58.3549995422363 | -134.57600402832 | 56.80170059 |
HOU | BNA | 1225 | 36.1245002746582 | -86.6781997680664 | 29.64539909 |
KOA | OGG | 814 | 20.8985996246338 | -156.429992675781 | 19.7388000488281 |
SLC | SEA | 3463 | 47.4490013122559 | -122.30899810791 | 40.7883987426758 |
DEN | PDX | 3423 | 45.58869934 | -122.5979996 | 39.861698150635 |
ORD | IAD | 1745 | 38.94449997 | -77.45580292 | 41.97859955 |
DSM | DFW | 857 | 32.896800994873 | -97.0380020141602 | 41.5340003967285 |
LAS | SNA | 2412 | 33.67570114 | -117.8679962 | 36.08010101 |
LAS | SEA | 5009 | 47.4490013122559 | -122.30899810791 | 36.08010101 |
ROC | DTW | 304 | 42.2123985290527 | -83.353401184082 | 43.1189002990723 |
DLongitude |
---|
-132.9450073 |
-95.27890015 |
-156.046005249023 |
-111.977996826172 |
-104.672996521 |
-87.90480042 |
-93.6631011962891 |
-115.1520004 |
-115.1520004 |
-77.6724014282227 |
ax = draw_aiports()
for r in g.sort('total_flights', ascending=False)[:400]:
plt.plot([r['OLongitude'], r['DLongitude']], [r['OLatitude'], r['DLatitude']],
color='gray', linewidth=1, marker='o',
transform=ccrs.PlateCarree(),
)
In this section, we are going to use GeoPandas to work with geographic datasets. GeoPandas is an open source project to make working with geospatial data in Python easier by extending the datatypes used by pandas to allow spatial operations on geometric types.
# These examples are inspired from http://geopandas.org/mapping.html
import geopandas
import matplotlib.pyplot as plt
%matplotlib inline
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
world.head(10)
pop_est | continent | name | iso_a3 | gdp_md_est | geometry | |
---|---|---|---|---|---|---|
0 | 920938 | Oceania | Fiji | FJI | 8374.0 | MULTIPOLYGON (((180.00000 -16.06713, 180.00000... |
1 | 53950935 | Africa | Tanzania | TZA | 150600.0 | POLYGON ((33.90371 -0.95000, 34.07262 -1.05982... |
2 | 603253 | Africa | W. Sahara | ESH | 906.5 | POLYGON ((-8.66559 27.65643, -8.66512 27.58948... |
3 | 35623680 | North America | Canada | CAN | 1674000.0 | MULTIPOLYGON (((-122.84000 49.00000, -122.9742... |
4 | 326625791 | North America | United States of America | USA | 18560000.0 | MULTIPOLYGON (((-122.84000 49.00000, -120.0000... |
5 | 18556698 | Asia | Kazakhstan | KAZ | 460700.0 | POLYGON ((87.35997 49.21498, 86.59878 48.54918... |
6 | 29748859 | Asia | Uzbekistan | UZB | 202300.0 | POLYGON ((55.96819 41.30864, 55.92892 44.99586... |
7 | 6909701 | Oceania | Papua New Guinea | PNG | 28020.0 | MULTIPOLYGON (((141.00021 -2.60015, 142.73525 ... |
8 | 260580739 | Asia | Indonesia | IDN | 3028000.0 | MULTIPOLYGON (((141.00021 -2.60015, 141.01706 ... |
9 | 44293293 | South America | Argentina | ARG | 879400.0 | MULTIPOLYGON (((-68.63401 -52.63637, -68.25000... |
print(type(world.geometry[0]))
world.geometry[9]
<class 'shapely.geometry.multipolygon.MultiPolygon'>
world.plot(figsize=(40,40))
<matplotlib.axes._subplots.AxesSubplot at 0x11ad5ff28>
cities = geopandas.read_file(geopandas.datasets.get_path('naturalearth_cities'))
cities.head(10)
name | geometry | |
---|---|---|
0 | Vatican City | POINT (12.45339 41.90328) |
1 | San Marino | POINT (12.44177 43.93610) |
2 | Vaduz | POINT (9.51667 47.13372) |
3 | Luxembourg | POINT (6.13000 49.61166) |
4 | Palikir | POINT (158.14997 6.91664) |
5 | Majuro | POINT (171.38000 7.10300) |
6 | Funafuti | POINT (179.21665 -8.51665) |
7 | Melekeok | POINT (134.62655 7.48740) |
8 | Monaco | POINT (7.40691 43.73965) |
9 | Tarawa | POINT (173.01757 1.33819) |
Let's put the cities on the world-map:
ax = world.plot(color='lightgreen', edgecolor='gray',figsize=(40,40))
cities.plot(ax=ax, marker='o', color='red', markersize=6);
# ading labels
for idx, row in cities.iterrows():
pt = row['geometry']
plt.annotate(s=row['name'], xy=(pt.x,pt.y),
horizontalalignment='center', fontsize=8,color="blue")
Let's color the maps according to each country's population size:
import math
fig, ax = plt.subplots()
ax.set_aspect('equal')
world['pop_est_log'] = world['pop_est'].apply(lambda i: math.log(i) if i >0 else 0)
world.plot(ax=ax, column="pop_est_log", cmap='OrRd', legend=True )
<matplotlib.axes._subplots.AxesSubplot at 0x11dff9278>
Let's plot the US states using shape file from Natural Earth, and GeoPandas:
#!mkdir ./datasets/ne_50m_admin_1_states_provinces/
#!wget -O ./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.zip https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/50m/cultural/ne_50m_admin_1_states_provinces.zip
#!unzip ./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.zip -d ./datasets/ne_50m_admin_1_states_provinces/
fig, ax = plt.subplots(figsize=(40,40))
shp_path = "./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.shp"
#reading states data from shape file
gdf = geopandas.read_file(shp_path)
gdf
featurecla | scalerank | adm1_code | diss_me | iso_3166_2 | wikipedia | iso_a2 | adm0_sr | name | name_alt | ... | name_nl | name_pl | name_pt | name_ru | name_sv | name_tr | name_vi | name_zh | ne_id | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Admin-1 scale rank | 2 | AUS-2651 | 2651 | AU-WA | None | AU | 6 | Western Australia | None | ... | West-Australië | Australia Zachodnia | Austrália Ocidental | Западная Австралия | Western Australia | Batı Avustralya | Tây Úc | 西澳大利亚州 | 1159315805 | MULTIPOLYGON (((113.13181 -25.95199, 113.14823... |
1 | Admin-1 scale rank | 2 | AUS-2650 | 2650 | AU-NT | None | AU | 6 | Northern Territory | None | ... | Noordelijk Territorium | Terytorium Północne | Território do Norte | Северная территория | Northern Territory | Kuzey Toprakları | Lãnh thổ Bắc Úc | 北領地 | 1159315809 | MULTIPOLYGON (((129.00196 -25.99901, 129.00196... |
2 | Admin-1 scale rank | 2 | AUS-2655 | 2655 | AU-SA | None | AU | 3 | South Australia | None | ... | Zuid-Australië | Australia Południowa | Austrália Meridional | Южная Австралия | South Australia | Güney Avustralya | Nam Úc | 南澳大利亚州 | 1159313267 | MULTIPOLYGON (((129.00196 -31.69266, 129.00196... |
3 | Admin-1 scale rank | 2 | AUS-2657 | 2657 | AU-QLD | None | AU | 5 | Queensland | None | ... | Queensland | Queensland | Queensland | Квинсленд | Queensland | Queensland | Queensland | 昆士蘭州 | 1159315807 | MULTIPOLYGON (((138.00196 -25.99901, 138.00174... |
4 | Admin-1 scale rank | 2 | AUS-2660 | 2660 | AU-TAS | None | AU | 5 | Tasmania | None | ... | Tasmanië | Tasmania | Tasmânia | Тасмания | Tasmanien | Tasmanya | Tasmania | 塔斯馬尼亞州 | 1159313261 | MULTIPOLYGON (((147.31246 -43.28038, 147.34238... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
95 | Admin-1 scale rank | 2 | USA-3540 | 3540 | US-VT | http://en.wikipedia.org/wiki/Vermont | US | 1 | Vermont | VT | ... | Vermont | Vermont | Vermont | Вермонт | Vermont | Vermont | Vermont | 佛蒙特州 | 1159315305 | POLYGON ((-73.35218 45.00542, -73.18201 45.005... |
96 | Admin-1 scale rank | 2 | USA-3519 | 3519 | US-WA | http://en.wikipedia.org/wiki/Washington_(state) | US | 6 | Washington | WA|Wash. | ... | Washington | Waszyngton | Washington | Вашингтон | Washington | Vaşington | Washington | 华盛顿州 | 1159309547 | MULTIPOLYGON (((-122.78878 48.99303, -122.6863... |
97 | Admin-1 scale rank | 2 | USA-3553 | 3553 | US-WI | http://en.wikipedia.org/wiki/Wisconsin | US | 1 | Wisconsin | WI|Wis. | ... | Wisconsin | Wisconsin | Wisconsin | Висконсин | Wisconsin | Wisconsin | Wisconsin | 威斯康辛州 | 1159315321 | POLYGON ((-90.65058 42.51298, -90.65733 42.520... |
98 | Admin-1 scale rank | 2 | USA-3554 | 3554 | US-WV | http://en.wikipedia.org/wiki/West_Virginia | US | 1 | West Virginia | WV|W.Va. | ... | West Virginia | Wirginia Zachodnia | Virgínia Ocidental | Западная Виргиния | West Virginia | Batı Virginia | Tây Virginia | 西維吉尼亞州 | 1159315323 | POLYGON ((-81.96528 37.53973, -82.10346 37.570... |
99 | Admin-1 scale rank | 2 | USA-3527 | 3527 | US-WY | http://en.wikipedia.org/wiki/Wyoming | US | 1 | Wyoming | WY|Wyo. | ... | Wyoming | Wyoming | Wyoming | Вайоминг | Wyoming | Wyoming | Wyoming | 怀俄明州 | 1159315351 | POLYGON ((-104.02166 41.00086, -104.33572 41.0... |
100 rows × 84 columns
gdf
featurecla | scalerank | adm1_code | diss_me | iso_3166_2 | wikipedia | iso_a2 | adm0_sr | name | name_alt | ... | name_nl | name_pl | name_pt | name_ru | name_sv | name_tr | name_vi | name_zh | ne_id | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Admin-1 scale rank | 2 | AUS-2651 | 2651 | AU-WA | None | AU | 6 | Western Australia | None | ... | West-Australië | Australia Zachodnia | Austrália Ocidental | Западная Австралия | Western Australia | Batı Avustralya | Tây Úc | 西澳大利亚州 | 1159315805 | MULTIPOLYGON (((113.13181 -25.95199, 113.14823... |
1 | Admin-1 scale rank | 2 | AUS-2650 | 2650 | AU-NT | None | AU | 6 | Northern Territory | None | ... | Noordelijk Territorium | Terytorium Północne | Território do Norte | Северная территория | Northern Territory | Kuzey Toprakları | Lãnh thổ Bắc Úc | 北領地 | 1159315809 | MULTIPOLYGON (((129.00196 -25.99901, 129.00196... |
2 | Admin-1 scale rank | 2 | AUS-2655 | 2655 | AU-SA | None | AU | 3 | South Australia | None | ... | Zuid-Australië | Australia Południowa | Austrália Meridional | Южная Австралия | South Australia | Güney Avustralya | Nam Úc | 南澳大利亚州 | 1159313267 | MULTIPOLYGON (((129.00196 -31.69266, 129.00196... |
3 | Admin-1 scale rank | 2 | AUS-2657 | 2657 | AU-QLD | None | AU | 5 | Queensland | None | ... | Queensland | Queensland | Queensland | Квинсленд | Queensland | Queensland | Queensland | 昆士蘭州 | 1159315807 | MULTIPOLYGON (((138.00196 -25.99901, 138.00174... |
4 | Admin-1 scale rank | 2 | AUS-2660 | 2660 | AU-TAS | None | AU | 5 | Tasmania | None | ... | Tasmanië | Tasmania | Tasmânia | Тасмания | Tasmanien | Tasmanya | Tasmania | 塔斯馬尼亞州 | 1159313261 | MULTIPOLYGON (((147.31246 -43.28038, 147.34238... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
95 | Admin-1 scale rank | 2 | USA-3540 | 3540 | US-VT | http://en.wikipedia.org/wiki/Vermont | US | 1 | Vermont | VT | ... | Vermont | Vermont | Vermont | Вермонт | Vermont | Vermont | Vermont | 佛蒙特州 | 1159315305 | POLYGON ((-73.35218 45.00542, -73.18201 45.005... |
96 | Admin-1 scale rank | 2 | USA-3519 | 3519 | US-WA | http://en.wikipedia.org/wiki/Washington_(state) | US | 6 | Washington | WA|Wash. | ... | Washington | Waszyngton | Washington | Вашингтон | Washington | Vaşington | Washington | 华盛顿州 | 1159309547 | MULTIPOLYGON (((-122.78878 48.99303, -122.6863... |
97 | Admin-1 scale rank | 2 | USA-3553 | 3553 | US-WI | http://en.wikipedia.org/wiki/Wisconsin | US | 1 | Wisconsin | WI|Wis. | ... | Wisconsin | Wisconsin | Wisconsin | Висконсин | Wisconsin | Wisconsin | Wisconsin | 威斯康辛州 | 1159315321 | POLYGON ((-90.65058 42.51298, -90.65733 42.520... |
98 | Admin-1 scale rank | 2 | USA-3554 | 3554 | US-WV | http://en.wikipedia.org/wiki/West_Virginia | US | 1 | West Virginia | WV|W.Va. | ... | West Virginia | Wirginia Zachodnia | Virgínia Ocidental | Западная Виргиния | West Virginia | Batı Virginia | Tây Virginia | 西維吉尼亞州 | 1159315323 | POLYGON ((-81.96528 37.53973, -82.10346 37.570... |
99 | Admin-1 scale rank | 2 | USA-3527 | 3527 | US-WY | http://en.wikipedia.org/wiki/Wyoming | US | 1 | Wyoming | WY|Wyo. | ... | Wyoming | Wyoming | Wyoming | Вайоминг | Wyoming | Wyoming | Wyoming | 怀俄明州 | 1159315351 | POLYGON ((-104.02166 41.00086, -104.33572 41.0... |
100 rows × 84 columns
fig, ax = plt.subplots(figsize=(40,40))
shp_path = "./datasets/ne_50m_admin_1_states_provinces/ne_50m_admin_1_states_provinces.shp"
#reading states data from shape file
gdf = geopandas.read_file(shp_path)
gdf = gdf[gdf['iso_a2'] == 'US'] # selecting only US states wihtout Alaska & Hawaii
gdf = gdf[gdf['name'].apply(lambda n: n not in {'Alaska', 'Hawaii'})]
#Let's add states name # see also https://stackoverflow.com/questions/38899190/geopandas-label-polygons
gdf['repres_points'] = gdf['geometry'].apply(lambda x: x.representative_point())
for idx, row in gdf.iterrows():
pt = row['repres_points']
name = row['iso_3166_2'].replace("US-", "")
plt.annotate(s=name, xy=(pt.x,pt.y),
horizontalalignment='center', fontsize=20,color="black")
gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')
<matplotlib.axes._subplots.AxesSubplot at 0x11ed84320>
Now, let's analyze Life on the Mississippi by Mark Twain and extract locations which appear in his work:
!wget -O ./datasets/mark_twain.txt http://www.gutenberg.org/files/245/245-0.txt
!python -m spacy download en_core_web_lg # remember to restart the runtime
import spacy
import operator
nlp = spacy.load('en_core_web_lg')
def get_locations_from_text(text):
locations_dict= {}
#using spaCy to get entities
doc = nlp(text)
for entity in doc.ents:
label = entity.label_
if label not in {'LOC', 'GPE'}:
continue
loc = entity.text.lower().strip()
if len(loc) < 2:
continue
if loc not in locations_dict:
locations_dict[loc] = 0
locations_dict[loc] += 1
return locations_dict
twain_full_work_path = "./datasets/mark_twain.txt"
txt = open(twain_full_work_path).read()
locations_dict = get_locations_from_text(txt)
locations_dict = {k:v for k,v in locations_dict.items() if v>3}
print(sorted(locations_dict.items(), key=operator.itemgetter(1), reverse=True)[:20])
print(f"Number of locations {len(locations_dict.keys())}")
[('mississippi', 119), ('new orleans', 105), ('st. louis', 80), ('cairo', 31), ('vicksburg', 28), ('memphis', 26), ('missouri', 25), ('the\nriver', 23), ('arkansas', 22), ('south', 22), ('natchez', 20), ('earth', 20), ('the united states', 20), ('st. paul', 18), ('cincinnati', 17), ('ohio', 15), ('illinois', 15), ('new\norleans', 14), ('texas', 13), ('louisiana', 12)] Number of locations 70
Using the , let's transfer locations mentioned in the book to coordinates and draw them on the map:
!pip install geopy
Collecting geopy Downloading https://files.pythonhosted.org/packages/53/fc/3d1b47e8e82ea12c25203929efb1b964918a77067a874b2c7631e2ec35ec/geopy-1.21.0-py2.py3-none-any.whl (104kB) |████████████████████████████████| 112kB 508kB/s eta 0:00:01 Collecting geographiclib<2,>=1.49 (from geopy) Downloading https://files.pythonhosted.org/packages/8b/62/26ec95a98ba64299163199e95ad1b0e34ad3f4e176e221c40245f211e425/geographiclib-1.50-py3-none-any.whl Installing collected packages: geographiclib, geopy Successfully installed geographiclib-1.50 geopy-1.21.0
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="Data Science Education App") # Using OpenStreetMap Nominatim
location = geolocator.geocode("the missouri river")
print(location.address)
print((location.latitude, location.longitude))
print(location.raw)
Missouri River, Sully County, South Dakota, 64072, United States of America (44.6042103, -100.6355825) {'place_id': 235665012, 'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. https://osm.org/copyright', 'osm_type': 'relation', 'osm_id': 1756890, 'boundingbox': ['38.5348721', '48.1497052', '-112.0153374', '-90.117707'], 'lat': '44.6042103', 'lon': '-100.6355825', 'display_name': 'Missouri River, Sully County, South Dakota, 64072, United States of America', 'class': 'waterway', 'type': 'river', 'importance': 0.7126739728744267}
from functools import lru_cache
from scipy.interpolate import interp1d # for transfaering font size
import time
@lru_cache(maxsize=256)
def get_location(loc):
time.sleep(1)
return geolocator.geocode(loc)
fig, ax = plt.subplots(figsize=(40,40))
gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')
m = interp1d([4,max(locations_dict.values())],[8,40])
for loc, v in locations_dict.items():
location = get_location(loc)
if location is None:
continue
if not (-120 < location.longitude < -65) or not (57>location.latitude > 25):
print(f"Skipping plottin {location}: {(location.latitude, location.longitude)} ")
continue
plt.plot(location.longitude,location.latitude, marker='o', color='red', markersize=m(v))
plt.annotate(s=loc, xy=(location.longitude,location.latitude),
horizontalalignment='center', fontsize=20,color="black")
Skipping plottin Россия: (64.6863136, 97.7453061) Skipping plottin Europa: (51.0, 10.0) Skipping plottin Deutschland: (51.0834196, 10.4234469) Skipping plottin France: (46.603354, 1.8883335) Skipping plottin Italia: (42.6384261, 12.674297) Skipping plottin England, United Kingdom: (52.7954791, -0.5402402866174321) Skipping plottin America, Horst aan de Maas, Limburg, Nederland: (51.44770365, 5.966069282055592) Skipping plottin Canada: (61.0666922, -107.9917071) Skipping plottin Рівненська область, Україна: (51.2074112, 26.5208033) Skipping plottin 中国: (35.000074, 104.999927) Skipping plottin القاهرة, محافظة القاهرة, Egypt / مصر: (30.048819, 31.243666) Skipping plottin Hat Island, Kitikmeot Region, Nunavut, Canada: (68.33154189999999, -100.09088423305138) Skipping plottin water, Bern, Verwaltungskreis Bern-Mittelland, Verwaltungsregion Bern-Mittelland, Bern/Berne, 3005, Switzerland: (46.9341389, 7.4472821) Skipping plottin London, Greater London, England, SW1A 2DX, United Kingdom: (51.5073219, -0.1276474) Skipping plottin Western, West Kenya, Kenya: (0.5090396, 34.5731341) Skipping plottin 대한민국: (35.7724185, 127.79654346305617) Skipping plottin Murel, Saint-Chamant, Tulle, Corrèze, Nouvelle-Aquitaine, France métropolitaine, 19380, France: (45.1393385, 1.8670033) Skipping plottin Alps, Bellagio, Como, Lombardia, Italia: (45.953168500000004, 9.237907459757649) Skipping plottin Norge, Namsos, Trøndelag, Norge: (64.5731537, 11.52803643954819) Skipping plottin Manchester, Greater Manchester, North West England, England, United Kingdom: (53.4794892, -2.2451148) Skipping plottin Troya'nın Arkeolojik Alanı, 17-56, Tevfikiye, Çanakkale merkez, Çanakkale, Marmara Bölgesi, Türkiye: (39.957373950000004, 26.238017461011644) Skipping plottin Ihamo, Rauma, Rauman seutukunta, Satakunta, Manner-Suomi, Suomi: (61.187668, 21.419709103913284)
Let's add the course of the river to the map:
[k for k in locations_dict.keys() if "river" in k]
['mississippi river', 'the\nriver', 'the mississippi river', 'black river', 'little river']
!wget -O ./datasets/ne_10m_rivers_lake_centerlines.zip https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip
!unzip ./datasets/ne_10m_rivers_lake_centerlines.zip -d ./datasets/ne_10m_rivers_lake_centerlines
--2020-05-08 15:12:35-- https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/physical/ne_10m_rivers_lake_centerlines.zip Resolving www.naturalearthdata.com (www.naturalearthdata.com)... 66.147.242.194 Connecting to www.naturalearthdata.com (www.naturalearthdata.com)|66.147.242.194|:443... connected. HTTP request sent, awaiting response... 302 Moved Temporarily Location: http://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip [following] --2020-05-08 15:12:37-- http://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip Resolving naciscdn.org (naciscdn.org)... 146.201.97.163 Connecting to naciscdn.org (naciscdn.org)|146.201.97.163|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip [following] --2020-05-08 15:12:38-- https://naciscdn.org/naturalearth/10m/physical/ne_10m_rivers_lake_centerlines.zip Connecting to naciscdn.org (naciscdn.org)|146.201.97.163|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 1817597 (1.7M) [application/zip] Saving to: ‘./datasets/ne_10m_rivers_lake_centerlines.zip’ ./datasets/ne_10m_r 100%[===================>] 1.73M 713KB/s in 2.5s 2020-05-08 15:12:41 (713 KB/s) - ‘./datasets/ne_10m_rivers_lake_centerlines.zip’ saved [1817597/1817597] Archive: ./datasets/ne_10m_rivers_lake_centerlines.zip inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.README.html extracting: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.VERSION.txt extracting: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.cpg inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.dbf inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.prj inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shp inflating: ./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shx
fig, ax = plt.subplots(figsize=(40,40))
ax = gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')
gdf.plot(ax=ax, color='lightgreen', edgecolor='gray')
m = interp1d([4,max(locations_dict.values())],[8,40])
for loc, v in locations_dict.items():
location = get_location(loc)
if location is None:
continue
if not (-100 < location.longitude < -65) or not (57>location.latitude > 25):
continue
plt.plot(location.longitude,location.latitude, marker='o', color='red', markersize=m(v))
plt.annotate(s=loc, xy=(location.longitude,location.latitude),
horizontalalignment='center', fontsize=20,color="black")
#adding the Mississippi river
# data from Natural Earth https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-rivers-lake-centerlines/
river_shp_path = "./datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shp"
#reading states data from shape file
r_gdf = geopandas.read_file(river_shp_path)
r_gdf = r_gdf[r_gdf['name'] == 'Mississippi']
r_gdf.plot(ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x150129a58>
#Focusing only on states which are relevant to the entities
from shapely.geometry import Point
location_points = []
def is_poly_contains_point(poly, l):
for pt in l:
if pt.within(poly):
return True
return False
for loc, v in locations_dict.items():
location = get_location(loc)
if location is None:
continue
if not (-100 < location.longitude < -65) or not (57>location.latitude > 25):
continue
location_points.append(Point(location.longitude,location.latitude))
gdf['is_relevant'] = gdf['geometry'].apply(lambda p: is_poly_contains_point(p,location_points))
gdf2 = gdf[gdf['is_relevant'] == True]
gdf2.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x14fefed68>
fig, ax = plt.subplots(figsize=(40,40))
ax = gdf2.plot(ax=ax, color='lightgreen', edgecolor='gray')
m = interp1d([4,max(locations_dict.values())],[8,40])
for loc, v in locations_dict.items():
location = get_location(loc)
if location is None:
continue
if not (-97 < location.longitude < -70) or not (57>location.latitude > 25):
continue
plt.plot(location.longitude,location.latitude, marker='o', color='red', markersize=m(v))
plt.annotate(s=loc, xy=(location.longitude,location.latitude),
horizontalalignment='center', fontsize=20,color="black")
#adding the mississippi river
# data from Natural Earth https://www.naturalearthdata.com/downloads/10m-physical-vectors/10m-rivers-lake-centerlines/
river_shp_path = "../../datasets/ne_10m_rivers_lake_centerlines/ne_10m_rivers_lake_centerlines.shp"
#reading states data from shape file
r_gdf = geopandas.read_file(river_shp_path)
r_gdf = r_gdf[r_gdf['name'] == 'Mississippi']
r_gdf.plot(ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x15029ac50>
gdf2
featurecla | scalerank | adm1_code | diss_me | iso_3166_2 | wikipedia | iso_a2 | adm0_sr | name | name_alt | ... | name_pt | name_ru | name_sv | name_tr | name_vi | name_zh | ne_id | geometry | repres_points | is_relevant | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
51 | Admin-1 scale rank | 2 | USA-3528 | 3528 | US-AR | http://en.wikipedia.org/wiki/Arkansas | US | 1 | Arkansas | AR|Ark. | ... | Arkansas | Арканзас | Arkansas | Arkansas | Arkansas | 阿肯色州 | 1159315355 | POLYGON ((-89.70477 36.00157, -89.70932 35.983... | POINT (-92.47082 34.74907) | True |
58 | Admin-1 scale rank | 2 | USA-3542 | 3542 | US-FL | http://en.wikipedia.org/wiki/Florida | US | 5 | Florida | FL|Fla. | ... | Flórida | Флорида | Florida | Florida | Florida | 佛罗里达州 | 1159315207 | MULTIPOLYGON (((-87.48951 30.37768, -87.48011 ... | POINT (-81.69118 28.02564) | True |
61 | Admin-1 scale rank | 2 | USA-3529 | 3529 | US-IA | http://en.wikipedia.org/wiki/Iowa | US | 1 | Iowa | IA|Iowa | ... | Iowa | Айова | Iowa | Iowa | Iowa | 艾奥瓦州 | 1159315357 | POLYGON ((-91.44195 40.37945, -91.52993 40.432... | POINT (-93.15799 41.94286) | True |
63 | Admin-1 scale rank | 2 | USA-3546 | 3546 | US-IL | http://en.wikipedia.org/wiki/Illinois | US | 1 | Illinois | IL|Ill. | ... | Illinois | Иллинойс | Illinois | Illinois | Illinois | 伊利诺伊州 | 1159315309 | POLYGON ((-91.44195 40.37945, -91.39078 40.397... | POINT (-89.46648 39.78811) | True |
66 | Admin-1 scale rank | 2 | USA-3548 | 3548 | US-KY | http://en.wikipedia.org/wiki/Kentucky | US | 1 | Kentucky | Commonwealth of Kentucky|KY | ... | Kentucky | Кентукки | Kentucky | Kentucky | Kentucky | 肯塔基州 | 1159315313 | MULTIPOLYGON (((-89.15431 36.99211, -89.16507 ... | POINT (-84.75146 37.81147) | True |
67 | Admin-1 scale rank | 2 | USA-3535 | 3535 | US-LA | http://en.wikipedia.org/wiki/Louisiana | US | 5 | Louisiana | LA | ... | Luisiana | Луизиана | Louisiana | Louisiana | Louisiana | 路易斯安那州 | 1159315221 | MULTIPOLYGON (((-94.04131 33.01200, -93.86163 ... | POINT (-91.64884 30.97444) | True |
68 | Admin-1 scale rank | 2 | USA-3513 | 3513 | US-MA | http://en.wikipedia.org/wiki/Massachusetts | US | 6 | Massachusetts | Commonwealth of Massachusetts|MA|Mass. | ... | Massachusetts | Массачусетс | Massachusetts | Massachusetts | Massachusetts | 麻薩諸塞州 | 1159312157 | MULTIPOLYGON (((-71.80084 42.01196, -71.80164 ... | POINT (-72.09041 42.19646) | True |
71 | Admin-1 scale rank | 2 | USA-3562 | 3562 | US-MI | http://en.wikipedia.org/wiki/Michigan | US | 1 | Michigan | MI|Mich. | ... | Michigan | Мичиган | Michigan | Michigan | Michigan | 密歇根州 | 1159314665 | POLYGON ((-89.49838 47.99790, -89.45565 47.996... | POINT (-84.52730 44.97850) | True |
72 | Admin-1 scale rank | 2 | USA-3514 | 3514 | US-MN | http://en.wikipedia.org/wiki/Minnesota | US | 1 | Minnesota | MN|Minn. | ... | Minnesota | Миннесота | Minnesota | Minnesota | Minnesota | 明尼蘇達州 | 1159315297 | POLYGON ((-97.22574 48.99318, -97.10345 48.993... | POINT (-94.49195 46.42266) | True |
73 | Admin-1 scale rank | 2 | USA-3531 | 3531 | US-MO | http://en.wikipedia.org/wiki/Missouri | US | 1 | Missouri | MO | ... | Missouri | Миссури | Missouri | Missouri | Missouri | 密蘇里州 | 1159315361 | POLYGON ((-89.70477 36.00157, -89.88818 35.999... | POINT (-92.49784 38.30694) | True |
74 | Admin-1 scale rank | 2 | USA-3544 | 3544 | US-MS | http://en.wikipedia.org/wiki/Mississippi | US | 5 | Mississippi | MS|Miss. | ... | Mississippi | Миссисипи | Mississippi | Mississippi | Mississippi | 密西西比州 | 1159315231 | MULTIPOLYGON (((-88.17327 34.99901, -88.08477 ... | POINT (-89.76229 32.60562) | True |
78 | Admin-1 scale rank | 2 | USA-3532 | 3532 | US-NE | http://en.wikipedia.org/wiki/Nebraska | US | 1 | Nebraska | NE|Nebr. | ... | Nebraska | Небраска | Nebraska | Nebraska | Nebraska | 內布拉斯加州 | 1159315363 | POLYGON ((-102.02449 40.00112, -102.02454 40.1... | POINT (-100.00838 41.51073) | True |
80 | Admin-1 scale rank | 2 | USA-3558 | 3558 | US-NJ | http://en.wikipedia.org/wiki/New_Jersey | US | 5 | New Jersey | NJ|N.J. | ... | Nova Jérsia | Нью-Джерси | New Jersey | New Jersey | New Jersey | 新泽西州 | 1159315267 | MULTIPOLYGON (((-75.07417 39.98348, -75.02353 ... | POINT (-74.41977 40.11702) | True |
83 | Admin-1 scale rank | 2 | USA-3559 | 3559 | US-NY | http://en.wikipedia.org/wiki/New_York | US | 3 | New York | NY|N.Y. | ... | Nova Iorque | Нью-Йорк | New York | New York | New York | 纽约州 | 1159312155 | MULTIPOLYGON (((-79.76202 42.53898, -79.44623 ... | POINT (-76.09640 42.88696) | True |
84 | Admin-1 scale rank | 2 | USA-3550 | 3550 | US-OH | http://en.wikipedia.org/wiki/Ohio | US | 1 | Ohio | OH|Ohio | ... | Ohio | Огайо | Ohio | Ohio | Ohio | 俄亥俄州 | 1159315315 | POLYGON ((-83.12167 41.95000, -83.02996 41.832... | POINT (-82.70617 40.38192) | True |
85 | Admin-1 scale rank | 2 | USA-3533 | 3533 | US-OK | http://en.wikipedia.org/wiki/Oklahoma | US | 1 | Oklahoma | OK|Okla. | ... | Oklahoma | Оклахома | Oklahoma | Oklahoma | Oklahoma | 奧克拉荷馬州 | 1159315365 | POLYGON ((-94.61838 36.50087, -94.59598 36.361... | POINT (-97.22039 35.34591) | True |
87 | Admin-1 scale rank | 2 | USA-3560 | 3560 | US-PA | http://en.wikipedia.org/wiki/Pennsylvania | US | 1 | Pennsylvania | Commonwealth of Pennsylvania|PA | ... | Pensilvânia | Пенсильвания | Pennsylvania | Pensilvanya | Pennsylvania | 宾夕法尼亚州 | 1159315331 | POLYGON ((-80.52076 42.32439, -80.24758 42.366... | POINT (-77.74166 41.11123) | True |
89 | Admin-1 scale rank | 2 | USA-3545 | 3545 | US-SC | http://en.wikipedia.org/wiki/South_Carolina | US | 1 | South Carolina | SC|S.C. | ... | Carolina do Sul | Южная Каролина | South Carolina | Güney Karolina | Nam Carolina | 南卡罗来纳州 | 1159315307 | POLYGON ((-80.87235 32.02957, -81.07481 32.109... | POINT (-80.54117 33.59079) | True |
91 | Admin-1 scale rank | 2 | USA-3551 | 3551 | US-TN | http://en.wikipedia.org/wiki/Tennessee | US | 1 | Tennessee | TN|Tenn. | ... | Tennessee | Теннесси | Tennessee | Tennessee | Tennessee | 田纳西州 | 1159315319 | POLYGON ((-85.62360 35.00086, -85.78404 35.002... | POINT (-86.35632 35.83989) | True |
92 | Admin-1 scale rank | 2 | USA-3536 | 3536 | US-TX | http://en.wikipedia.org/wiki/Texas | US | 4 | Texas | TX|Tex. | ... | Texas | Техас | Texas | Teksas | Texas | 得克萨斯州 | 1159315211 | MULTIPOLYGON (((-94.48415 33.64843, -94.43201 ... | POINT (-99.66867 31.19509) | True |
94 | Admin-1 scale rank | 2 | USA-3552 | 3552 | US-VA | http://en.wikipedia.org/wiki/Virginia | US | 6 | Virginia | VA | ... | Virgínia | Виргиния | Virginia | Virjinya | Virginia | 弗吉尼亚州 | 1159315259 | MULTIPOLYGON (((-77.12205 38.94354, -77.10183 ... | POINT (-78.45841 37.98719) | True |
21 rows × 86 columns
In this section, we are going to create interactive maps using the Folium package.
Note: A very helpful Folium tutorial can be found at the following link
import folium
import matplotlib.pyplot as plt
m = folium.Map(location=[40.712776, -74.005974]) # Latitude and Longitude (Northing, Easting)
m
Let's change the tiles and put in a marker:
tiles = 'Stamen Terrain'
m = folium.Map(location=[40.712776, -74.005974],
zoom_start=9,
tiles = tiles)
folium.Marker(
location=[40.712776, -74.005974], # coordinates for the marker (Earth Lab at CU Boulder)
popup='Here is New York', # pop-up label for the marker
icon=folium.Icon()
).add_to(m)
m
Let's use Folium to visualize earthquakes, using the Significant Earthquakes dataset:
#!mkdir ./datasets
!mkdir ./datasets/earthquakes
# download the dataset from Kaggle and unzip it
!kaggle datasets download usgs/earthquake-database -p ./datasets/earthquakes
!unzip ./datasets/earthquakes/*.zip -d ./datasets/earthquakes/
Downloading earthquake-database.zip to ./datasets/earthquakes 100%|████████████████████████████████████████| 590k/590k [00:00<00:00, 1.34MB/s] 100%|████████████████████████████████████████| 590k/590k [00:00<00:00, 1.34MB/s] Archive: ./datasets/earthquakes/earthquake-database.zip inflating: ./datasets/earthquakes/database.csv
import turicreate as tc
sf = tc.SFrame.read_csv("./datasets/earthquakes/database.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/earthquakes/database.csv
Parsing completed. Parsed 100 lines in 0.075065 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[str,str,float,float,str,float,str,str,float,str,str,str,str,str,str,str,str,str,str,str,str] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/earthquakes/database.csv
Parsing completed. Parsed 23412 lines in 0.059568 secs.
Date | Time | Latitude | Longitude | Type | Depth | Depth Error | Depth Seismic Stations | Magnitude |
---|---|---|---|---|---|---|---|---|
01/02/1965 | 13:44:18 | 19.246 | 145.616 | Earthquake | 131.6 | 6.0 | ||
01/04/1965 | 11:29:49 | 1.863 | 127.352 | Earthquake | 80.0 | 5.8 | ||
01/05/1965 | 18:05:58 | -20.579 | -173.972 | Earthquake | 20.0 | 6.2 | ||
01/08/1965 | 18:49:43 | -59.076 | -23.557 | Earthquake | 15.0 | 5.8 | ||
01/09/1965 | 13:32:50 | 11.938 | 126.427 | Earthquake | 15.0 | 5.8 | ||
01/10/1965 | 13:36:32 | -13.405 | 166.629 | Earthquake | 35.0 | 6.7 | ||
01/12/1965 | 13:32:25 | 27.357 | 87.867 | Earthquake | 20.0 | 5.9 | ||
01/15/1965 | 23:17:42 | -13.309 | 166.212 | Earthquake | 35.0 | 6.0 | ||
01/16/1965 | 11:32:37 | -56.452 | -27.043 | Earthquake | 95.0 | 6.0 | ||
01/17/1965 | 10:43:17 | -24.563 | 178.487 | Earthquake | 565.0 | 5.8 |
Magnitude Type | Magnitude Error | Magnitude Seismic Stations ... |
Azimuthal Gap | Horizontal Distance | Horizontal Error | Root Mean Square |
---|---|---|---|---|---|---|
MW | ||||||
MW | ||||||
MW | ||||||
MW | ||||||
MW | ||||||
MW | ||||||
MW | ||||||
MW | ||||||
MW | ||||||
MW |
ID | Source | Location Source | Magnitude Source | Status |
---|---|---|---|---|
ISCGEM860706 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM860737 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM860762 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM860856 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM860890 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM860922 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM861007 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEM861111 | ISCGEM | ISCGEM | ISCGEM | Automatic |
ISCGEMSUP861125 | ISCGEMSUP | ISCGEM | ISCGEM | Automatic |
ISCGEM861148 | ISCGEM | ISCGEM | ISCGEM | Automatic |
Let's plot the earthquake magnitudes distributions:
import seaborn as sns
%matplotlib inline
sns.set()
sns.distplot(sf['Magnitude'])
<matplotlib.axes._subplots.AxesSubplot at 0x147412d30>
import dateutil
import turicreate.aggregate as agg
sf['Year'] = sf['Date'].apply(lambda dt: dateutil.parser.parse(dt).year)
g = sf.groupby('Year', {'Earthquakes Number':agg.COUNT()})
df = g.to_dataframe()
fig, ax = plt.subplots(figsize=(6,15))
sns.barplot(y='Year', x='Earthquakes Number',orient="h", data=df, ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x158295470>
sf = sf['Date','Time','Latitude', 'Longitude', 'Magnitude']
b_sf = sf[sf['Magnitude'] > 8]
b_sf.sort('Magnitude', ascending=False)
Date | Time | Latitude | Longitude | Magnitude |
---|---|---|---|---|
12/26/2004 | 00:58:53 | 3.295 | 95.982 | 9.1 |
03/11/2011 | 05:46:24 | 38.297 | 142.373 | 9.1 |
02/27/2010 | 06:34:12 | -36.122 | -72.898 | 8.8 |
02/04/1965 | 05:01:22 | 51.251 | 178.715 | 8.7 |
04/11/2012 | 08:38:37 | 2.327 | 93.063 | 8.6 |
03/28/2005 | 16:09:37 | 2.085 | 97.108 | 8.6 |
09/12/2007 | 11:10:27 | -4.438 | 101.367 | 8.4 |
06/23/2001 | 20:33:14 | -16.265 | -73.641 | 8.4 |
09/16/2015 | 22:54:33 | -31.5729 | -71.6744 | 8.3 |
05/24/2013 | 05:44:49 | 54.892 | 153.221 | 8.3 |
tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
tiles = tiles)
for r in b_sf:
tooltip = f"{r['Date']} {r['Time']} - Magnitude: {r['Magnitude']} - Location ({(r['Latitude'],r['Longitude'])})"
folium.Marker(
location=[r['Latitude'],r['Longitude']], # coordinates for the marker (Earth Lab at CU Boulder)
popup= tooltip,
icon=folium.Icon(color='red', icon='info-sign')
).add_to(m)
m
Let's use the full dataset and create a heatmap of areas with many earthquakes:
import folium
from folium.plugins import HeatMap
tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
tiles = tiles)
data = [(r['Latitude'],r['Longitude']) for r in b_sf]
HeatMap(data, radius = 20).add_to(m)
m
import folium
from folium.plugins import HeatMap
tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
tiles = tiles)
data = [(r['Latitude'],r['Longitude'],r['Magnitude']) for r in sf]
HeatMap(data, radius = 20).add_to(m)
m
In this section, we will use Folium to analyze the UFO Sightings dataset. First let's explore the data:
from IPython.lib.display import YouTubeVideo
YouTubeVideo('hAAlDoAtV7Y')
#!mkdir ./datasets
!mkdir ./datasets/ufo
# download the dataset from Kaggle and unzip it
!kaggle datasets download NUFORC/ufo-sightings -p ./datasets/ufo
!unzip ./datasets/ufo/*.zip -d ./datasets/ufo/
Downloading ufo-sightings.zip to ./datasets/ufo 98%|█████████████████████████████████████▏| 10.0M/10.2M [00:01<00:00, 5.84MB/s] 100%|██████████████████████████████████████| 10.2M/10.2M [00:01<00:00, 5.51MB/s] Archive: ./datasets/ufo/ufo-sightings.zip inflating: ./datasets/ufo/complete.csv inflating: ./datasets/ufo/scrubbed.csv
import turicreate as tc
import turicreate.aggregate as agg
import dateutil
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set()
sf = tc.SFrame.read_csv('./datasets/ufo/complete.csv')
sf
Unexpected characters after last column. "0" Parse failed at token ending at: 5/5/2002 13:00,,,,,0,,,"Characteristics: Not aura or haze, but more like emanation of an energy field.",5/24/2005,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 5/5/2002 13:00 1: 2: 3: 4: 5: 0 6: 7: 8: Characteri ... rgy field. 9: 5/24/2005 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 3/30/2007 04:30,,ms,,,0,sphere,5 minutes,"White luminous Sphere",4/27/2007,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 3/30/2007 04:30 1: 2: ms 3: 4: 5: 0 6: sphere 7: 5 minutes 8: White luminous Sphere 9: 4/27/2007 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 3/31/1966 23:50,,,,,0,,,"Please refer to my report I reported to you 6/7/2001 and you posted on 8/5/2001. Anyway, do you know of any doctor that would remove t",10/30/2006,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 3/31/1966 23:50 1: 2: 3: 4: 5: 0 6: 7: 8: Please ref ... d remove t 9: 10/30/2006 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 7/15/2013 21:00,,ms,,,0,formation,5 minutes,"My two children and I saw strange lights move around the sky. Antigravity type motion,silent,some suddenly vanished",8/30/2013,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 7/15/2013 21:00 1: 2: ms 3: 4: 5: 0 6: formation 7: 5 minutes 8: My two chi ... y vanished 9: 8/30/2013 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 1/26/2009 24:00,,,,,0,,,"I followed an FAA regulation, and you went back on your word for me to remain anonymous. Now you've been removed from the 7110.65. HA!",3/19/2009,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 1/26/2009 24:00 1: 2: 3: 4: 5: 0 6: 7: 8: I followed ... 65. HA! 9: 3/19/2009 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 3/31/2010 01:30,,nv,,,0,teardrop,,"unusual flying objects taken by flash earth satalite over nevada highway",11/21/2010,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 3/31/2010 01:30 1: 2: nv 3: 4: 5: 0 6: teardrop 7: 8: unusual f ... da highway 9: 11/21/2010 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 12/7/2002 14:00,,,,,0,circle,,"it was round made nosies i saw it uabove the sky.",12/23/2002,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 12/7/2002 14:00 1: 2: 3: 4: 5: 0 6: circle 7: 8: it was rou ... e the sky. 9: 12/23/2002 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 5/6/2008 02:00,,,,,0,circle,30 seconds,"it was early morning i was reading a book for some reason and i saw 5 circle shaped UFO'S in the sky that were moving rapidly through t",6/12/2008,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 5/6/2008 02:00 1: 2: 3: 4: 5: 0 6: circle 7: 30 seconds 8: it was ear ... through t 9: 6/12/2008 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 12/7/2008 12:07,,,,,0,,,"I was looking at a friends house and tilted the view up(north) and noticed what appears to be a classic shape of a "space ship".",1/10/2009,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 12/7/2008 12:07 1: 2: 3: 4: 5: 0 6: 7: 8: I was look ... hip". 9: 1/10/2009 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 11/20/2008 22:00,,tx,,,0,light,30 min,"10:00 pm= I saw 2 pair of lights. moments later one pair disapeared. another moment later I saw the other pair disapear & I saw a air p",1/10/2009,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 11/20/2008 22:00 1: 2: tx 3: 4: 5: 0 6: light 7: 30 min 8: 10:00 pm= ... aw a air p 9: 1/10/2009 10: 0
196 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/ufo/complete.csv
Parsing completed. Parsed 100 lines in 0.258661 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[str,str,str,str,str,int,str,str,str,float,float] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Unexpected characters after last column. "0" Parse failed at token ending at: 5/5/2002 13:00,,,,,0,,,"Characteristics: Not aura or haze, but more like emanation of an energy field.",5/24/2005,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 5/5/2002 13:00 1: 2: 3: 4: 5: 0 6: 7: 8: Characteri ... rgy field. 9: 5 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 3/30/2007 04:30,,ms,,,0,sphere,5 minutes,"White luminous Sphere",4/27/2007,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 3/30/2007 04:30 1: 2: ms 3: 4: 5: 0 6: sphere 7: 5 minutes 8: White luminous Sphere 9: 4 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 3/31/1966 23:50,,,,,0,,,"Please refer to my report I reported to you 6/7/2001 and you posted on 8/5/2001. Anyway, do you know of any doctor that would remove t",10/30/2006,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 3/31/1966 23:50 1: 2: 3: 4: 5: 0 6: 7: 8: Please ref ... d remove t 9: 10 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 7/15/2013 21:00,,ms,,,0,formation,5 minutes,"My two children and I saw strange lights move around the sky. Antigravity type motion,silent,some suddenly vanished",8/30/2013,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 7/15/2013 21:00 1: 2: ms 3: 4: 5: 0 6: formation 7: 5 minutes 8: My two chi ... y vanished 9: 8 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 3/31/2010 01:30,,nv,,,0,teardrop,,"unusual flying objects taken by flash earth satalite over nevada highway",11/21/2010,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 3/31/2010 01:30 1: 2: nv 3: 4: 5: 0 6: teardrop 7: 8: unusual f ... da highway 9: 11 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 1/26/2009 24:00,,,,,0,,,"I followed an FAA regulation, and you went back on your word for me to remain anonymous. Now you've been removed from the 7110.65. HA!",3/19/2009,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 1/26/2009 24:00 1: 2: 3: 4: 5: 0 6: 7: 8: I followed ... 65. HA! 9: 3 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 11/20/2008 22:00,,tx,,,0,light,30 min,"10:00 pm= I saw 2 pair of lights. moments later one pair disapeared. another moment later I saw the other pair disapear & I saw a air p",1/10/2009,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 11/20/2008 22:00 1: 2: tx 3: 4: 5: 0 6: light 7: 30 min 8: 10:00 pm= ... aw a air p 9: 1 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 5/6/2008 02:00,,,,,0,circle,30 seconds,"it was early morning i was reading a book for some reason and i saw 5 circle shaped UFO'S in the sky that were moving rapidly through t",6/12/2008,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 5/6/2008 02:00 1: 2: 3: 4: 5: 0 6: circle 7: 30 seconds 8: it was ear ... through t 9: 6 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 12/7/2002 14:00,,,,,0,circle,,"it was round made nosies i saw it uabove the sky.",12/23/2002,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 12/7/2002 14:00 1: 2: 3: 4: 5: 0 6: circle 7: 8: it was rou ... e the sky. 9: 12 10: 0
Unexpected characters after last column. "0" Parse failed at token ending at: 12/7/2008 12:07,,,,,0,,,"I was looking at a friends house and tilted the view up(north) and noticed what appears to be a classic shape of a "space ship".",1/10/2009,0,0[1;31m^[0m Successfully parsed 11 tokens: 0: 12/7/2008 12:07 1: 2: 3: 4: 5: 0 6: 7: 8: I was look ... hip". 9: 1 10: 0
196 lines failed to parse correctly
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/ufo/complete.csv
Parsing completed. Parsed 88679 lines in 0.303616 secs.
datetime | city | state | country | shape | duration (seconds) | duration (hours/min) |
---|---|---|---|---|---|---|
10/10/1949 20:30 | san marcos | tx | us | cylinder | 2700 | 45 minutes |
10/10/1949 21:00 | lackland afb | tx | light | 7200 | 1-2 hrs | |
10/10/1955 17:00 | chester (uk/england) | gb | circle | 20 | 20 seconds | |
10/10/1956 21:00 | edna | tx | us | circle | 20 | 1/2 hour |
10/10/1960 20:00 | kaneohe | hi | us | light | 900 | 15 minutes |
10/10/1961 19:00 | bristol | tn | us | sphere | 300 | 5 minutes |
10/10/1965 21:00 | penarth (uk/wales) | gb | circle | 180 | about 3 mins | |
10/10/1965 23:45 | norwalk | ct | us | disk | 1200 | 20 minutes |
10/10/1966 20:00 | pell city | al | us | disk | 180 | 3 minutes |
10/10/1966 21:00 | live oak | fl | us | disk | 120 | several minutes |
comments | date posted | latitude | longitude |
---|---|---|---|
This event took place in early fall around ... |
4/27/2004 | 29.8830556 | -97.9411111 |
1949 Lackland AFB, TX. Lights racing across the ... |
12/16/2005 | 29.38421 | -98.581082 |
Green/Orange circular disc over Chester, ... |
1/21/2008 | 53.2 | -2.916667 |
My older brother and twin sister were leaving the ... |
1/17/2004 | 28.9783333 | -96.6458333 |
AS a Marine 1st Lt. flying an FJ4B ... |
1/22/2004 | 21.4180556 | -157.8036111 |
My father is now 89 my brother 52 the girl with ... |
4/27/2007 | 36.595 | -82.1888889 |
penarth uk circle 3mins stayed 30ft above me for ... |
2/14/2006 | 51.434722 | -3.18 |
A bright orange color changing to reddish c ... |
10/2/1999 | 41.1175 | -73.4083333 |
Strobe Lighted disk shape object observed close ... |
3/19/2009 | 33.5861111 | -86.2861111 |
Saucer zaps energy from powerline as my pregnant ... |
5/11/2005 | 30.2947222 | -82.9841667 |
def get_datetime(dt_str):
try:
return dateutil.parser.parse(dt_str)
except:
return None
sf['datetime'] = sf['datetime'].apply(lambda dt_str: get_datetime(dt_str))
sf = sf.dropna()
sf['Hour'] = sf['datetime'].apply(lambda dt: dt.hour)
sf['Month'] = sf['datetime'].apply(lambda dt: dt.month)
sf['Year'] = sf['datetime'].apply(lambda dt: dt.year)
sf['Decade'] = sf['Year'].apply(lambda y: y - y%10)
sf2 = sf[sf['Year'] >= 1950]
g = sf2.groupby('Decade', {'Sightings Number': agg.COUNT()})
df = g.to_dataframe()
fig, ax = plt.subplots(figsize=(10,6))
sns.barplot(x='Decade', y='Sightings Number', data=df, ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x15efb0be0>
g = sf.groupby('Hour', {'Sightings Number': agg.COUNT()})
df = g.to_dataframe()
fig, ax = plt.subplots(figsize=(10,6))
sns.barplot(x='Hour', y='Sightings Number', data=df, ax=ax)
<matplotlib.axes._subplots.AxesSubplot at 0x1609e7e10>
g = sf.groupby('shape', {'Sightings Number': agg.COUNT()})
g = g.sort('Sightings Number')
g = g[g['Sightings Number'] > 100]
df = g.to_dataframe()
fig, ax = plt.subplots(figsize=(10,6))
sns.barplot(x='shape', y='Sightings Number', data=df, ax=ax)
plt.xticks(rotation=45)
(array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21]), <a list of 22 Text xticklabel objects>)
import folium
from folium.plugins import HeatMap
tiles = 'Stamen Terrain'
m = folium.Map(location=[0,0], zoom_start=2,
tiles = tiles)
data = [(r['latitude'],r['longitude']) for r in sf]
HeatMap(data, radius = 20).add_to(m)
m
Let's visualize the sighting locations in 1993:
sf_80 = sf[sf['Decade'] == 1980]
len(sf_80)
2273
from folium.plugins import MarkerCluster
def pop_text(r):
txt = f"<b>{r['datetime']}</b><br> {r['comments'][:500]}"
return txt
m = folium.Map(zoom_start=8, tiles='CartoDB dark_matter')
mc = MarkerCluster()
for r in sf_80[:200]:
mc.add_child(folium.CircleMarker(location=[r['latitude'],r['longitude']],
radius=5,color="#007849", popup=pop_text(r), parse_html=True))
m.add_child(mc)
m
def get_color(shape):
if r['shape'] == 'circle':
return 'red'
if r['shape'] == 'triangle':
return 'green'
else:
return 'blue'
s_sf = sf[sf['shape'].apply(lambda s: s in ['circle', 'triangle'])]
random_sample_sf, x = s_sf.random_split(0.1)
m = folium.Map(zoom_start=8, tiles='CartoDB dark_matter')
for r in random_sample_sf:
m.add_child(folium.CircleMarker(location=[r['latitude'],r['longitude']],
radius=5,color=get_color(r['shape']), popup=pop_text(r), parse_html=True))
m
In this example, we are going to work with Folium and TopoJSON. Namely, we are going to draw an interactive choropleth map of the population in Washington state by county. First let's get data of Washington state counties' population and TopoJSON data with the counties' geographic data. Let's create a map that presents the population in each county:
#!mkdir ./datasets
!mkdir ./datasets/WA
!wget -O ./datasets/WA/population.csv https://data.wa.gov/api/views/2hia-rqet/rows.csv?accessType=DOWNLOAD
!wget -O ./datasets/WA/WA-53-washington-counties.json https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/WA-53-washington-counties.json
--2020-05-08 16:04:25-- https://data.wa.gov/api/views/2hia-rqet/rows.csv?accessType=DOWNLOAD Resolving data.wa.gov (data.wa.gov)... 52.206.68.26, 52.206.140.205, 52.206.140.199 Connecting to data.wa.gov (data.wa.gov)|52.206.68.26|:443... connected. HTTP request sent, awaiting response... 200 OK Length: unspecified [text/csv] Saving to: ‘./datasets/WA/population.csv’ ./datasets/WA/popul [ <=> ] 73.63K 449KB/s in 0.2s 2020-05-08 16:04:26 (449 KB/s) - ‘./datasets/WA/population.csv’ saved [75396] --2020-05-08 16:04:26-- https://raw.githubusercontent.com/deldersveld/topojson/master/countries/us-states/WA-53-washington-counties.json Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.112.133 Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.112.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 15542 (15K) [text/plain] Saving to: ‘./datasets/WA/WA-53-washington-counties.json’ ./datasets/WA/WA-53 100%[===================>] 15.18K --.-KB/s in 0.07s 2020-05-08 16:04:27 (232 KB/s) - ‘./datasets/WA/WA-53-washington-counties.json’ saved [15542/15542]
import folium
datasets_path = "./datasets/WA"
tiles = 'Mapbox Bright'
m = folium.Map(
location=[47.6117, -122.332],
tiles=tiles,
zoom_start=7
)
folium.Marker(
location=[47.611700, -122.332000], # coordinates for the marker (Earth Lab at CU Boulder)
popup='Here is Washington State', # pop-up label for the marker
icon=folium.Icon()
).add_to(m)
#Create a layer of WAshington State Counties
topoJSONpath = f"{datasets_path}/WA-53-washington-counties.json"
folium.TopoJson(
open(topoJSONpath),
object_path='objects.cb_2015_washington_county_20m',
).add_to(m)
m
Now, let's use the counties' population data to create a choropleth map:
import geopandas as gpd
gdf = gpd.read_file(topoJSONpath)
gdf.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x16105cbe0>
gdf.head()
id | STATEFP | COUNTYFP | COUNTYNS | AFFGEOID | GEOID | NAME | LSAD | ALAND | AWATER | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | None | 53 | 063 | 01529225 | 0500000US53063 | 53063 | Spokane | 06 | 4568197031 | 43789502 | POLYGON ((-117.82144 47.82584, -117.69950 47.8... |
1 | None | 53 | 041 | 01531927 | 0500000US53041 | 53041 | Lewis | 06 | 6223223859 | 86636988 | POLYGON ((-123.37212 46.79151, -123.20424 46.7... |
2 | None | 53 | 025 | 01531924 | 0500000US53025 | 53025 | Grant | 06 | 6939890129 | 289550821 | POLYGON ((-120.00743 47.22020, -120.00566 47.3... |
3 | None | 53 | 051 | 01529157 | 0500000US53051 | 53051 | Pend Oreille | 06 | 3626035232 | 65404066 | POLYGON ((-117.42912 48.99974, -117.26831 48.9... |
4 | None | 53 | 023 | 01533500 | 0500000US53023 | 53023 | Garfield | 06 | 1840672367 | 19490299 | POLYGON ((-117.85325 46.62453, -117.75075 46.6... |
import pandas as pd
wa_pop_path = f"{datasets_path}/population.csv"
df = pd.read_csv(wa_pop_path)
df
SEQUENCE | FILTER | COUNTY | JURISDICTION | POP_1990 | POP_1991 | POP_1992 | POP_1993 | POP_1994 | POP_1995 | ... | POP_2010 | POP_2011 | POP_2012 | POP_2013 | POP_2014 | POP_2015 | POP_2016 | POP_2017 | POP_2018 | POP_2019 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 1 | Adams | Adams County | 13603.0 | 13823.0 | 14063.0 | 14335.0 | 14679.0 | 15030.0 | ... | 18728 | 18950 | 19050 | 19200 | 19400 | 19410 | 19510 | 19870 | 20020 | 20150 |
1 | 2 | 2 | Adams | Unincorporated Adams County | 6466.0 | 6698.0 | 6776.0 | 7009.0 | 7162.0 | 7303.0 | ... | 8818 | 8960 | 8980 | 9040 | 9135 | 9085 | 9105 | 9165 | 9220 | 9270 |
2 | 3 | 3 | Adams | Incorporated Adams County | 7137.0 | 7125.0 | 7287.0 | 7326.0 | 7517.0 | 7727.0 | ... | 9910 | 9990 | 10070 | 10160 | 10265 | 10325 | 10405 | 10705 | 10800 | 10880 |
3 | 4 | 4 | Adams | Hatton | 71.0 | 80.0 | 81.0 | 82.0 | 83.0 | 84.0 | ... | 101 | 100 | 105 | 110 | 110 | 110 | 110 | 110 | 110 | 115 |
4 | 5 | 4 | Adams | Lind | 472.0 | 400.0 | 523.0 | 435.0 | 452.0 | 451.0 | ... | 564 | 560 | 565 | 570 | 565 | 560 | 550 | 550 | 550 | 550 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
404 | 405 | 4 | Yakima | Yakima | 54843.0 | 58925.0 | 60455.0 | 61693.0 | 62387.0 | 63930.0 | ... | 91196 | 91630 | 91930 | 92620 | 93080 | 93220 | 93410 | 93900 | 94190 | 94440 |
405 | 406 | 4 | Yakima | Zillah | 1911.0 | 1922.0 | 1938.0 | 1991.0 | 2062.0 | 2096.0 | ... | 2964 | 3000 | 3035 | 3115 | 3140 | 3140 | 3145 | 3150 | 3165 | 3185 |
406 | 407 | 1 | Washington | State Total | 4866659.0 | 5000353.0 | 5091138.0 | 5188009.0 | 5291577.0 | 5396569.0 | ... | 6724540 | 6767900 | 6817770 | 6882400 | 6968170 | 7061410 | 7183700 | 7310300 | 7427570 | 7546410 |
407 | 408 | 2 | Washington | Unincorporated State Total | 2341365.0 | 2394824.0 | 2438904.0 | 2435178.0 | 2475442.0 | 2507091.0 | ... | 2478323 | 2454633 | 2438547 | 2449701 | 2470761 | 2497039 | 2516902 | 2557466 | 2591085 | 2635501 |
408 | 409 | 3 | Washington | Incorporated State Total | 2525294.0 | 2605529.0 | 2652234.0 | 2752831.0 | 2816135.0 | 2889478.0 | ... | 4246217 | 4313267 | 4379223 | 4432699 | 4497409 | 4564371 | 4666798 | 4752834 | 4836485 | 4910909 |
409 rows × 34 columns
import turicreate as tc
import turicreate.aggregate as agg
import pandas as pd
sf = tc.SFrame.read_csv(wa_pop_path)
sf = sf[sf['JURISDICTION'].apply(lambda j: "corporated" not in j)]
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/WA/population.csv
Parsing completed. Parsed 100 lines in 0.00683 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[int,int,str,str,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int,int] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/WA/population.csv
Parsing completed. Parsed 409 lines in 0.006773 secs.
SEQUENCE | FILTER | COUNTY | JURISDICTION | POP_1990 | POP_1991 | POP_1992 | POP_1993 | POP_1994 | POP_1995 | POP_1996 |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | Adams | Adams County | 13603 | 13823 | 14063 | 14335 | 14679 | 15030 | 15323 |
4 | 4 | Adams | Hatton | 71 | 80 | 81 | 82 | 83 | 84 | 93 |
5 | 4 | Adams | Lind | 472 | 400 | 523 | 435 | 452 | 451 | 484 |
6 | 4 | Adams | Othello | 4638 | 4692 | 4735 | 4868 | 5033 | 5240 | 5265 |
7 | 4 | Adams | Ritzville | 1725 | 1728 | 1730 | 1729 | 1730 | 1733 | 1733 |
8 | 4 | Adams | Washtucna | 231 | 225 | 218 | 212 | 219 | 219 | 218 |
9 | 1 | Asotin | Asotin County | 17605 | 17677 | 17866 | 18124 | 18666 | 18937 | 19622 |
12 | 4 | Asotin | Asotin | 981 | 1039 | 1046 | 1002 | 1017 | 1072 | 1086 |
13 | 4 | Asotin | Clarkston | 6753 | 6632 | 6762 | 6771 | 7128 | 6748 | 6982 |
14 | 1 | Benton | Benton County | 112560 | 114439 | 116503 | 119374 | 123457 | 128359 | 132590 |
POP_1997 | POP_1998 | POP_1999 | POP_2000 | POP_2001 | POP_2002 | POP_2003 | POP_2004 | POP_2005 | POP_2006 | POP_2007 |
---|---|---|---|---|---|---|---|---|---|---|
15698 | 15879 | 16151 | 16428 | 16699 | 16911 | 17081 | 17489 | 17643 | 17690 | 17959 |
94 | 96 | 97 | 98 | 119 | 97 | 97 | 97 | 97 | 96 | 96 |
517 | 535 | 567 | 582 | 582 | 576 | 574 | 561 | 556 | 556 | 550 |
5508 | 5614 | 5681 | 5847 | 5961 | 6062 | 6129 | 6434 | 6551 | 6523 | 6714 |
1731 | 1733 | 1733 | 1736 | 1745 | 1735 | 1716 | 1707 | 1695 | 1685 | 1678 |
250 | 254 | 258 | 260 | 255 | 248 | 241 | 239 | 237 | 243 | 239 |
19943 | 20202 | 20442 | 20551 | 20650 | 20652 | 20709 | 20779 | 20939 | 21176 | 21413 |
1083 | 1094 | 1081 | 1095 | 1106 | 1114 | 1120 | 1128 | 1133 | 1174 | 1189 |
7168 | 7369 | 7565 | 7337 | 7371 | 7300 | 7275 | 7265 | 7270 | 7258 | 7273 |
135620 | 137717 | 139498 | 142475 | 145267 | 148290 | 151933 | 155874 | 159286 | 162255 | 165096 |
POP_2008 | POP_2009 | POP_2010 | POP_2011 | POP_2012 | POP_2013 | POP_2014 | POP_2015 | POP_2016 | POP_2017 | POP_2018 |
---|---|---|---|---|---|---|---|---|---|---|
18214 | 18421 | 18728 | 18950 | 19050 | 19200 | 19400 | 19410 | 19510 | 19870 | 20020 |
96 | 98 | 101 | 100 | 105 | 110 | 110 | 110 | 110 | 110 | 110 |
550 | 550 | 564 | 560 | 565 | 570 | 565 | 560 | 550 | 550 | 550 |
6931 | 7089 | 7364 | 7420 | 7495 | 7565 | 7695 | 7780 | 7875 | 8175 | 8270 |
1681 | 1675 | 1673 | 1705 | 1695 | 1700 | 1680 | 1670 | 1660 | 1660 | 1660 |
214 | 210 | 208 | 205 | 210 | 215 | 215 | 205 | 210 | 210 | 210 |
21522 | 21593 | 21623 | 21650 | 21700 | 21800 | 21950 | 22010 | 22150 | 22290 | 22420 |
1224 | 1244 | 1251 | 1255 | 1255 | 1265 | 1265 | 1260 | 1270 | 1275 | 1275 |
7226 | 7228 | 7229 | 7200 | 7205 | 7210 | 7225 | 7235 | 7260 | 7250 | 7205 |
167598 | 171402 | 175177 | 177900 | 180000 | 183400 | 186500 | 188590 | 190500 | 193500 | 197420 |
POP_2019 |
---|
20150 |
115 |
550 |
8345 |
1660 |
210 |
22520 |
1280 |
7205 |
201800 |
sf = sf[sf["FILTER"] == 1]
sf
SEQUENCE | FILTER | COUNTY | JURISDICTION | POP_1990 | POP_1991 | POP_1992 | POP_1993 | POP_1994 | POP_1995 |
---|---|---|---|---|---|---|---|---|---|
1 | 1 | Adams | Adams County | 13603 | 13823 | 14063 | 14335 | 14679 | 15030 |
9 | 1 | Asotin | Asotin County | 17605 | 17677 | 17866 | 18124 | 18666 | 18937 |
14 | 1 | Benton | Benton County | 112560 | 114439 | 116503 | 119374 | 123457 | 128359 |
22 | 1 | Chelan | Chelan County | 52250 | 53436 | 54965 | 56423 | 58319 | 60079 |
30 | 1 | Clallam | Clallam County | 56210 | 57626 | 58275 | 59155 | 59919 | 60548 |
36 | 1 | Clark | Clark County | 238053 | 248417 | 255915 | 264548 | 274423 | 286804 |
47 | 1 | Columbia | Columbia County | 4024 | 4044 | 4041 | 4047 | 4053 | 4051 |
52 | 1 | Cowlitz | Cowlitz County | 82119 | 82862 | 83802 | 84813 | 85921 | 87232 |
60 | 1 | Douglas | Douglas County | 26195 | 27648 | 27423 | 28143 | 28692 | 29312 |
69 | 1 | Ferry | Ferry County | 6295 | 6366 | 6465 | 6561 | 6648 | 6812 |
POP_1996 | POP_1997 | POP_1998 | POP_1999 | POP_2000 | POP_2001 | POP_2002 | POP_2003 | POP_2004 | POP_2005 | POP_2006 |
---|---|---|---|---|---|---|---|---|---|---|
15323 | 15698 | 15879 | 16151 | 16428 | 16699 | 16911 | 17081 | 17489 | 17643 | 17690 |
19622 | 19943 | 20202 | 20442 | 20551 | 20650 | 20652 | 20709 | 20779 | 20939 | 21176 |
132590 | 135620 | 137717 | 139498 | 142475 | 145267 | 148290 | 151933 | 155874 | 159286 | 162255 |
61240 | 62895 | 64199 | 65575 | 66616 | 66896 | 67400 | 67507 | 68013 | 68963 | 69895 |
61469 | 62037 | 62933 | 63425 | 64179 | 64717 | 65398 | 65928 | 66725 | 67672 | 68948 |
298364 | 310512 | 323892 | 334641 | 345238 | 352715 | 364855 | 374091 | 385370 | 394600 | 404737 |
4051 | 4056 | 4058 | 4062 | 4064 | 4114 | 4115 | 4111 | 4122 | 4135 | 4128 |
88531 | 89568 | 90600 | 91744 | 92948 | 94081 | 94854 | 95849 | 96593 | 97673 | 99095 |
29967 | 30548 | 31427 | 32035 | 32603 | 32817 | 32871 | 33545 | 33944 | 34466 | 35505 |
6950 | 7058 | 7135 | 7184 | 7260 | 7340 | 7378 | 7397 | 7367 | 7405 | 7462 |
POP_2007 | POP_2008 | POP_2009 | POP_2010 | POP_2011 | POP_2012 | POP_2013 | POP_2014 | POP_2015 | POP_2016 | POP_2017 |
---|---|---|---|---|---|---|---|---|---|---|
17959 | 18214 | 18421 | 18728 | 18950 | 19050 | 19200 | 19400 | 19410 | 19510 | 19870 |
21413 | 21522 | 21593 | 21623 | 21650 | 21700 | 21800 | 21950 | 22010 | 22150 | 22290 |
165096 | 167598 | 171402 | 175177 | 177900 | 180000 | 183400 | 186500 | 188590 | 190500 | 193500 |
70773 | 71799 | 72185 | 72453 | 72700 | 73200 | 73600 | 74300 | 75030 | 75910 | 76830 |
69847 | 70629 | 71027 | 71404 | 71600 | 72000 | 72350 | 72500 | 72650 | 73410 | 74240 |
412692 | 419091 | 423775 | 425363 | 428000 | 431250 | 435500 | 442800 | 451820 | 461010 | 471000 |
4095 | 4099 | 4097 | 4078 | 4100 | 4100 | 4100 | 4080 | 4090 | 4050 | 4100 |
100377 | 101542 | 102175 | 102410 | 102700 | 103050 | 103300 | 103700 | 104280 | 104850 | 105900 |
36340 | 37238 | 38036 | 38431 | 38650 | 38900 | 39280 | 39700 | 39990 | 40720 | 41420 |
7484 | 7529 | 7563 | 7551 | 7600 | 7650 | 7650 | 7660 | 7710 | 7700 | 7740 |
POP_2018 | POP_2019 |
---|---|
20020 | 20150 |
22420 | 22520 |
197420 | 201800 |
77800 | 78420 |
75130 | 76010 |
479500 | 488500 |
4150 | 4160 |
107310 | 108950 |
42120 | 42820 |
7780 | 7830 |
total_wa = sf['POP_2019'].sum()
sf['Pop Percentage'] = sf['POP_2019'].apply(lambda p: p/float(total_wa))
sf = sf.rename({"COUNTY":"NAME"})
gsf = tc.SFrame(gdf[['GEOID', 'NAME']])
g = sf.join(gsf, on="NAME")
g.sort('Pop Percentage', ascending=False)
SEQUENCE | FILTER | NAME | JURISDICTION | POP_1990 | POP_1991 | POP_1992 | POP_1993 | POP_1994 | POP_1995 |
---|---|---|---|---|---|---|---|---|---|
124 | 1 | King | King County | 1507305 | 1549991 | 1570997 | 1590603 | 1609529 | 1625241 |
245 | 1 | Pierce | Pierce County | 586203 | 598065 | 610619 | 623697 | 636802 | 649284 |
292 | 1 | Snohomish | Snohomish County | 465628 | 488075 | 496461 | 507336 | 519960 | 531704 |
315 | 1 | Spokane | Spokane County | 361333 | 365887 | 371147 | 377020 | 384035 | 391318 |
36 | 1 | Clark | Clark County | 238053 | 248417 | 255915 | 264548 | 274423 | 286804 |
340 | 1 | Thurston | Thurston County | 161238 | 167663 | 172425 | 177058 | 181715 | 186419 |
166 | 1 | Kitsap | Kitsap County | 189731 | 196926 | 202113 | 207976 | 212429 | 218308 |
390 | 1 | Yakima | Yakima County | 188823 | 191490 | 194939 | 198225 | 202044 | 206046 |
361 | 1 | Whatcom | Whatcom County | 127780 | 132669 | 137791 | 141270 | 146056 | 149942 |
14 | 1 | Benton | Benton County | 112560 | 114439 | 116503 | 119374 | 123457 | 128359 |
POP_1996 | POP_1997 | POP_1998 | POP_1999 | POP_2000 | POP_2001 | POP_2002 | POP_2003 | POP_2004 | POP_2005 | POP_2006 |
---|---|---|---|---|---|---|---|---|---|---|
1640249 | 1659106 | 1686266 | 1712122 | 1737046 | 1755487 | 1777514 | 1788082 | 1800783 | 1814999 | 1845209 |
653212 | 664070 | 675651 | 688884 | 700818 | 709288 | 721124 | 731969 | 743701 | 756919 | 774050 |
542738 | 554585 | 570896 | 589266 | 606024 | 617864 | 629287 | 639942 | 648778 | 661346 | 676126 |
397508 | 403954 | 408740 | 413665 | 417939 | 423127 | 428755 | 428889 | 431905 | 438249 | 446751 |
298364 | 310512 | 323892 | 334641 | 345238 | 352715 | 364855 | 374091 | 385370 | 394600 | 404737 |
190409 | 194440 | 198435 | 203167 | 207355 | 210102 | 214139 | 218264 | 223065 | 229286 | 234083 |
221849 | 225251 | 227179 | 229569 | 231969 | 233918 | 236656 | 239443 | 240777 | 239819 | 244049 |
209381 | 212375 | 215587 | 219483 | 222581 | 224229 | 224790 | 227956 | 230002 | 231902 | 234408 |
154122 | 157071 | 160667 | 163774 | 166826 | 170980 | 174238 | 175984 | 180245 | 184965 | 190088 |
132590 | 135620 | 137717 | 139498 | 142475 | 145267 | 148290 | 151933 | 155874 | 159286 | 162255 |
POP_2007 | POP_2008 | POP_2009 | POP_2010 | POP_2011 | POP_2012 | POP_2013 | POP_2014 | POP_2015 | POP_2016 | POP_2017 |
---|---|---|---|---|---|---|---|---|---|---|
1871098 | 1891125 | 1909205 | 1931249 | 1942600 | 1957000 | 1981900 | 2017250 | 2052800 | 2105100 | 2153700 |
786911 | 794330 | 796900 | 795225 | 802150 | 808200 | 814500 | 821300 | 830120 | 844490 | 859400 |
689314 | 699330 | 705894 | 713335 | 717000 | 722900 | 730500 | 741000 | 757600 | 772860 | 789400 |
454034 | 460303 | 466426 | 471221 | 472650 | 475600 | 480000 | 484500 | 488310 | 492530 | 499800 |
412692 | 419091 | 423775 | 425363 | 428000 | 431250 | 435500 | 442800 | 451820 | 461010 | 471000 |
239570 | 244853 | 249336 | 252264 | 254100 | 256800 | 260100 | 264000 | 267410 | 272690 | 276900 |
247476 | 249905 | 251249 | 251133 | 253900 | 254500 | 254000 | 255900 | 258200 | 262590 | 264300 |
236923 | 239524 | 241708 | 243231 | 244700 | 246000 | 247250 | 248800 | 249970 | 250900 | 253000 |
195298 | 197675 | 199736 | 201140 | 202100 | 203500 | 205800 | 207600 | 209790 | 212540 | 216300 |
165096 | 167598 | 171402 | 175177 | 177900 | 180000 | 183400 | 186500 | 188590 | 190500 | 193500 |
POP_2018 | POP_2019 | Pop Percentage | GEOID |
---|---|---|---|
2190200 | 2226300 | 0.14750722528990606 | 53033 |
872220 | 888300 | 0.05885580030769598 | 53053 |
805120 | 818700 | 0.05424433604853168 | 53061 |
507950 | 515250 | 0.034138749418597715 | 53063 |
479500 | 488500 | 0.032366383485657416 | 53011 |
281700 | 285800 | 0.018936156397545322 | 53067 |
267120 | 270100 | 0.017895926672417746 | 53035 |
254500 | 255950 | 0.01695839478639512 | 53077 |
220350 | 225300 | 0.014927627838932684 | 53073 |
197420 | 201800 | 0.013370596084760834 | 53005 |
tiles = 'Mapbox Bright'
m = folium.Map(
location=[47.6117, -122.332],
tiles=tiles,
zoom_start=7
)
#Create a layer of WAshington State Counties
topoJSONpath = f"{datasets_path}/WA-53-washington-counties.json"
folium.Choropleth(
geo_data=open(topoJSONpath,"r"),
topojson='objects.cb_2015_washington_county_20m',
data=g.to_dataframe(),
columns=['GEOID', 'Pop Percentage'],
key_on='properties.GEOID',
fill_color='BuPu',
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Population Percentage',
reset=True
).add_to(m)
folium.LayerControl().add_to(m)
m
In this section, we are going to create an interactive map using Plotly-Express, using data from WikiTree. Our goal is to create a map of the most common birth locations over time. Let's start by loading the data into an SFrame object:
import turicreate as tc
import turicreate.aggregate as agg
sf = tc.SFrame.read_csv('./datasets/wikitree_birth_locations.csv', verbose=True)
# sf['Birth Location'] = sf['Birth Location'].apply(lambda s: s.strip().lower() if len(s.lower()) > 1 else None)
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/datasets/wikitree_birth_locations.csv
Parsing completed. Parsed 100 lines in 0.418882 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[int,str] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Read 2252637 lines. Lines per second: 2.81638e+06
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/datasets/wikitree_birth_locations.csv
Parsing completed. Parsed 16420973 lines in 4.60602 secs.
Birth Date | Birth Location |
---|---|
0 | |
1940 | |
1920 | |
1920 | |
1940 | |
1910 | |
1920 | |
1940 | |
1940 | |
18870921 | Canning, Kings, Nova Scotia, Canada ... |
def get_year(s):
if len(s) < 4:
return None
if len(s) == 4:
return int(s)
return int(s[:4])
sf['Year'] = sf['Birth Date'].apply(lambda s: get_year(str(s)))
sf = sf[sf['Year'] < 2019]
sf = sf[sf['Year'] >= 0]
sf = sf.dropna()
print("Unique locations number %s" % len(sf['Birth Location'].unique()))
Unique locations number 1691018
We have too many locations (at least for this example). Let's filter out places that with less than 50 births:
g = sf.groupby('Birth Location', {'count': agg.COUNT()})
g = g[g['count'] > 50]
len(g)
23420
We have about over 23,500 locations, so let's reolve each location's and find each location longitude and latitude:
from geopy import Bing
BING_MAPS_API_KEY = open("../../env/.keys").read()
b = Bing(BING_MAPS_API_KEY)
r = b.geocode("New York")
r
Location(New York, NY, United States, (40.71455001831055, -74.00714111328125, 0.0))
r.raw
{'__type': 'Location:http://schemas.microsoft.com/search/local/ws/rest/v1', 'bbox': [40.363765716552734, -74.74592590332031, 41.0565299987793, -73.26721954345703], 'name': 'New York, NY', 'point': {'type': 'Point', 'coordinates': [40.71455001831055, -74.00714111328125]}, 'address': {'adminDistrict': 'NY', 'countryRegion': 'United States', 'formattedAddress': 'New York, NY', 'locality': 'New York'}, 'confidence': 'High', 'entityType': 'PopulatedPlace', 'geocodePoints': [{'type': 'Point', 'coordinates': [40.71455001831055, -74.00714111328125], 'calculationMethod': 'Rooftop', 'usageTypes': ['Display']}], 'matchCodes': ['Good']}
Let's resolve the top-mentioned locations and insert each location into a MongoDB collection. Let's create a collection and insert to it a single location data:
import pymongo
client = pymongo.MongoClient("mongodb://localhost:27017/")
db = client['locations'] # Created a new DB named locations
collection = db['wikitree_locatios']
j = {'raw': r.raw, 'result': str(r), "query": "New York"}
collection.insert_one(j)
# Creating a index for faster search
collection.create_index([('query', pymongo.TEXT)], name='location_query_index', default_language='english')
'location_query_index'
from geopy.exc import GeocoderTimedOut
import time
def find_location(collection, query , search_bing=True):
result = collection.find_one({'query': query})
if result is None and search_bing:
try:
result = b.geocode(query)
result = add_location(collection, query, result)
except:
time.sleep(5)
pass
return result
def add_location(collection, query, result):
if result is not None:
j = {'raw': result.raw, 'result': str(result), "query": query}
else:
j = {'raw': {}, 'result': None, "query": query}
collection.insert_one(j)
return j
find_location(collection, "Italy")
{'_id': ObjectId('5ca26c697a00fc082e44119f'), 'raw': {'__type': 'Location:http://schemas.microsoft.com/search/local/ws/rest/v1', 'bbox': [36.67594528198242, 1.228389024734497, 47.06735610961914, 23.807432174682617], 'name': 'Italy', 'point': {'type': 'Point', 'coordinates': [43.529029846191406, 12.16218376159668]}, 'address': {'countryRegion': 'Italy', 'formattedAddress': 'Italy'}, 'confidence': 'High', 'entityType': 'CountryRegion', 'geocodePoints': [{'type': 'Point', 'coordinates': [43.529029846191406, 12.16218376159668], 'calculationMethod': 'Rooftop', 'usageTypes': ['Display']}], 'matchCodes': ['Good']}, 'result': 'Italy', 'query': 'Italy'}
from tqdm import tqdm_notebook as tqdm
n =5000
g = g.sort('count', ascending=False)
l = list(g['Birth Location'])[:n]
for loc in tqdm(l):
find_location(collection, loc)
from functools import lru_cache
@lru_cache(maxsize=None)
def get_country(query):
r = find_location(collection, query)
try:
return r['raw']['address']['countryRegion']
except:
return None
locations_set = set(l)
sf2 = sf[sf['Birth Location'].apply(lambda loc: loc in locations_set)]
countries = []
for loc in tqdm(sf2['Birth Location']):
countries.append(get_country( loc ))
sf2['Country'] = countries
sf2
Birth Date | Birth Location | Year | Country |
---|---|---|---|
1940 | 1940 | None | |
1920 | 1920 | None | |
1920 | 1920 | None | |
1940 | 1940 | None | |
1910 | 1910 | None | |
1920 | 1920 | None | |
1940 | 1940 | None | |
1940 | 1940 | None | |
1910 | 1910 | None | |
1920 | 1920 | None |
import plotly_express as px
import pycountry
def get_country_to_alpha3(name):
if name == 'Czech Republic':
return 'CZE'
if name == 'Russia':
return 'RUS'
try:
return pycountry.countries.get(name=name).alpha_3
except:
return None
sf2 = sf2.dropna()
g2 = sf2.groupby(['Year','Country'], {'Count': agg.COUNT()})
g3 = g2[g2['Year'] >= 1400]
g3['alpha3'] = g3['Country'].apply(lambda c: get_country_to_alpha3(c))
g3 = g3.dropna()
g3 = g3[g3['Year'] >= 1750]
g3 = g3[g3['Year'] <= 1950]
px.scatter_geo(g3.to_dataframe(), locations="alpha3", hover_name="Country", size="Count",
animation_frame="Year", projection="natural earth")
us_sf = sf2[sf2['Country'] == 'United States']
us_sf.materialize()
us_sf
Birth Date | Birth Location | Year | Country |
---|---|---|---|
19200714 | Massachusetts | 1920 | United States |
18791130 | Massachusetts | 1879 | United States |
19230926 | Boston, MA | 1923 | United States |
19010524 | Boston, Massachusetts, United States ... |
1901 | United States |
19030118 | Cambridge, Massachusetts | 1903 | United States |
19120000 | Massachusetts | 1912 | United States |
18900828 | New York City, New York, United States ... |
1890 | United States |
18660200 | New York, NY | 1866 | United States |
18461100 | Massachusetts | 1846 | United States |
19280422 | Worcester, Massachusetts | 1928 | United States |
us_g = us_sf.groupby(['Year','Birth Location'], {"Count": agg.COUNT()})
us_g
Birth Location | Year | Count |
---|---|---|
Smith County, Tennessee, USA ... |
1887 | 1 |
Perry, Ohio, United States ... |
1891 | 1 |
Giles, Tennessee, USA | 1824 | 3 |
Jersey City, Hudson, New Jersey ... |
1918 | 3 |
Cabell County, West Virginia ... |
1837 | 4 |
Cabarrus County, North Carolina, United States ... |
1873 | 3 |
Harwich, Barnstable, Massachusetts ... |
1716 | 3 |
Cumberland, Pennsylvania | 1712 | 2 |
St. Louis, Missouri | 1844 | 4 |
Germantown, Philadelphia, Pennsylvania ... |
1769 | 2 |
@lru_cache(maxsize=None)
def get_long_lat(query):
r = find_location(collection, query)
try:
return r['raw']['point']['coordinates']
except:
return None
@lru_cache(maxsize=None)
def get_state(query):
r = find_location(collection, query)
try:
return r['raw']['address']['adminDistrict']
except:
return None
state_l = []
cor_l = []
for loc in tqdm(us_g['Birth Location']):
cor_l.append(get_long_lat(loc))
state_l.append(get_state(loc))
us_g['Cooridnates'] = cor_l
us_g['State'] = state_l
us_g = us_g.dropna()
us_g
Birth Location | Year | Count | Cooridnates | State |
---|---|---|---|---|
Smith County, Tennessee, USA ... |
1887 | 1 | [36.250572204589844, -85.95669555664062] ... |
TN |
Perry, Ohio, United States ... |
1891 | 1 | [41.771671295166016, -81.14666748046875] ... |
OH |
Giles, Tennessee, USA | 1824 | 3 | [35.202117919921875, -87.03473663330078] ... |
TN |
Jersey City, Hudson, New Jersey ... |
1918 | 3 | [40.71747970581055, -74.04385375976562] ... |
NJ |
Cabell County, West Virginia ... |
1837 | 4 | [38.42015075683594, -82.24178314208984] ... |
WV |
Cabarrus County, North Carolina, United States ... |
1873 | 3 | [35.3868408203125, -80.55193328857422] ... |
NC |
Harwich, Barnstable, Massachusetts ... |
1716 | 3 | [41.68627166748047, -70.07341766357422] ... |
MA |
Cumberland, Pennsylvania | 1712 | 2 | [39.76728057861328, -77.27111053466797] ... |
PA |
St. Louis, Missouri | 1844 | 4 | [38.627750396728516, -90.1995620727539] ... |
MO |
Germantown, Philadelphia, Pennsylvania ... |
1769 | 2 | [40.029541015625, -75.17510986328125] ... |
PA |
df = us_g[us_g['Year'] == 1900].to_dataframe()
px.choropleth(df, locations='State', locationmode='USA-states', color='Count')
us_g = us_g.sort('Year')
df = us_g.to_dataframe()
px.choropleth(df, locations='State', locationmode='USA-states', color='Count', animation_frame='Year')
Let's load the 2016 Parties in New York to an SFrame object:
#!mkdir ./datasets
!mkdir ./datasets/partynyc
# download the dataset from Kaggle and unzip it
!kaggle datasets download somesnm/partynyc -p ./datasets/partynyc
!unzip ./datasets/partynyc/*.zip -d ./datasets/partynyc/
Downloading partynyc.zip to ./datasets/partynyc 97%|████████████████████████████████████▉ | 14.0M/14.4M [00:04<00:00, 5.57MB/s] 100%|██████████████████████████████████████| 14.4M/14.4M [00:04<00:00, 3.45MB/s] Archive: ./datasets/partynyc/partynyc.zip inflating: ./datasets/partynyc/bar_locations.csv inflating: ./datasets/partynyc/party_in_nyc.csv inflating: ./datasets/partynyc/test_parties.csv inflating: ./datasets/partynyc/train_parties.csv
import turicreate as tc
import turicreate.aggregate as agg
sf = tc.SFrame.read_csv("./datasets/partynyc/party_in_nyc.csv")
sf
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/partynyc/party_in_nyc.csv
Parsing completed. Parsed 100 lines in 0.367555 secs.
------------------------------------------------------ Inferred types from first 100 line(s) of file as column_type_hints=[str,str,str,float,str,str,float,float] If parsing fails due to incorrect types, you can correct the inferred type list above and pass it to read_csv in the column_type_hints argument ------------------------------------------------------
Finished parsing file /Users/michael/Dropbox (BGU)/massive data mining/ 2020/notebooks/datasets/partynyc/party_in_nyc.csv
Parsing completed. Parsed 225414 lines in 0.368045 secs.
Created Date | Closed Date | Location Type | Incident Zip | City | Borough |
---|---|---|---|---|---|
2015-12-31 00:01:15 | 2015-12-31 03:48:04 | Store/Commercial | 10034.0 | NEW YORK | MANHATTAN |
2015-12-31 00:02:48 | 2015-12-31 04:36:13 | Store/Commercial | 10040.0 | NEW YORK | MANHATTAN |
2015-12-31 00:03:25 | 2015-12-31 00:40:15 | Residential Building/House ... |
10026.0 | NEW YORK | MANHATTAN |
2015-12-31 00:03:26 | 2015-12-31 01:53:38 | Residential Building/House ... |
11231.0 | BROOKLYN | BROOKLYN |
2015-12-31 00:05:10 | 2015-12-31 03:49:10 | Residential Building/House ... |
10033.0 | NEW YORK | MANHATTAN |
2015-12-31 00:08:05 | 2015-12-31 01:59:12 | Residential Building/House ... |
10467.0 | BRONX | BRONX |
2015-12-31 00:11:40 | 2015-12-31 06:24:00 | Residential Building/House ... |
11230.0 | BROOKLYN | BROOKLYN |
2015-12-31 00:12:13 | 2015-12-31 00:38:09 | Residential Building/House ... |
11215.0 | BROOKLYN | BROOKLYN |
2015-12-31 00:12:37 | 2015-12-31 05:03:39 | Residential Building/House ... |
10463.0 | BRONX | BRONX |
2015-12-31 00:14:13 | 2015-12-31 06:25:40 | Store/Commercial | 11372.0 | JACKSON HEIGHTS | QUEENS |
Latitude | Longitude |
---|---|
40.86618344001468 | -73.91893042945345 |
40.85932419390543 | -73.93123733660876 |
40.79941540978025 | -73.95337116858667 |
40.6782851094981 | -73.99466779426595 |
40.85030372032608 | -73.93851562699031 |
40.8587476839271 | -73.86562454420242 |
40.61700535900229 | -73.95692046165364 |
40.66505114462701 | -73.98127790267175 |
40.875894942376384 | -73.91247127084895 |
40.75558360239671 | -73.88520104800678 |
sf['Latitude'] = sf['Latitude'].apply(lambda i: round(i,4))
sf['Longitude'] = sf['Longitude'].apply(lambda i: round(i,4))
sf = sf.dropna()
g = sf.groupby(['Latitude','Longitude'], {'Count': agg.COUNT()})
g.sort('Count', ascending=False)
g.export_csv("./parties_location_count.csv")
g[g['Count'] >= 50].export_csv("./parties_location_count_50.csv")
g
Latitude | Longitude | Count |
---|---|---|
40.6693 | -73.8848 | 1 |
40.8707 | -73.8901 | 6 |
40.8793 | -73.8599 | 28 |
40.6862 | -73.872 | 4 |
40.688 | -73.8317 | 2 |
40.869 | -73.9028 | 16 |
40.6657 | -73.772 | 1 |
40.8558 | -73.9056 | 2 |
40.645 | -73.946 | 2 |
40.8487 | -73.9059 | 19 |
Let's use Kepler.gl to visualize the data:
from IPython.display import Image
Image("./kepler.gl_party.png")
In this section, we are going to use BaseMap. Let's first install BaseMap using Conda:
!conda install basemap
# In case a problem with PROJ_LIB
# than install conda install proj4 and set PROJ_LIB to the share/proj directory
#import os
# os.environ["PROJ_LIB"] = "/anaconda3/pkgs/proj4-<complete path>/share/proj" # or
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
%matplotlib inline
plt.figure(figsize=(8, 8))
m = Basemap(projection='ortho', resolution=None, lat_0=0, lon_0=0)
m.bluemarble(scale=0.5);
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
plt.figure(figsize=(8, 8))
m = Basemap(projection='cea', resolution=None)
m.bluemarble(scale=1);
# inspired from https://jakevdp.github.io/PythonDataScienceHandbook/04.13-geographic-data-with-basemap.html
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='lcc', resolution='h',
width=2E6, height=2E6,
lat_0=51, lon_0=0.12,)
m.etopo(scale=0.51, alpha=0.5)
# Map (long, lat) to (x, y) for plotting
x, y = m(0.1278, 51.5074)
plt.plot(x, y, 'ok', markersize=6)
plt.text(x, y, 'London', fontsize=16);
Using BaseMap, we can plot maps with different resolutions and colors:
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='gall', resolution='c')
m.etopo(scale=0.51, alpha=0.5)
m.fillcontinents(color='white',lake_color='aqua')
# Map (long, lat) to (x, y) for plotting
x, y = m(0.1278, 51.5074)
plt.plot(x, y, 'ok', markersize=2)
plt.text(x, y, 'London', fontsize=6);
Each map projection has its own advantages and disadvantages. In BaseMap there are 24 different map projections.
fig = plt.figure(figsize=(8, 8))
m = Basemap(projection='gnom', resolution='c',
width=3E6, height=3E6,
lat_0=47.608013, lon_0=-122.335167)
m.etopo(scale=0.51, alpha=0.5)
Clipping input data to the valid range for imshow with RGB data ([0..1] for floats or [0..255] for integers).
<matplotlib.image.AxesImage at 0xb27391eb8>