Assignment 4¶

In this assignment, you'll combine the assignment 3 data set with nutrition data from the USDA Food Composition Databases. The CSV file fresh.csv contains the fresh fruits and vegetables data you extracted in assignment 3.

The USDA Food Composition Databases have a documented web API that returns data in JSON format . You need a key in order to use the API. Only 1000 requests are allowed per hour, so it would be a good idea to use caching.

Sign up for an API key here. The key will work with any Data.gov API. You may need the key again later in the quarter, so make sure you save it.

These modules may be useful:

Exercise 1.1. Read the search request documentation, then write a function called ndb_search() that makes a search request. The function should accept the search term as an argument. The function should return the search result items as a list (for 0 items, return an empty list).

Note that the search url is: https://api.nal.usda.gov/ndb/search

As an example, a search for "quail eggs" should return this list:

[{u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u'CHAOKOH, QUAIL EGG IN BRINE, UPC: 044738074186',
  u'ndbno': u'45094707',
  u'offset': 0},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u'L&W, QUAIL EGGS, UPC: 024072000256',
  u'ndbno': u'45094890',
  u'offset': 1},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u'BUDDHA, QUAIL EGGS IN BRINE, UPC: 761934535098',
  u'ndbno': u'45099560',
  u'offset': 2},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u'GRAN SABANA, QUAIL EGGS, UPC: 819140010103',
  u'ndbno': u'45169279',
  u'offset': 3},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u"D'ARTAGNAN, QUAIL EGGS, UPC: 736622102630",
  u'ndbno': u'45178254',
  u'offset': 4},
 {u'ds': u'SR',
  u'group': u'Dairy and Egg Products',
  u'name': u'Egg, quail, whole, fresh, raw',
  u'ndbno': u'01140',
  u'offset': 5}]

As usual, make sure you document and test your function.

from urllib2 import Request, urlopen
from urlparse import urlparse, urlunparse
import requests
import requests_cache
import pandas as pd
import json
import os
from matplotlib import pyplot as plt
plt.style.use('ggplot')
%matplotlib inline
requests_cache.install_cache('usda_cache')

key = 'O6dynzQFxqlFUFzc9XoqGxr0froELHWUtxhW3aVR'

def ndb_search(search_term):
    """
    This queries the USDA Food composite database to make a search request
    Input: search_term (string of terms to search for)
    Output: json object (of search results)
    """
    urlbase = "https://api.nal.usda.gov/ndb/search"
    try: 
        usdareq = requests.get(urlbase + "?format=json" + "&q={}".format(search_term) + "&api_key={}".format(key))
        usda_json = usdareq.json()
        usda_json = usda_json['list']['item']
        return usda_json
    except (KeyError): 
        pass

ndb_search('potatoes raw')

[{u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Potatoes, raw, skin',
  u'ndbno': u'11362',
  u'offset': 0},
 {u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Sweet potato, raw, unprepared',
  u'ndbno': u'11507',
  u'offset': 1},
 {u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Sweet potato leaves, raw',
  u'ndbno': u'11505',
  u'offset': 2},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u"BRAD'S RAW CHIPS, RAW CHIPS, SWEET POTATO, UPC: 854615002023",
  u'ndbno': u'45017998',
  u'offset': 3},
 {u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Potatoes, flesh and skin, raw',
  u'ndbno': u'11352',
  u'offset': 4},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u'RAW SWEET POTATO FRIES, UPC: 087738172100',
  u'ndbno': u'45184166',
  u'offset': 5},
 {u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Potatoes, russet, flesh and skin, raw',
  u'ndbno': u'11353',
  u'offset': 6},
 {u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Potatoes, white, flesh and skin, raw',
  u'ndbno': u'11354',
  u'offset': 7},
 {u'ds': u'SR',
  u'group': u'Vegetables and Vegetable Products',
  u'name': u'Potatoes, red, flesh and skin, raw',
  u'ndbno': u'11355',
  u'offset': 8},
 {u'ds': u'BL',
  u'group': u'Branded Food Products Database',
  u'name': u'RAW STUFFED CHICKEN BREASTS LOADED BAKED POTATO, UPC: 073461000056',
  u'ndbno': u'45218813',
  u'offset': 9}]

Exercise 1.2. Use your search function to get NDB numbers for the foods in the fresh.csv file. It's okay if you don't get an NDB number for every food, but try to come up with a strategy that gets most of them. Discuss your strategy in a short paragraph.

Hints:

The foods are all raw and unbranded.
You can test search terms with the online search page.
You can convert the output of ndb_search() to a data frame with pd.DataFrame().
The string methods for Python and Pandas are useful here. It's okay if you use simple regular expressions in the Pandas methods, although this exercise can be solved without them.
You can merge data frames that have a column in common with pd.merge().

path = os.path.join(os.getcwd(), 'fresh.csv')
fresh = pd.read_csv(path)
fresh['food'] = fresh['food'].str.replace("_", " ")
fresh_df = pd.DataFrame()
for i in fresh['food']:
    search = i + ' raw'
    j = i.replace(" ", "|")
    try: 
        results = pd.DataFrame(ndb_search(search))
        results = results.loc[results.ds.str.startswith("SR"), :]
        results = results.loc[results.group.str.startswith(("Fruits", "Vegetables")), :]
        if results.shape == (1,5):
            results["offset"] = i
            fresh_df = fresh_df.append(results)
        else:
            results = results[results['name'].str.contains(j, case = False)]
            try: 
                results["offset"] = i
                fresh_df = fresh_df.append(results.iloc[0])
            except (IndexError):
                pass

    except (AttributeError):
        pass

fresh_df = fresh_df.drop_duplicates()
fresh_df = fresh_df.rename(columns = {'offset':'food'})
fresh_df = fresh_df.merge(fresh)
fresh_df

ndbno = [i for i in fresh_df['ndbno']]
nd = ['09279']

My strategy for tackling this problem started with replacing the '_' in all food names with spaces then adding ' raw' in order to use the ndb_search function to search for that term. Then I further narrowed the results DataFrame down by eliminating all rows without 'SR', which meant that no branded foods would be included. Then, I narrowed the results to those that were in the Fruits or Vegetables categories, since I was getting some results in other categories. Finally, I further narrowed the results to keep all results that contained the key words of the food I was searching for. In order to do this well, I replaced all spaces in the food types with the '|' (or) symbol. Because, I still had 300+ results, I simply just took the first entry of each search and appended that into the fresh_df DataFrame. In the end, I ended up with most of the correct rows that I wanted. The cucumber rows were a sad consequence, since the keyword given here was "cucumbers" and the keyword in the results DataFrame was "cucumber". Besides that, the search for "kiwi raw" did not give any results, since in the USDA database it is under "Kiwifruit, raw". Finally, I ended up with the wrong cabbage, artichoke, and potato types. Besides those 5 mistakes, I ended up with the correct fruits and vegetables for each food type.

Exercise 1.3. Read the food reports V2 documentation, then write a function called ndb_report() that requests a basic food report. The function should accept the NDB number as an argument and return the list of nutrients for the food.

Note that the report url is: https://api.nal.usda.gov/ndb/V2/reports

For example, for "09279" (raw plums) the first element of the returned list should be:

{u'group': u'Proximates',
 u'measures': [{u'eqv': 165.0,
   u'eunit': u'g',
   u'label': u'cup, sliced',
   u'qty': 1.0,
   u'value': u'143.93'},
  {u'eqv': 66.0,
   u'eunit': u'g',
   u'label': u'fruit (2-1/8" dia)',
   u'qty': 1.0,
   u'value': u'57.57'},
  {u'eqv': 151.0,
   u'eunit': u'g',
   u'label': u'NLEA serving',
   u'qty': 1.0,
   u'value': u'131.72'}],
 u'name': u'Water',
 u'nutrient_id': u'255',
 u'unit': u'g',
 u'value': u'87.23'}

Be sure to document and test your function.

def ndb_report(ndbno):
    """
    This queries the USDA Food composite database to make a search request for reports based on an NDB number
    Input: ndbno(takes in one number)
    Output: json object(of nutrient values of the NDB numbers provided)
    """
    urlbase = "https://api.nal.usda.gov/ndb/V2/reports?"
    try: 
        usdareq = requests.get(urlbase + 'ndbno={}&'.format("".join(ndbno)) + "type=b&"+ "format=json&" + "api_key={}".format(key))
        usda_json = usdareq.json()
        usda_json = usda_json['foods'][0]['food']['nutrients']
        return usda_json
    except (KeyError): 
        pass

ndb_report(nd)

[{u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'143.93'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'57.57'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'131.72'}],
  u'name': u'Water',
  u'nutrient_id': u'255',
  u'unit': u'g',
  u'value': u'87.23'},
 {u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'76'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'30'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'69'}],
  u'name': u'Energy',
  u'nutrient_id': u'208',
  u'unit': u'kcal',
  u'value': u'46'},
 {u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'1.15'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.46'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'1.06'}],
  u'name': u'Protein',
  u'nutrient_id': u'203',
  u'unit': u'g',
  u'value': u'0.70'},
 {u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.46'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.18'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.42'}],
  u'name': u'Total lipid (fat)',
  u'nutrient_id': u'204',
  u'unit': u'g',
  u'value': u'0.28'},
 {u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'18.84'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'7.54'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'17.24'}],
  u'name': u'Carbohydrate, by difference',
  u'nutrient_id': u'205',
  u'unit': u'g',
  u'value': u'11.42'},
 {u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'2.3'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.9'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'2.1'}],
  u'name': u'Fiber, total dietary',
  u'nutrient_id': u'291',
  u'unit': u'g',
  u'value': u'1.4'},
 {u'group': u'Proximates',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'16.37'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'6.55'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'14.98'}],
  u'name': u'Sugars, total',
  u'nutrient_id': u'269',
  u'unit': u'g',
  u'value': u'9.92'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'10'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'4'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'9'}],
  u'name': u'Calcium, Ca',
  u'nutrient_id': u'301',
  u'unit': u'mg',
  u'value': u'6'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.28'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.11'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.26'}],
  u'name': u'Iron, Fe',
  u'nutrient_id': u'303',
  u'unit': u'mg',
  u'value': u'0.17'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'12'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'5'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'11'}],
  u'name': u'Magnesium, Mg',
  u'nutrient_id': u'304',
  u'unit': u'mg',
  u'value': u'7'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'26'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'11'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'24'}],
  u'name': u'Phosphorus, P',
  u'nutrient_id': u'305',
  u'unit': u'mg',
  u'value': u'16'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'259'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'104'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'237'}],
  u'name': u'Potassium, K',
  u'nutrient_id': u'306',
  u'unit': u'mg',
  u'value': u'157'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0'}],
  u'name': u'Sodium, Na',
  u'nutrient_id': u'307',
  u'unit': u'mg',
  u'value': u'0'},
 {u'group': u'Minerals',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.17'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.07'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.15'}],
  u'name': u'Zinc, Zn',
  u'nutrient_id': u'309',
  u'unit': u'mg',
  u'value': u'0.10'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'15.7'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'6.3'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'14.3'}],
  u'name': u'Vitamin C, total ascorbic acid',
  u'nutrient_id': u'401',
  u'unit': u'mg',
  u'value': u'9.5'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.046'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.018'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.042'}],
  u'name': u'Thiamin',
  u'nutrient_id': u'404',
  u'unit': u'mg',
  u'value': u'0.028'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.043'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.017'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.039'}],
  u'name': u'Riboflavin',
  u'nutrient_id': u'405',
  u'unit': u'mg',
  u'value': u'0.026'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.688'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.275'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.630'}],
  u'name': u'Niacin',
  u'nutrient_id': u'406',
  u'unit': u'mg',
  u'value': u'0.417'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.048'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.019'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.044'}],
  u'name': u'Vitamin B-6',
  u'nutrient_id': u'415',
  u'unit': u'mg',
  u'value': u'0.029'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'8'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'3'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'8'}],
  u'name': u'Folate, DFE',
  u'nutrient_id': u'435',
  u'unit': u'\xb5g',
  u'value': u'5'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.00'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.00'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.00'}],
  u'name': u'Vitamin B-12',
  u'nutrient_id': u'418',
  u'unit': u'\xb5g',
  u'value': u'0.00'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'28'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'11'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'26'}],
  u'name': u'Vitamin A, RAE',
  u'nutrient_id': u'320',
  u'unit': u'\xb5g',
  u'value': u'17'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'569'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'228'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'521'}],
  u'name': u'Vitamin A, IU',
  u'nutrient_id': u'318',
  u'unit': u'IU',
  u'value': u'345'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.43'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.17'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.39'}],
  u'name': u'Vitamin E (alpha-tocopherol)',
  u'nutrient_id': u'323',
  u'unit': u'mg',
  u'value': u'0.26'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.0'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.0'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.0'}],
  u'name': u'Vitamin D (D2 + D3)',
  u'nutrient_id': u'328',
  u'unit': u'\xb5g',
  u'value': u'0.0'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0'}],
  u'name': u'Vitamin D',
  u'nutrient_id': u'324',
  u'unit': u'IU',
  u'value': u'0'},
 {u'group': u'Vitamins',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'10.6'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'4.2'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'9.7'}],
  u'name': u'Vitamin K (phylloquinone)',
  u'nutrient_id': u'430',
  u'unit': u'\xb5g',
  u'value': u'6.4'},
 {u'group': u'Lipids',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.028'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.011'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.026'}],
  u'name': u'Fatty acids, total saturated',
  u'nutrient_id': u'606',
  u'unit': u'g',
  u'value': u'0.017'},
 {u'group': u'Lipids',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.221'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.088'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.202'}],
  u'name': u'Fatty acids, total monounsaturated',
  u'nutrient_id': u'645',
  u'unit': u'g',
  u'value': u'0.134'},
 {u'group': u'Lipids',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.073'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.029'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.066'}],
  u'name': u'Fatty acids, total polyunsaturated',
  u'nutrient_id': u'646',
  u'unit': u'g',
  u'value': u'0.044'},
 {u'group': u'Lipids',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0.000'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0.000'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0.000'}],
  u'name': u'Fatty acids, total trans',
  u'nutrient_id': u'605',
  u'unit': u'g',
  u'value': u'0.000'},
 {u'group': u'Lipids',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0'}],
  u'name': u'Cholesterol',
  u'nutrient_id': u'601',
  u'unit': u'mg',
  u'value': u'0'},
 {u'group': u'Other',
  u'measures': [{u'eqv': 165.0,
    u'eunit': u'g',
    u'label': u'cup, sliced',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 66.0,
    u'eunit': u'g',
    u'label': u'fruit (2-1/8" dia)',
    u'qty': 1.0,
    u'value': u'0'},
   {u'eqv': 151.0,
    u'eunit': u'g',
    u'label': u'NLEA serving',
    u'qty': 1.0,
    u'value': u'0'}],
  u'name': u'Caffeine',
  u'nutrient_id': u'262',
  u'unit': u'mg',
  u'value': u'0'}]

#list of all foods
usda_list = [ndb_report(i) for i in ndbno]

kcal_list = [i[1]['value'] for i in usda_list]
kcal_list = pd.Series(map(int, kcal_list))
fresh_df['nutrition'] = kcal_list.values
usda_list[37][1]

{u'group': u'Proximates',
 u'measures': [{u'eqv': 116.0,
   u'eunit': u'g',
   u'label': u'cup slices',
   u'qty': 1.0,
   u'value': u'19'},
  {u'eqv': 9.0,
   u'eunit': u'g',
   u'label': u'large (1" to 1-1/4" dia)',
   u'qty': 1.0,
   u'value': u'1'},
  {u'eqv': 4.5,
   u'eunit': u'g',
   u'label': u'medium (3/4" to 1" dia)',
   u'qty': 1.0,
   u'value': u'1'},
  {u'eqv': 1.0,
   u'eunit': u'g',
   u'label': u'slice',
   u'qty': 1.0,
   u'value': u'0'},
  {u'eqv': 2.0,
   u'eunit': u'g',
   u'label': u'small',
   u'qty': 1.0,
   u'value': u'0'},
  {u'eqv': 58.0,
   u'eunit': u'g',
   u'label': u'cup slices',
   u'qty': 0.5,
   u'value': u'9'}],
 u'name': u'Energy',
 u'nutrient_id': u'208',
 u'unit': u'kcal',
 u'value': u'16'}

Exercise 1.4. Which foods provide the best combination of price, yield, and nutrition? You can use kilocalories as a measure of "nutrition" here, but more a detailed analysis is better. Use plots to support your analysis.

import seaborn as sns

sns.regplot('yield', 'price_per_lb', data = fresh_df, fit_reg = False)
plt.title('Yield vs. Price per Pound')
plt.xlabel('Yield')
plt.ylabel('Price per Pound')

<matplotlib.text.Text at 0x120d85b0>

sns.regplot('nutrition', 'price_per_lb', data = fresh_df, fit_reg = False)
plt.title('Nutrition vs. Price per Pound')
plt.xlabel('Nutrition in KCal')
plt.ylabel('Price per Pound')

<matplotlib.text.Text at 0x11ec9470>

It seems as though, the best combination of Nutrition, Price per Pound, and Yield can be calculated by this equation: (Nutrition * Yield)/(Price per Lb)

fresh_df['score'] = fresh_df['nutrition']*fresh_df['yield']/fresh_df['price_per_lb']
best = fresh_df.ix[fresh_df['score'] > 50]
best

For this analysis I used the amount of KCals in 100 grams of each food (nutrition), the Price per Pound, and the yield percentage. By multiplying the yield and nutrition, we get the amount of KCals that survived the growing and picking process of the fruit or vegetables per 100 grams. This eliminates many low yield and low energy foods. Next by dividing by the price per pound, instead of price per cup, we make sure that we still using weight instead of volume as our comparison tool. By diving by pricer per pound, we allow lower priced foods with high energy per yield to stand out and get higher scores.

In the end, excluding the wrong potato type, I ended up with bananas being the best combination of nutrition, price, and yield, with sweet potatoes and avocados coming in 2nd and 3rd place.

	ds	group	name	ndbno	food	form	price_per_lb	yield	lb_per_cup	price_per_cup	type
0	SR	Fruits and Fruit Juices	Watermelon, raw	09326	watermelon	Fresh1	0.333412	0.520000	0.330693	0.212033	fruit
1	SR	Fruits and Fruit Juices	Melons, cantaloupe, raw	09181	cantaloupe	Fresh1	0.535874	0.510000	0.374786	0.393800	fruit
2	SR	Fruits and Fruit Juices	Tangerines, (mandarin oranges), raw	09218	tangerines	Fresh1	1.377962	0.740000	0.407855	0.759471	fruit
3	SR	Fruits and Fruit Juices	Strawberries, raw	09316	strawberries	Fresh1	2.358808	0.940000	0.319670	0.802171	fruit
4	SR	Fruits and Fruit Juices	Plums, raw	09279	plums	Fresh1	1.827416	0.940000	0.363763	0.707176	fruit
5	SR	Fruits and Fruit Juices	Oranges, raw, California, valencias	09201	oranges	Fresh1	1.035173	0.730000	0.407855	0.578357	fruit
6	SR	Fruits and Fruit Juices	Raspberries, raw	09302	raspberries	Fresh1	6.975811	0.960000	0.319670	2.322874	fruit
7	SR	Fruits and Fruit Juices	Pomegranates, raw	09286	pomegranate	Fresh1	2.173590	0.560000	0.341717	1.326342	fruit
8	SR	Fruits and Fruit Juices	Pineapple, raw, all varieties	09266	pineapple	Fresh1	0.627662	0.510000	0.363763	0.447686	fruit
9	SR	Fruits and Fruit Juices	Apricots, raw	09021	apricots	Fresh1	3.040072	0.930000	0.363763	1.189102	fruit
10	SR	Fruits and Fruit Juices	Melons, honeydew, raw	09184	honeydew	Fresh1	0.796656	0.460000	0.374786	0.649077	fruit
11	SR	Fruits and Fruit Juices	Papayas, raw	09226	papaya	Fresh1	1.298012	0.620000	0.308647	0.646174	fruit
12	SR	Fruits and Fruit Juices	Cherries, sweet, raw	09070	cherries	Fresh1	3.592990	0.920000	0.341717	1.334548	fruit
13	SR	Fruits and Fruit Juices	Bananas, raw	09040	bananas	Fresh1	0.566983	0.640000	0.330693	0.292965	fruit
14	SR	Fruits and Fruit Juices	Apples, raw, with skin	09003	apples	Fresh1	1.567515	0.900000	0.242508	0.422373	fruit
15	SR	Fruits and Fruit Juices	Peaches, yellow, raw	09236	peaches	Fresh1	1.591187	0.960000	0.341717	0.566390	fruit
16	SR	Fruits and Fruit Juices	Nectarines, raw	09191	nectarines	Fresh1	1.761148	0.910000	0.319670	0.618667	fruit
17	SR	Fruits and Fruit Juices	Pears, raw	09252	pears	Fresh1	1.461575	0.900000	0.363763	0.590740	fruit
18	SR	Fruits and Fruit Juices	Grapefruit, raw, white, California	09117	grapefruit	Fresh1	0.897802	0.490000	0.462971	0.848278	fruit
19	SR	Fruits and Fruit Juices	Blackberries, raw	09042	blackberries	Fresh1	5.774708	0.960000	0.319670	1.922919	fruit
20	SR	Fruits and Fruit Juices	Grapes, muscadine, raw	09129	grapes	Fresh1	2.093827	0.960000	0.330693	0.721266	fruit
21	SR	Fruits and Fruit Juices	Blueberries, raw	09050	blueberries	Fresh1	4.734622	0.950000	0.319670	1.593177	fruit
22	SR	Fruits and Fruit Juices	Mangos, raw	09176	mangoes	Fresh1	1.377563	0.710000	0.363763	0.705783	fruit
23	SR	Vegetables and Vegetable Products	Asparagus, raw	11011	asparagus	Fresh1	3.213494	0.493835	0.396832	2.582272	vegetables
24	SR	Vegetables and Vegetable Products	Lettuce, iceberg (includes crisphead types), raw	11252	lettuce iceberg	Fresh1	1.213039	0.950000	0.242508	0.309655	vegetables
25	SR	Vegetables and Vegetable Products	Onions, raw	11282	onions	Fresh1	1.038107	0.900000	0.352740	0.406868	vegetables
26	SR	Vegetables and Vegetable Products	Turnip greens, raw	11568	turnip greens	Fresh1	2.471749	0.750000	0.319670	1.053526	vegetables
27	SR	Vegetables and Vegetable Products	Mustard greens, raw	11270	mustard greens	Fresh1	2.569235	0.840000	0.308647	0.944032	vegetables
28	SR	Vegetables and Vegetable Products	Potatoes, raw, skin	11362	potatoes	Fresh1	0.564320	0.811301	0.264555	0.184017	vegetables
29	SR	Vegetables and Vegetable Products	Collards, raw	11161	collard greens	Fresh1	2.630838	1.160000	0.286601	0.650001	vegetables
30	SR	Vegetables and Vegetable Products	Beans, snap, green, raw	11052	green beans	Fresh1	2.139972	0.846575	0.275578	0.696606	vegetables
31	SR	Vegetables and Vegetable Products	Squash, winter, acorn, raw	11482	acorn squash	Fresh1	1.172248	0.458554	0.451948	1.155360	vegetables
32	SR	Vegetables and Vegetable Products	Peppers, sweet, red, raw	11821	red peppers	Fresh1	2.277940	0.820000	0.264555	0.734926	vegetables
33	SR	Vegetables and Vegetable Products	Swamp cabbage, (skunk cabbage), raw	11503	cabbage	Fresh green cabbage1	0.579208	0.778797	0.330693	0.245944	vegetables
34	SR	Vegetables and Vegetable Products	Swamp cabbage, (skunk cabbage), raw	11503	cabbage	Fresh red cabbage1	1.056450	0.779107	0.330693	0.448412	vegetables
35	SR	Vegetables and Vegetable Products	Sweet potato, raw, unprepared	11507	sweet potatoes	Fresh1	0.918897	0.811301	0.440925	0.499400	vegetables
36	SR	Vegetables and Vegetable Products	Squash, summer, scallop, raw	11475	summer squash	Fresh1	1.639477	0.769500	0.396832	0.845480	vegetables
37	SR	Vegetables and Vegetable Products	Radishes, raw	11429	radish	Fresh1	1.311629	0.900000	0.275578	0.401618	vegetables
38	SR	Vegetables and Vegetable Products	Squash, winter, butternut, raw	11485	butternut squash	Fresh1	1.244737	0.714000	0.451948	0.787893	vegetables
39	SR	Fruits and Fruit Juices	Avocados, raw, California	09038	avocados	Fresh1	2.235874	0.740753	0.319670	0.964886	vegetables
40	SR	Vegetables and Vegetable Products	Kale, raw	11233	kale	Fresh1	2.807302	1.050000	0.286601	0.766262	vegetables
41	SR	Vegetables and Vegetable Products	Jerusalem-artichokes, raw	11226	artichoke	Fresh1	2.213050	0.375309	0.385809	2.274967	vegetables
42	SR	Vegetables and Vegetable Products	Okra, raw	11278	okra	Fresh1	3.213552	0.769474	0.352740	1.473146	vegetables
43	SR	Vegetables and Vegetable Products	Peppers, sweet, green, raw	11333	green peppers	Fresh1	1.410363	0.820000	0.264555	0.455022	vegetables
44	SR	Vegetables and Vegetable Products	Brussels sprouts, raw	11098	brussels sprouts	Fresh1	2.763553	1.060000	0.341717	0.890898	vegetables
45	SR	Vegetables and Vegetable Products	Corn, sweet, white, raw	11900	corn sweet	Fresh1	2.690623	0.540000	0.363763	1.812497	vegetables