Monday, December 15, 2014

Most popular songs containing most decade-specific words in Billboard's popular music charts

This post is an adjunct to my dataviz on prooffreader.com, "Most decade-specific words in Billboard popular songs titles, 1890-2014".

Here's the viz, click to enlarge:

The code I used to download and process it is nicely formatted with NBViewer as an IPython notebook on my GitHub. The data comes not from Billboard itself, but from  www.bullfrogspond.com; I don't know much about the data source, but it certainly looks thorough and painstaking, and up to date.

Here are some observations about the methodology, written with both the non-data-expert and the cognoscenti in mind.

What's "keyness" all about? Keyness is a common approach when comparing word frequencies between two sets; it's particularly useful when you're comparing two sets of unequal size (in this case, I was comparing, for example, all of the words from the 2010s, to every word in the entire dataset). I used the log-likelihood method, which returns a measure of the statistical significance of finding that word in that decade. A keyness of  about 11 means there's only a 0.1% chance that you would get the same result higher picking words from the entire collection at random instead of restricting yourself to the words in that subset (decade), and about 14 is a 0.01% chance. There's a good, intermediate-technical description of log-likelihood here.

The contour charts: I made this with Excel sparklines since it was so easy. The absolute scale of the y axis is different for each one, otherwise the less popular words would be invisible. If you look at some of the contours, you'll see that there are actually higher bars in other decades than the highlighted one; that's because "keyness" considers other factors than just the heights of the bars (such as the relative number of words in the decade).

Why is "f**k" censored? It was like that in the dataset, I'm no prude. Fuck fuck fuck fuck fuck.

The binning effect: It's natural to bin decades together, but it's completely arbitrary in a statistical sense. The popular songs in 1960s far, far more resembled those of the 1950s than they did of what one thinks of as "sixties music". What this means for this dataset is that words that tend to respect this artificial boundary will be overrepresented. For example, if a word was particularly popular between 1972 and 1979, it will have a higher keyness than one that was popular across a decade break from 1976 to 1983. That's just a choice that has to be made to make for an easy-to-grasp analysis, if I were analyzing things more important than song titles I'd be more rigorous in this regard.

Why the top five from each decade instead of the top keyness overall?  Basically because it was more interesting this way. The database goes back to 1890, and there are fewer songs overall back then with more uncommon words, which means they have higher keyness and any such list would be full of songs nobody's heard of. The top word in the 2010s, "we", is number 63 overall, and over half of the words about it are from the first three decadess.

I considered leaving out entire decades, but the number of songs in them was small but not negligible, as you can see from the following chart (that I made in about 14 seconds with Chartbuilder):


In the end, I went with the more interesting approach. Data visualization is narrative by nature, don't let anyone tell you otherwise.

Finally, here's a table with the most popular song for each decade containing the top-five word in question. "Most popular" is decided by a metric particular to the dataset source, but it seems thorough and defensible.

Dec.  Word      Ky. Max.  Most popular song
2010s We         22 1.4%  Rihanna, "We Found Love" (2011)
      Yeah       18 0.2%  Austin Mahone, "Mmm Yeah" (2014)
      Hell       18 0.3%  Avril Lavigne, "What The Hell" (2011)
      F**k       15 0.1%  Cee Lo Green, "F**K You (Forget You)" (2011)
      Die        14 0.2%  Ke$ha, "Die Young" (2012)
2000s U          71 1.1%  Usher, "U Got It Bad" (2001)
      Like       28 1.1%  T.I., "Whatever You Like" (2008)
      Breathe    25 0.2%  Faith Hill, "Breathe" (2000)
      It         24 2.4%  Usher, "U Got It Bad" (2001)
      Ya         19 0.7%  OutKast, "Hey Ya!" (2003)
1990s U          49 1.1%  Sinead O'Connor, "Nothing Compares 2 U" (1990)
      You        28 5.1%  Stevie B, "Because I Love You (The Postman Song)" (1990)
      Up         21 1.0%  Brandy, "Sittin' Up In My Room" (1996)
      Get        20 1.0%  En Vogue, "My Lovin' (You're Never Gonna Get It)" (1992)
      Thang      18 0.2%  Dr. Dre, "Nuthin' But A "G" Thang" (1993)
1980s Love       48 3.8%  Joan Jett & The Blackhearts, "I Love Rock 'N Roll" (1982)
      Fire       24 0.5%  Billy Joel, "We Didn't Start The Fire" (1989)
      Don't      20 1.6%  Human League, The, "Don't You Want Me" (1982)
      Rock       14 0.7%  Joan Jett & The Blackhearts, "I Love Rock 'N Roll" (1982)
      On         14 3.2%  Bon Jovi, "Livin' On A Prayer" (1987)
1970s Woman      33 0.6%  The Guess Who, "American Woman" (1970)
      Disco      31 0.4%  Johnnie Taylor, "Disco Lady" (1976)
      Rock       24 0.7%  Elton John, "Crocodile Rock" (1973)
      Music      24 0.6%  Wild Cherry, "Play That Funky Music" (1976)
      Dancin'    20 0.5%  Leif Garrett, "I Was Made For Dancin'" (1979)
1960s Baby       51 1.9%  Supremes, The, "Baby Love" (1964)
      Twist      24 0.7%  Joey Dee & the Starliters, "Peppermint Twist - Part 1" (1962)
      Little     16 4.0%  Steve Lawrence, "Go Away Little Girl" (1963)
      Twistin'   15 0.4%  Chubby Checker, "Slow Twistin'" (1962)
      Lonely     14 0.5%  Bobby Vinton, "Mr. Lonely" (1964)
1950s Christmas  31 0.8%  Art Mooney & Orch., "(I'm Getting) Nuttin' For Christmas" (1955)
      Penny      18 0.4%  Dinah Shore & Tony Martin, "A Penny A Kiss" (1951)
      Mambo      15 0.5%  Perry Como, "Papa Loves Mambo" (1954)
      Rednosed   15 0.3%  Gene Autry, "Rudolph, the Red-Nosed Reindeer" (1950)
      Three      15 0.5%  Browns, The, "The Three Bells" (1959)
1940s Polka      50 0.4%  Kay Kyser & Orch., "Strip Polka" (1942)
      Serenade   35 0.7%  Andrews Sisters, "Ferry Boat Serenade" (1940)
      Boogie     28 0.6%  Will Bradley & Orch., "Scrub Me, Mama, With a Boogie Beat" (1941)
      Blue       26 1.6%  Tommy Dorsey & Frank Sinatra, "In The Blue Of Evening" (1943)
      Christmas  22 0.8%  Bing Crosby, "White Christmas" (1942)
1930s Moon       79 1.4%  Glenn Miller & Orch., "Moon Love" (1939)
      In         38 6.5%  Ted Lewis & His Band, "In A Shanty In Old Shanty Town"  (1932)
      Swing      34 0.5%  Ray Noble & Orch., "Let's Swing It" (1935)
      Sing       34 1.4%  Benny Goodman & Martha Tilton, "And the Angels Sing" (1939)
      A          30 5.8%  Ted Lewis & His Band, "In A Shanty In Old Shanty Town" (1932)
1920s Blues     153 3.1%  Paul Whiteman & Orch., "Wang Wang Blues" (1921)
      Pal        42 0.9%  Al Jolson, "Little Pal" (1929)
      Sweetheart 27 0.9%  Isham Jones & Orch., "Nobody's Sweetheart" (1924)
      Rose       25 1.4%  Ted Lewis & His Band, "Second Hand Rose" (1921)
      Mammy      23 1.0%  Paul Whiteman & Orch., "My Mammy" (1921)
1910s Gems       70 1.1%  Victor Light Opera Co., "Gems from 'Naughty Marietta'" (1912)
      Rag        52 1.2%  Original Dixieland Jazz Band, "Tiger Rag" (1918)
      Home       43 2.9%  Henry Burr, "When You're a Long, Long Way from Home" (1914)
      Land       41 0.6%  Al Jolson," Hello Central, Give Me No Man's Land" (1918)
      Old        38 3.7%  Harry Macdonough, "Down by the Old Mill Stream" (1912)
1900s Uncle      58 4.5%  Cal Stewart, "Uncle Josh's Huskin' Bee Dance" (1901)
      Old        58 3.7%  Haydn Quartet, "In the Good Old Summer Time" (1903)
      Josh       44 3.7%  Cal Stewart, "Uncle Josh On an Automobile" (1903)
      Reuben     38 1.4%  S. H. Dudley, "When Reuben Comes to Town" (1901)
      When       33 3.8%  George J. Gaskin, "When You Were Sweet Sixteen" (1900)
1890s Uncle      59 4.5%  Cal Stewart, "Uncle Josh's Arrival in New York" (1898)
      Casey      54 3.3%  Russell Hunting, "Michael Casey Taking the Census" (1892)
      Josh       53 3.7%  Cal Stewart, "Uncle Josh at the Opera" (1898)
      Old        26 3.7%  Dan Quinn, "A Hot Time in the Old Town" (1896)
      Michael    24 2.7%  Russell Hunting, "Michael Casey Taking the Census" (1892)


• • •

Thursday, November 27, 2014

How to quickly turn an IPython notebook into a blog post

IPython notebooks are great for many things, but they're a little awkward to embed in blog post platforms like Blogger, Wordpress, etc. When the nbconvert feature was a standalone command-line tool, there was a blog export template, but that seems to have disappeared now that nbconvert has been folded within IPython.

Out of the box, nbconvert just has two html export options:
  • --html
    which includes a lot of CSS that interferes with a blog's CSS, and:
  • --html --template basic
    which has no CSS and so pretty much negates the benefit of using an IPython notebook. However, it does have CSS classess in the text.

My solution was to whip up a quick CSS stylesheet that could be included in the blog post. It seems to work pretty well; you can have a look at:

Note that, for aesthetic reasons, I removed all the
In [1]
-style tags because of the narrow columns on this blog. Your mileage may vary.


1. Convert .ipynb notebook to HTML


In the terminal, navigate to the folder containing the .ipynb file and type:

ipython nbconvert --to html --template basic filename.ipynb

2. Paste HTML in blog


Note: if you're using the Blogger platform, never switch back to the Compose interface after you use the HTML interface, it changes all your tags.

3. Add CSS to blog HTML


This seems to reproduce the native syntax highlighting of IPython.

<style type="text/css">
.highlight{background: #f8f8f8; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .1em;padding:0em .5em;border-radius: 4px;}
.k{color: #338822; font-weight: bold;}
.kn{color: #338822; font-weight: bold;}
.mi{color: #000000;}
.o{color: #000000;}
.ow{color: #BA22FF;  font-weight: bold;}
.nb{color: #338822;}
.n{color: #000000;}
.s{color: #cc2222;}
.se{color: #cc2222; font-weight: bold;}
.si{color: #C06688; font-weight: bold;}
.nn{color: #4D00FF; font-weight: bold;}
</style>
• • •

Wednesday, November 26, 2014

Top 10 Python idioms I wish I'd learned earlier



I've been programming all my life, but never been a programmer. Most of my work was done in Visual Basic because it's what I was most comfortable with, plus a smattering of other languages (R, C, JavaScript, etc... Pascal, Applescript, Hypertext and BASIC, which I learned in 1979, if you go far back enough). A couple of years ago, I decided to use Python exclusively so that I could improve my coding. I ended up re-inventing many wheels -- which I didn't mind too much, because I enjoy solving puzzles. But sometimes it's good to have a more efficient, Pythonesque approach, and time after time I found myself having "aha!" moments, realizing I'd been doing things the hard, excessively verbose way for no reason. Here is a list of ten Python idioms that would have made my life much easier if I'd thought to search for them early on.

Missing from this list are some idioms such as list comprehensions and lambda functions, which are very Pythonesque and very efficient and very cool, but also very difficult to miss because they're mentioned on StackOverflow every other answer! Also ternary x if y else z constructions, decorators and generators, because I don't use them very often.

There's also an IPython notebook nbviewer version of this document if you prefer.



1. Python 3-style printing in Python 2


One of the things that kept me from concentrating on Python was this whole version 2 - version 3 debacle. Finally I went with Python 2 because all the libraries I wanted were not 3-compatible, and I figured if I needed to, I would laboriously adjust my code later.

But really, the biggest differences in everyday programming are printing and division, and now I just import from future. Now that almost all the libraries I use heavily are v3-compliant, I'll make the switch soon.
mynumber = 5

print "Python 2:"
print "The number is %d" % (mynumber)
print mynumber / 2,
print mynumber // 2

from __future__ import print_function
from __future__ import division

print('\nPython 3:')
print("The number is {}".format(mynumber))
print(mynumber / 2, end=' ')
print(mynumber // 2)
Python 2:
The number is 5
2 2

Python 3:
The number is 5
2.5 2

Oh, and here's an easter egg for C programmers:
from __future__ import braces
  File "<ipython-input-3-2aebb3fc8ecf>", line 1
    from __future__ import braces
SyntaxError: not a chance

2. enumerate(list)


It might seem obvious that you should be able to iterate over a list and its index at the same time, but I used counter variables or slices for an embarrassingly long time.
mylist = ["It's", 'only', 'a', 'model']

for index, item in enumerate(mylist):
    print(index, item)
0 It's
1 only
2 a
3 model

3. Chained comparison operators


Because I was so used to statically typed languages where this idiom would be ambiguous, it never occurred to me to put two operators in the same expression. In many languages, 4 > 3 > 2 would return as False, because (4 > 3) would be evaluated as a boolean, and then True > 2 would be evaluated as False. EDIT: I'm informed that, in Python, this construction falls under the category of "syntactic sugar".
mynumber = 3

if 4 > mynumber > 2:
    print("Chained comparison operators work! \n" * 3)
Chained comparison operators work! 
Chained comparison operators work! 
Chained comparison operators work! 


4. collections.Counter


The collections library is, like, the best thing ever. Stackoverflow turned me on to ordered dicts early on, but I kept using a snippet to create dicts to count occurrences of results in my code. One of these days, I'll figure out a use for collections.deque.
from collections import Counter
from random import randrange
import pprint
mycounter = Counter()
for i in range(100):
    random_number = randrange(10)
    mycounter[random_number] += 1
for i in range(10):
    print(i, mycounter[i])
0 10
1 10
2 13
3 6
4 6
5 11
6 10
7 14
8 12
9 8

5. Dict comprehensions


A rite of passage for a Python programmer is understanding list comprehensions, but eventually I realized dict comprehensions are just as useful -- especially for reversing dicts.
my_phrase = ["No", "one", "expects", "the", "Spanish", "Inquisition"]
my_dict = {key: value for value, key in enumerate(my_phrase)}
print(my_dict)
reversed_dict = {value: key for key, value in my_dict.items()}
print(reversed_dict)
{'Inquisition': 5, 'No': 0, 'expects': 2, 'one': 1, 'Spanish': 4, 'the': 3}
{0: 'No', 1: 'one', 2: 'expects', 3: 'the', 4: 'Spanish', 5: 'Inquisition'}

6. Executing shell commands with subprocess


I used to use the os library exclusively to manipulate files; now I can even programmatically call complex command-line tools like ffmpeg for video editing

(And yes, I use Windows, so do all of my clients. But I have the good grace to be embarrassed about it!)

Note that the particular subprocess I picked would be much better done with the os library; I just wanted a command everyone would be familiar with. And in general, shell=True is a VERY bad idea, I put it here so that the command output would appear in the IPython notebook cell. Don't try this at home, kids!
import subprocess
output = subprocess.check_output('dir', shell=True)
print(output)
 Volume in drive C is OS
 Volume Serial Number is [REDACTED]
 Directory of C:\Users\David\Documents\[REDACTED]

2014-11-26  06:04 AM    <DIR>          .
2014-11-26  06:04 AM    <DIR>          ..
2014-11-23  11:47 AM    <DIR>          .git
2014-11-26  06:06 AM    <DIR>          .ipynb_checkpoints
2014-11-23  08:59 AM    <DIR>          CCCma
2014-09-03  06:58 AM            19,450 colorbrewdict.py
2014-09-03  06:58 AM            92,175 imagecompare.ipynb
2014-11-23  08:41 AM    <DIR>          Japan_Earthquakes
2014-09-03  06:58 AM             1,100 LICENSE
2014-09-03  06:58 AM             5,263 monty_monte.ipynb
2014-09-03  06:58 AM            31,082 pocket_tumblr_reddit_api.ipynb
2014-11-26  06:04 AM             3,211 README.md
2014-11-26  06:14 AM            19,898 top_10_python_idioms.ipynb
2014-09-03  06:58 AM             5,813 tree_convert_mega_to_gexf.ipynb
2014-09-03  06:58 AM             5,453 tree_convert_mega_to_json.ipynb
2014-09-03  06:58 AM             1,211 tree_convert_newick_to_json.py
2014-09-03  06:58 AM            55,970 weather_ML.ipynb
              11 File(s)        240,626 bytes
               6 Dir(s)  180,880,490,496 bytes free

7. dict .get() and .iteritems() methods


Having a default value when a key does not exist has all kinds of uses, and just like enumerate() for lists, you can iterate over key, value tuples in dicts
my_dict = {'name': 'Lancelot', 'quest': 'Holy Grail', 'favourite_color': 'blue'}

print(my_dict.get('airspeed velocity of an unladen swallow', 'African or European?\n'))

for key, value in my_dict.iteritems():
    print(key, value, sep=": ")
African or European?

quest: Holy Grail
name: Lancelot
favourite_color: blue

8. Tuple unpacking for switching variables


Do you know how many times I had to use a third, temporary dummy variable in VB? c = a; a = b; b = c?
a = 'Spam'
b = 'Eggs'

print(a, b)

a, b = b, a

print(a, b)
Spam Eggs
Eggs Spam

9. Introspection tools


I was aware of dir(), but I had assumed help() would do the same thing as IPython's ? magic command. It does way more. (This post has been updated after some great advice from reddit's /r/python which, indeed, I wish I'd known about before!)
my_dict = {'That': 'an ex-parrot!'}
    
help(my_dict)
Help on dict object:

class dict(object)
 |  dict() -> new empty dictionary
 |  dict(mapping) -> new dictionary initialized from a mapping object's
 |      (key, value) pairs
 |  dict(iterable) -> new dictionary initialized as if via:
 |      d = {}
 |      for k, v in iterable:
 |          d[k] = v
 |  dict(**kwargs) -> new dictionary initialized with the name=value pairs
 |      in the keyword argument list.  For example:  dict(one=1, two=2)
 |  
 |  Methods defined here:
 |  
 |  __cmp__(...)
 |      x.__cmp__(y) <==> cmp(x,y)
 |  
 |  __contains__(...)
 |      D.__contains__(k) -> True if D has a key k, else False
 |  
 |  __delitem__(...)
 |      x.__delitem__(y) <==> del x[y]
 |  
 |  __eq__(...)
 |      x.__eq__(y) <==> x==y
 |  
 
[TRUNCATED FOR SPACE]

 |  
 |  update(...)
 |      D.update([E, ]**F) -> None.  Update D from dict/iterable E and F.
 |      If E present and has a .keys() method, does:     for k in E: D[k] = E[k]
 |      If E present and lacks .keys() method, does:     for (k, v) in E: D[k] = v
 |      In either case, this is followed by: for k in F: D[k] = F[k]
 |  
 |  values(...)
 |      D.values() -> list of D's values
 |  
 |  viewitems(...)
 |      D.viewitems() -> a set-like object providing a view on D's items
 |  
 |  viewkeys(...)
 |      D.viewkeys() -> a set-like object providing a view on D's keys
 |  
 |  viewvalues(...)
 |      D.viewvalues() -> an object providing a view on D's values
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __hash__ = None
 |  
 |  __new__ = 
 |      T.__new__(S, ...) -> a new object with type S, a subtype of T

10. PEP-8 compliant string chaining


PEP8 is the style guide for Python code. Among other things, it directs that lines not be over 80 characters long and that indenting by consistent over line breaks.

This can be accomplished with a combination of backslashes \; parentheses () with commas , ; and addition operators +; but every one of these solutions is awkward for multiline strings. There is a multiline string signifier, the triple quote, but it does not allow consistent indenting over line breaks.

There is a solution: parentheses without commas. I don't know why this works, but I'm glad it does.
my_long_text = ("We are no longer the knights who say Ni! "
                "We are now the knights who say ekki-ekki-"
                "ekki-p'tang-zoom-boing-z'nourrwringmm!")
print(my_long_text)
We are no longer the knights who say Ni! We are now the knights who say ekki-ekki-ekki-p'tang-zoom-boing-z'nourrwringmm!

• • •

Monday, November 24, 2014

Python code to plot CCCma NetCDF files in Basemap

On my other blog, prooffreader.com, I posted an animated GIF of projected White Christmases between now and the year 2100, based on NetCDF files from the Canadian Centre for Climate Modeling and Analysis. There are plenty of examples of NetCDF with Basemap out there, but the CCCma's format was a little different, so I thought I'd post this here in case it's helpful to anyone.

Links:
  • Github repo with this script and the one I used to make the animated GIF with the nice background
  • nbviewer version of this script in nice IPython format.

• • •

Monday, October 27, 2014

Fan commentary for 2013 film 'Coherence'

I was quite taken with the film Coherence, which a friend suggested I download from iTunes. Being in Canada, iTunes wouldn't let me, but I figured a way around it; contact me at prooffreader at gmail dot com if you want the deets.

Briefly, Coherence is a microbudget ($50 000) film, taking place at a dinner party, when a comet passing overhead seems to be changing the very nature of reality itself... it's got Emily Foxler, aka Emily Baldoni, with a terrific performance; Nicholas Brendon (Xander from Buffy the Vampire Slayer), Elizabeth Gracen from Highlander: The Series and others in an ensemble cast.

I enjoyed the film so much I made this. The film is not widely released, so my competition was low. Hopefully that will change!

I track (no spoilers) all the Hughs and Amirs, the glowsticks, the objects in the box, the houses, I try to determine what people know when and what their motivations are when it's a little opaque, and I try to figure out what exactly is going on with the person in the bathroom at the end. And I wax philosophical, humorous (I think; I hope!) and analytical. Or perhaps I wane, who knows?

Link to Audio File at yourlisten.com:


Link to Video File on YouTube:
(Basically an audio file with a screenshot of the film every 30 seconds on average)



Link to the film's Wikipedia page

Link to the film's IMDB page

• • •

Wednesday, September 17, 2014

Python scripts to shorten column names, or to fetch Google Ngrams data

I've made a couple new GitHub repos:

google_ngram_py, which allows you to look up one- to five-word phrases in Google Ngrams Viewer (which shows the frequency by year) from python and returns the data as pandas dataframes, separated into parent and child for case-insensitive searches (e.g. parent is 'the (All)', children are 'the', 'The', 'THE').

shorten_column_names, which allows you to find the most common words in a list of phrases and abbreviate them; I used them for shortening the sometimes 100+-character column names from World Bank data (e.g. population -> pop), but you could use it on any list of strings.
• • •

Monday, July 14, 2014

The 100 trendiest baby names (determined by an analytical chemistry technique)

With the technique described at prooffreader.com, here are the 100 trendiest names from the U.S. Social Security Administration database:

Name Sex Trendiness Peak height (%) Peak width (years) Peak range
Linda F 0.183 5.67 31 1938-1969
Dewey M 0.151 0.91 6 1897-1903
Brittany F 0.121 2.05 17 1983-2000
Debra F 0.118 2.59 22 1950-1972
Shirley F 0.112 4.04 36 1921-1957
Ashley F 0.105 3.16 30 1980-2010
Jennifer F 0.105 4.3 41 1961-2002
Deborah F 0.104 2.82 27 1947-1974
Lisa F 0.1 3.41 34 1955-1989
Jessica F 0.092 3.22 35 1971-2006
Jason M 0.085 3.48 41 1968-2009
Betty F 0.069 3.4 49 1910-1959
Barbara F 0.067 3.56 53 1918-1971
Judith F 0.063 1.96 31 1934-1965
Grover M 0.059 0.71 12 1883-1895
Amy F 0.058 2.21 38 1958-1996
Carol F 0.055 2.32 42 1928-1970
Patricia F 0.053 3.13 59 1922-1981
Heather F 0.052 1.67 32 1966-1998
Mark M 0.052 2.75 53 1946-1999
Melissa F 0.052 2.12 41 1960-2001
Dorothy F 0.051 3.24 63 1893-1956
Joan F 0.049 1.97 40 1924-1964
Tammy F 0.049 1.22 25 1957-1982
Debbie F 0.048 0.97 20 1952-1972
Jaime F 0.045 0.53 12 1975-1987
Lori F 0.043 1.24 29 1954-1983
Karen F 0.042 1.99 47 1937-1984
Sandra F 0.042 2.02 48 1934-1982
Woodrow M 0.041 0.46 11 1911-1922
Sharon F 0.041 1.77 43 1935-1978
Judy F 0.039 1.3 33 1935-1968
Michelle F 0.039 2.03 52 1954-2006
Gary M 0.039 2.03 52 1933-1985
Brian M 0.039 2.29 59 1950-2009
Cynthia F 0.038 1.92 50 1942-1992
Kathy F 0.038 1.19 31 1943-1974
Tracy F 0.038 1.06 28 1959-1987
Larry M 0.037 1.91 51 1931-1982
Scott M 0.037 1.75 47 1950-1997
Chelsea F 0.037 0.88 24 1982-2006
Kimberly F 0.036 2.01 55 1956-2011
Donald M 0.036 2.95 81 1902-1983
Donna F 0.035 1.8 51 1925-1976
Pamela F 0.033 1.41 43 1942-1985
Ronald M 0.033 2.05 63 1926-1989
Cindy F 0.032 0.99 31 1951-1982
Cheryl F 0.032 1.2 38 1943-1981
Megan F 0.031 1.16 37 1973-2010
Angela F 0.031 1.6 51 1953-2004
Dolores F 0.031 1.14 37 1914-1951
Jeffrey M 0.03 1.69 56 1944-2000
Cody M 0.029 1 34 1978-2012
Crystal F 0.029 1.13 39 1963-2002
Tiffany F 0.029 1.03 36 1968-2004
Carolyn F 0.028 1.48 52 1923-1975
Diane F 0.028 1.19 43 1932-1975
Tina F 0.027 0.9 33 1955-1988
Dawn F 0.027 0.9 33 1952-1985
Joyce F 0.027 1.3 48 1921-1969
Steven M 0.027 1.84 68 1941-2009
Amber F 0.027 0.99 37 1971-2008
Whitney F 0.026 0.56 21 1979-2000
Kelly F 0.026 1.19 46 1958-2004
Chad M 0.026 0.83 32 1966-1998
Courtney F 0.025 0.81 32 1974-2006
Kim F 0.025 0.62 25 1952-1977
Todd M 0.024 0.84 35 1956-1991
Doris F 0.024 1.48 62 1897-1959
Timothy M 0.024 1.64 69 1942-2011
Mildred F 0.024 1.59 67 1881-1948
Marilyn F 0.024 1.06 45 1922-1967
Shannon F 0.023 0.91 39 1963-2002
Danielle F 0.023 0.98 42 1967-2009
Dennis M 0.023 1.32 57 1932-1989
Stephanie F 0.022 1.37 61 1949-2010
Kelsey F 0.022 0.64 29 1983-2012
Brittney F 0.021 0.42 20 1982-2002
Brenda F 0.021 1.21 58 1939-1997
Erin F 0.021 0.89 43 1966-2009
Carole F 0.02 0.62 31 1931-1962
Robin F 0.02 0.78 39 1950-1989
Gladys F 0.019 1.12 58 1887-1945
Julie F 0.019 1.05 55 1942-1997
Rhonda F 0.019 0.62 33 1949-1982
Christina F 0.019 1.07 57 1949-2006
Jamie F 0.019 0.86 46 1959-2005
Kathleen F 0.018 1.54 84 1908-1992
Gregory M 0.018 1.06 58 1945-2003
Cathy F 0.018 0.54 31 1945-1976
Janice F 0.018 0.89 51 1924-1975
Beverly F 0.017 0.91 52 1921-1973
Jerry M 0.017 1.37 79 1904-1983
Misty F 0.017 0.41 24 1967-1991
Tonya F 0.017 0.49 29 1959-1988
Lindsay F 0.017 0.52 31 1976-2007
Kristin F 0.016 0.58 36 1963-1999
Lindsey F 0.016 0.54 33 1976-2009
Denise F 0.016 0.78 48 1946-1994
Brandy F 0.016 0.44 27 1972-1999

Here are links to my Baby Name GitHub Repo, and to an IPython notebook for this analysis.
• • •