06 April 2021

Family Tree Fun for Computer Geeks

I started a project in 2019 and said I would share it with you soon. It was harder than I thought, so I put it aside until now. The results are interesting to see.

The geeky background is this:

  • In Family Tree Maker, I exported my latest GEDCOM file
  • In Family Tree Analyzer (free), I imported the GEDCOM and exported a spreadsheet of all facts
  • In Power BI Desktop (free), I imported the spreadsheet and built different views of my data

A few minutes into reviving this project, I noticed a friend's blog post on a similar topic. He showed all the pie charts he generated from his family tree on MyHeritage. I have only the most basic tree on that website, so I got back to work in Power BI Desktop.

In Power BI Desktop, I created graphs showing:

1. Last name occurrences in my family tree from most to least. Most of the last names in my tree (currently with 27,900 people) come from one town. That's no surprise. Years ago I pieced together my Grandpa Leone's town using 1809–1860 vital records. I added 15,000 people to my family tree.

I also see big numbers for last names from Grandpa Iamarino's town. The name Pozzuto is in the #1 position by far. That's because I made an effort to fit every last Pozzuto from the vital records into my tree. My maiden name is in 10th place because I've spent time pushing to find my closest Iamarino relatives.

Do you know what are the most common names in your family tree? This tool can tell you.
Do you know what are the most common names in your family tree? This tool can tell you.

2. First name occurrences in my family tree from most to least. My family is 100% from southern Italy, from the region called Campania. I'll bet the most common first names in my family tree are almost the same as other Campania family trees.

The most common first names in my family tree are:

  • Giuseppe
  • Angelamaria
  • Giovanni
  • Antonio
  • Francesco
  • Domenico
  • Pasquale
  • Maria

3. Birth locations plotted on a world map. The Power BI software plotted every birth location from my family tree on a map. I love zooming into southern Italy to see how centralized my Italian ancestors were. Draw a straight line from the Bay of Naples to the spur of the Italian boot, and that's where my DNA comes from.

Almost any type of family tree data can be plotted to give you the big picture.
Almost any type of family tree data can be plotted to give you the big picture.

4. Ahnentafel numbers from 1 to 2,691. I created a chart using a custom field in my GEDCOM called Ahnentafel. (Each of my direct ancestors has their Ahnentafel number in this field in Family Tree Maker.) I put the numbers on both the X and Y axis of a scatter plot for an interesting visualization of the gaps.

I know almost all my direct ancestors up to Ahnentafel number 748. Then there's a sprinkling from 999 to 1,392. Finally, I have a gigantic gap with two stragglers at 2,136 and 2,691. It's exciting to see my progress this way.

5. Number of children per marriage. I made a pie chart for the number of children in every marriage in my family tree. More than a third of my marriages have only one child. I'll bet I'm missing a ton of kids. That sounds like something to work on. About a quarter of the marriages in my tree have between 4 and 14 kids!

What do you think is the average number of children per family in your family tree?
What do you think is the average number of children per family in your family tree?

6. Drill-through by type of data. I'm familiar with this type of chart, but I never thought of using it for genealogy. I started with every individual in my family tree. Then I broke them down by last name. Then I broke each last name down by first name. I followed that with birth location, birth date, marriage date, and death date.

It may not be the most useful tool, but it is cool. I can choose any last name in my tree, then a first name and a birth place. I can click each one to see which facts I have in my tree.

This drill-through chart lets you follow anyone in your family tree through a series of events.
This drill-through chart lets you follow anyone in your family tree through a series of events.

For instance, I can click my name of Iamarino, and then the most common first name of Antonio. Now I see all the locations where an Antonio Iamarino was born. Next I'll click the town name (Colle Sannita) where I have 7 Antonio Iamarinos. Next comes the birth dates of the 7 men. I clicked each one until I found an Antonio for whom I have all the basic facts: birth, marriage, and death dates.

If you'd like to see statistics for your family tree, you can:

6 comments:

  1. You kill me! No wonder your family is alive to you...you play with them. It's amazing! You've found a way to take what could be boring statistics and make them come alive. Bravo!

    ReplyDelete
    Replies
    1. I really got a kick out of your comment, Denise. Thanks!

      Delete
  2. Working on learning PowerBI, and found your article after it occurred to me to see what I can do with the Family Tree Maker data I am 'cleansing' (place names). Thanks! If I run into any obstacles I may just contact you.

    ReplyDelete
    Replies
    1. I'm happy to find someone else who likes this idea.

      Delete
    2. Likewise. Also have SQL Server installed ... could be a chance to hone some "pipeline" skills I'm interested in as well. I work with data in my job, sadly have too many other hats/distractions to devote to or build skills in PBI/ETL as I'd like.

      Delete
    3. I was recently playing with SQL again to help someone access an Italian records database. I am every kind of self-taught hack.

      Delete