Search for content in message boards

How to see deep ethnicity (an alternative to the pie chart)

Replies: 183

How to see deep ethnicity (an alternative to the pie chart)

Posted: 21 Mar 2013 5:55PM GMT
Classification: Query
Edited: 22 Mar 2013 7:01PM GMT
Once GEDmatch is ready for the onslaught of Ancestry DNA files, it will be easy to look at several detailed analyses of your "admix," aka the ethnicity pie chart. Until then, I have a kludgy/inefficient/Windows-centric method to share for those who didn't get the gene that appreciates delayed gratification. :)

The instructions below will tell you how to see your "biogeographical analysis" using Eurogenes K35. Here's a post with some background about the test: http://bga101.blogspot.com.au/2013/03/eurogenes-k36-at-gedma... (I know it says K36, but I only know where to get K35 files.)


First, convert your Ancestry DNA file to 23andMe format.

Edited 3/22: WAIT! Garry B. tells me we can actually skip this step. Woot! I'll leave the conversion instructions, but set them off in tildes below for easy skipping.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(Why 23andMe? Because my knowledge of how the programs below work would barely provide shelter for a gnat, so I can only go with what I know. I hope others will post better, easier, more efficient ways to do everything that follows.)

I don't have Excel, and Google Docs crawled off to die when I tried to load my Ancestry file, so I used a text editor. (I like Textpad.) You probably want to make a copy of the file you downloaded and work off of that.

1. Remove all the lines starting with hash marks (#) at the top of your raw data file. (Shouldn't really matter, but I removed them anyway.)

2. Replace (although, again, shouldn't really matter)
rsid chromosome position allele1 allele2
with
# rsid chromosome position genotype

3. Edited to add: You may have to delete all the lines at the bottom with "24" or "25" in the second position on the line. (I'm not sure if this is necessary, but I had to do it when I made a FTDNA file for Promethease, which I then converted to 23andMe format to try K35, so FYI.)

4. Close the gap between the two letters at the end of each line.

In Textpad, I would do this by searching for "T\tT" and replacing with "TT". (\t tells Textpad to look for a tab.) I would then do it for all possible pairs. (AG, AC, CC, etc.) In other editors/word processors that don't recoginize \t, just highlight and copy the letters and tab into the search box.

Thanks to APTurner, I can add that in a spreadsheet like Excel, you can combine two columns with this formula: =CONCATENATE(A1," ",B1).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Now you have to download a few things, and from here on out I'm drawing *heavily* on a post on 23andMe's forums by Peradam. (Those forums are closed to non-customers, plus a link here would be frowned upon. The links below are all to free tools, not to Ancestry's competitors, so we should be good.)


Download the DIYDodecad software (Click "File" then "Download" then extract to a folder):
https://docs.google.com/file/d/0B7AJcY18g2GaZGU4OWQ5OWItMzY2...

Download the Eurogenes K35 files into the same folder:
https://skydrive.live.com/?cid=5223cc821fdfeb45&id=5223C...

Place your converted "23andMe-style" raw data file into that same folder.

Download and install http://www.r-project.org/.


Everything is now in place. Time to start the party!


Open the R program and change the directory (File > Change Dir) to the folder where you placed your DIY, K37, and converted raw data files.

Enter this command:
source('standardize.r')

Enter this command, changing the name of the text file to whatever you've called yours:
standardize('RENAMETHISTOYOURRAWDATAFILE.txt', company='23andMe')

(I always have a lag after this step.)

Enter this command:
system('DIYDodecadWin admix.par')


And off it goes. (For around 50 minutes, in my case, during which time you may think the program has hung up, but odds are it's all fine.) What you'll end up with is a list of populations and percentages. Ta-da!

(Okay, so it's no more useful than the pie chart, but at least it's quite different and gives you something to think about, if you like to think about deep ancestry.)

It seems there's a way to "paint" each pair of chromosomes to show which population is on which part of which chromosome. This can be useful when comparing with matches (not that we can do that yet). Alas, trying to figure out the command to do this makes my head swim, so I figure I'll just wait for GEDmatch. (If anyone knows how to do this, please holler!)
SubjectAuthorDate Posted
Shari S. 21 Mar 2013 11:55PM GMT 
carolynsgems2... 22 Mar 2013 4:56AM GMT 
Shari S. 22 Mar 2013 5:30AM GMT 
carolynsgems2... 22 Mar 2013 5:40AM GMT 
sfyri 22 Mar 2013 6:09AM GMT 
Shari S. 22 Mar 2013 10:56PM GMT 
sfyri 22 Mar 2013 11:10PM GMT 
Shari S. 23 Mar 2013 4:52AM GMT 
carolynsgems2... 22 Mar 2013 5:43PM GMT 
Shari S. 23 Mar 2013 4:40AM GMT 
per page

Find a board about a specific topic