Thursday, January 14, 2010

Cross referencing items in a spreadsheet using VLOOKUP()

I'm in the mist of a typical cheminformatics problem: how to convert plate-well data in to something more useful.  Normally our founding programmer would handle things like this, but he's stretched thin, and therefore it may take weeks to see a solution on his end.

The first step is to aggregate ones data;  a mapping of all chemical structures to their respective plate-well locations, a list of interesting plate-well locations.  Once aquired, one must choose a way to cross-reference the data.  My background is in Java, which isn't quite as nice as python, or pear to preform database manuplations on.  Therefore, staying in a spreasheet is like Calc (openoffice) or excel is my best option.



When cross referencing data in a spread sheet one uses the commands LOOKUP(), HLOOKUP, or VLOOKUP.  As my data happen to have to generally reference row-to-row, I will focus on the VLOOKUP() command.  VLOOKUP has the following properties:

VLOOKUP(Search Criterion, Array, Index, Sort Order)

"Search Criterion" - is the data (likely in the current row of the current spreadsheet) that you want to index against.

"Array" is the block of data containing the Search Criterion as the first column, and the data you want to import in a later column. 
  • For instance, if the feature I'm referencing spans column B from rows 2-51 the and the data I want to pull in is in the same rows of column D, the array would be structured:  
B2:D51
  • Most likely this data is contained in another sheet, or entirely different file.  For this case one must prefix the above block with the exact path to the workbook, the # symbok, and the name of the sheet of interest followed by a period.  For instance, if the data is contained in a sheet called "sheet1" in a separate file, called "referenceFile.ods",  in a folder called Documents in my home directory, one would put the following before the cell block specified above.
'file:///home/myusername/Documents/referenceFile.ods'#sheet1.
  • Finally, as we probably don't want the range of the block specified in the first part of this example to move, we need to add a '$' to each part of the range.  Thus the entire "Array" argument would look like:  
'file:///home/myusername/Documents/referenceFile.ods'#$sheet1.$B$2:$D$51

The "Index" argument is simply the number of columns to the right of my initial column specified in the "Array" argument.
  • In the example above, column D (the data I want to import) is 2 columns to the right of column B (my referencing column), thus the index is 2.

No comments:

Post a Comment