Code here written by Erica Krimmel. Please see Use Case: Find tissue samples for context.

# Load core libraries; install these packages if you have not already
library(ridigbio)
library(tidyverse)

# Load library for making nice HTML output
library(kableExtra)

We need to start with a data frame that we get by querying the idig_search_records function and that includes the field recordset (it is included by default). For simplicity sake you can rename your own data frame records to most easily reuse the code in this example.

# Get data frame to use as example
records <- idig_search_records(rq = list(family = "veneridae", 
                                         county = "los angeles county"))

Our example records data frame looks like this:

uuid occurrenceid catalognumber family genus scientificname country stateprovince geopoint.lon geopoint.lat datecollected collector recordset
00ea8cd3-68ee-48f3-b0e4-fa556bccd576 urn:catalog:ucmp:i:237778 237778 veneridae saxidomus saxidomus nuttalli united states california NA NA NA NA 5ab348ab-439a-4697-925c-d6abe0c09b92
01f20e87-ba23-4edd-8a98-1c5bd47146e6 urn:catalog:ucmp:i:233957 233957 veneridae amiantis amiantis callosa united states california NA NA NA NA 5ab348ab-439a-4697-925c-d6abe0c09b92
02210ee1-adb2-4657-b665-5e1120d5344c urn:catalog:ucmp:i:237218 237218 veneridae globivenus globivenus fordii united states california NA NA NA NA 5ab348ab-439a-4697-925c-d6abe0c09b92
027eb9f9-80c2-4c53-9438-ae52487ebbbc urn:catalog:ucmp:i:245128 245128 veneridae saxidomus saxidomus nuttalli united states california NA NA NA NA 5ab348ab-439a-4697-925c-d6abe0c09b92
02877ef3-7948-48f7-b579-1acaaaab38d5 urn:catalog:ucmp:i:231776 231776 veneridae amiantis amiantis callosa united states california NA NA NA NA 5ab348ab-439a-4697-925c-d6abe0c09b92
03e33d03-6170-4093-8c8f-22c13c232048 http://arctos.database.museum/guid/dmns:inv:14809?seid=2048855 dmns:inv:14809 veneridae leukoma leukoma laciniata united states california -118.1336 33.77104 NA collector(s): james e. steadman 1e86442f-35a5-4e7b-9a38-4599e4d3b510

We will use attributes attached to our records data frame to figure out contact information for each of the recordsets providing data here. For background reading on what we mean by attributes, see Hadley Wikham’s explanation in Advanced R. We can use attributes here because the ridigbio package has structured the results of the idig_search_records function in a specific way. The code below will not work as expected with a data frame that did not originate from the idig_search_records function.

# Count how many records in the data were contributed by each recordset
recordtally <- records %>% 
  group_by(recordset) %>% 
  tally()

# Get metadata from the attributes of the `records` data frame
collections <- tibble(collection = attr(records, "attribution")) %>% 
  # Expand information captured in nested lists
  hoist(collection, 
        recordset_uuid = "uuid",
        recordset_name = "name",
        recordset_url= "url",
        contacts = "contacts") %>% 
  # Get rid of extraneous attribution metadata
  select(-collection) %>% 
  # Expand information captured in nested lists
  unnest_longer(contacts) %>% 
  # Expand information captured in nested lists
  unnest_wider(contacts) %>% 
  # Remove any contacts without an email address listed
  filter(!is.na(email)) %>% 
  # Get rid of duplicate contacts within the same recordset
  distinct() %>% 
  # Rename some columns
  rename(contact_role = role, contact_email = email) %>% 
  # Group first and last names together in the same column
  unite(col = "contact_name", 
        first_name, last_name, 
        sep = " ", 
        na.rm = TRUE) %>% 
  # Restructure data frame so that there is one row per recordset
  group_by(recordset_uuid) %>% 
  mutate(contact_index = row_number()) %>% 
  pivot_wider(names_from = contact_index,
                values_from = c(contact_name, contact_role, contact_email)) %>%
  # Include how many records in the data were contributed by each recordset
  left_join(recordtally, by = c("recordset_uuid"="recordset")) %>% 
  # Rearrange columns so that contact information is grouped by person
  select(starts_with("recordset"),
         "recordset_recordtally" = n,
         contains("1"),
         contains("2"),
         contains("3"),
         contains("4"),
         contains("5"),
         contains("6"),
         contains("7"),
         contains("8"),
         contains("9"),
         contains("10"),
         everything()) %>% 
  # Get rid of any rows which don't actually contribute data to `records`;
  # necessary because the attribute metadata by default includes all recordsets
  # in iDigBio that match the `idig_search_records` query, even if you filter
  # or limit those results in your own code
  filter(recordset_uuid %in% records$recordset)

Our newly constructed collections data frame contains contact information for each of the collections (i.e. recordsets) providing data, and looks like this:

recordset_uuid recordset_name recordset_url recordset_recordtally contact_name_1 contact_role_1 contact_email_1 contact_name_2 contact_role_2 contact_email_2 contact_name_3 contact_role_3 contact_email_3 contact_name_4 contact_role_4 contact_email_4 contact_name_5 contact_role_5 contact_email_5 contact_name_6 contact_role_6 contact_email_6 contact_name_7 contact_role_7 contact_email_7 contact_name_8 contact_role_8 contact_email_8
5ab348ab-439a-4697-925c-d6abe0c09b92 University of California Museum of Paleontology 625 Joyce Gross Programmer Patricia Holroyd Museum Scientist Diane Erwin Senior Museum Scientist for Paleobotany Erica Clites Museum Scientist for Invertebrate Paleontology NA NA NA NA NA NA NA NA NA NA NA NA
5082e6c8-8f5b-4bf6-a930-e3e6de7bf6fb LACM Invertebrate Paleontology https://nhm.org/site/research-collections/invertebrate-paleontology 63 Austin Hendy Collection Manager William Mertz Database Manager Kevin Love NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
6bb853ab-e8ea-43b1-bd83-47318fc4c345 UF Invertebrate Zoology 54 Gustav Paulay Curator of Invertebrate Zoology Warren Brown IT Director NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
bd61c458-b865-4b05-9f1f-735c49066e55 CAS Invertebrate Zoology (IZ) http://www.calacademy.org/scientists/izg-collections 26 Stanley Blum Research Information Manager Jon Fong Programmer Christina Piotrowski IZ Collections Manager, Department of Invertebrate Zoology & Geology NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
41b119de-f745-482d-be42-a0155bc76e5d CMC Cincinnati Museum Center Invertebrate Paleontology 16 Brenda Hunda Curator of Invertebrate Paleontology Anne Kling Manager, Collection Databases and Websites NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
e8a10a16-86af-42b2-be40-9d6a1b21859a CHAS Malacology Collection (Arctos) http://www.naturemuseum.org/the-museum/collections/invertebrates 13 Dawn Roberts Director of Collections Erica Krimmel Assistant Collections Manager David Bloom Coordinator John Wieczorek Information Architect NA NA NA NA NA NA NA NA NA NA NA NA
774a153b-e556-47f6-95d1-bab49e61cc58 ANSP Malacology 7 Collections Management Biodiversity Informatics Manager Biodiversity Informatics Manager NA Collection Management Biodiversity Informatics Manager NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
1ba0bbad-28a7-4c50-8992-a028f79d1dc5 University of Florida Invertebrate Paleontology 6 Roger Portell Collection Manager Office of Museum Technology OMT OMT NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
1e86442f-35a5-4e7b-9a38-4599e4d3b510 DMNS Marine Invertebrate Collection (Arctos) http://www.dmns.org/science/collections/dmns-zoology-collections 5 Paula Cushing Curator of Invertebrate Zoology Laura Russell VertNet Programmer David Bloom VertNet Coordinator John Wieczorek Information Architect Dusty McDonald Arctos Database Programmer Phyllis Sharp Departmental Associate Bryan Johnson Departmental Associate NA NA NA
137ed4cd-5172-45a5-acdb-8e1de9a64e32 Invertebrate Paleontology Division, Yale Peabody Museum 3 Larry Gall Head, Computer Systems Office Susan Butts Division of Invertebrate Paleontology NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
97058091-eb35-401b-b286-18465761f832 Delaware Museum of Natural History – Mollusks http://www.delmnh.org/mollusks/ 1 NA Elizabeth Shea NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
a6eee223-cf3b-4079-8bb2-b77dad8cae9d NMNH Extant Specimen Records http://collections.nmnh.si.edu 1 Thomas Orrell NMNH Informatics Chris Tuccinardi Information Management Karen Reed Data Manager Jessica Bird Collections Information Manager Jeff Williams Collection Manager Kenneth Tighe Database Coordinator Brian Schmidt Museum Specialist Ingrid Rochon Scientific Data Manager

We can contact each collection by looking for the most appropriate person listed in each row, often someone with the role of “collection manager” or “curator.” Because each collection publishes this kind of metadata separately, sometimes the contacts listed also include people who are not directly responsible for managing physical specimens, and who may not be able to help you. These people often have roles such as “information architect,” “programmer,” or “database manager.” All contacts listed per recordset have been included here, and it is up to you to decide who to reach out to.

It is frequently helpful to provide your collection contact with a spreadsheet listing the specimen records you are interested in. We can generate these spreadsheets automatically, as shown in the code below.

# Generate a spreadsheet for each recordset containing only the rows provided by
# that recordset, and named according to the recordset uuid.
for (i in seq_along(collections$recordset_uuid)) {
  
  filename <- paste("records_", collections$recordset_uuid, ".csv",
                    sep = "", na = "")
  
  subset <- records %>% 
    filter(recordset == collections$recordset_uuid[i])
  
  # Save files to your working directory
  write_csv(subset, filename[i])
}

For specific research requests there are many ways you could modify the code demonstrated here to be more helpful, e.g. by including additional fields available through idig_search_records. See also the ridigbio function idig_build_attrib for a summary of recordsets used by records in the data frame, minus contact information.