│By Sarah L. Ketchley, Senior Digital Humanities Specialist│
This month’s blog post will discuss how to start the work of sourcing research documents in Gale Primary Sources (GPS) archives, before transitioning seamlessly to Gale Digital Scholar Lab to create content sets, clean OCR text data, and conduct analyses of this material to answer research questions. With this methodology, researchers are able to use the rich contextual detail and varied navigation options to begin compiling their corpus of text data outside of the Lab, which can be an attractive option if the user has an existing working knowledge of specific GPS archives, such as The Times Digital Archive, Women’s Studies Archive, or Nineteenth Century Collections Online.
There is a standardised user experience across GPS and the Lab, making the transition from one to the other familiar and streamlined. However, there are options to view documents in GPS that aren’t yet available in the Lab, which may make combining both access points useful so that no document slips through the cracks!
What’s in the Archive?
To kick off your research, a recommended point of access is via the Gale Product Menu, which will list all the archives your institution has purchased, along with a brief description of what’s in each research product. You can click into specific archives from this page, many of which have useful Learning Centers with contextual information, sample search queries, and guided research questions.
Further information about archival content is available at support.gale.com in the Product Title Lists, which are downloadable as .txt or .xls files. This granular list provides a birds-eye view of the material available for research and can suggest search strategies as the researcher transitions into the GPS platforms.
Expanding Searches from Archive to GPS
Let’s consider how we can search within individual archives, and then broaden the search by shifting to Gale Primary Sources, which is a cross-search platform integrating all the archives available at your institution. The image above shows the landing page of The Times Digital Archive, which offers various options to orient the researcher, including details about how the archive was created, and a full Learning Center.
You can search within the archive using a basic or advanced search if you know what you’re looking for. But you can also browse the entire archive by date and click into individual issues to explore on a page-by-page basis, to have a more immersive experience within the print publications of the day.
The sidebar in each document gives additional ways to engage with the material: searching within the document or navigating through the table of contents on a page-by-page basis. While this work may be slow, data curation is an essential component of creating relevant content sets in the Lab, so taking the time to look closely at documents is worthwhile, and amply supported by a variety of ways to explore within individual archives.
An important thing to note about search results in the Lab vs search results in GPS is that the latter has subject expansion built into the search algorithm, while the Lab does not. This results in more documents listed in GPS searches than in those carried out in the Lab.
The image below shows a sample set of search results. Once the researcher is happy with the search string they’ve created, and has assessed the returned results, there are two options for continuing investigation. The first is to expand the search beyond whichever archive is being used (in this case, it’s The Times Digital Archive) by clicking the ‘Gale Primary Sources’ button in the ‘Broaden Your Search’ section. This carries the search string across to GPS, which will return results from all archives owned by your institution.
The second option is to select ‘Digital Scholar Lab’, and the search string will be carried across to the Lab, where a researcher can choose to add up to 10,000 documents from this search into a new or existing content set.
Starting with Gale Primary Sources
Another option to kick off research is to begin directly in Gale Primary Sources. This is a cross-search platform/discovery engine which will generate results from all the archives your institution has purchased.
A basic search will search keywords through all digital archives, which means the title of the document, the author, the date, the publication title, the subject indexing, and about the first hundred words. For more granularity, it’s better to start with the Advanced Search option, which allows for precise searching based on parameters you define, like the date, or targeted searches on an author name, or publication title.
Trends in GPS Search Results
GPS offers a couple of analysis methods for digging into trends and topics in a list of search results. These insights can be helpful in building out a list of themes to explore or flesh out with continued searching. Topic Finder and Term Frequency are both accessed from the right panel in the search results page.
Term Frequency gives a bird’s eye view of the search terms over time. The visualisation is dynamic: clicking on individual points in the display will show how many documents appear in that year for the specific search term. A researcher can then click on this point and be taken to a list of the documents.
This results list, for a given year, could then be expanded on by following the link to broaden the search in GPS, or it could be taken into the Lab for further analysis, OCR text cleaning, and so on.
Topic finder provides an overview of the main topics found in the initial list of search results. In the example, below, you can see that I’ve clicked into the ‘Tomb’ topic which opens a more detailed window giving a graphical view of the topics with ‘tomb’ in them, and a list of related documents which can then be explored on an individual level. Again, this process can be incorporated into the researcher’s data curation efforts to dig into archival material and identify the most relevant documents for inclusion in content sets in the Lab.
Kickstarting Research
This post highlights the potential for multiple approaches to building out collections of research data using Gale Primary Sources, individual archives, and Gale Digital Scholar Lab. It’s often difficult to know where to start when accessing millions of pages of primary source material, but Gale offers solutions to kickstart research through product Learning Centers, expanded keyword results, and various viewing options to dig deep into the records of the past.
If you enjoyed reading this blog post, check out others in the ‘Notes from our DH Correspondent’ series, which include: