Search – A Basic Guide

There are several types of search on archive.org:

  • General Metadata Search
  • Full-Text Search
  • Bookreader Text Search
  • TV News Captions
  • Wayback Machine Search

You may also be looking for:

  • Where is advanced search?
  • What search APIs are available?
  • Can I search by Creative Commons license?
  • How do I sort search results?
  • How do I search just within a collection?
  • How can I view search results as a list instead of picture tiles?
  • What is indexed in the search engine?
  • What are the lists of metadata on the left side of search results pages?

This uses the metadata on item pages, like title, creator, description, subjects, etc. It returns results from pages on the site.

Boolean operators (AND, OR, AND NOT and ranges using TO) work for this search.

To use this on the site, select Search metadata in the drop-down that appears when you type a query in the input text field.

When you enter text into the main search bar on the homepage, you can also choose Search metadata

Click on the blue Advanced Search below the input text field.

You will be taken to the Advanced Search page.

The Advanced Search also uses item metadata on item pages.

It allows you to use the form to create more complex search queries and also to output the results in several formats: JSON, XML HTML, CSV, and RSS URL.

You will find more information and help below the form.

To do an advanced search, click on the following link:

Advanced search 

Full-Text Search (FTS)

This allows you to search inside books and other text items.

It uses the OCRed text file derived from the processing of uploaded text formatted files such as PDF or scanned image zips.

It will search all texts on the site.

To use this on the site, select Search text contents in the drop-down that appears when you type a query in the input text field.

This allows you to search the contents of a single item when using the bookreader. The search field is in the upper right in the bookreader.

Enter the search item into the search bar that says Search Inside. The system will search inside the book and results will appear as blue icons.

Mouse over the blue icon.

Click on the preferred search result to be taken to the correct place in the book.

TV News Captions

This uses the search engine for the TV News Archive.  It searches the closed caption files for captured TV News items.

To use this on the site, select Search TV news captions in the drop-down that appears when you type a query in the input text field.

Alternatively, click on the Video icon.

Select the TV News category.

Enter your criteria into the search bar.

To search for websites on the Wayback Machine, enter the specific URL of the website into the search bar and select Search archived websites in the drop-down.

Alternatively, click on the Web icon on the upper left side of the black bar.

Enter the correct URL into the Wayback Machine search bar.

Select the desired year.

Select the date and the time.

You will be taken to a snapshot of the website.

For more information on using the search on the Wayback Machine visit this link:  Wayback Machine search.

You will find Advanced Search at the bottom of the drop-down or below the search input field.

What search APIs are available?

Information about how to use the various search APIs can be found at aboutsearch.

Can I search by Creative Commons license?  

Yes, you can. But it’s a little complicated.

1. You need to go and find the license types at the Creative Commons Website. You can do so at the following link creative commons.

2. Scroll down the page on the Creative Commons Website to find the abbreviations.

3. When you want to find all of the items assigned a certain license by an uploading party, you’ll plug their abbreviation for it into this search query:

licenseurl:http*abbreviation*

So if you’re looking for Attribution Non-commercial No Derivatives (by-nc-nd), you’d put this in the search box:

licenseurl:http*by-nc-nd*

If you want to use this in combination with other queries, like “I want by-nc-nd items about dogs” you’d do this:

licenseurl:http*by-nc-ndAND dog

The AND tells the search engine all the items returned should have that license AND they should contain the word dog. AND has to be in all caps.

Just to make it easier, here are the basic searches:

How do I sort search results?

The SORT BY bar has options to allow you to control which results are displayed, in what order and what view:

How do I search just within a collection?

On a collection page, there will be a Search this Collection input field on the left side of the page.

Enter a term there and hit your return/enter key.

The results will be of items in that collection.

For advanced Boolean search, you can use AND collection:[IDENTIFIER] in your query.

How can I view search results as a list instead of picture tiles?

For most search results pages, you can choose the view in the Sort by bar: Tile view (the icon with three rectangles) or List view (the icon with multiple lines.) Tile view is the default view. Click which view you prefer, and your choice will “stick” as you navigate the site until you switch back to the other view.

What is indexed in the search engine?

Only the metadata in an item page is indexed.

The search engine does not have the text of books, individual file metadata, or embedded metadata.

What are the lists of metadata on the left side of search results pages?

Those are facets. These are categories of metadata that, when selected, narrow your results.

The number on the right of the type of facet indicates how many results there are for that selection.

You can select multiple facets to narrow down your search.

Clicking More will show additional facets to choose from. 

Once you have finished selecting your filters, click Apply your filters.