How to upload scanned images to make a book

Internet Archive has millions of books available for free download, but we still don’t have them all. If you would like to add a book to the library, we encourage you to scan and upload it if you believe you have the appropriate rights to do so.

NOTE: If you are starting with a single PDF file (as opposed to images of book pages), the upload process is more straightforward.  In this case, you may wish to review the information below starting from step 4, but using a MARC record is optional. You can simply upload the PDF and hand-enter the metadata as you would with any other type of media.

Basic steps to upload images for a book:

  1. Scan the book
  2. Create a .zip or .tar file of images with the pages in sequential order
  3. Name the .zip/.tar file correctly (e.g. identifier_images.zip)
  4. (Optional) Prepare a correctly named MARC metadata record (e.g. identifier_marc.xml)
  5. Upload the files to archive.org
  6. Verify the metadata is correct and the book formats were derived properly

Details for Each Step

1. Scan the book

There are many ways to create images of book pages. Librarians may have special equipment for scanning books, while home users may use a combination scanner/printer to capture images. These are the key things to remember when creating page images:

  • Scan all pages, including the cover and back page.
  • Scan at the highest possible quality level (we recommend 300 dpi minimum).
  • Image files can end with .jp2, .jpg, .jpeg, .tif, .tiff, .gif, .bmp or .png. Any combination of these can be used.
  • Check the quality of the images before uploading:
    • No pages are skipped.
    • All images are crisp.
  • If possible, crop your scanned images to show just the page.
  • If possible, make sure the lines of text in the images are exactly horizontal (or vertical for some languages).

2. Create a .zip or .tar file of the images in sequential order

  • Make sure the file names for the page images are in sequential order. The cover should come first, then all the pages of the book in order, and ending with the back cover.
  • Put the files in a folder (they may be in one already).
  • Turn the folder into a .zip or .tar file. There are many methods for doing this, but these are the most common:
    • On a Mac, control-click on the folder and choose Compress from the drop-down menu. (more)
    • In Windows, right-click on the folder and choose Send To then Compress. (more)

3. Name the .zip/.tar file correctly (e.g. identifier_images.zip)
It is critical to name your files correctly for this process to work smoothly. On archive.org your uploaded book will have a unique identifier that is part of the URL.

Example: https://archive.org/details/leatherwork00wils

In this example, leatherwork00wils is the unique identifier for the book. You may choose your own unique identifier (as long as it is not already in use).

You will use this identifier to name your .zip/.tar file (and any MARC metadata file you choose to upload). Your file name should begin with your unique identifier and end in _images.zip or _images.tar

Example: leatherwork00wils_images.zip

4. (Optional) Prepare a correctly named MARC metadata record (e.g. identifier_marc.xml)

A MARC record is a MAchine Readable Cataloging record. It contains metadata like title, author, description, subjects, etc. (learn more). It is the digital equivalent of a catalog card.

You are not required to upload a MARC record with your book. You can type in all of the appropriate metadata by hand in the next step.

Advantages of uploading a MARC record:

  • Less typing! We will pull the metadata out of the MARC record for you.
  • Libraries love them! Having a MARC record means libraries can more easily use or refer to the book you upload.

You can find MARC records online at the Library of Congress. Search their catalog for your book and if they have it you can download a MARCXML record or a MARC21 record.

If you have a MARCXML record, name your file with your identifier and the ending _marc.xml:

Example:  leatherwork00wils_marc.xml

If you have a MARC21 record, name your file with your identifier and the ending _meta.mrc:

Example: leatherwork00wils_meta.mrc

5. Upload the files to archive.org
You must have a free archive.org account to upload. Get one here.  Sign in to the site and click the upload icon in the navigation bar.

  • Press the green Upload Files button.
  • Drag and drop your .zip/.tar file (and optional MARC file) into the grey area on the upload page, or use the Choose files to upload button to find the files.
  • Make sure the Page URL ends in the unique identifier that you chose for your files.
  • Fill out the Title, Description, Subject, Creator and Date fields:
    • If you did NOT upload a MARC record please fill out all of these fields with as many details as possible. This is how people can find your book! More is better!
    • If you uploaded a MARC record, you can enter minimal metadata for the description and subject. After we process your book, we will automatically pull all of the metadata out of the MARC record to populate these fields.
  • Select a collection from the drop-down menu (Community Texts, unless you have your own collection)
  • Select a language. This step is critical to make sure we OCR your images correctly. OCR (optical character recognition) is the process of turning your images into text.
  • Select a license if you know what it should be (or leave this blank).
  • Click Submit

Depending on the speed of your internet connection, the size of your files, and how busy our website is, it may take several minutes (or longer!) to complete your upload and create a page for your book.

When the process is complete, you will be automatically redirected to your item page. 

6. Verify the metadata is correct and the book formats were derived properly
After your upload is complete, it will take anywhere from minutes to hours for us to process (derive) your images into all of our book formats.  

In the meantime, you will see this message on your item page:

When this process is complete, you can simply refresh the page and you will see a preview of your book at the top of the page and all of the derived file formats in the download options section on the right.

If you uploaded a MARC record, you will also see the full metadata that was pulled from the record and put into the page.

Please take a moment to verify all of your metadata.

If you entered temporary placeholder text during the upload process, you should remove it now by clicking the Edit link near the title.

At this point, you should have a beautiful ebook! You can download various formats, full screen the bookreader, share the book on social media, or embed it on another website.

Thanks for adding a book to the library!

Was this helpful?

13 / 0