How to download sample files

Google SettingsIf you want to download random documents, such as Microsoft Word, as used in the articles “What document properties are actually used in .doc/.docx files“, then go to Google and click on Advanced Search.

Unfortunately, the location of Advanced Search changes according to which version of Google you are using. One way is to do a search, and click on the settings button (the wheel icon) and click on Advanced Search.

Google Advanced Search

In the Advanced Search window, go to File Type and select the type of file you want. The options are:

  • Adobe Acrobat PDF (.pdf)
  • Adobe Postscript (.ps)
  • Autodesk DWG (.dwg)
  • Google Earth KML or KMZ (.kml / .kmz)
  • Microsoft Excel (.xls)
  • Microsoft PowerPoint (.ppt)
  • Microsoft Word (.doc)
  • Rich Text Format (.rtf)
  • Shockwave Flash (.swf)

You may also want to change the Language of the files. In the article “What document properties are actually used in Microsoft Excel files“, files were downloaded in English, French, Spanish, Portuguese, German and Russian in order to secure a wide selection.

Then click Advanced Search.

Results of Google Search

It will be noted that the above list does not include “.docx” (i.e. Microsoft Word 2007 documents), or their equivilient in Excel or PowerPoint. However, that is easily remedied by adding the letter “x” after “filetype:doc” in the search bar.

The number of results that are returned by default are only 10. This can be increased to 100 by:

  • Clicking on the settings icon,
  • Change “Google Instance predictions” to “Never show Instant results”, and
  • Change “Results per page ” to 100 results.
  • Click “Save”.

To download all the results, use Download Master for Chrome, which can download all the documents linked to a webpage with a specified extension (such as Word documents). Sometimes this will fail, especially if the files are large as in PowerPoint presentations. If this happens, open the second page of Google and download those files as well in order to obtain a sufficiently large source.

Leave a Reply

Your email address will not be published. Required fields are marked *