What are “worker” (downloader) threads?
Apart from the tracker threads, you can specify additional threads to take charge of downloading urls. The urls can be downloaded in these threads instead of consuming the time of the fetcher threads. These threads are launched ‘apriori’, similar to the tracker threads, before the start of the crawl. By default, HarvestMan launches a set of 10 of these worker threads which are managed by a thread pool object. The fetcher threads delegate the actual job of downloading to the workers. However, if the worker threads are disabled, the fetchers will do the downloads themselves. These threads also die only at the end of a HarvestMan crawl.