MIA: History: USA: How I scanned New Masses
How I scanned New Masses
by Marty Goodman Brookyn NY November 2017
[Martin H. Goodman MD, Director, Riazanov Library digital archive projects]Introduction:
The combination of this text and the pdf file of my archival scan of the May 07 1935 (Volume 15, Number 6) issue of New Masses is intended to not just inform how I scanned that one issue, or even somewhat more generally how I made the digital archive of New Masses 1926 - 1935. It's intended most broadly to inform regarding my approach to this kind of digital archiving, so that others can consider using it, or some aspects or variation of it, when they approach making similar digital archives.
The approach presented here is fundamentally the same as that which I've used for other "art-preserving" archiving of socialist and communist periodicals, such as The Masses 1911 - 1917 and The Liberator 1918 - 1924.
There are some minor differences between how I scanned the May 07 1935 issue of New Masses and the scanning techniques in projects I did earlier. These are the result of my gradually learning more about scanning over the years I've been archiving such material, of better scanning equipment becoming more affordable thus available to me over the years, of some changes in what is considered appropriate limits on file size for archival material posted on Marxists.org and elsewhere, etc.
For example, until recently I did not have available to me the means to make scans with optical (true) resolutions of greater than 600 dpi. And it wasn't until a few years ago that I decided 600 dpi was a minimum appropriate resolution for rendering text in single bit BW format. Prior to that I'd used 300 and 400 dpi for various projects where all I was doing was rendering text. For earlier projects 600 dpi tended to be the highest resolution I used for rendering art. Presently, when I wish to use single bit BW mode to render art, I scan at 1200 dpi most of the time, and occasionally (if the work of art is physically small... 1/2 or less than that of a letter size page) ... I will in some cases scan at 2400 dpi in that mode. For now, for me, the problem with using 2400 dpi is that my equipment takes too long to make the scan for any except occasional use, and the software I have for it is limited to allowing 2400 dpi only on relatively small surface areas.
The exact numbers for dots per inch resolution in projects like this will likely continue to increase as better, faster, higher resolution scanning equipment becomes available, and as faster data rates become available between the Internet and users, and as storage continues to expand in size and gets ever less and less expensive.
But the principles of how I approached making this archive are, as noted above, fundamentally the same as those I applied for my earlier archival projects of left periodicals.
Basics: The tools and source material preparation:
The scanner:
For the scans of this issue of New Masses, I used exclusively a tabloid (11 x 17 inch) flat bed Epson 10000 XL "photo grade" scanner, rated at up to 2400 dpi resolution, though actually capable with its included EpsonsScan software of scanning letter size and larger pages at no more than 1200 dpi.
In years past, I used the extremely similar Epson GT 20000 "document grade" scanner, which is nearly identical, but limited to a maximum resolution of 600 dpi, and which also does not (unlike the photo-grade scanner) offer the options of 48 bit color depth or 16 bit gray scale depth (those last two being photography-oriented options which I have not used for archiving left periodicals).
The key thing here is that one must use a flat bed scanner, or a comparable device (overhead scanner with a glass plate one can press on the work to flatten the page) that ensures the page is absolutely flat when scanned to get professional quality archival images. Failure to do this often results in visible distortion of the image of the page... and thus often ugly and incompetent scans.
Of all strategies to ensure the page being scanned is flat, the flat bed scanner costs less (by a factor of 10 or more) than other approaches that provide the same resolution and color or gray scale depth and in general overall quality of the imaging.
Perhaps the most ill-considered approach is that favored by academics and libraries of using an ultra high resolution digital camera and high end copy stand to capture the images. True, a digital camera will capture an image much faster than a moving bar / 1-dimensional eye-chip flat bed scanner can. And in fact eventually this type of approach... using a camera with a 2 dimensional eye-chip to capture the imageā¦ will dominate "scanning" / digital image acquisition equipment. This approach IS used today in some systems, notably the outstandingly-engineered high end book scanners designed by Archive.org. The problem today is that to get resolutions greater than 300 or 400 dpi for imaging larger format pages (especially tabloid and larger), you need more megapixels in the camera sensor than even $100,000 cameras can provide.
[To calculate the dpi resolution one can get from imaging a page with a camera of a given number of megapixels in its sensor, start by noting that, as a rough approximation, a letter size page has a surface area of 100 square inches. A tabloid size is roughly 200 square inches, and a broadsheet size is roughly 400 square inches. You take the number of pixels in the sensor, divide that by the number of square inches of surface area on your page, then take the square root of that number. The result is your first approximation of the resolution in dots per inch that you are getting. This is, however, only a first approximation, and is accurate only if the piece of paper has exactly the same aspect ratio (ratio of length to width) as the sensor in the camera, and only if that page is imaged so that it precisely and fully and exactly fills the frame of the camera. This, of course, never is the case! Assuming the page is very approximately a rectangle somewhat like that of the sensor of the camera, I'd suggest decreasing the dpi number you get from your first approximation by about 15 to 20% to get a more realistic number for dpi achieved with the camera in imaging the page. This fudge factor will of course vary with the exact situation, depending on degree of mismatch of shape of paper vs camera sensor, and how much margin there is between the edge of the paper and the edge of the camera's frame.]
Also an inferior choice of scanning equipment if the goal is high quality archival scanning is another device favored by "library science" types: the Overhead Scanner. These devices, like high end copy stands, cost many times more than a flat bed scanner for the same available paper size and dot resolution capability. Few offer the option of a glass plate to make the paper flat, and usually such are shunned by library types given their (at times sincerely misguided ... and times with deliberate malignant intent to thwart the making of quality copies) ideas about what can and cannot be done with paper if one wishes to best "preserve" it.
Preparing the pages:
In the case of the 1935 (and the 1926, 1934, and 1936) issues of New Masses, I acquired original issues in bound volumes. These were bound by the original publisher, near the time of publication. The binding was elaborate... highly professional... involving thread, drilled holes in the issues for the threads, and copious amounts of glue on the spine which also seeped thru holes drilled in the issues. Over the last ten years, in the course of unbinding over 100 broadsheet size volumes of newspapers and many other bound volumes, and in consultation with a professional book binder who had a lifetime of experience, I developed assorted approaches to unbinding volumes in general, and in particular for volumes containing somewhat fragile, brown, and even brittle pages.
This is not for the faint of heart, but is absolutely positively required if one wants to make professional and quality digital images of this sort of material. For as I noted above, to make proper images in general, the page must be laid absolutely pressed flat. In particular, it will be totally impossible to image the two page wide center art that crosses the gutter of the magazine if one has not render the publication into separate sheets.
This is not the place for me to go into a dissertation on the fine details of my unbinding techniques, which vary somewhat with the details of each volume I unbind. Suffice it to say that the New Masses bound volumes for 1934, 1935, and 1936 were among the hardest for me to unbind of any I've ever worked on. It took about 20 man-hours of work to unbind each six month bound volume of New Masses. [Use "person-hours" if you like... it happens I am a man, but of course it would take a woman (phenotypic or transgender) or one who does not identify within the range of binary assignment of gender the same amount of time to do this work.] My approach involved use of X-ACTO knives, of course, but mostly of brute force with my arms, use of fingernails, and most critically very experienced and judicious application of particular grinding bits on a Dremel tool to remove the glue from the spine of the bound issues yet to absolutely minimal harm to the paper. Half of the time spent "unbinding" was spent meticulously separating sheets of paper in each issue. These were sheets with four pages printed on them... two facing pages each on each side of the sheet of paper, which is what these publications are made of. I cleaned away the last remnants of glue with my fingernails. Fortunately, past experience with far more fragile paper in bound volumes of The Liberator and The Masses stood me in good stead here, and though the work was very tedious and took a long time, it went relatively smoothly.
In those cases where I was scanning issues acquired as loose separate issues, preparation was far simpler. It consisted merely of removing the two metal staples holding the issue together. For this, given how fragile some of the paper was, the approach I used was to apply a small (4 inch) high quality (Xcelite brand, in this case) diagonal cutting pliers to two points on the staple in a way that for the most part did not impinge on the paper, then gently using fingernails and tips of fingers... occasionally assisted by a fine pliers... work out the now severed pieces of the staple. This prevented damage (for the most part) to the covers and to the center art.
Scanning the pages:
1. Try to get it right at the time of the scan.
In what I present below, one general principle of scanning strategy come up, again and again. It is the rule (that I figured out for myself, but also heard advised to me in the first months I was learning to do digital archiving by the team of digital imaging specialists at Columbia University). This is: whenever reasonably possible, try to achieve the image you want to get at the time of the making of the scan by exposure setting, paper handling (such as pressing the page flat, scanning two page wide art as a single two page wide scan, etc.). DO NOT, in so far as reasonably possible, seek to use post processing of images to clean them up, restore them, etc.
This is not to say I reject post processing. Not at all. I've had to adjust exposure after the fact, fix flaws by adjustment of color, contrast, brightness, editing out shadows and tears and stains, etc. for certain presentation of radical political art. Often when fixing mediocre scanning jobs done by others, though sometimes with work I've scanned myself. But the fact is that doing this is very time consuming, and so wherever I know of means of avoiding need to do post processing by taking certain steps at the time the scan is made, I employ such means. Wherever I can bring to bear a large format scanner to avoid having to stitch together two smaller scans, I do so. Sadly, my own very large format scanner (18 x 24 inch broadsheet size flatbed) I bought a long time ago. It is a now discontinued older model, whose resolution is limited to 400 dpi true optical. So these days when I need to render a two page wide piece of art whose total dimensions exceed 11 x 17 inches (the size of many high quality tabloid flat bed scanners) at greater than 400 dpi, I do have to resort to two separate scans that are later stitched to together. But wherever possible within the mission goal of the project (as in rendering the many tens of thousands of pages of The Militant, Labor Action, and Daily Worker) I use a single scan of a large format scanner to render large format material.
2. Specific scanning strategies used to scan New Masses May 07, 1935:
With this project of scanning New Masses had, as was the case for assorted previous similar projects, the overall mission goal of providing relatively high quality digital preservation of the art content of the periodical. High enough so that if one printed out the images on a 1200 dpi ink jet or laser printer of the sort universally owned, one would get an image that would usually as good as... often visibly better than... the original in the pages scanned. "Better" in that it would be restored to vibrant dark black ink on pure white paper, as it appeared when originally printed, not consist of unevently browned, at times stained and torn paper and faded black ink. The goal was for such print outs to suitable in the opinion of many to frame and hang as art.
Different scanning strategies were used for different types of content in the 1935 New Masses.
These reflected different needs.
Text was not going to be printed out and framed as art... it only needed to be good enough to allow as high quality OCR as the original print quality supported. Text also often is optimally exposed more darkly than is line drawn art, often posing problems if one wishes to scan a page that contains both text and line drawn (or other) art. Sometimes I would just find a good compromise exposure that rendered both acceptably well. But more often I'd make one scan exposing specifically optimally for text, then a second scan... usually of a crop of the art... exposing optimally for the graphic image. This reflects limitations of dynamic range of the image acquisition system (scanner), which can be severe when using single bit BW mode, but also affect scans one makes using 8 bit gray scale mode.
Where the graphic image was line drawn, such as many Mackey and Art Young cartoons, and a fair number of Redfield's works, I'd just scan at high resolution (mostly at 1200 dpi) single bit BW (bi-tonal). This optimally renders such material.
But where the image was a photograph or a charcoal sketch that rendered the illusion of gray by "half tone" means, I more often than not would make two exposures of the same image. One at 1200 dpi single bit BW (usually with the exposure set very light... that tended to in most cases produce optimal results) and one using 8 bit gray scale.
For the gray scale renditions, I usually adjust contrast and brightness as follows: First I crank up the contrast until what should be white background, at its darkest points (the edges and corners: the most yellowed or browned areas on most aged paper) turned white (or very near white) in preview. Then I adjust brightness. More often than not I'll crank brightness down just a little. But with some very dark works I'll boost the brightness just a bit at this stage.
At this point in my scanning experience, I find I I need to go back and adjust the exposure less than one time in ten. Perhaps one time in twenty, when scanning material like this. That wasn't the case when I first began doing this kind of scanning. On the The Liberator project, it was common for me to make two or three differently exposed attempts before I was satisfied with the results.
When making both a 1200 dpi single bit BW and a 600 dpi 8 bit grayscale exposure, I believe an indicator of my having exposed both optimally... or at least in a fashion I find pleasing and well done... is that when I flick back and forth between the two exposures on the screen, they appear to differ little. When that happens, I feel I've done both correctly. Still, despite the similarities, there are arguments for doing as I did and including both types of imaging of the graphic in the final presentation result, as I do a lot of the time.
Gray scale scans can have their exposure tweaked and adjusted after they are saved. This is largely not true for single bit scans. But single bit scans done well tend to be crisper, and have more presence and punch. Many judge them (at least with some subject material) as looking better when viewing them side by side or one after another compared to a good gray scale rendition of the same image, whether or not they actually have preserved as much of the detail of the original as the gray scale scan. A gray scale or color scan can be converted later to single bit BW, but for that conversion to yield a good result much higher dpi resolution usually is required. For to render art as clearly as one can with 600 dpi color or gray scale, one usually has to use 2400 or 4800 dpi single bit BW. That of course being in cases where the subject matter is in the first place amenable to good rendition by single bit BW technique.
3. Use of "red dropout":
Epson flat bed scanners come with scan software called "Epson-scan". There are many alternative software applications one can use to acquire images with these scanners, some of which offer outstanding flexibility and options, especially to those dealing with photographs and negatives. But the simple included for-free Epson-scan software has a feature that makes it especially good for scanning archival documents: its "color dropout / enhance" function, especially when that is set for "red-dropout".
With this function, when scanning in single bit BW or in gray scale, red light from the subject being scanning is ignored, and does not contribute to the image produced. Now, paper with extra dark yellow or brown at the edges has a lot of red in those discolorations. With red-dropout turned on, one can expose the paper more darkly without getting black or gray toning of the corners and edges of the scan, making for cleaner, better exposures. The key thing to appreciate here is that this is a fix / restoration / tweak of the image that can be done automatically at the time of scanning, thus costing nothing what so ever in time per scanned page. It's merely a setting of the scan software, much like the threshold, contrast, brightness, etc. setting.
Sure, one can accomplish the same effect, or better, by making a very high resolution color scan, then processing that scan with a sophisticated photo editor, such as Photoshop. But doing this increases by massive amounts (usually factors of 10 or more) the time it takes to produce a scanned page! And yes, what we really want is not simple red, green, or blue dropout or enhance, but the ability to sample a color use that particular color, and on a decreasing curve colors close to that color, as the color to be ignored. But the approximation of using red dropout to permit better, darker exposures of browned and yellowed pages works very respectably well in a substantial number of situations archiving unevenly browned and/or stained pages.
In this situation, a quantitative change in the time required for making this sort of restoration becomes unequivocally a qualitative change in its utility.
4. Pressing paper flat:
I cannot emphasize too strongly how critically important it is to use a flat bed scanner or comparable technique (overhead scanner with glass plate for pressing paper flat) to get quality professional archival scans from many paper sources. The slightest creases can show up as shadow lines in the scan.
To be sure, the most modern of scanners available at the time of this writing have design aspects that address this: The current Contex "IQ flex" and the competing Widetek "25-200" broadsheet (A2 paper size) flat bed scanners (priced around $7000 to $1100) both offer dual rows of LED lighting that illuminates paper from more than one angle, and effectively can remove much of the shadowing in scans caused by small creases and other irregularities in the "flat" subject being scanned. Widetek even offers a tabloid size (A3) scanner with this feature, though the price tag is still high compared to other tabloid size flat bed scanners. When (if) this dual row / multiple angle illumination feature becomes more commonly available and in lower priced equipment, mashing the page absolutely flat may become less critically important. For now, many of us can afford the $400 to $800 used or $1200 to $2300 new for a high quality Epson or other tabloid flat bed scanner, but cannot put out $5000 to $10000 for the latest broadsheet dual LED illumination flat bed scanners.
Pressing the page flat is especially important when doing high quality archival scanning of two page wide center art from such periodicals as New Masses, The Liberator , and The Masses. There one has first extracted the two page wide sheet. But this sheet has been folded over for as much as a century or more in some cases, with a pronounced crease. In some cases (where the work had been in a bound volume prior to the digital archivist unbinding it) the crease is especially pronounced.
At times when I have to scan such material, I put added padding over the crease and press as hard as I can over that crease without pressing so hard as to break the glass on the flatbed scanner, or distort it so that it causes the scan bar to hang up at the site of pressure!. This has to be combined with judicious adjustment of exposure of the scan. With this, at times I have been able to either totally eliminate the image of the crease from the scan of such pages, or at least reduce the effect massively, so that the amount of post processing required is very greatly decreased.
5. more on two page wide center (and other) art:
As noted above, if I want to (with the equipment available to me) make images of two page wide center art that is more than 11 x 17 inches in dimensions, at greater than 400 dpi resolution, I end up making two scans on my tabloid scanner and stitching the two together. Often the production of such a scan also involves other post processing, such as restoring small holes in the image caused by holes in the paper where threads and glue used in binding the issue into a bound volume mar portions of a line that runs vertically up and down the center of image. Staple holes, stains from rusted staples combined with high humidity storage over the decades, etc. all require a bit of image editing / restoration. This typically can take from 30 minutes to as much as 5 hours of work to get the final images I present.
There are alternatives, if one is willing to experiment with novel tricks, or if cost is absolutely positively no object at all:
At the time of this writing there are offered broadsheet flat bed scanners capable of yielding at least 600 dpi, and I think now 1200 dpi. There are flat bed scanners bigger than broadsheet that, if one has a spare $100,000 or so to spend, can scan pages TWICE broadsheet size (A1 format, or 36 x 24 inches). For a bit more, one can even buy a flat bed scanner that handles bigger format than that.
Normally digital archivists will reject the most economical and most widely available of large format scanners, the feed thru scanners, because for some of the material we need to image feeding the page into one of those is not much different from feeding it into a paper shredder. However, some of the material we scan IS robust enough to do fine in such feed thru scanners, and for as little as $4500 or so one can buy a feed thru scanner (such as the Context IQ quattro 2490) that can scan 24 inch wide material at 1200 dpi.
Speculation: The specifications for this feed thru scanner say it can handle flexible material fed into it that is up to 2 mm thick. I think it likely that if one sheaths somewhat vulnerable paper in flexible polypropylene and then runs it thru such a scanner, one might be able to use this device for making relatively high quality scans of up to A2 size graphic art printed on such paper. But this is an experiment I've yet to try, as I don't own or have easy access to one of these scanners at this time. This approach DOES take time, sheathing and unsheathing the paper, and may also have some issues relating to lack of an available easy preview followed by scan exposure adjustment, but it MAY be viable for some projects involving making high quality archival records of material printed on larger than 11 x 17 paper, even for modestly (though not massively highly) decrepit, aged paper.
The bottom line is that to do quality digital archiving of material like this, one must consider all possibilities, and question existing dictums of what one can and cannot do. Within the bounds of common sense and careful experimentation with non-critical material, and acceptance of the evidence of one's experiments whether that evidence supports or contradicts what one WANTS to find.
6. Difference in print density of the art from issue to issue:
An obvious approach to checking the quality and accuracy of digital scan ... to checking how well it preserves and presents the full reality of the original work of art... is that of visually comparing first a screen display and then, more critically, a print out at high resolution, of the scan to the original page scanned, side by side. There is, however, a very important thing to be aware of when one does this: Nearly always, when art is mass reproduced via a printing process, the results vary from one impression to another. Sometimes quite a lot. Variations in how heavily the printing plates were inked prior to impressions being struck can make very significant differences in how a given page of one copy of the same periodical looks compared to another. Unlike the case where there is a single original... a unique painting or sketch... or even a small run of lithographs meticulously produced and struck under the same printing conditions... mass printed impressions of art in The Masses, The Liberator , New Masses, etc. will vary among copies of the same issue.
This was dramatically made apparent to me years ago when I was working on the (3 year long) project of making a high quality digital archive of The Liberator . I was fortunate to be working with 3 different near complete collections, able to chose page by page which page was in best condition for scanning. Some of those collections themselves had duplicate issues, so it was not uncommon for me to have in front of me 4 or even 5 copies of the same issue of The Liberator when working on scanning a given issue. At one point, trying to make sure my scan matched the properties of the original, I decided it would be convenient to have the page bearing a given image from one copy of The Liberator in front of me while I had the page from another face down on the flat bed scanner. I figured it would save handling and wear and tear on this often brown, brittle, fragile paper if I just compared the scan I got with a secondary copy, without having to remove and replace the page between adjustments of the scan exposure. Immediately I ran into a problem. Try as I might, I just could not make the scan I got look quite like the sample page I was comparing it to. Initially I suspected some defect in my scanner, or that I had overlooked some major aspect of setting the exposure. Finally I opened the scanner and removed the page I was scanning, and put it beside the page from the other copy of that issue that I was using as a comparison. Wow! The two looked quite different. One was MUCH more heavily inked than the other!
Bottom line is, in the absence of having the actual original drawing the artist submitted to the editor for reproduction, it's not clear exactly how a given graphic was intended to appear, and it is thus appropriate for the digital archivist to have some leeway and take some (modest, to be sure) liberties in rendering that image from a given mass produced example in a publication.
But what IS absolutely clear is that when these publications were printed, they were printed using black ink on relatively pure white (sometimes creamy white) paper. NOT using faded black ink on yellowed or browned or stained paper! Thus using high contrast gray scale and single bit BW is appropriate in making a truly accurate and archival image of these pages. And the still taught in many circles practice of meticulously reproducing every aspect of how the paper aged and how it was damaged over the last 100 years by making 24 bit color scans of such pages, which involves applying about 90 to 95% of the file size to conveying this information about damage and aging, and only 5 to 10% of the file size to actually conveying what was originally printed, is at best massively misguided. What is worse is that such mindless, slavish creation of a digital museum of examples of the results of paper aging helps us nearly not at all in understanding the physics and chemistry of paper aging, because nearly never is it recorded along with the image the exact conditions under which the paper had been stored for the last century: humidity, temperature variations, chemicals in the air, etc. Nor is recorded, essentially ever, a chemical analysis of the paper that might shed light on what in that particular batch of paper may be associated with the given degree of observed aging, or lack thereof.
To be sure, with a very few digital archives, recording of every detail, including paper aging and flaws, is appropriate: The dead sea scrolls. The Magna Carta. But for 99.99% of digital archiving such methods are at best inappropriate, and more often just plain stupid, actually making for poorer access to that aspect of the thing being recorded that is our concern: its art and text content. Because of all the irrelevant and distracting "information" of the paper color variations, faded ink variations, etc.
7. Specific run downs of technique related to actual pages of May 07 1935 issue scan:
OK! If you've made it this far into my presentation of my approach to scanning, and even read some of the diatribes I've included in which I vent concerning my frustration with misguided approaches to making archival scans, I wish to thank you.
Here, as the final part of this essay, is a quick guide to where in the May 07 1935 issue of New Masses that I recently scanned you can find examples of the techniques I discussed above.
Please note that only in the case of the two 2-page-wide digital renditions was any post processing used at all. For all other (all single) pages, such restoration and "post processing" as I applied was done entirely by choice of exposure settings, keeping paper pressed flat, etc. at the time of the making of the scan.
=======================================
Page 1 (cover) scanned at 1200 dpi single bit BW with a relatively light exposure. Where I might use a setting of 85 to 103 threshold on the EpsonScan program for text, here I used a setting in the range of 45 to 61. This is more like the setting I'd use for some graphics. It helps make for crisp rendition of the white text on black background portions of covers like this one, and also helps wash out imperfections in browned and at times stained originally white background, to produce the crisp, fresh "restored" image you see here.
Page 2 (inside front cover) It was a bit of a tossup whether I would scan this page at 600 or 1200 dpi. In the end I used 1200 dpi because of the appearance of a little bit of art, and because I'd already had my scanner set for 1200 when I scanned the previous page, the cover. Scanning at 1200 dpi with the EpsonScan software doesn't take all that long for the scan, but it DOES pose problems due to the fact that the auto page de-skewing function of the EpsonScan software does not work above 600 dpi, forcing me to be extra careful about positioning the page, and sometimes requiring a little bit of re-positioning (rotation) and re-checking the preview before scanning. It's this last, more than anything else, that makes such a 1200 dpi scan take significantly longer than a 600 dpi scan, and this is a function of a failing of the scan software used, not some intrinsic aspect of the process.
Page 3 There are three scans of this page. The first of the whole page at 600 dpi single bit BW, with the exposure optimized for rendering the text. The next two are crops of the graphic image on that page, oriented entirely to best rendering the graphics. The image here is a charcoal sketch, which in many ways is similar to a half tone photograph, in that gray is rendered by random patterns of modulated black dots on white background. With this kind of image, I nearly always provide at least one rendition using 8 bit gray scale, though often (as in this case, I also attempt to make the best possible single bit BW scan as well. Here there is first a scan at 1200 dpi single bit BW (with a threshold substantially lower in number... meaning lighter exposure... than that which I used to render the text). Followed by a 600 dpi 8 bit gray scale exposure. For the the gray scale exposure I tweaked contrast and brightness to move the image more toward being somewhat like a single bit BW exposure / increase contrast. This I did by increasing contrast in the preview until the gray-ish corners and edges of the image became white. Then I cranked back brightness just a bit, to increase darkness of the dark areas. [Sometimes I will increase brightness at this point when working with a particularly dark image.]
Page 4 Text plus Mackey line drawn graphic. I used 1200 dpi single bit BW to scan this page. In this particular case I found what I felt was the optimal exposure for the Mackey line drawn graphic to be close enough to the optimal or at least very acceptable exposure for the text that I just did a single scan of the page. In other instances of a line drawn graphic combined with text, this approach sometimes works fine, but at other times I end up deciding to do a separate scan of a crop of the graphic at a different (usually lighter) exposure. [Different artists who produce line-drawn art, I've found, tend to produce graphics that are more or less likely, depending on the artist, to be acceptably scanned in a way that also renders surrounding text well. Mackay's work most often I find able to be rendered in a single scan with surrounding text.
Page 5 all text, but very small point text in the Contents / Editorial Staff box. Because of the use of very small point text I decided to scan this page at 1200 dpi.
Page 6 and 7 all text of reasonable point size. Scanned at 600 dpi single bit BW.
Page 8 Whole Page William Gropper charcoal sketch. Scanned both at 1200 dpi single bit BW and at 600 dpi 8 bit gray scale. See page 3 for details of scanning rationale and technique.
Pages 9, 10, and 11 All text. 600 dpi single bit BW. Same as pages 6 & 7.
Pages 12 and 13: This is an unusual situation, which occurs very rarely in issues of New Masses. Here we have two facing pages that are NOT a single sheet of paper. Art NOTE the two center pages of the publication. But it's clear that they were meant to be viewed together, side by side. Because they cover the full width of the two pages, they were together just a little bit too wide for my Epson 10000 XL tabloid scanner, which supports at a maximum 12.2 x 17.2 inches of scan area. So I scanned the pages separately, with overlap (at 600 dpi 8 bit gray scale). Then stitched the two pages together using Photoshop. No other post processing than the stitching was used. Robin, who did the stitching, suggested we add in some sort of line to show the crease at the center of the two pages. I considered this, but ended up deciding not to do it. Once again, a decision that could have gone either way. A matter of taste and style, really. I've encountered similar instances of art or text that asked for a two page wide display even though it was not center art fewer than a half dozen times in scanning New Masses between 1926 and 1935.
Pages 14 and 15 All text. 600 dpi single bit BW. Same as pages 6, 7, 9, 10 and 11.
Pages 16 and 17. Two page wide center art. In this case line drawn caricatures by Gropper, Wolfe, and Limbach. Because these were line drawings, they were scanned at 1200 dpi single bit BW. Because the images went out to the edges of the two page wide sheet of paper, the sheet would not fit on my tabloid size scanner, and I had to scan this in two parts, then stitch them together. In the case of this image, a very small amount of added post-processing (editing / restoration) was employed in addition to simply stitching the images. The source material for these scans were issues unbound from a tightly and professionally bound volume. Despite my spending literally days of work making the most clean possible meticulous unbind and separation of the pages, there were issues in the center of this two page wide art due to holes drilled in the paper to permit the threads that bound the volume together, and also some minor damage restricted to a very narrow part of that same center column of the two pages due to ravages of the glue used to hold the spine to the covers of the volume. Fortunately the amount of damage to the art was extremely minor, and very easy to fix by just extending a few lines across where they had obviously been interrupted by the (small) holes in the paper. While I was at it, I removed just a few speckles here and there of added noise from the image. Not all, but many of them. I spent perhaps 1/2 to 1 hour doing these restorations.
Pages 18 thru 30 all text 600 dpi. Much like previous all text pages. (There were a few cases here where one could argue the point size of some of the text was a bit on the small size. Thus making a case for 1200 dpi scans. But 600 dpi rendered the text quite adequately, and I had to make some compromises to keep the time required to do this project within reasonable limits, in order to ensure it actually DID get completed.)
Pages 31 and 32 (inside back cover and back cover) Here I used 1200 dpi single bit BW for the scans. Partly because these were "special" because they were the back cover (inside and outside). Partly to render the small point text on page 30. Partly to better render the shaded border box on page 32, the back cover.
So there you have it: A running explanation of much of what went into my decisions about choosing scanning approaches for making the scan of the May 07 1935 issue of New Masses. And what those decisions were.
7. Epilogue / Update: scanning of Fairy Tales for Children ups resolutions used.
A little bit after I produced the scan of the May 07 1935 issue of New Masses, I made a scan of an original edition of the (c) 1925 (copyright now expired) book "Fairy Tales for Workers Children", by Herminine Muhlen, translated by Ida Dailes, with cover illustrations and with 4 stunning color graphic plates inside by Lydia Gibson.
I mention this because in scanning this I began experimenting with a notch higher resolution routine scans in rendering a publication than I have ever used previously. For this book I rendered the 4 color graphic plates at 1200 dpi 24 bit color.
[I considered using 48 bit color (an option available on the Epson 10000 XL scanner), but that would have been silly and pointless, given these color plates employed JUST ONE color to enhance what clearly had been a line drawn black on white graphic. The main advantage of 48 bits of color depth, of course, is to very subtly render slightly different shades of color. There was no need for that here.]
I also switched from 600 dpi single bit BW for text and 1200 dpi single bit BW for line drawn graphics, the standard for the New Masses scans, to using 1200 dpi single bit BW for text only pages, and 2400 dpi single bit BW for rendering the line drawn graphics (which were entirely presented as only a fraction of the surface area of the pages where they appeared, and thus amenable to the limitations of the EpsonScan software).
The problem here is that with a moving bar flat bed scanner of this sort, the time to make a scan for 1200 dpi color or 2400 dpi single bit BW is very long. The scans of each color plate took around 6 minutes to make. That's excluding time spent rejecting an exposure and re-doing the exposure, of course... time critically required when one starts out in order to fine tune the rendering of the image and get best results. Some of the bigger 2400 dpi line drawn art scans took nearly as long. Because there were only 4 color plates, and because most of the line drawn art was small enough to limit somewhat the time required to scan it, and because this was a single project (and an experiment in new techniques), and because the book was only a total of 66 pages of which only half or fewer had on them graphics, the project was do-able in a (full) day's work.
In the end, while pleased with the results, it became clear that this added level of resolution is suitable... or at least feasible... only for a very few projects of relatively few pages each with the equipment I currently possess.
It is absolutely certain that equipment CAN be made that can acquire very high resolution images of relatively large pages nearly instantly, using digital cameras. Unfortunately, at this time, to get true optical resolutions of 1200 dpi from a tabloid size page requires a camera sensor that has 288 megapixels. Rendering a letter size page at 2400 dpi requires a camera sensor with 576 megapixels. No off the shelf cameras are currently available that can do this. You can cover this area with several cameras with overlapping fields of view, and automatically stitch the images together. But to get into such high resolution rendering in this fashion, we're talking custom made equipment costing $200,000 or more. Compared to $2,000 for a 2400 dpi - capable flat bed scanner.
Eventually, as camera sensor resolution cost continues to drop, we may see first letter size then bigger paper size scanners that provide both instant capture and relatively high optical resolution at affordable prices. If there is a market for it. But this is not likely to be available in the immediate future. Note that instant capture capability not only means one can acquire the final scan image very rapidly, but also that one can far more rapidly and precisely adjust exposure to taste in a much better, more effective, more functional preview mode that instant full resolution capture can support with proper software for the scanner.
In Conclusion:
As one friend of mine liked to say "There are more ways to break your leg than getting run over by a truck". The approaches to scanning that I use, outlined here, are not the one and only way... or even the one and only best way... to do proper archival scanning. But they are the result of 8 years of refining my techniques in the course of scanning over 100,000 pages of archival socialist and communist periodicals and pamphlets. These techniques work well for me, and result is what many consider to be very nice looking scans. It's my hope that by elaborating on how I do my scanning, others who are teaching themselves this delicate combination of science and art can find ideas they can incorporate into the techniques for scanning that they develop for their own use. Different individuals' aesthetics, changes in available scanning equipment, differing mission goals in making archival scans all will contribute to others developing their own approaches to making archival scans.
I urge all who embark on doing archival scanning to think for themselves, and question existing routines and norms. Not reject them out of hand, but also not accept as right a given dictum or approach just because it is revered by (for example) librarians, or academicians. Apply common sense. Apply good logic. Modify your approach based on clear evidence, and on the advice of others who review your work. As you scan more and more material, go back and examine old work you've done, comparing it to more recent work, to constantly monitor your own approaches, to best select that which works best, and abandon that which no longer seems to work as well as you require. But be careful: As is the case with guessing at multiple choice test answers, at times the first guess you make at a given approach to scanning may wind up being the best you can do! Always consider that as possible when you review your own work, and do not change techniques JUST for the sake of changing techniques, but rather for good cause.
Best regards to all who embark on scanning of archival material.
Especially to those who endeavor to scan socialist and communist archival material... the history of the workers movement... and make it freely available to scholars, researchers, and above all members of and leaders of the present and future workers movement.
Martin H. Goodman MD
November 21 2017
Brooklyn, NYDirector, Riazanov Library digital archive project
Board of Directors Holt Labor Library
associated (informally) with Marxists Internet Archive (marxists.org)