One of the projects I’m working on a little bit at a time is an expanded book version of my articles about Vermont’s soldiers at the Battle of Gettysburg, and part of that book will include the text of an 1864 Vermont soldier’s memoir entitled The Second Brigade.
I’m about 2/3 through the slow process of formatting that manuscript; my source for the text is a New York City Public Library scan of the original 1864 book: JPG files which I ran through an online text conversion (OCR) program.
Here is a scan of the first page of chapter one:
When I ran this page through the photo-to-text conversion program, here’s what I got:
Each valley, each sequestered glen,
Mustered its little horde of men,
That met as torrents from the height
In highland dales their streams unite,
[BLANK LINE]
Tivilorethnt’fon,tirt.irlinotnsfrong,
Till at the rendezvous they stood
By hundreds. prompt for blowszdtr.CoTT.
Amazingly, I have seen many, many online documents like this, and in fact have also bought several Kindle edition e-books, where the text was apparently converted from a JPG scan using an OCR program and NOT edited or proofread.
But, you know, as blowszdtr.CoTT said, Tivilorethnt’fon,tirt.irlinotnsfrong.