We have scans (PDF) of a number of reports documenting early geometric models in the COMGEOM format (a now obsolete format, but the models are interesting nonetheless). These reports contain the actual geometry defining the model as pages and pages of numbers and letters. Unfortunately, the quality is sufficiently poor that optical character recognition (OCR) has a very high rate of error.
This task is to attempt the manual transcription of a portion of the Black Hawk Helicopter model described in the report ''Computer Description of Black Hawk Helicopter'' (see the References list below for the link that will let you download the PDF). One possible approach is to use Acrobat Reader or some other PDF reader select and copy the OCR text, paste that to a text file as a starting point, and then manually correct it. There may also be some patterns that will allow for semi-automated processing (for example, if 5 zeros in a row are commonly replaced with the character ''O'' instead of 0, a search and replace is in order.) However you wish to approach it is fine, but remember that the goal is not just the extraction of the OCR text but the production of an accurate transcription of the file. The OCR text can be used as a starting point but it will NOT be accurate.
The preferred format to provide the pages in is a comma-separated value ASCII text file, which is suitable for post-processing.
The eventual goal is to have a file that can be fed to BRL-CAD's comgeom-g importer to generate an accurate .g file. The description of this target is a couple hundred text pages (which will take much longer than a single GCI task if you're doing correctness checking!) so there will be multiple tasks for pieces of the file. For this task, pleas submit a csv file with the content of the tables on pages
102-133
References:
Please discuss your progress with the developers.
Additional information on comgeom
- Documentation
- Source at: src/conv/comgeom
File name/URL | File size | Date submitted | |
---|---|---|---|
stuff.txt | 111.5 KB | January 18 2015 15:58 UTC |
I would like to work on this task.
This task has been assigned to Arnav. You have 100 hours to complete this task, good luck!
sir,
please give a example to what to do in this task.
i am some what not able to understand the main work of this task.
Sir,
Please give a example to what to do in this task.
I am some what not able to understand the main work of this task.
Regards
Arnav
The claim on this task has been removed, someone else can claim it now.
I would like to work on this task.
The claim on this task has been removed, someone else can claim it now.
I would like to work on this task.
This task has been assigned to xirow. You have 100 hours to complete this task, good luck!
Melange has detected that the final deadline has passed and it has reopened the task.
I would like to work on this task.
This task has been assigned to Phoebe. You have 100 hours to complete this task, good luck!
Melange has detected that the final deadline has passed and it has reopened the task.
I would like to work on this task.
The claim on this task has been removed, someone else can claim it now.
I would like to work on this task.
This task has been assigned to Jacob L. You have 100 hours to complete this task, good luck!
The claim on this task has been removed, someone else can claim it now.
I would like to work on this task.
This task has been assigned to Mou Yan Qiao. You have 81 hours to complete this task, good luck!
The claim on this task has been removed, someone else can claim it now.
I would like to work on this task.
This task has been assigned to Vladimir Kuznetsov. You have 51 hours to complete this task, good luck!
So that's a beginner task, huh?
I'll try to do my best, but there are some serious troubles:
1)Reader marks some of the pages not by lines, but by columns. This gives me a lot of lines in the file and a takes a lot of time to sort.
2)Some digits are so bad, even I can't recognize them. If computer does, it's a miracle.
3)Inaccuracy. Computer may recognize some digits not as they are (6 instead of 8, for example). I guess, when you import this to MGED you'll get a heli-shaped figure, but not the helicopter itself.
This'll take a lot more than 50 hours, but, as i said above, i'll try to do my best.
The work on this task is ready to be reviewed.
Hi Vladimir,
It is indeed considered a beginner task because it's not difficult. It's VERY tedious, but not at all "hard" in the sense that you don't really need to know much to complete the task. You just need to type what you see.
If we do these tasks again, we'll definitely make them into smaller page sets (maybe 10-20 pages per task). We have hundreds of old models like this one that are only available as terrible scans like this one. It is pretty much impossible for a computer to scan these due to the errors, which is exactly why we need a human to do it (and even then, sometimes it'll be impossible without guessing). It's the best we can do. ;)
Congratulations, this task has been completed successfully.