Difference between revisions of "SJ1"

From BRL-CAD
m (Added more content)
m
Line 86: Line 86:
 
I've been a Google Summer of Code Scholar with Sahana Software Foundation(code: https://bitbucket.org/suryajith/gsoc-2010 and reference: lifeeth@gmail.com who was my mentor) where I've developed a handwritten character recognition module which was implemented as a wrapper over the tesseract, implemented automated training module for the character recognition module and an entirely independent module of generating the OCR/HCR-able forms.  
 
I've been a Google Summer of Code Scholar with Sahana Software Foundation(code: https://bitbucket.org/suryajith/gsoc-2010 and reference: lifeeth@gmail.com who was my mentor) where I've developed a handwritten character recognition module which was implemented as a wrapper over the tesseract, implemented automated training module for the character recognition module and an entirely independent module of generating the OCR/HCR-able forms.  
  
Post the summers of 2010, I've joined as a software developer at a web startup(www.capillary.co.in) and there have worked on the company's custom MVC which was refactored and mostly rewritten during my stay at the company. I've played an important role in developing many quintessential web APIs, the controller and model code of various modules and scaling/performance tuning of the LAMP stack. I am particularly pleased about the database connection layer which I have implemented there. Apart from this, I've been working on a few independent modules which involved a lot of machine learning and text parsing which were implemented in Python. I am very familiar with the web paradigms, web framework design and backend scaling. Though I have predominantly worked with Python and php with MySQL so far, I can pick up and thus work with anything other language or tool required pretty fast.
+
Post the summers of 2010, I've joined as a software developer at a web startup(www.capillary.co.in) and there have worked on the company's custom MVC which was refactored and mostly rewritten during my stay at the company. I've played an important role in developing many quintessential web APIs, the controller and model code of various modules and scaling/performance tuning of the LAMP stack. Apart from this, I've been working on a few independent modules which involved a few machine learning and text parsing techniques which were implemented in Python. I am familiar with the web paradigms, web framework design and backend scaling. Though I have predominantly worked with Python and php with MySQL so far, I can pick up and thus work with anything other language or tool required pretty fast.
  
 
I've been lurking and following brlcad for around 3-4 years now and I shall continue to be around and working for the same.
 
I've been lurking and following brlcad for around 3-4 years now and I shall continue to be around and working for the same.

Revision as of 17:57, 26 March 2012

Personal details

  • Name: Suryajith Chillara
  • irc handle: Stattrav
  • email id: suryajith1987@gmail.com
  • background: <At the end of the page.>

Sources

ftp

Users could just transfer/submit their logs to the public ftp server.

scp

Developers could transfer the logs to the a specific folder on the server.

Mail

Users mail the benchmark file to "benchmark@brlcad.org"

http API

There is a http API which recieves the file as a post message which could be sent using curl or equivalent tool.

Diagram

(Preliminary) Stattrav process diagram.jpg

Mechanism

Web API

A http API has to be made which accepts the benchmark file as a POST message or use a file upload mechanism. The http call is embedded into the benchmark shell script or a seperate script through which the user can just use "benchmark-post" or some such command. The file upload could be automated using urllib2(Python) or somesuch equivalent lib for other languages. If this mechanism is included in the benchmark shell script, one could just use an extra argument which could be something such as "--push-result-to-web=true".

FTP sync

Similar to the http push, one can also implement the FTP sync at the userend. The files are submitted to a queue folder and here the mechanism is a polling script which checks for any new file and introduces them into the db and the file storage folder. Here the polling script(say at a frequency of 5mins) can find the files which have been created after the last poll and then check if they are already introduced to the db by checking their md5sum(which could be stored in a separate table). This script dumps the log to a file which could be used to check if the script has worked properly. There could be another script which could be a cleaning script which checks the logs, pushes them to the db and file storage folders incase some of the log files have not been moved and emails if there are any discrepancies. The log files in the file storage folder could be stored as .gz

scp sync

Similar to that of an FTP client from the queue folder.

Mail server

Similar to the FTP/scp sync, a polling script could be written to check the IMAP server and bring in the attachments.

Data extraction

TODO


Storage

Values to be maintained in the db

TODO

flat file on the disk

TODO

Db Schema

TODO

Backup

TODO


User-end code changes

User-end tools

TODO

Other data needed for analysis

TODO

Cross platform solutions for the data

TODO

Core scripts to be written

FTP/scp sync

TODO

FTP/scp verifier

TODO

IMAP sync

TODO

IMAP verifier

TODO

files to db

TODO

http-push

TODO

ftp-push

TODO

Frontend

TODO

Analysis

TODO

Background

I am a first year grad student at Chennai Mathematical Institute studying computer science. I've been a Google Summer of Code Scholar with Sahana Software Foundation(code: https://bitbucket.org/suryajith/gsoc-2010 and reference: lifeeth@gmail.com who was my mentor) where I've developed a handwritten character recognition module which was implemented as a wrapper over the tesseract, implemented automated training module for the character recognition module and an entirely independent module of generating the OCR/HCR-able forms.

Post the summers of 2010, I've joined as a software developer at a web startup(www.capillary.co.in) and there have worked on the company's custom MVC which was refactored and mostly rewritten during my stay at the company. I've played an important role in developing many quintessential web APIs, the controller and model code of various modules and scaling/performance tuning of the LAMP stack. Apart from this, I've been working on a few independent modules which involved a few machine learning and text parsing techniques which were implemented in Python. I am familiar with the web paradigms, web framework design and backend scaling. Though I have predominantly worked with Python and php with MySQL so far, I can pick up and thus work with anything other language or tool required pretty fast.

I've been lurking and following brlcad for around 3-4 years now and I shall continue to be around and working for the same.