Indic Wikisource Community Consultation 2018
- Access to Knowledge
Jayanta Nath
8 December 2018
There was a long time required of Optical Character Recognition (OCR) for Indic language computing. There was not at per OCR available in Indic languages before 2015. Most of the Indic subdomain was created in 2007 to 2011, but due to not availability of OCR, the Indic Wikisource Community used to type the whole book or import the Unicoded text from other non-reliable sources. In 2015 the after Google Drive OCR released Indic community relief from the typing era.
Later Shrinivasan T developed an OCR4wikisource script to use the Google Drive OCR as Bot. Since the implementation of the OCR, there has been a lot of progress in Indic Wikisource. But we have realized the there should be a common platform where we can share our knowledge. Then one-month planning we have organized Indic Wikisource Community Consultation 2018. in Kolkata. this is first such consultation at this scale, convened by the CIS A2K team.
The meeting had a representation of one volunteer from the Assamese, Bangla, English, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Telugu, and Sanskrit language Wikisource communities. Ananth Subray (Kannada ) Bodhisattwa (Bengali) Hrishikes Sen (English ) Gurlal Maan (Punjabi ) Gitartha Bordoloi (Assamese ) Pooja Jadhav (Marathi ) Pankajmala Sarangi (Oriya ) Shubha (Sanskrit ) Sushant Savla (Gujurati ) Ranjith siji (Malayalam ) Ajit Kumar Tiwari (Hindi ) Ramesam54 (Telugu ) Jayprakash (Indic Tech team) Chinmayee Mishra (Oriya ) as well as Tito Dutta, Tanveer Hasan, Subodh Kulkarni and Jayanta Nath, four members of the Access to Knowledge Programme of the Centre for Internet and Society (CIS-A2K) .
The objectives of the consultation are:
- Share views and preferences on the most effective ways to pursue our shared vision of creating and sharing free knowledge in India and in the Indian languages (including English) around the world through the Indic Wikisource Project.
- Attempt to come to an agreement on a roadmap for a future where our resources are better utilized, our volunteers are better served, and progress on our mission is more steadily attained.
We have started our discussion on day zero with the agenda of the main aims of this consultation and what all participants want from this program. The discussion was started at 6 PM and ended at 10 PM night. After discussion, we have summarized and set-up for two days agenda which was actually coming from the participants. The CIS-A2K team arranged for the travel and stay of all participants, as well as a night stay for all participants between the zero and second day, to ensure that the programme started on time on.
Day one started with Introduction of Wikisource by me were introduce the workflow of Wikisource, adding text, finding the source, basic copyright checking, creating Index pages, OCRed the page, Proofreading, layout with typography, Validation, Transclusion and Finishing touch. Later on, Hrishikes Sen demonstrated each segment broadly. Bodhisattwa (Bengali) demonstrated Wikisource Tool, like IA-UPLOAD, Vicuna Uploader, URL2COMMONS, Fill index Gadget etc. And all participants implement hands-on. Bodhisatta showed the Bengali Wikisource promotional videos.
Day two was started with Google Drive OCR without using Bot solution developed by Jayprakash (Indic Tech team). Later on OTRS process by Jayanta Nath, Wikisource Roadmap by Tanveer Hasan, Institutional Partnership – by Subodh Kulkarni and Transclusion in Wikisource by Susant Salva presented. The most achievements of this meeting were the second day, Jayprakash leads the task myself to clear the Wikisource technical backlog.
There were also some ideas coming up by the session by Tanveer. This included awareness, outreach, followups, and evaluation. A report about this meeting was published at Asomiya Pratidin. Some feedback from the participants can be found here.