Worked / With / Open States/
Google Summer of Code 2017

Pupa-ization Open Civic Data Admin Tools technology

I’ve always been fascinated by the open-source software community, and wanted to contribute my skills to society and do something impactful. With that aim in my mind, I started using GitHub for my personal projects, learning a lot about version control. I also started fixing bugs in open-source projects, which taught me practical Git skills like branching and pull requests, and how to best contribute to these larger development efforts.

When Google announced the list of 201 mentoring organizations, I decided to focus on applying to Open States; it had a welcoming community and would help me hone my existing skills with Python and Django. In preparation, I started tackling outstanding GitHub issues with Open States scrapers, and converting them to pupa.

After nights of hard work, I was excited to find out on May 4 that my application had been accepted! Especially since I was new to the open-source world (and had never before applied to GSoC), I feel very proud and would like to thank my mentors Miles sir and James sir for this opportunity and their guidance.

My Deliveries were: Designed and built new admin tools for the updated OCD structure and Pupa Conversion of Scrapers from Billy; fixed bugs and updated scrapers for different states.


Open States was in the process of updating its infrastructure. I helped Organization in Pupa-ization of Existing Scrapers and Fixing Bugs & Updating Scrapers For Different States

Open Civic Data Admin Tools

The new admin tools will identify the data quality issues in the database and allow Open States admins to know about the current status of quality of scrapers and to manually fix or modify the data quality issues or mark them as exceptions. These tools will also provide admin the right to consider the requests posted by the users to report any wrong or missing information on

This will supports Open Civic Data Backend with important features like,

  1. User Feedback Tool
  2. Merge Tool
  3. Name Resolution Tool
  4. Retirement Tool
  5. Data Quality Issues & Exceptions Tool
  6. Common Status Page for all states containing current status of different parameters
  7. Sub pages for the status of specific sessions/states etc.
  8. Advanced filters to sort data accordingly
  9. Fast search queries using Django-ORM

Blog Posts

  1. Google Summer of Code - Improved Data Tools
  2. Progress on the OCD Data Quality Tools
  3. Google Summer of Code - Data Quality Tools Update
  4. Google Summer of Code 2017 Final Update

Technology Used

python Django postgresql lxml scrapelib pupa git travis-ci javascript mdbootstrap


While I was working on my project I learned a lot of things like —  writing quality code and effectively testing the code. More importantly, I learned the importance of Continuous Integration and Unit Testing. So, GSoC really helped me in growing myself on my road to becoming a super cool programmer.

Related Links: