Hi there!
I’m Josh Archibald,
a software engineer who uses big data insights to make the world a better place.
I’ve developed pipelines and platforms across domains,
ranging from civic and climate tech to education and travel.
This is a photograph of me!
Most recently, I worked at Mobi.AI, building AWS infrastructure, engineering large datasets, and developing better ways to access that data, including a REST API. All the while, I was collaborating across teams and disciplines, writing extensive documentation, and presenting to company-wide fora about infrastructure, developer experience, and, of course, our global content store.
In my college summers, I worked across a variety of industries, focusing on building technology for good:
- In summer 2022, I worked as a software engineering intern at H2Ok Innovations, a climate tech startup that is trying to reduce water waste in industrial manufacturing. In this role, I worked on several full-stack development projects, including both internal and client-facing apps.
- In summer 2021, I served as a software engineering intern through the Civic Digital Fellowship, a technology internship program that gives innovative students the opportunity to solve pressing problems in federal agencies. My internship team developed data cleaning and classification techniques to assist with the quantification of environmental impacts on aircraft.
- In summer 2020, I interned at CS50, helping to design software tools to improve the student experience for Harvard’s introductory computer science course. I’ve also taught the course in both its undergraduate and law school iterations.
I previously worked as a research assistant at the Creative Computing Lab at the Harvard Graduate School of Education, where I worked on tools to analyze Scratch projects and helped develop a website for the Getting Unstuck professional development program. I also spent a few summers in the IT department of a public school district.
Contact Me
- You can reach me by email at josh@josharchibald.com.
Links
Experience
Software Engineer II, Mobi Systems
Somerville, MA • Nov 2023 − May 2025
- Led the development of a low-latency Django REST API to filter and serve 18M points of interest. Plus a PySpark pipeline to turn heavy Parquet records into rows in a relational database, index them, and hot-swap the versioned tables in production using views.
- Designed software to orchestrate tens of millions of requests to external APIs, respecting rate limits. Developed this system design into a reusable template and docs used by teammates for other projects.
- Built a system to assign a single logical ID to multiple records of places from different sources.
- Created a dashboard with React, MapLibre, and Visx to visualize the state of the company’s datasets.
- Spearheaded updates to the POI schema and a transition to Parquet format, reducing S3 usage by 30%.
- Implemented schema checks and golden-data tests to detect errors and unexpected changes in data. Similar to enforcing a data contract.
- Rewrote a Glue job used in most pipelines to run 10x faster, reducing costs and removing a bottleneck.
- Implemented a distributed system, using Kafka and AWS Batch, to concatenate millions of S3 objects in parallel.
- Refactored team’s infrastructure-as-code to support the addition of a parallel production AWS account.
- Presented multiple times in company fora, to an average audience of 50 staff including engineers and other stakeholders, on infrastructure, developer experience, and global content.
Software Engineer I, Mobi Systems
Somerville, MA • Jan 2023 − Oct 2023
- Designed data pipelines, leveraging AWS Glue to ingest 18 million points of interest for use in the company’s flagship travel planning application, a 3-fold increase over the previous dataset.
- Built and maintained infrastructure for data engineering workloads, including Airflow and RDS, in Terraform and CloudFormation. Wrote GitHub Actions to deploy application code to AWS.
- Developed Docker containers and Python Click CLIs to standardize the data engineering experience.
Teaching Fellow, CS50, Harvard School of Engineering and Applied Sciences
Cambridge, MA • Sep 2019 − Mar 2024
- Led 23-person 23 students in fall 2019, 15 in fall 2020, 16 in fall 2022, if we’re crunching numbers. weekly recitation and created original slides and practice exercises. Led a two-hour seminar on Flask web development.
- Reviewed, tested, and graded students’ final projects. Assisted students in planning of projects.
- Helped students at office hours for CS50, CS50 for JDs, and the Harvard Business Analytics Program. The latter two are essentially spinoff courses designed for law and online business certificate students, respectively. Led a section and weekly office hours for CS50 for Teachers. A version of the course for K-12 teachers based in Indonesia.
- Received a Certificate of Distinction in Teaching from the Office of Undergraduate Education in fall 2020. Awarded to instructors with an overall rating of 4.50 or higher (out of 5) in course evaluations.
Software Engineering Intern, H2Ok Innovations
Somerville, MA • May − Aug 2022
- Architected and developed backend Flask APIs and React frontend for internal web app of climate tech startup.
- Designed React components and API routes for a full-stack redesign of company’s client-facing app.
- Automated web app releases, API testing, and configuration propagation.
- Participated in discussions about architectural choices, including database design, API structure, and infrastructure decisions.
Software Engineering Fellow, Civic Digital Fellowship, U.S. Special Operations Command
Remote • Jun − Aug 2021
- Developed data cleaning pipelines for an aircraft fault record classification model; helped develop model.
- Analyzed aircraft fault records using classification model results; compiled and presented report on findings.
- Selected from a competitive applicant pool of 1,700+ students with an acceptance rate of 6%.
Software Engineering Intern, CS50
Cambridge, MA • Jun − Aug 2020
- Overhauled SQL library, used by thousands of students each year, to handle both manual and auto transactions robustly.
- Collaborated on a React app to compare similar student work side-by-side; added PDF export and a D3 graph component.
- Devised new unit tests for 6 libraries and software tools used by students and teaching staff.
Research Assistant, Harvard Graduate School of Education
Cambridge, MA • Feb − Aug 2020
- Developed a Python package Most of Scratch’s own software is written in Node.js, but we decided on a Python-based stack. to scrape and aggregate data about Scratch projects, studios, and users. Tested with Pytest and Travis CI; deployed API documentation using Sphinx.
- Worked with team to design a website that allows users to reflect on their learning progress in Scratch through specific teachable examples from other people’s projects and guided reflection questions.
The project summary page of the Clowder software.
Alas, the app is no longer live, but I do have a video of the time selection and emotional state feedback inputs I designed.
- Contributed heavily to site’s backend, including MongoDB cache backend; Celery worker app; and Scratch code analysis.
- Collaborated with team to make design decisions based on technical and pedagogical concerns alike. Contributed to paper on findings (publication pending) by combining and organizing data from Google Analytics and the site’s database.
- Software used in a two-week professional development program by over 300 teachers to analyze over 2,300 projects.
Information Technology Intern, Montville Public Schools
Oakdale, CT • Jun 2016 − Aug 2019 (4 summers)
- Designed Python scripts to parse and clean student data, easing transition between Student Information Systems.
- Developed tools using Python to manage the return of old and distribution of new teacher laptops during lease transition, tracking asset and serial information and using that data to find missing computers using Meraki network dashboard.
- Redesigned sections of district website;
I even wrote an Electron app so school admin could generate the HTML and JS for a landing page slideshow, which I had to write manually because of the CMS used.
helped plan and execute a file server migration.
Skills
Below is a full listing of the programming languages with which I feel comfortable, and a brief listing of some of the libraries and tools with which I have experience. This is not an exhaustive list of every language and library I have ever used.
Each skill is followed by the year in which I first used it.
Programming languages
- C (2018)
- HTML + CSS (2013)
- Java (2016)
- JavaScript/TypeScript (2013/2022)
- MongoDB (2020)
- PHP (2014)
- Python (2015)
- SQL (PostgreSQL, MySQL, SQLite) (2014)
Libraries
- Beautiful Soup (2020)
- Django REST Framework (2024)
- Flask (2018)
- Pandas (2021)
- Plotly (Python, JS) (2021)
- PySpark (2023)
- React (2020)
- Selenium (2020)
- Vue.js (2018)
Tools and devops things
- Apache Airflow (2023)
- Apache Kafka (2023)
- Auth0 (2022)
- AWS - API Gateway, AppConfig, Athena, Batch, CloudFormation, CloudFront, CloudWatch, CodeArtifact, CodePipeline, DMS, ECR, ECS (Fargate), Elastic Beanstalk, EventBridge, Glue, IAM, Lambda, MSK, MWAA, RDS, S3, Secrets Manager, SNS, Step Functions (2022)
- Docker (2020)
- Git (2015)
- GitHub Actions (2020)
- Jupyter/Google Colab (2019)
- Netlify (2020)
- Pytest (2020)
- Python Package Index (as a publisher) (2020)
- Sphinx Documentation (2020)
- Terraform (2023)
- Travis CI (2020)
Projects
Vericlass
Jul 2020-Sep 2023
Getting Unstuck Web
Mar-Aug 2020
Scratch Tools
Feb-Jul 2020
Room Designer
Apr-May 2020
Postcards
Jan 2020
Exhibitor
Nov-Dec 2018
Cookie Analysis
Dec 2019
Education
Harvard College
A.B. Computer Science
Cambridge, MA • Aug 2018 − Dec 2022
- Majored in computer science; minored in history. I wanted a balanced liberal arts education, and that’s part of why I chose to attend Harvard. Learning the nuances of history as a discipline taught me a different way of thinking and provided a richer perspective than if I had limited myself to CS.
- Cross-registered for CS classes at MIT.
- Volunteered in Boston Public Schools, teaching American government and computer science.
- Formal degree conferral in March 2023.