Home
Scholarly Work
Software

James C. Caldwell

Me!

Bio

I'm a computational researcher and software developer based in London, Ontario. I specialize in machine learning, generative AI, and building data pipelines that turn messy, large-scale datasets into usable research and business intelligence. I work both as an independent consultant and am open to team-based roles in industry.

My recent work includes a research collaboration with Environment Canada and Western University, where I designed and built a vision-language model pipeline to extract handwritten qualitative observations from over 571,000 digitized historical weather forms spanning 1840-1960. The project involved organizing 6 TB of scanned documents, benchmarking multiple AI models across different hardware configurations, and delivering detailed cost/benefit analyses to guide the production run. The pipeline is now ready for deployment and a co-authored methodology paper is in progress.

I also built the Modular Digital Methodologies Toolkit (MDMT), a desktop application that integrates OCR, audio transcription, named entity recognition, AI-powered document analysis, and other tools into a single interface for researchers working with document collections. MDMT is open-source and available on GitHub.

Earlier work includes a computational discourse analysis of over 80 years of Canadian government records, conducted in collaboration with Dr. Janice Forsyth, examining how sport and physical activity were used by residential school administrators in Canada. That project required building custom NLP pipelines to analyze language patterns across a large archival corpus.

I hold an MA in History (computational focus) from Western University, where my thesis applied bibliometric methods and machine learning to trace the emergence of aquatic antibiotic pollution as a global research field across approximately 45,000 scientific publications. I also hold a BA in Psychology from Western, where I focused on the biological bases of behavior - coursework that grounded me in experimental design, statistical analysis, and the neuroscience of cognition and perception.

I'm looking for projects where computational methods can unlock value in complex data, whether that's in academia, government, or the private sector. If you have a problem that involves large amounts of data, unstructured text, document analysis, or AI, I'd like to hear about it.

You can contact me at James.Caldwell.000@gmail.com

ORCID

Curriculum Vitae