Skip to content
Riccardo Ruspoli headshot

Hi, I'm Riccardo Ruspoli

Data Engineer building scalable and reliable data platforms. Certified in Snowflake and AWS, with hands-on experience also in Databricks, Spark, Kafka, Airflow, dbt. Currently exploring Large Language Models to bridge data and AI.

About

I'm Riccardo, a Data Engineer based in Novara with experience in consulting, enterprise, and startup environments. I specialize in designing and optimizing cloud data warehouses, real-time pipelines, and data platforms at scale. My expertise spans Snowflake, Databricks, AWS, Python, Java, Scala, and their main frameworks. I enjoy working in cross-functional teams, where I combine technical depth with business impact, and I'm currently expanding my skills in Large Language Models to connect data engineering with intelligent systems.

Interests

  • 💻 Programming
  • 🛰️ Space
  • 📷 Photography
  • 🏹 Archery

Skills

Apache Airflow Apache Kafka Apache Spark AWS Azure CSS Databricks DuckDB dbt Git HTML Java JavaScript LLM PostgreSQL Python Salesforce Scala Snowflake Spring SQL Talend Terraform

Experience

Data Engineer — Agile Lab

February 2025 — Present

Contributing to a Scala/Spark ingestion framework and Databricks workflows for near real-time pipelines in a medallion architecture; integrating Kafka, Snowflake, dbt, and Airflow in the insurance domain.

Advanced Data Engineer — NTT Data

November 2022 — February 2025

Built a 16+ TB Snowflake data warehouse and reduced daily processing time by ~30%. Optimized Scala/Kafka streaming and Java APIs for IoT and onboarded new real-time analytics use cases.

Data Engineer — Expandi Group

October 2021 — November 2022

Built a 1+ TB data warehouse and Python/Talend/Airflow pipelines processing 50M+ records/month; migrated infrastructure to AWS to reduce costs and led a team of 5.

Full-Stack Java Developer — Expandi Group

March 2019 — October 2021

Developed a B2B Data-as-a-Service application using Java/Spring and Salesforce; integrated CRM tools and improved ETL with Talend.

Certifications

  • Databricks Certified Associate Developer for Apache Spark

    Databricks · 2026

  • SnowPro® Advanced: Data Engineer

    Snowflake · 2025

  • SnowPro® Core

    Snowflake · 2023

  • AWS Certified Cloud Practitioner

    Amazon Web Services · 2022

  • Oracle Certified Associate, Java SE 8 Programmer

    Oracle · 2019

Projects

Wikipedia redlink explorer data pipeline and website from Wikimedia SQL dumps.

Python DuckDB AWS

exitlight

A privacy-oriented CLI tool to discover legitimate contact methods for data access, deletion, or modification requests.

Python Playwright

reko

A modern, local-first CLI tool to extract transcripts from YouTube and transform them into concise summaries and key points.

Python LLM

StarTracker

An Arduino “barn door” star tracker implementation for long-exposure astrophotography.

C++ Arduino

Contact

Let's connect — always happy to talk data platforms, engineering, and AI. 🤝