Niranjan Fartare

Data Engineer with 3+ years’ experience in AWS, PySpark & Spark. Built scalable ETL pipelines for finance & telecom domains using EMR, Glue, and Redshift. Skilled in big data processing and cloud platforms.

Experience

Data Engineer

ERP Consulting, Pune

July 2023 – August 2025
  • Built scalable ETL pipelines using PySpark and Spark to process thousands of daily credit card transactions from AWS S3
  • Ingested and enriched raw transaction data from S3 using distributed Spark jobs on EMR
  • Deployed Spark applications on AWS EMR for high-performance processing of large datasets
  • Extracted reference data from RDS to enhance transaction datasets
  • Implemented validation and quality checks for accuracy
  • Optimized Spark jobs to cut processing time and costs
  • Automated workflows with Apache Airflow for scheduling and monitoring
  • Managed version control with Git (GitHub/Bitbucket)
  • Skills: Airflow, EMR, Git, Hive, PySpark, Python, RDS, Spark, S3

Data Engineer Intern

ERP Consulting, Pune

June 2022 – June 2023
  • Built centralized data warehouse in AWS Redshift for telecom customer data
  • Processed usage, billing, and service data from AWS S3
  • Developed automated ETL pipelines with AWS Glue
  • Managed schemas with AWS Glue Data Catalog
  • Queried data with AWS Athena for analytics
  • Set up CloudWatch monitoring and alerts
  • Optimized Redshift table design and query performance
  • Skills: Athena, CloudWatch, Glue, RDS, Redshift, S3

Skills

Big Data Processing: Apache Spark, Databricks, Hadoop, HDFS, Hive, PySpark, Snowflake

Cloud Services: Athena, EC2, EMR, Glue, Quicksight, RDS, Redshift, S3

Programming Languages: Bash, HQL, Java, Python, SQL

Project Management Tools: Confluence, Jira

Soft Skills: Adaptability, Attention to Detail, Problem Solving, Quick Learner

System & Infrastructure: Cloud infrastructure management, Distributed computing environments, Linux administration, Shell Scripting

Tools: Jupyter Notebooks, MySQL Workbench, PowerBI, PyCharm, Putty, VSCode

Version Control & Code Management: Bitbucket, Git, GitHub

Workflow & Orchestration: Apache Airflow, AWS Glue, Databricks Lakeflow Jobs

Latest Posts

October 22, 2024

Two Sum in Java

July 1, 2024

Hello World!