Niranjan Fartare
Data Engineer with 3+ years’ experience in AWS, PySpark & Spark. Built scalable ETL pipelines for finance & telecom domains using EMR, Glue, and Redshift. Skilled in big data processing and cloud platforms.
Experience
Data Engineer
SourceFuse Inc, Remote
- Monitoring and optimizing ETL pipelines for data ingestion and transformation
- Managing workflows in Amazon EMR to process large datasets efficiently
- Managing Apache Ranger for fine-grained data access control and security policies
- Maintaining data pipelines using Apache Airflow for job orchestration and automation
- Managed version control with Git and Bitbucket
- Skills: Airflow, Bitbucket, EMR, Hive, PySpark, Python, Ranger, RDS, S3
Data Engineer
ERP Consulting, Remote
- Built scalable ETL pipelines using PySpark and Spark to process thousands of daily credit card transactions from AWS S3
- Ingested and enriched raw transaction data from S3 using distributed Spark jobs on EMR
- Deployed Spark applications on AWS EMR for high-performance processing of large datasets
- Extracted reference data from RDS to enhance transaction datasets
- Implemented validation and quality checks for accuracy
- Optimized Spark jobs to cut processing time and costs
- Automated workflows with Apache Airflow for scheduling and monitoring
- Managed version control with Git (GitHub/Bitbucket)
- Skills: Airflow, EMR, Git, Hive, PySpark, Python, RDS, Spark, S3
Data Engineer Intern
ERP Consulting, Remote
- Built centralized data warehouse in AWS Redshift for telecom customer data
- Processed usage, billing, and service data from AWS S3
- Developed automated ETL pipelines with AWS Glue
- Managed schemas with AWS Glue Data Catalog
- Queried data with AWS Athena for analytics
- Set up CloudWatch monitoring and alerts
- Optimized Redshift table design and query performance
- Skills: Athena, CloudWatch, Glue, RDS, Redshift, S3
Skills
Big Data Processing: Apache Spark, Databricks, Hadoop, HDFS, Hive, PySpark
Cloud Services: Athena, EC2, EMR, Glue, RDS, Redshift, S3
Programming Languages: Bash, Java, Python, SQL
Project Management Tools: Confluence, Jira, SharePoint
Soft Skills: Adaptability, Attention to Detail, Problem Solving, Quick Learner
System & Infrastructure: Cloud infrastructure management, Distributed computing environments, Linux administration, Shell Scripting
Tools: DBeaver, Jupyter Notebooks, MySQL Workbench, PyCharm, Putty, VSCode
Version Control & Code Management: Bitbucket, Git, GitHub
Workflow & Orchestration: Apache Airflow
Latest Posts
August 14, 2025
List of my Public MirrorsJuly 7, 2025
Free RIPE Atlas CreditsOctober 22, 2024
Two Sum in JavaJuly 1, 2024
Hello World!