add photo
Vishnu
vardh3282@gmail.com
469-312-6820
Dallas, TX 75398
Big Data Engineer
11 years experience W2
0
Recommendations
Average rating
161
Profile views
Summary

  • 7 years of experience into Big Data stacks
  • AWS cloud, PySpark, Python, SQL, MPP Databases and Dell Boomi
  • Designed, developed and deployed multiple high-throughput, scalable and complex big data ETL pipelines in health care, e-commerce and finance domains.
  • Performed spark jobs using AWS services like EC2, S3, EMR, Lambda, Step Function, Redshift, AWS Glue, Data Sync, ECS and Dell Boomi.
  • Experienced in creating IAM roles and policies for different AWS services to ensure security while performing a task.
  • Familiar with creating and managing the infrastructure stack using AWS CloudFormation.
  • Developed data pipelines involving both relational and non-relational databases to perform ingestion into AWS S3.
  • Experienced in production support like performance tuning, job optimization, job monitoring and job automation.
  • Experience in developing generic frameworks for data ingestion, data curation, data migration, and analytic frameworks using Spark in Scala and Python language.
  • Skilled in handling structured and semi-structured data like CSV, Parquet, Avro, XML, JSON etc.
  • IDEs used
  • Databricks, IntelliJ, PyCharm, Sublime Text, Jupyter Notebook, RStudio, Atom
  • Strong knowledge in Spark ecosystems such as Spark core, Datasets/DataFrame, Spark SQL, Spark Streaming libraries, wiring UDFs.
  • Experience in consuming data from various data sources like Kafka, S3, SFTP servers, etc., and multiple data stores like HBase, Hive, Athena, DynamoDB, etc.,
  • Good knowledge in utilizing Jenkins and GitHub to perform Continuous Integration and Deployment (CI/CD).
  • Experience in performance tuning of spark applications from various aspects.
  • Extensive knowledge in developing spark streaming jobs with a good knowledge on Kafka.
  • Good understanding of different databases like MySQL, PostgreSQL, MongoDB, Cassandra, HBase, HDFS.
  • Experienced in data ingestion, data curation and data manipulation using Scala and Python.
  • Capable of performing data analysis and other numerical computations using Python libraries such as Pandas and NumPy.
  • Experienced in performing optimized spark jobs, bulk load/extract from Redshift tables, writing UDFs, submitting spark jobs to EMR in Scala also capable to perform these in Python.

Experience
Education
not provided
Skills
Data Engineering
2020
4
Apache
2020
3
AWS
2020
3
AWS S3
2020
3
Big Data
2020
3
ETL
2020
3
MongoDB
2020
3
Pipeline
2020
3
Spark
2020
3
SQL
2020
3
AWS Redshift
2020
2
Data Migration
2020
2
Data Warehousing
2020
2
MySQL
2018
2
Python
2020
2
AWS EC2
2018
1
Cassandra
2018
1
Data Analysis
2020
1
Data Cleansing
2016
1
Data Science
2018
1
Hbase
2018
1
JSON
2018
1
Machine Learning
2018
1
Microsoft Excel
2016
1
MS Azure
2018
1
PostgreSQL
2018
1
Snowflake
2018
1
SQL Server
2018
1
Teradata
2018
1
AWS CloudFormation
2020
1
Business Analysis
2020
1
Cloud Infrastructure
2020
1
Data Integration
2016
1
Hive
2020
1
Jenkins
0
1
Kafka
0
1
MapReduce
2016
1
Performance Tuning
0
1
Production Support
2020
1
PySpark
2020
1
Shell Scripts
2020
1
XML
0
1