Uploaded File
add photo
Sai
abhi.work69@gmail.com
703-996-9429
Arlington, VA 22246
Hadoop/Spark Developer
10 years experience W2
0
Recommendations
Average rating
68
Profile views
Summary

Around 8+ years of professional experience in Software development with 5+years of experience in Bigdata technologies including Hadoop and Spark.
• Professional Java developer with strong expertise in data engineering and big data technologies.
• Extensively worked on Spark, Hive, Pig, MapReduce, Sqoop, Kafka, Oozie, HBase, Impala and Yarn.
• Hands on experience in programming using Java, Python, Scala and SQL.
• Sound knowledge of architecture of Distributed Systems and parallel processing frameworks.
• Designed and implemented end-to-end data pipelines to processes and analyze massive amounts of data.
• Experienced working with Hadoop distributions both on-prem (CDH, HDP) and in cloud (AWS).
• Good experience working with various data analytics and big data services in AWS Cloud like EMR, Redshift, S3, Athena, Glue etc.,
• Experienced in developing production ready spark application using Spark RDD Apis, Data frames, Spark-SQL and Spark-Streaming API's.
• Worked extensively on fine tuning spark applications to improve performance and troubleshooting failures in spark applications.
• Strong experience in using Spark Streaming, Spark Sql and other components of spark like accumulators, Broadcast variables, different levels of caching and optimization techniques for spark jobs
• Proficient in importing/exporting data from RDBMS to HDFS using Sqoop.
• Used hive extensively to performing various data analytics required by business teams.
• Solid experience in working various data formats like Parquet, Orc, Avro, Json etc.,
• Experience automating end-to-end data pipelines with strong resilience and recoverability.
• Strong knowledge of NoSQL databases and worked with HBase, Cassandra and Mongo DB.
• Extensively used various IDE's like IntelliJ, NetBeans and Eclipse
• Expert in SQL, extensively worked RDBMSs like Oracle, SQL Server, DB2, MySQL and Teradata
• Worked with Apache Nifi to ingest the data into HDFS from variety of sources
• Proficient and Worked with GIT, Jenkins and Maven.
• Good understanding and Experience with Agile and Waterfall methodologies of Software Development Life Cycle (SDLC).
• Highly motivated, self-learner with a positive attitude, willingness to learn new concepts and accepts challenges. Big Data Ecosystems : HDFS, MapReduce, YARN, Hive, Sqoop, Pig, Spark HBase, Oozie.

Experience
Hadoop/Spark Developer
Information Technology
Mar 2018 - present
Responsibilities:
• Ingested gigabytes of click stream data from external servers such as FTP server and S3 buckets on daily basis using custom Input Adapters.
• Created Sqoop scripts to import/export user profile data from RDBMS to S3 data lake.
• Developed various spark applications using Scala to perform various enrichments of user behavioral data (click stream data) merged with user profile data.
• Involved in data cleansing, event enrichment, data aggregation, de-normalization and data preparation needed for down stream model learning and reporting.
• Utilized Spark Scala API to implement batch processing of jobs
• Trouble Shooting Spark applications for improved error tolerance.
• Fine-tuning spark applications/jobs to improve the efficiency and overall processing time for the pipelines
• Created Kafka producer API to send live-stream data into various Kafka topics.
• Developed Spark-Streaming applications to consume the data from Kafka topics and to insert the processed streams to HBase.
• Utilized Spark in Memory capabilities, to handle large datasets.
• Used Broadcast variables in Spark, effective & efficient Joins, transformations and other capabilities for data processing.
• Experienced in working with EMR cluster and S3 in AWS cloud.
• Creating Hive tables, loading and analyzing data using hive scripts. Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
• Involved in continuous Integration of application using Jenkins.
• Interacted with the infrastructure, network, database, application and BA teams to ensure data quality and availability Environment: AWS EMR, Spark, Hive, HDFS, Sqoop, Kafka, Oozie, Hbase, Scala, MapReduce.
No skills were added
Remove Skill
Spark Developer
Information Technology
Mar 2016 - Jan 2018
Responsibilities:
• Extensively worked on migrating data from traditional RDBMS to HDFS.
• Ingested data into HDFS from Teradata, Mysql using Sqoop.
• Involved in developing spark application to perform ELT kind of operations on the data.
• Modified existing MapReduce jobs to Spark transformations and actions by utilizing Spark RDDs, Dataframes and Spark SQL API's
• Utilized Hive partitioning, Bucketing and performed various kinds of joins on Hive tables
• Involved in creating Hive external tables to perform ETL on data that is produced on daily basis
• Validated the data being ingested into HIVE for further filtering and cleansing.
• Developed Sqoop jobs for performing incremental loads from RDBMS into HDFS and further applied Spark transformations
• Loaded data into hive tables from spark and used Parquet columnar format.
• Created Oozie workflows to automate and productionize the data pipelines
• Migrating Map Reduce code into Spark transformations using Spark and Scala.
• Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
• Designed, documented operational problems by following standards and procedures using JIRA Environment: HDP (Hortonworks Data Platform) Spark, Scala, Sqoop, Oozie, Hive, Cent OS, MySQL, Oracle DB, Flume Life Lock, Az
No skills were added
Remove Skill
Java Developer
Information Technology
Jul 2011 - Aug 2012
Responsibilities:
• Involved in developing the application using Java/J2EE platform. Implemented the Model View Control (MVC) structure using Structs.
• Responsible to enhance the Portal UI using HTML, JavaScript, XML, JSP, Java, CSS as per the requirements and providing the client-side Java script validations and Server-side bean Validation Framework (JSR 303).
• Used Spring Core Annotations for Dependency Injection.
• Used Hibernate as persistence framework mapping the ORM objects to table using Hibernate annotations.
• Responsible to write the different service classes and utility API which will be used across the framework.
• Used Axis to implementing Web Services for integration of different systems.
• Developed Web services component using XML, WSDL and SOAP with DOM parser to transfer and transform data between applications.
• Exposed various capabilities as Web Services using SOAP/WSDL.
• Used SOAP UI for testing the Restful Webservices by sending and SOAP request.
• Used AJAX framework for server communication and seamless user experience.
• Created test framework on Selenium and executed Web testing in Chrome, IE and Mozilla through Web driver.
• Used client-side java scripting: JQUERY for designing TABS and DIALOGBOX.
• Created UNIX shell scripts to automate the build process, to perform regular jobs like file transfers between different hosts.
• Used Log4j for the logging the output to the files.
• Used JUnit/ Eclipse for the unit testing of various modules.
• Involved in production support, monitoring server and error logs and foreseeing the Potential issues and escalating to the higher levels. Environment: Java, J2EE, JSP, Servlets, Spring, Servlets, Custom Tags, Java Beans, JMS, Hibernate, IBM MQ Series, AJAX, Junit, Log4j, JNDI, Oracle, XML, SAX, Rational Rose, UML.
No skills were added
Remove Skill
Edit Skills
Non-cloudteam Skill
Education
Computer Science
Acharya Nagarjuna University
Skills