Sign In
Looking for talent?
Check out our hiring section
Login to your account
Remember me?
Login
Forgot password?
Not a user yet?
Click here to register.
LOADING
Select Login
Uploaded File
Sheetala
sneelagiri99@gmail.com
747-206-3770
Dallas, TX 75354
DATA ENGINEER
8 years experience
W2
0
Recommendations
Average rating
258
Profile views
Summary
5 years of IT experience in a variety of industries, which includes hands on experience in Hadoop, Hive, Spark, SQOOP and experience in data quality, data governance, master data management and metadata management.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig. Develop data ingestion frameworks, real-time processing solutions, and data processing/transformation frameworks.
Used Python scripts to build a workflow in Autosys to automate the tasks in three zones in cluster. Expertise on coding in different technologies i.e., Python, shell scripting.
Experienced working with Business team for gathering the requirements and fully understand the business requirements.
Designed and created data extracts, supporting Power BI, Tableau, or other visualization tools reporting applications.
Extensive experience in Big Data ecosystem and its various components such as SPARK, MapReduce, HDFS, HIVE, PIG, Sqoop, Zookeeper, Oozie and Flume.
Well versed experience in Amazon Web Services (AWS) Cloud services like EC2, S3, EMR, Redshift. Experience in working with MapReduce Framework and Spark execution model.
Mastered in using different columnar file formats like Json, ORC and Parquet formats. Developed the Pig UDF's to pre-process the data for analysis.
Hands-on experience in programming with Resilient Distributed Datasets (RDDs), data frames and dataset API. Experience in Loading Essbase metadata with Oracle Data Integrator.
Optimize performance for data access requirements by choosing the appropriate native Hadoop file formats (Avro, Parquet, ORC etc) and compression codec, respectively.
Expertise in Data Extraction, Transformation, Loading, Data Analysis, Data Profiling, and SQL Tuning.
Experienced in Partitioning, bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
Experienced in writing custom Hive UDF's to incorporate business logic with Hive queries.
Good experience in working with concepts of Hadoop Architecture and its components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Secondary Name node, and MapReduce concepts.
Experience in Data Governance in promoting vision and strategy to support business's lines by driving the development and implementation of data governance initiatives.
Experience in handling different file formats like Text files, Sequence files, Avro data files using different SerDe in Hive.
Experience in process improvement, Normalization/de-Normalization, data extraction, data cleansing, data manipulation on HIVE.
Experience in writing Sqoop command to import data from Relational database to Hdfs. Having experience in SQL Server and Oracle Database and in writing queries.
Exploring with the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Experience
Edit Skills
Non-cloudteam Skill
Education
Electrical Engineering
California State University, Northridge 2018
Record has not been verified.
Electrical Engineering
JNTU
Record has not been verified.
Skills
Agile Methodology
2020
1
AWS
2020
1
AWS EC2
0
1
AWS S3
2020
1
Big Data
2020
1
Data Cleansing
0
1
Data Engineering
2020
1
Data Governance
2020
1
Data Management
0
1
Data Profiling
0
1
Data Warehousing
2020
1
DB2
2020
1
ETL
2020
1
Flume
0
1
Hadoop
2020
1
HDFS
2018
1
Hive
2018
1
JSON
0
1
MapReduce
2018
1
Metadata
2020
1
MS Power BI
0
1
Oozie
2018
1
Oracle
2020
1
Performance Tuning
2018
1
Pig
0
1
Postman
2020
1
PySpark
2018
1
Python
2018
1
Requirements Gathering
0
1
Scripting
2018
1
Scrum
2020
1
Shell Scripts
2018
1
Snowflake
2020
1
Spark
2018
1
SQL
2020
1
Sqoop
2018
1
Tableau
0
1
WebServices
0
1