Uploaded File
Majeed
abdulmajeed99313@gmail.com
724-278-3107
4.X and CDH 5.X in a
Multi
Hadoop Administrator
11 years experience
0
Recommendations
Average rating
76
Profile views
Summary

• Over all IT Experience 4+ Years & Relevant Experience In Bigdata Hadoop Admin 3+ Years & Currently
Working.
• Experience in administration of Spark.
• Transformations using Spark RDDs.
• Experienced in setting up pre-requisites on servers for Hadoop clusters.
• Expertise in Installation, configuration and management of Cloudera Hadoop Enterprise version.
• Experienced in Cluster Planning, Architecting , Installation, Configuration and Deploying Enterprise
Hadoop Clusters.
• Specialist in delivering Enterprise Data Hub in the Cloud (In AWS)
• Deployment of Enterprise Hadoop in public, private and hybrid cloud Environments
• Experience in Deploying EMR PAAS Cluster on AWS.
• Monitor and actions on long running jobs on integration and production hadoop cluster, analysing
delayed jobs affecting the cluster
• As an admin involved in Cluster maintenance, trouble shooting, Monitoring and followed proper
backup & Recovery strategies.
• Data Governance on Hadoop cluster by deploying authentication mechanism using Kerberos,
authorisation using Sentry, Extended ACL’s and Encryption using KMS
• Performance Optimization of Hadoop Cluster,Troubleshooting, diagnosing, tuning, and solving
Hadoop issues
• Experience in Deploying, Configuring and Maintaining Kafka Cluster in Production as a part of Cloudera
Enterprise.
• Report generation of running nodes using various benchmarking Operations
• Experienced on working with various Hadoop ecosystem tools like Hive,Zookeeper,Sqoop,
Flume,Oozie,Hue,Spark,Kafka,etc
• Expertise in commissioning and de-commissioning Nodes
• Upgrading Hadoop Cluster in production environment
• Delivered production ready hive interface by configuring remote metastore deployment
• Experienced in configuration of MIT Kerberos as well as AD Kerberos in Cloudera Manager
• Experience on Design, configure and manage the backup and disaster recovery for Hadoop
data
• Experience in Importing and exporting data from different databases like MySQL, RDBMS into HDFS

Email : hdfscloudera@gmail.com

Experience
Hadoop Administrator
Feb 2019 - present
Sunnyvale, CA
Responsibilities:
• Installed druid service on HDP cluster and further loaded batch jobs with druid, loaded streaming data from Kafka to druid.
• Connected druid to hive and hive to tableau.
• Upgraded HDP 2.6.0 to HDP 2.6.5 cluster
• Worked on increasing HDFS IO efficiency by adding new disks and directories to HDFS data nodes. Tested HDFS performance by DFSIO before and after adding data directories.
• Worked on HBase performance tuning by following Apache HBase recommendations and changed row key accordingly.
• Creation of key performance metrics, measuring the utilization, performance and overall health of the cluster.
• Capacity planning and implementation of new/upgraded hardware and software releases as well as for storage infrastructure.
• Research and recommend innovative, and where possible, automated approaches for system administration tasks.
• Ability to closely calibrate with product managers and lead engineers.
• Provide guidance in the creation and modification of standards and procedures
• Proactively monitor and setup alerting mechanism for Kafka Cluster and supporting hardware to ensure system health and maximum availability
• Wrote Lambda functions in python for AZURE Lambda which invokes python scripts to perform various transformations and analytics on large data sets in EMR clusters.
• As a Lead of Data Services team, built Hadoop cluster on Azure HD Insight Platform and deployed Data analytic solutions using tools like Spark and BI reporting tools.
• Experience in the Azure components & APIs.
• Thorough knowledge on Azure platforms IAAS, PaaS.
• Manage Azure based SaaS environment.
• Azure Data Lakes and Data Factory.
• Responsible for daily monitoring activities of 6 clusters on cloud (GCP) with 3 different environments (Dev, Stg and Prod) making a total of 18 clusters.
• Support developer team in case of issues related to job failures related to hive queries, zeppelin issues etc.
• Responsible for setting rack awareness on all clusters.
• Responsible for DDL deployments as per requirement and validated DDLs among different environments.
• Responsible for usual admin activities like giving access to users for edge nodes, raising tickets/requests for cyberark account creation, RSA, AD user creation for different services. Environment: Big Data, HDFS, YARN, Hive, Sqoop, Zookeeper, HBase, Oozie, Kerberos, Rangers, Knox, Spark, RedHatLinux.
Analytics Apache Big Data Approach Capacity Planning CyberArk Data Services Hadoop CASE EMR Hbase HDFS IaaS Kerberos Oozie Performance Tuning MS Azure Kafka Hive RedHat SaaS RSA Python System Administration Tableau Yarn Zookeeper Team Build Sqoop Spark Project Management
Remove Skill
Hadoop Administrator
Feb 2019 - present
Sunnyvale, CA
Responsibilities:
• Installed druid service on HDP cluster and further loaded batch jobs with druid, loaded streaming data from Kafka to druid.
• Connected druid to hive and hive to tableau.
• Upgraded HDP 2.6.0 to HDP 2.6.5 cluster
• Worked on increasing HDFS IO efficiency by adding new disks and directories to HDFS data nodes. Tested HDFS performance by DFSIO before and after adding data directories.
• Worked on HBase performance tuning by following Apache HBase recommendations and changed row key accordingly.
• Creation of key performance metrics, measuring the utilization, performance and overall health of the cluster.
• Capacity planning and implementation of new/upgraded hardware and software releases as well as for storage infrastructure.
• Research and recommend innovative, and where possible, automated approaches for system administration tasks.
• Ability to closely calibrate with product managers and lead engineers.
• Provide guidance in the creation and modification of standards and procedures
• Proactively monitor and setup alerting mechanism for Kafka Cluster and supporting hardware to ensure system health and maximum availability
• Wrote Lambda functions in python for AZURE Lambda which invokes python scripts to perform various transformations and analytics on large data sets in EMR clusters.
• As a Lead of Data Services team, built Hadoop cluster on Azure HD Insight Platform and deployed Data analytic solutions using tools like Spark and BI reporting tools.
• Experience in the Azure components & APIs.
• Thorough knowledge on Azure platforms IAAS, PaaS.
• Manage Azure based SaaS environment.
• Azure Data Lakes and Data Factory.
• Responsible for daily monitoring activities of 6 clusters on cloud (GCP) with 3 different environments (Dev, Stg and Prod) making a total of 18 clusters.
• Support developer team in case of issues related to job failures related to hive queries, zeppelin issues etc.
• Responsible for setting rack awareness on all clusters.
• Responsible for DDL deployments as per requirement and validated DDLs among different environments.
• Responsible for usual admin activities like giving access to users for edge nodes, raising tickets/requests for cyberark account creation, RSA, AD user creation for different services. Environment: Big Data, HDFS, YARN, Hive, Sqoop, Zookeeper, HBase, Oozie, Kerberos, Rangers, Knox, Spark, RedHatLinux.
Performance Tuning Capacity Planning Data Services Hadoop System Administration Sqoop Kerberos Kafka Project Management Oozie IaaS HDFS CyberArk Apache Big Data Hbase Hive MS Azure Python SaaS Spark
Remove Skill
Hadoop Administrator
Aug 2018 - Jan 2019
Chicago, IL
Responsibilities:
• Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
• Worked on Installing and configuring the HDP Hortonworks 2.x Clusters in Dev and Production Environments.
• Worked on Capacity planning for the Production Cluster.
• Installed HUE Browser.
• Involved in loading data from UNIX file system to HDFS and creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
• Experience in MapR, Cloudera, & EMR Hadoop distributions.
• Worked on Installation of HORTONWORKS 2.1 in AWS Linux Servers and Configuring Oozie Jobs
• Create a complete processing engine, based on Hortonworks distribution, enhanced to performance.
• Performed on cluster up gradation in Hadoop from HDP 2.1 to HDP 2.3.
• Ability to Configuring queues in capacity scheduler and taking Snapshot backups for Hbase tables.
• Worked on fixing the cluster issues and Configuring High Availability for Name Node in HDP 2.1.
• Involved in Cluster Monitoring backup, restore and troubleshooting activities.
• Involved in MapR to Hortonsworks migration.
• Administration and management of Atlassian tool suites (installation, deployment, configuration, migration, upgrade, patching, provisioning, server management etc.).
• Integrate JIRA with Apteligent for creating two ways linking between the crash reports and JIRA issues.
• Audited our existing plug-ins and uninstalled few unused plugins to save costs and manage the tool efficiently. Automated the transition of issues based on the status when work is logged on the issues.
• Automated issue creation from the office 365 email through mail handler. Configured logging to reduce unnecessary warnings and Info.
• Currently working as hadoop administrator in MapR hadoop distribution for 5 clusters ranges from POC clusters to PROD clusters contains more than 1000 nodes.
• Implemented manifest files in puppet for automated orchestration of Hadoop and Cassandra clusters.
• Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
• Responsible for implementation and ongoing administration of Hadoop infrastructure
• Managed and reviewed Hadoop log files.
• Administration of Hbase, Hive, Sqoop, HDFS, and MapR.
• Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
• Worked on Configuring Kerberos Authentication in the cluster
• Experience in using Mapr File system, Ambari Cloudera Manager for installation and management of Hadoop Cluster.
• Very good experience with all the Hadoop eco systems in UNIX environment.
• Experience with UNIX administration.
• Worked on installing and configuring Solr 5.2.1 in Hadoop cluster.
• Hands on experience in installation, configuration, management and development of big data solutions using Hortonworks distributions.
• Worked on indexing the Hbase tables using and indexing the Json data and Nested data.
• Hands on experience on installation and configuring the Spark and Impala.
• Successfully install and configuring Queues in Capacity scheduler and Oozie scheduler.
• Worked on configuring queues in and Performance Optimization for the Hive queries while Performing tuning in the Cluster level and adding the Users in the clusters.
• Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
• Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
• Adding/installation of new components and removal of them through Ambari.
• Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
• Monitored workload, job performance and capacity planning
• Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
• Inventing and deploying a corresponding Solr Cloud collection.
• Creating collections and configurations, Register a Lily Hbase Indexer configuration with the Lily Hbase Indexer Service.
• Creating and managing the Cron jobs. Environment: Hadoop, Map Reduce, Yarn, Hive, HDFS, PIG, Sqoop, Solr, Oozie, Impala, Spark, Hortonworks, Flume, HBase, Zookeeper and Unix/Linux, Hue (Beeswax), AWS.
node.js MySQL Ambari Apache SOLR Cassandra Cron Configuration Management Database Backups Database Upgrades Flume Pig MapReduce Office 365 Kerberos UNIX impala JSON JIRA Data Cleansing RDBMS Puppet Sqoop Provisioning AWS Big Data Hadoop Hbase HDFS Hive Oozie Spark
Remove Skill
Hadoop Administrator
Aug 2018 - Jan 2019
Chicago, IL
Responsibilities:
• Installed and configured Hadoop Map Reduce, HDFS, developed multiple Map Reduce jobs in java for data cleaning and preprocessing.
• Worked on Installing and configuring the HDP Hortonworks 2.x Clusters in Dev and Production Environments.
• Worked on Capacity planning for the Production Cluster.
• Installed HUE Browser.
• Involved in loading data from UNIX file system to HDFS and creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
• Experience in MapR, Cloudera, & EMR Hadoop distributions.
• Worked on Installation of HORTONWORKS 2.1 in AWS Linux Servers and Configuring Oozie Jobs
• Create a complete processing engine, based on Hortonworks distribution, enhanced to performance.
• Performed on cluster up gradation in Hadoop from HDP 2.1 to HDP 2.3.
• Ability to Configuring queues in capacity scheduler and taking Snapshot backups for Hbase tables.
• Worked on fixing the cluster issues and Configuring High Availability for Name Node in HDP 2.1.
• Involved in Cluster Monitoring backup, restore and troubleshooting activities.
• Involved in MapR to Hortonsworks migration.
• Administration and management of Atlassian tool suites (installation, deployment, configuration, migration, upgrade, patching, provisioning, server management etc.).
• Integrate JIRA with Apteligent for creating two ways linking between the crash reports and JIRA issues.
• Audited our existing plug-ins and uninstalled few unused plugins to save costs and manage the tool efficiently. Automated the transition of issues based on the status when work is logged on the issues.
• Automated issue creation from the office 365 email through mail handler. Configured logging to reduce unnecessary warnings and Info.
• Currently working as hadoop administrator in MapR hadoop distribution for 5 clusters ranges from POC clusters to PROD clusters contains more than 1000 nodes.
• Implemented manifest files in puppet for automated orchestration of Hadoop and Cassandra clusters.
• Worked on installing cluster, commissioning & decommissioning of Data Nodes, Name Node recovery, capacity planning, Cassandra and slots configuration.
• Responsible for implementation and ongoing administration of Hadoop infrastructure
• Managed and reviewed Hadoop log files.
• Administration of Hbase, Hive, Sqoop, HDFS, and MapR.
• Importing and exporting data from different databases like MySQL, RDBMS into HDFS and HBASE using Sqoop.
• Worked on Configuring Kerberos Authentication in the cluster
• Experience in using Mapr File system, Ambari Cloudera Manager for installation and management of Hadoop Cluster.
• Very good experience with all the Hadoop eco systems in UNIX environment.
• Experience with UNIX administration.
• Worked on installing and configuring Solr 5.2.1 in Hadoop cluster.
• Hands on experience in installation, configuration, management and development of big data solutions using Hortonworks distributions.
• Worked on indexing the Hbase tables using and indexing the Json data and Nested data.
• Hands on experience on installation and configuring the Spark and Impala.
• Successfully install and configuring Queues in Capacity scheduler and Oozie scheduler.
• Worked on configuring queues in and Performance Optimization for the Hive queries while Performing tuning in the Cluster level and adding the Users in the clusters.
• Responsible for Cluster maintenance, Monitoring, commissioning and decommissioning Data nodes, Troubleshooting, Manage and review data backups, Manage & review log files.
• Day to day responsibilities includes solving developer issues, deployments moving code from one environment to other environment, providing access to new users and providing instant solutions to reduce the impact and documenting the same and preventing future issues.
• Adding/installation of new components and removal of them through Ambari.
• Collaborating with application teams to install operating system and Hadoop updates, patches, version upgrades.
• Monitored workload, job performance and capacity planning
• Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
• Inventing and deploying a corresponding Solr Cloud collection.
• Creating collections and configurations, Register a Lily Hbase Indexer configuration with the Lily Hbase Indexer Service.
• Creating and managing the Cron jobs. Environment: Hadoop, Map Reduce, Yarn, Hive, HDFS, PIG, Sqoop, Solr, Oozie, Impala, Spark, Hortonworks, Flume, HBase, Zookeeper and Unix/Linux, Hue (Beeswax), AWS.
Ambari impala Cron Cassandra AWS Apache SOLR UNIX Puppet Big Data Data Cleansing Database Backups Database Upgrades Flume Hadoop Hbase HDFS Hive JSON MapReduce MySQL node.js Office 365 Oozie Pig Spark Sqoop
Remove Skill
Information Technology
Jan 2018 - Jul 2018
Chicago, IL
HCSC Responsibilities:
• Deployed Hadoop cluster of Cloudera Distribution and installed ecosystem components: HDFS, Yarn, Zookeeper, HBase, Hive, MapReduce, Pig, Kafka, Confluent Kafka, Storm and Spark in Linux servers.
• Responsible for maintaining 24x7 production CDH Hadoop clusters running spark, HBase, hive, MapReduce with multiple petabytes of data storage on daily basis.
• Configured Capacity Scheduler on the Resource Manager to provide a way to share large cluster resources.
• Deployed Name Node high availability for major production cluster.
• Experienced in writing the automatic scripts for monitoring the file systems, key MapR services.
• Configured Oozie for workflow automation and coordination.
• Troubleshoot production level issues in the cluster and its functionality.
• Backup data on regular basis to a remote cluster using Distcp.
• Setting up cluster and installing all the ecosystem components through MapR and manually through command line in Lab Cluster.
• Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
• Used Sqoop to connect to the ORACLE, MySQL, and Teradata and move the data into Hive /HBase tables.
• Worked on Hadoop Operations on the ETL infrastructure with other BI teams like TD and Tableau.
• Involved in installing and configuring Confluent Kafka in R&D line, also Validate the installation with HDFS connector and Hive connectors.
• Performed Disk Space management to the users and groups in the cluster.
• Used Storm and Kafka Services to push data to HBase and Hive tables.
• Documented slides & Presentations on Confluence Page.
• Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
• Used Sqoop, Distcp utilities for data copying and for data migration.
• Worked on end to end Data flow management from sources to NoSQL (mongo DB) Database using Oozie.
• Installed Kafka cluster with separate nodes for brokers.
• Involved with Continuous Integration team to setup tool GitHub for scheduling automatic deployments of new/existing code in Production.
• Monitored multiple hadoop clusters environments using Nagios. Monitored workload, job performance and capacity planning using MapR control systems.
• Effectively worked in Agile Methodology and provide Production On call support
• Regular Ad-Hoc execution of Hive and Pig queries depending upon the use cases.
• Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
• Monitor Hadoop cluster connectivity and security.
• Manage and review Hadoop log files.
• File system management and monitoring.
• Monitored Hadoop Jobs and Reviewed Logs of the failed jobs to debug the issues based on the errors.
• Diagnose and resolve performance issues and scheduling of jobs using Cron & Control-M.
• Used Avro SerDe for serialization and de-serialization packaged with Hive to parse the contents of streamed log data. Environment: CDH 5.8.3, HBase, Hive, Pig, Sqoop, Yarn, Apache Oozie workflow scheduler, Kafka, Flume, Zookeeper.
Agile Methodology BMC Control-M Continuous Integration ETL Kafka MongoDB MySQL Teradata
Remove Skill
Information Technology
Jan 2018 - Jul 2018
Chicago, IL
HCSC Responsibilities:
• Deployed Hadoop cluster of Cloudera Distribution and installed ecosystem components: HDFS, Yarn, Zookeeper, HBase, Hive, MapReduce, Pig, Kafka, Confluent Kafka, Storm and Spark in Linux servers.
• Responsible for maintaining 24x7 production CDH Hadoop clusters running spark, HBase, hive, MapReduce with multiple petabytes of data storage on daily basis.
• Configured Capacity Scheduler on the Resource Manager to provide a way to share large cluster resources.
• Deployed Name Node high availability for major production cluster.
• Experienced in writing the automatic scripts for monitoring the file systems, key MapR services.
• Configured Oozie for workflow automation and coordination.
• Troubleshoot production level issues in the cluster and its functionality.
• Backup data on regular basis to a remote cluster using Distcp.
• Setting up cluster and installing all the ecosystem components through MapR and manually through command line in Lab Cluster.
• Implemented High Availability and automatic failover infrastructure to overcome single point of failure for Name node utilizing Zookeeper services.
• Used Sqoop to connect to the ORACLE, MySQL, and Teradata and move the data into Hive /HBase tables.
• Worked on Hadoop Operations on the ETL infrastructure with other BI teams like TD and Tableau.
• Involved in installing and configuring Confluent Kafka in R&D line, also Validate the installation with HDFS connector and Hive connectors.
• Performed Disk Space management to the users and groups in the cluster.
• Used Storm and Kafka Services to push data to HBase and Hive tables.
• Documented slides & Presentations on Confluence Page.
• Added Nodes to the cluster and Decommissioned nodes from the cluster whenever required.
• Used Sqoop, Distcp utilities for data copying and for data migration.
• Worked on end to end Data flow management from sources to NoSQL (mongo DB) Database using Oozie.
• Installed Kafka cluster with separate nodes for brokers.
• Involved with Continuous Integration team to setup tool GitHub for scheduling automatic deployments of new/existing code in Production.
• Monitored multiple hadoop clusters environments using Nagios. Monitored workload, job performance and capacity planning using MapR control systems.
• Effectively worked in Agile Methodology and provide Production On call support
• Regular Ad-Hoc execution of Hive and Pig queries depending upon the use cases.
• Regular Commissioning and Decommissioning of nodes depending upon the amount of data.
• Monitor Hadoop cluster connectivity and security.
• Manage and review Hadoop log files.
• File system management and monitoring.
• Monitored Hadoop Jobs and Reviewed Logs of the failed jobs to debug the issues based on the errors.
• Diagnose and resolve performance issues and scheduling of jobs using Cron & Control-M.
• Used Avro SerDe for serialization and de-serialization packaged with Hive to parse the contents of streamed log data. Environment: CDH 5.8.3, HBase, Hive, Pig, Sqoop, Yarn, Apache Oozie workflow scheduler, Kafka, Flume, Zookeeper.
Data Migration MapReduce MySQL node.js Kafka Linux Agile Methodology
Remove Skill
Hadoop Administrator
Jun 2017 - Dec 2017
Detroit, MI
Responsibilities:
• Worked on Distributed/Cloud Computing (MapReduce/ Hadoop, Hive, Pig, HBase, Sqoop, Flume, Spark, Zookeeper, etc.), Hortonworks (HDP 2.4.0)
• Deploying, managing, and configuring HDP using Apache Ambari 2.4.2.
• Installing and Working on Hadoop clusters for different teams, supported 50+ users to use Hadoop platform and resolve tickets and issues they run into and provide training to users to make Hadoop usability simple and updating them for best practices.
• Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
• Configuring YARN capacity scheduler with Apache Ambari.
• Configuring predefined alerts and automating cluster operations using Apache Ambari.
• Managing files on HDFS via CLI/Ambari files view.Ensure the cluster is healthy and available with monitoring tool.
• Developed Hive User Defined Functions in Python. Writing a Hadoop MapReduce Program in Python.
• Improved Mapper and Reducer code using Python iterators and generators
• Built high availability for major production cluster and designed automatic failover control using Zookeeper Failover Controller (ZKFC) and Quorum Journal nodes.
• Converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala
• Implemented Flume, Spark, and Spark Stream framework for real time data processing.
• Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non-secured cluster to secured cluster.
• Responsible for upgrading Hortonworks Hadoop HDP2.2.0 and MapReduce 2.0 with YARN in Multi Clustered Node environment.
• Responsible for services and component failures and solving issues through analyzing and troubleshooting the Hadoop cluster.
• Manage and review Hadoop log files. Monitor the data streaming between web sources and HDFS.
• Working with Oracle XQuery for Hadoop oracle java hotspot virtual machines.
• Managing Ambari administration, and setting up user alerts.
• Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
• Solving Hive thrift issues and HBase problems after upgrading HDP 2.4.0.
• Involved in projects Extensively on Hive, Spark, Pig and Sqoop throughout the development Lifecycle until the projects went into Production.
• Managing the cluster resources by implementing capacity scheduler by creating queues.
• Integrated Kafka with Flume in sand box Environment using Kafka source and Kafka sink.
• Performed Puppet, Kibana, Elastic Search, and Tableau, Red Hat infrastructure for data ingestion, processing, and storage.
• Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
• Implemented Spark solution to enable real time reports from Hadoop data. Was also actively involved in designing column families for various Hadoop Clusters. Environment: HDP 2.4.0, Ambari 2.4.2, Oracle 11g/10g, Oracle Big Data Appliance, MySQL, Sqoop, Hive, Oozie, Spark, Zookeeper, Oracle Big Data SQLMapReduce, Pig, Kerberos, RedHat 6.5.
Elasticsearch Ganglia Kafka Scala OpenShift Ambari Apache Big Data Cloud Computing Flume Hadoop Hbase HDFS Hive MapReduce MySQL Nagios node.js Oozie Oracle Pig Python Sandbox Spark Sqoop
Remove Skill
Hadoop Administrator
Jun 2017 - Dec 2017
Detroit, MI
Responsibilities:
• Worked on Distributed/Cloud Computing (MapReduce/ Hadoop, Hive, Pig, HBase, Sqoop, Flume, Spark, Zookeeper, etc.), Hortonworks (HDP 2.4.0)
• Deploying, managing, and configuring HDP using Apache Ambari 2.4.2.
• Installing and Working on Hadoop clusters for different teams, supported 50+ users to use Hadoop platform and resolve tickets and issues they run into and provide training to users to make Hadoop usability simple and updating them for best practices.
• Installed/Configured/Maintained Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, HBase, Zookeeper and Sqoop.
• Configuring YARN capacity scheduler with Apache Ambari.
• Configuring predefined alerts and automating cluster operations using Apache Ambari.
• Managing files on HDFS via CLI/Ambari files view.Ensure the cluster is healthy and available with monitoring tool.
• Developed Hive User Defined Functions in Python. Writing a Hadoop MapReduce Program in Python.
• Improved Mapper and Reducer code using Python iterators and generators
• Built high availability for major production cluster and designed automatic failover control using Zookeeper Failover Controller (ZKFC) and Quorum Journal nodes.
• Converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala
• Implemented Flume, Spark, and Spark Stream framework for real time data processing.
• Involved in implementing security on Hortonworks Hadoop Cluster using with Kerberos by working along with operations team to move non-secured cluster to secured cluster.
• Responsible for upgrading Hortonworks Hadoop HDP2.2.0 and MapReduce 2.0 with YARN in Multi Clustered Node environment.
• Responsible for services and component failures and solving issues through analyzing and troubleshooting the Hadoop cluster.
• Manage and review Hadoop log files. Monitor the data streaming between web sources and HDFS.
• Working with Oracle XQuery for Hadoop oracle java hotspot virtual machines.
• Managing Ambari administration, and setting up user alerts.
• Handled importing of data from various data sources, performed transformations using Hive, MapReduce, Spark and loaded data into HDFS.
• Solving Hive thrift issues and HBase problems after upgrading HDP 2.4.0.
• Involved in projects Extensively on Hive, Spark, Pig and Sqoop throughout the development Lifecycle until the projects went into Production.
• Managing the cluster resources by implementing capacity scheduler by creating queues.
• Integrated Kafka with Flume in sand box Environment using Kafka source and Kafka sink.
• Performed Puppet, Kibana, Elastic Search, and Tableau, Red Hat infrastructure for data ingestion, processing, and storage.
• Monitored multiple Hadoop clusters environments using Ganglia and Nagios. Monitored workload, job performance and capacity planning using Ambari.
• Implemented Spark solution to enable real time reports from Hadoop data. Was also actively involved in designing column families for various Hadoop Clusters. Environment: HDP 2.4.0, Ambari 2.4.2, Oracle 11g/10g, Oracle Big Data Appliance, MySQL, Sqoop, Hive, Oozie, Spark, Zookeeper, Oracle Big Data SQLMapReduce, Pig, Kerberos, RedHat 6.5.
Kibana Nagios Kerberos Ambari Apache Big Data Cloud Computing Flume Hadoop Hbase HDFS Hive MapReduce MySQL node.js Oozie Oracle Pig Python Sandbox Spark Sqoop
Remove Skill
Hadoop Administrator
Dec 2016 - May 2017
Bowie, MD
Responsibilities:
• Involved in installation of CDH5.5 with CM5.6 on centos Linux environment.
• Involved in installation and configuration of Kerberos security setup on CDH5.5 cluster.
• Involved in installation and configuration of LDAP server and integrated with kerberos on cluster.
• Worked with Sentry configuration to provide centralized security to hadoop services.
• Monitor critical services and provide on call support to the production team on various issues.
• Assist in Install and configuration of Hive, Pig, Sqoop, Flume, Oozie and HBase on the Hadoop cluster with latest patches.
• Involved in performance tuning of various hadoop ecosystem components like YARN, MRv2.
• Implemented the Kerberos security software to CDH cluster for user level as well as service level to provide strong security to the cluster.
• Troubleshooting, diagnosing, tuning, and solving Hadoop issues.
• Maintain good health of cluster.
• Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
• Commissioning and decommissioning the nodes across cluster. Environment: Hortonworks (HDP 2.2), Ambari, Map Reduce 2.0 (Yarn), HDFS, Hive, Hbase, Pig, Oozie, Sqoop, Spark, Flume, Kerberos, Zookeeper, DB2, SQL Server 2014, CentOS, RHEL 6.x.
LDAP CentOS Ambari DB2 Flume Hadoop Hbase HDFS Hive MapReduce Oozie Performance Tuning Pig Spark SQL Server Sqoop
Remove Skill
Hadoop Administrator
Dec 2016 - May 2017
Bowie, MD
Responsibilities:
• Involved in installation of CDH5.5 with CM5.6 on centos Linux environment.
• Involved in installation and configuration of Kerberos security setup on CDH5.5 cluster.
• Involved in installation and configuration of LDAP server and integrated with kerberos on cluster.
• Worked with Sentry configuration to provide centralized security to hadoop services.
• Monitor critical services and provide on call support to the production team on various issues.
• Assist in Install and configuration of Hive, Pig, Sqoop, Flume, Oozie and HBase on the Hadoop cluster with latest patches.
• Involved in performance tuning of various hadoop ecosystem components like YARN, MRv2.
• Implemented the Kerberos security software to CDH cluster for user level as well as service level to provide strong security to the cluster.
• Troubleshooting, diagnosing, tuning, and solving Hadoop issues.
• Maintain good health of cluster.
• Continuous monitoring and managing the Hadoop cluster using Cloudera Manager.
• Commissioning and decommissioning the nodes across cluster. Environment: Hortonworks (HDP 2.2), Ambari, Map Reduce 2.0 (Yarn), HDFS, Hive, Hbase, Pig, Oozie, Sqoop, Spark, Flume, Kerberos, Zookeeper, DB2, SQL Server 2014, CentOS, RHEL 6.x.
Ambari CentOS Hbase Hadoop DB2 Flume HDFS Hive LDAP MapReduce Oozie Performance Tuning Pig Spark SQL Server Sqoop
Remove Skill
System/Linux Administrator
Jun 2012 - Oct 2014
Responsibilities:
• Performed installation, configuration and maintenance of Red Hat Linux4.x, 5.x, 6.x.
• Provided 24x7 System Administration support for Red Hat Linux 4.x, 5.x, 6.x servers and resolved trouble tickets on shift rotation basis.
• Installation of Red Hat Linux 4.x, 5.x, 6.x using Kickstart installation.
• Wrote bash script for getting information about various Red Hat Linux servers.
• Worked on DM-Multipath to determine scsi disks to corresponding LUNS.
• Creation of LVM's on SAN using Red Hat Linux utilities.
• Experienced in Servers consolidation and virtualization using VMware Vsphere Client and Citrix Xen
• Worked on Migrating servers from one host to other in VMware, Xen.
• Worked on Migrating servers from one datacenter to other in VMware Vsphere Client. Working knowledge of Hyper-V virtualization on Microsoft Windows 2008 platform.
• Monitored overall system performance, performed user management, system updates and disk & storage management.
• Performed Patching in the Red Hat Linux servers and worked on installing, upgrading the packages in the Linux systems.
• Analyzing Business requirements/user problems to determine feasibility of application or design within time and cost constraints. Formulated scope and objectives through fact-finding to develop or modify complex software programming applications or information systems.
• Designed and wrote scripts in Shell, Bash scripting, Installations and Configurations of different versions and Editions of Linux servers.
• Development of automation of Kubernetes clusters with Ansible, writing playbooks.
• Troubleshooting the performance, Network issues and monitors the RHEL Linux Servers on Day to Day basis.
• Experience in working on LVM's which includes add/expand/configure disks, disk partitioning with FDISK/PARTED.
• Experience in NFS sharing files/directories, with security considerations.
• Designing and developing new Ansible Playbooks.
• Performed installations, updates of system packages using RPM, YUM.
• Performed Patching activity of RHEL servers using Red Hat Satellite server.
• Implemented Virtualization using VMware in Linux on HP-DL585.
• Red Hat Linux kernel, memory upgrades and swaps area. Red Hat Linux Kick start InstallationSun Solaris Jump start Installation. Configuring DNS, DHCP, NIS, NFS in Sun Solaris 8/9& other Network Services.
• Performed Red Hat Linux kernel, memory upgrades and undertook Red Hat Linux Kickstart installations.
• Created users, manage user permissions, maintain User & File System quota on Red Hat Linux.
• Performed troubleshooting, tuning, security, backup, recovery and upgrades of Red Hat Linux based systems.
• Setup of full networking services and protocols on UNIX, including NIS/NFS, DNS, SSH, DHCP, NIDS, TCP/IP, ARP, applications, and print servers to insure optimal networking, application, and printing functionality.
• Installed and configured Sudo for users to access the root privileges. Environment: RedHat Enterprise Linux 4.x/5.x, Oracle 9i, Logical Volume Manger for Linux and VMware ESX Server 2.x, Apache 2.0, ILO, RAID, VMware Vsphere Client, Citrix Xen, Microsoft Windows 2008/2012.
Citrix VMWare Ansible Windows Scripting Solaris Kickstart NIS NFS OpenShift Xen Sun RPM TCP/IP Virtualization Apache BaSH Database Backups Database Upgrades DHCP DNS Linux Oracle RAID RedHat RHEL SSH System Administration UNIX vSphere
Remove Skill
System/Linux Administrator
Jun 2012 - Oct 2014
Responsibilities:
• Performed installation, configuration and maintenance of Red Hat Linux4.x, 5.x, 6.x.
• Provided 24x7 System Administration support for Red Hat Linux 4.x, 5.x, 6.x servers and resolved trouble tickets on shift rotation basis.
• Installation of Red Hat Linux 4.x, 5.x, 6.x using Kickstart installation.
• Wrote bash script for getting information about various Red Hat Linux servers.
• Worked on DM-Multipath to determine scsi disks to corresponding LUNS.
• Creation of LVM's on SAN using Red Hat Linux utilities.
• Experienced in Servers consolidation and virtualization using VMware Vsphere Client and Citrix Xen
• Worked on Migrating servers from one host to other in VMware, Xen.
• Worked on Migrating servers from one datacenter to other in VMware Vsphere Client. Working knowledge of Hyper-V virtualization on Microsoft Windows 2008 platform.
• Monitored overall system performance, performed user management, system updates and disk & storage management.
• Performed Patching in the Red Hat Linux servers and worked on installing, upgrading the packages in the Linux systems.
• Analyzing Business requirements/user problems to determine feasibility of application or design within time and cost constraints. Formulated scope and objectives through fact-finding to develop or modify complex software programming applications or information systems.
• Designed and wrote scripts in Shell, Bash scripting, Installations and Configurations of different versions and Editions of Linux servers.
• Development of automation of Kubernetes clusters with Ansible, writing playbooks.
• Troubleshooting the performance, Network issues and monitors the RHEL Linux Servers on Day to Day basis.
• Experience in working on LVM's which includes add/expand/configure disks, disk partitioning with FDISK/PARTED.
• Experience in NFS sharing files/directories, with security considerations.
• Designing and developing new Ansible Playbooks.
• Performed installations, updates of system packages using RPM, YUM.
• Performed Patching activity of RHEL servers using Red Hat Satellite server.
• Implemented Virtualization using VMware in Linux on HP-DL585.
• Red Hat Linux kernel, memory upgrades and swaps area. Red Hat Linux Kick start InstallationSun Solaris Jump start Installation. Configuring DNS, DHCP, NIS, NFS in Sun Solaris 8/9& other Network Services.
• Performed Red Hat Linux kernel, memory upgrades and undertook Red Hat Linux Kickstart installations.
• Created users, manage user permissions, maintain User & File System quota on Red Hat Linux.
• Performed troubleshooting, tuning, security, backup, recovery and upgrades of Red Hat Linux based systems.
• Setup of full networking services and protocols on UNIX, including NIS/NFS, DNS, SSH, DHCP, NIDS, TCP/IP, ARP, applications, and print servers to insure optimal networking, application, and printing functionality.
• Installed and configured Sudo for users to access the root privileges. Environment: RedHat Enterprise Linux 4.x/5.x, Oracle 9i, Logical Volume Manger for Linux and VMware ESX Server 2.x, Apache 2.0, ILO, RAID, VMware Vsphere Client, Citrix Xen, Microsoft Windows 2008/2012.
Ansible Citrix BaSH Database Maintenance Database Design Xen Scripting OpenShift Apache Database Backups Database Upgrades DHCP DNS Kickstart Linux NFS NIS Oracle RAID RedHat RHEL Solaris SSH System Administration TCP/IP UNIX VMWare vSphere Windows
Remove Skill
Linux Administrator
Jun 2010 - May 2012
Responsibilities:
• Provided 24x7 on-call supports in debugging and fixing issues related to Linux, Solaris, HP-UX Installation/Maintenance of Hardware/Software in Production, Development & Test Environment as an integral part of the Unix/Linux (RHEL/SUSE/SOLARIS/HP-UX/AIX) Support team.
• Installation Red hat Linux Enterprise Server 5/6 on Dell and HP x 86 HW.
• Planning and implementing Backup and Restore procedures using Ufsdump, Ufsrestore, Tar" and "Cpio".
• Installed and configured the Red Hat Linux 5.1 on HP-Dl585 servers using Kick Start.
• Monitoring day-to-day administration and maintenance operations of the company network and systems working on Linux and Solaris Systems.
• Responsible for deployment, patching and upgrade of Linux servers in a large datacenter environment.
• Design, Build and configuration of RHEL.
• Responsible for providing 24x7 production support for Linux.
• Automated Kickstart images installation, patching and configuration of 500+ Enterprise Linux servers.
• Built kickstart server for automated Linux server builds.
• Installed Ubuntu servers for migration.
• Created Shell, Bash scripts to automate a variety of tasks.
• NFS and SAN filesystem management - Veritas VxVM, LVM
• Maintained user accounts. Sudo was used for management accounts and faceless. Otherwise the accounts were LDAP.
• Datacenter operations, migration of Linux servers
• Configured the NIS, NIS+ and DNS on Red Hat Linux 5.1 and update NIS maps and organize the RHN Satellite Servers in combination with RHN Proxy Server.
• OpenLdap server & clients, PAM authentication setup on RedHat Linux 6.5/7.1.
• Installed, configured, troubleshoot and maintain Linux Servers and Apache Web server, configuration and maintenance of security and scheduling backups, submitting various types of croon jobs.
• Installations of HP Open view, monitoring tool, in servers and worked with monitoring tools such as Nagios and HP Open view.
• Installed and configured the RPM packages using the YUM Software manager.
• Involved in developing custom scripts using Shell (bash, ksh) to automate jobs.
• Defining and Develop plan for Change, Problem & Incident management Process based on ITIL.
• Networking communication skills and protocols such as TCP/IP, Telnet, FTP, NDM, SSH, rlogin.
• Deploying Veritas Clusters and Oracle test databases to implement disaster recovery strategies, ensuring uninterrupted availability of the global systems.
• Also coordinating with storage team and networking teams. Environment: Red Hat Enterprise Linux 4.x/5.x, Logical Volume Manger for Linux and VMware ESX Server 2.x, Hyper-V Manager VMware Vsphere Client, RHEL Citrix Xen.
Proxy Server HP-UX HP Hyper-V Veritas Cluster Management Veritas Ubuntu Server Builds VMWare Disaster Recovery Database Backups Dell Data Center Apache BaSH Citrix DNS Kickstart Linux Nagios NFS NIS OpenShift Oracle RedHat RHEL SAN Solaris SSH SuSE TCP/IP UNIX vSphere
Remove Skill
Linux Administrator
Jun 2010 - May 2012
Responsibilities:
• Provided 24x7 on-call supports in debugging and fixing issues related to Linux, Solaris, HP-UX Installation/Maintenance of Hardware/Software in Production, Development & Test Environment as an integral part of the Unix/Linux (RHEL/SUSE/SOLARIS/HP-UX/AIX) Support team.
• Installation Red hat Linux Enterprise Server 5/6 on Dell and HP x 86 HW.
• Planning and implementing Backup and Restore procedures using Ufsdump, Ufsrestore, Tar" and "Cpio".
• Installed and configured the Red Hat Linux 5.1 on HP-Dl585 servers using Kick Start.
• Monitoring day-to-day administration and maintenance operations of the company network and systems working on Linux and Solaris Systems.
• Responsible for deployment, patching and upgrade of Linux servers in a large datacenter environment.
• Design, Build and configuration of RHEL.
• Responsible for providing 24x7 production support for Linux.
• Automated Kickstart images installation, patching and configuration of 500+ Enterprise Linux servers.
• Built kickstart server for automated Linux server builds.
• Installed Ubuntu servers for migration.
• Created Shell, Bash scripts to automate a variety of tasks.
• NFS and SAN filesystem management - Veritas VxVM, LVM
• Maintained user accounts. Sudo was used for management accounts and faceless. Otherwise the accounts were LDAP.
• Datacenter operations, migration of Linux servers
• Configured the NIS, NIS+ and DNS on Red Hat Linux 5.1 and update NIS maps and organize the RHN Satellite Servers in combination with RHN Proxy Server.
• OpenLdap server & clients, PAM authentication setup on RedHat Linux 6.5/7.1.
• Installed, configured, troubleshoot and maintain Linux Servers and Apache Web server, configuration and maintenance of security and scheduling backups, submitting various types of croon jobs.
• Installations of HP Open view, monitoring tool, in servers and worked with monitoring tools such as Nagios and HP Open view.
• Installed and configured the RPM packages using the YUM Software manager.
• Involved in developing custom scripts using Shell (bash, ksh) to automate jobs.
• Defining and Develop plan for Change, Problem & Incident management Process based on ITIL.
• Networking communication skills and protocols such as TCP/IP, Telnet, FTP, NDM, SSH, rlogin.
• Deploying Veritas Clusters and Oracle test databases to implement disaster recovery strategies, ensuring uninterrupted availability of the global systems.
• Also coordinating with storage team and networking teams. Environment: Red Hat Enterprise Linux 4.x/5.x, Logical Volume Manger for Linux and VMware ESX Server 2.x, Hyper-V Manager VMware Vsphere Client, RHEL Citrix Xen.
Oracle NIS+ OpenShift SAN FTP Apache BaSH Citrix Data Center Database Backups Disaster Recovery DNS Hyper-V Kickstart Linux Nagios NFS NIS RedHat RHEL Solaris SSH SuSE TCP/IP Ubuntu UNIX Veritas Veritas Cluster Management VMWare vSphere
Remove Skill
Edit Skills
Non-cloudteam Skill
Education
Skills
Linux
2018
15
UNIX
2019
15
Apache
2021
14
Oracle
2017
14
RedHat
2021
14
BaSH
2014
12
Citrix
2014
12
NFS
2014
12
NIS
2014
12
Solaris
2014
12
VMWare
2014
12
vSphere
2014
12
Hbase
2021
11
Database Backups
2019
10
System Administration
2021
9
DNS
2014
8
Kickstart
2014
8
OpenShift
2017
8
RHEL
2014
8
SSH
2014
8
TCP/IP
2014
8
Kerberos
2021
7
Capacity Planning
2021
6
Database Upgrades
2019
6
DHCP
2014
6
Disaster Recovery
2012
6
Hadoop
2021
6
HDFS
2021
6
Hive
2021
6
Oozie
2021
6
Performance Tuning
2021
6
RAID
2014
6
Scripting
2014
6
Spark
2021
6
Sqoop
2021
6
SuSE
2012
6
Ubuntu
2012
6
Veritas
2012
6
Windows
2014
6
MySQL
2019
5
Ambari
2019
4
Ansible
2014
4
Big Data
2021
4
Cron
2019
4
Data Center
2012
4
Dell
2012
4
Flume
2019
4
HP
2012
4
Hyper-V
2012
4
MapReduce
2019
4
Nagios
2017
4
Pig
2019
4
Proxy Server
2012
4
Python
2021
4
SAN
2012
4
Server Builds
2012
4
Veritas Cluster Management
2012
4
Virtualization
2014
4
Xen
2014
4
DB2
2017
3
Apache SOLR
2019
2
AWS
2019
2
Cassandra
2019
2
CentOS
2017
2
CyberArk
2021
2
Data Cleansing
2019
2
Data Services
2021
2
Database Design
2014
2
Database Maintenance
2014
2
FTP
2012
2
HP-UX
2012
2
IaaS
2021
2
impala
2019
2
JIRA
2019
2
JSON
2019
2
Kafka
2021
2
LDAP
2017
2
MS Azure
2021
2
NIS+
2012
2
node.js
2019
2
Office 365
2019
2
Project Management
2021
2
Puppet
2019
2
RDBMS
2019
2
RPM
2014
2
SaaS
2021
2
SQL Server
2017
2
Sun
2014
2
Agile Methodology
2018
1
Analytics
2021
1
Approach
2021
1
CASE
2021
1
Cloud Computing
2017
1
Configuration Management
2019
1
Data Migration
2018
1
EMR
2021
1
Provisioning
2019
1
RSA
2021
1
Tableau
2021
1
Team Build
2021
1
Teradata
2018
1
Yarn
2021
1
Zookeeper
2021
1
Apache Webserver
2012
7
Apache Webserver
2012
7
Linux
2012
7
Winstall
2014
7
Winstall
2014
7
Oracle
2014
6
Hbase
2019
5
Problem Solving
2014
5
Problem Solving
2014
5
RedHat
2014
5
UNIX
2014
5
Apache
2019
4
BaSH
2014
4
Capacity Planning
2019
4
Citrix
2014
4
Kerberos
2019
4
NFS
2014
4
NIS
2014
4
RHadoop
2019
4
RHadoop
2019
4
Shell Scripts
2014
4
Shell Scripts
2014
4
Solaris
2014
4
VMWare
2014
4
vSphere
2014
4
MySQL
2019
3
SAP Detailed Scheduling
2018
3
SAP Detailed Scheduling
2018
3
System Administration
2019
3
AIX
2012
2
AIX
2012
2
Business Requirements
2014
2
Business Requirements
2014
2
C
2017
2
C
2017
2
Cron
2019
2
Dell
2012
2
DHCP
2014
2
Disaster Recovery
2012
2
Fiddler
2012
2
Fiddler
2012
2
HP
2012
2
iSCSI
2014
2
iSCSI
2014
2
ITIL
2012
2
ITIL
2012
2
iWeb
2012
2
iWeb
2012
2
Java
2019
2
Java
2019
2
Oracle 11i
2014
2
Oracle 11i
2014
2
Performance Tuning
2019
2
Proxy Server
2012
2
Python
2019
2
RAID
2014
2
Scripting
2014
2
Sendmail
2012
2
Sendmail
2012
2
Server Builds
2012
2
SuSE
2012
2
Ubuntu
2012
2
Veritas
2012
2
Virtualization
2014
2
Web Weaver
2012
2
Web Weaver
2012
2
Windows
2014
2
Windows 2000
2014
2
Windows 2000
2014
2
Agile Methodology
2018
1
Apache Tomcat
0
1
Apache Tomcat
0
1
Application Development
2017
1
Application Development
2017
1
Cloud Computing
2017
1
Data Migration
2018
1
DB2
2017
1
Email Campaign
2019
1
Email Campaign
2019
1
JBOSS
0
1
JBOSS
0
1
JIRA
2019
1
Microsoft SMS Server
2017
1
Microsoft SMS Server
2017
1
Perl
0
1
Perl
0
1
RDBMS
2019
1
Teradata
2018
1
BMC Control-M
2018
1
Continuous Integration
2018
1
Data Governance
0
1
Elasticsearch
2017
1
ETL
2018
1
Ganglia
2017
1
Hadoop Admin
0
1
Hybrid Cloud
0
1
Kibana
2017
1
MongoDB
2018
1
PaaS
0
1
Sandbox
2017
1
Scala
2017
1