Uploaded File
Alekhya
469-312-6844
US IT Sales Executive at Tek Ninjas
11 years experience W2
0
Recommendations
Average rating
69
Profile views
Summary

Data Scientist with an experience of around 8+ years, working through Telecom and Retail industries, holding master's degree in Computer Science and Technology.

  • Extensive working experience in data science projects and tools like R, Python, SQL, Tableau and Power BI.
  • Enough experience in agile methodology and ability to manage all phases of SDLC ranging from requirement analysis, design, development, testing to deployment.
  • Proficiency in developing story boards and advanced visualizations using Tableau, Python and Power BI.
  • Maintained a key role in management of team by gathering requirements from clients and discuss with the team. Also, turnback the output with zero-defect.
  • Enough practical knowledge in performing Data Analysis process using Python like Importing datasets, Data wrangling, Exploratory Data Analysis.
  • Ability to participate in long hour calls to understand and gather the business requirements.
  • Worked on Adobe campaign tool to create and run the campaign flows and gather all the requirements for creating segments in campaign flows.
  • Proficient in creating the segments & suppressions to run the campaign flows and upload them back in production mode after encryption of files.
  • Skilled in implementing techniques like Regression, Classification, Clustering and Recommender systems, Random forest, decision trees, K means clustering using packages of Python and R studio.
  • Performed Data mining by Handling the tasks of building, deploying and maintaining metadata inventories and data support tools. Responsible for researching and evaluating emerging data ware house tools and techniques
  • Analyzed pre-existing predictive model for predicting the conversion rate of customers from retail to mail developed by Advanced Analytics team and re-built predictive model using machine learning algorithms by considering factors that better influenced the conversion rate. Increase in the conversion rate is beneficial for both customers and company.
  • Designed and developed Big Data analytics platform for processing customer viewing preferences and social media comments using Java, Hadoop, Hive and Pig.
  • Develop excellent quality software using agile techniques such as Test-Driven Development and Pair Programming
  • Skilled in implementing machine learning techniques like Regression, Classification, Clustering and Recommender systems including Random forest, decision trees, Support Vector Machines, K means clustering using packages of Python and R studio. Performed migration from R to Python.
  • Strong experience in ETL data warehousing and implementing all phases of SDLC which includes requirement gap analysis, design, Datawarehouse implementation, development, testing, deployment and production support maintenance.
  • Proficient with RDBMS like Oracle, and SQL developer. Possess hands on experience of creating UNIX shell scripts required to control the ETL flow and implementing complex ETL logic.
  • Created ETL IBM-Datastage parallel jobs to extract and reformat the source data so it can be loaded into the new data warehouse schema.
  • Migrated 9 microservices to Google Cloud Platform from skava and have one more big release planned with 4 more microservices.
  • Extensive Knowledge and hands-on experience implementing PaaS, IaaS, SaaS style delivery models inside the Enterprise (Data center) and in Public Clouds using like AWS, Google Cloud, and Kubernetes etc.
  • Setup Alerting and monitoring using Stackdriver in GCP.
  • Good Experience in data lineage mapping and data mining.
  • Worked with NoSQL databases including HBase, MongoDB and Cassandra. Used Cassandra CQL to retrieve the data from Cassandra tables.
  • Worked on IBM Watson NLP library to develop the mailbot grammerly which fixes the mail body depending on the type of mail. (Official / Personal / informal).
  • Skilled in pulling large datasets to run the manual flows and refresh the dashboards and setup the connections for different databases.
  • Ability to multi-task the projects with zero-defect.
  • Experience in working with customers to determine their needs, gather, analyze and document requirements, communicate with customers throughout the development cycle, manage customer expectations, resolve issues and provide project status.
  • Good communication, interpersonal and quick learning skills with proven ability to adapt to different project environments.

Experience
Data Scientist
Telecommunications
Sep 2017 - present
Philadelphia, PA

Comcast corporation is the largest cable TV company and largest home \o "Internet service provider" Internet service provider in the United States, and the nation's third-largest home \o "Telephone service provider" telephone service provider. Worked with marketing & strategic clients to find the better opportunity and understanding the performance of segments in different circumstances. Responsibilities:

  • Key role in managing a team for a growing relationship through zero defect and aggressive delivery.
  • Responsible for Project Management, Coordination with offshore team and project delivery.
  • Developed solutions to enhance the DW/BI capabilities aligning with the steadily changing business requirements.
  • Identified, analyzed, and executed new and potential products, services, markets.
  • Collected and analyzed data on established and prospective customers, competitors, and marketing channels and sources.
  • Prepared reports that interpret consumer behavior, market opportunities and conditions, marketing results, trends and investment levels.
  • NLP / Deep NLP & text mining: tagging (based on a trigram HMM), syntactic parsing (based on a PCFG), feature engineering and dimensionality reduction, multi-label classification, word sense disambiguation, Twitter hashtag decomposition, relevance engine, topic modeling sentiment analysis, contextual text mining.
  • Experience with open-source NLP toolkits such as CoreNLP, OpenNLP, NLTK, gensim, LingPipe, Mallet, etc
  • Developed offer calendar to track advertised offers across geography and marketing channels. The dashboard includes calendar view of advertised offers and performance of the offers in individual marketing channel across time.
  • Provided performance tuning and physical and logical database design support in projects for Teradata systems and managed user level access rights through the roles.
  • Worked on gathering all the requirements fromUNICA and created the flowcharts for the transition.
  • Design and development of camping management tool as an additional feature to Segment Lab suite of dashboards. Worked on segmentation and opportunity finding.
  • Participated in the transition of project and turnback the client requests with zero defect.
  • Worked on pilots like Strawman PPT, KEB Express & DM dashboard requirements gathering, develop & maintain dashboards deployment.
  • Used K-Means cluster analysis to identify the opportunity for upgrade and lower the churn rate.
  • Excellent Experience in Hadoop architecture and various components such as HDFS Job Tracker Task Tracker NameNode Data Node and MapReduce programming paradigm.
  • Implemented Spark using Scala and utilizing Spark core, Spark streaming and Spark SQL API for faster processing of data instead of MapReduce in Java.
  • Used Spark-SQL to load JSON data and create Schema RDD and loaded it into Hive tables and handles structured data using Spark SQL.
  • Transferred data from Oracle database as well as MS Excel into SAS for analysis and used filters based on the analysis.
  • Used SAS Import/Export Wizard as well as SAS Programming techniques to extract data from Excel.
  • Participate in the Adobe campaign development calls on creation of campaign flows and help clients understand the usage of tool.
  • Performed parameter tuning procedures to achieve optimal performance of the model.
  • Worked on Machine learning algorithms like logistic regression, Decision trees, Support Vector Machine and Random forest to achieve best accuracy for the propensity model
  • Extensive working experience in RStudio packages and Python libraries like SciKit-Learn to improve the model accuracy from 65% to 86%.
  • Strong practical experience in various Python libraries like Pandas, One dimensional NumPy and Two dimensional NumPy
  • Developed data visualizations in Tableau to display day to day accuracy of the model with newly incoming data.
  • Identified factors to be considered for phase 2 development of the project and documented those findings with clear explanations
  • Implemented complete data science project involving data acquisition, data wrangling, exploratory data analysis, model development and model evaluation. Environment: Teradata, Advanced SQL, RStudio (ggplot2, choroplethr, dplyr, caret), Python (Pandas, NumPy), Machine Learning (Logistic Regression, Decision trees, SVM, Random forest), PyTorch, Keras, Knime Analytics tool, Tableau, Excel, SharePoint, Unix, Scala, Teradata SQL Assistant
Data Analysis Data Visualization Hadoop HDFS Hive Logistic Regression Machine Learning MapReduce Oracle Requirements Gathering SAS Spark SQL Tableau Teradata UNIX Spark Streaming Spark SQL Spark Core Scala Python Data Science JSON Analysis Adobe Campaign Text Mining Cabling Microsoft Excel MS SharePoint node.js Project Management Database Design Analytics Performance Tuning
Remove Skill
Data Scientist
Information Technology
May 2016 - Aug 2017
Plano, TX

Responsibilities:

  • Understanding business context and strategic plans and develop a data-driven business plan to support the attainment of business goals.
  • Designed and developed Power BI graphical and visualization solutions with business requirement documents and plans for creating interactive dashboards.
  • Gathered usage reports of Microsoft applications (Word, Excel, SharePoint online, Teams) of all employees from Microsoft office portal in the form of excel sheets
  • Considered one-month period at a time and analyzed usage report data using RStudio packages ggplot2 to identify the patterns and trends of usage.
  • Extensively used PyTorch and Keras to build and train deep learning models.
  • Worked with the data science team to build and deploy machine learning based models to predict customer churn and optimize customer acquisition using Teradata, Oracle, SQL, BTEQ, and UNIX.
  • Created story boards in Tableau and PowerBI for each application usage report categorized country, region and state wise.
  • Tuning of Teradata SQL statements using Explain analyzing the data distribution among AMPs and index usage, collect statistics, definition of indexes, revision of correlated subqueries
  • Created Macros, to generate reports daily, monthly basis and moving files from Test to Production.
  • Hands on experience in installing configuring and using Hadoop ecosystem components like spark/Scala.
  • Analyzed feedbacks from employees regarding Microsoft applications usage in their day to day tasks and built predictive models using machine learning algorithms to understand the main issues those are hindering usage of these apps by the employees.
  • Documented results obtained and supplied Digital fluency reports of each individual team to their respective team leads.
  • Suggested individual teams' better practices of using these apps to improve their overall efficiency.
  • Developed applications processed images take from the smart phone using image processing tools in OpenCV in order to extract relevant medical data.
  • Responsible for working with stakeholders to troubleshoot issues, communicate to team members, leadership and stakeholders on findings to ensure models are well understood and optimized. Environment: SQL*Plus, RStudio, Python (NumPy, Pandas), Machine learning algorithms (Logistic Regression, Decision trees), SharePoint Online, PyTorch, Tableau, PowerBI, Excel.
Hadoop Machine Learning MS Power BI Oracle Spark SQL Tableau Teradata Data Science Stakeholder Engagement BTEQ Scala Python Microsoft Excel MS Visio MS SharePoint Microsoft Office IMAGE Regression Testing Image Processing Employ
Remove Skill
Data Analyst /Engineer
Information Technology
Jan 2012 - Dec 2014
Responsibilities:
  • Responsible for reporting of findings that will use gathered metrics to infer and draw logical conclusions from past and future behavior.
  • Implemented and managed several ETL projects using Informatica PowerCenter by loading data from a variety of Data sources like flat files, JSON, XML files to Oracle for reporting. Source and target data were synced using Informatica and finally transformed data was stored in staging tables.
  • Performed Multinomial Logistic Regression, Random Forest, Decision Tree, SVM to classify package is going to deliver on time for the new route.
  • Created reporting tables for comparing source and target data and report data discrepancies (mismatch, missing scenarios) found in the data.
  • Performed validations not received in the requirement document from the customer end and learnt the SQL queries which helped to attend defect triage calls.
  • Results obtained from report mappings were displayed using MicroStrategy which is a better User Interface tool.
  • Implemented rule-based expertise system from the results of exploratory analysis and information gathered from the people from different departments.
  • Created test plans for conducting unit testing of developed code.
  • Created deployment groups in QA environment and deployed workflows from DEV to QA environment.
  • Performed debugging of the code as per inputs given by IST (Integrated System Testing) team and deployed code into PROD environment after receiving approval from IST team.
  • Created and maintained complete documentation of project from beginning till end.
  • Performed internal enhancements of the jobs running in PROD environment.
  • Successfully maintained and managed all the jobs running in production environment by offering production support.
  • Extensive hands on experience of HP Quality Center tool used for performing production support activities. Environment: Informatica Power Center 9.1 (Repository Manger, Designer, Workflow Monitor, Workflow Manager), Oracle 11g, Toad for Oracle, SQL, UNIX, Shell scripting, SQL*Plus, MS Visio, Erwin Data Modeler, MicroStrategy.
TOAD XML UNIX Test Planning MS Visio Oracle 11i Oracle MicroStrategy Regression Testing SQL ETL HP QC HP Informatica Powercenter Informatica JSON Erwin Data Modler Database Design Data Modeling Data Analysis Project Management Scripting System Testing Unit Testing Logistic Regression Shell Scripts Production Support
Remove Skill
Data Analyst
Jul 2010 - Dec 2011
Responsibilities:
  • Communicated and coordinated with other departments to gather business requirements.
  • Gathering all the data that is required from multiple data sources and creating datasets that will be used in analysis.
  • Performed Exploratory Data Analysis and Data Visualizations using R, and Python.
  • In Preprocessing phase, used Pandas and Scikit-Learn to remove or impute missing values, detect outliers, scale features, and applied feature selection (filtering) to eliminate irrelevant features.
  • Conducted Exploratory Data Analysis using Python Matplotlib and Seaborn to identify underlying patterns and correlation between features.
  • Used Python (NumPy, SciPy, Pandas, SciKit-Learn, Seaborn ) to develop variety of models and algorithms for analytic purposes.
  • Setup storage and data analysis tools in Amazon Web Services cloud computing infrastructure.
Cloud Computing Data Visualization Python WebServices Data Analysis Business Requirements Requirements Gathering
Remove Skill
Edit Skills
Non-cloudteam Skill
Education
Computer Science and Technology
Texas A&M University
Computer Science
Holy Mary Institute of Technology & Science JNTU
Skills
Oracle
2021
11
UNIX
2021
10
Data Analysis
2021
9
Python
2021
9
Database Design
2021
7
Project Management
2021
7
Tableau
2021
7
Teradata
2021
7
Microsoft Excel
2021
6
Performance Tuning
2021
6
Requirements Gathering
2021
6
SQL
2021
6
JSON
2021
5
Logistic Regression
2021
5
SAS
2021
5
Business Requirements
2011
4
Data Science
2021
4
Data Visualization
2021
4
Hadoop
2021
4
HP
2014
4
Informatica
2014
4
Informatica Powercenter
2014
4
Machine Learning
2021
4
MicroStrategy
2014
4
MS SharePoint
2021
4
Oracle 11i
2014
4
Scala
2021
4
Scripting
2014
4
Shell Scripts
2014
4
Spark
2021
4
System Testing
2014
4
TOAD
2014
4
Unit Testing
2014
4
Adobe Campaign
2021
3
Analysis
2021
3
Analytics
2021
3
Cabling
2021
3
HDFS
2021
3
Hive
2021
3
MapReduce
2021
3
MS Visio
2017
3
node.js
2021
3
Regression Testing
2017
3
Spark Core
2021
3
Spark SQL
2021
3
Spark Streaming
2021
3
Text Mining
2021
3
Cloud Computing
2011
2
Data Modeling
2014
2
Erwin Data Modler
2014
2
ETL
2014
2
HP QC
2014
2
Microsoft Office
2017
2
Production Support
2014
2
Test Planning
2014
2
XML
2014
2
BTEQ
2017
1
Employ
2017
1
IMAGE
2017
1
Image Processing
2017
1
MS Power BI
2017
1
Stakeholder Engagement
2017
1
WebServices
2011
1
C
0
5
Oracle SQL*Plus
2017
3
RHadoop
2019
3
Documentation
2014
2
Fiddler
2014
2
Java
2019
2
Quality Assurance
2014
2
Agile Methodology
0
1
AML
0
1
Apache
0
1
Biostatistics
2017
1
Bitlocker Encryption
0
1
Business Intelligence
2017
1
Clustering
0
1
Data Mining
0
1
Metadata
0
1
ORMS
0
1
Predictive Analytics
2017
1
RDBMS
0
1
SDLC
0
1
Social Media
0
1
SQL Developer
0
1
AWS
0
1
Big Data
0
1
Cassandra
0
1
Data Warehousing
0
1
Database Maintenance
0
1
DataStage
0
1
Gap Analysis
0
1
Hbase
0
1
IaaS
0
1
Microservices
0
1
MongoDB
0
1
PaaS
0
1
Public Cloud
0
1
SaaS
0
1
Sales
0
1
Test driven Development
0
1