Over 6 years of experience in Data science / Data analysis, ETL Development, and Project Management.
- Having Experience in all phases of diverse technology projects specializing in Data Science and Machine Learning.
- Proven expertise in employing techniques for Supervised and Unsupervised (Clustering, Classification, PCA, Decision trees, KNN, SVM) learning, Predictive Analytics, Optimization Methods and Natural Language Processing (NLP), Time Series Analysis.
- Experienced in Machine Learning Regression Algorithms like Simple, Multiple, Polynomial, SVR (Support Vector Regression), Decision Tree Regression, Random Forest Regression.
- Experienced in advanced statistical analysis and predictive modelling in the structured and unstructured data environment.
- Expertise in Hadoop ecosystem components HDFS, Map Reduce, Yarn, HBase, Pig, Sqoop, Spark, Spark SQL, Spark Streaming, and Hive for scalability, distributed computing, and high-performance computing
- Strong knowledge of NOSQL column-oriented databases like HBase, Cassandra, Mongo DB, and Mark Logic and its integration with the Hadoop cluster.
- Strong expertise in Business and Data Analysis, Data Profiling, Data Migration, Data Conversion, Data Quality, Data Governance, Data Lineage, Data Integration, Master Data Management (MDM), Metadata Management Services, Reference Data Management (RDM).
- Hands on experience of Data Science libraries in Python such as Pandas, NumPy, SciPy, Scikit-learn, Mat plot lib, Sea born, Beautiful Soup, Orange, Rpy2, Lib SVM, Neurolab NLTK.
- Solid understanding of AWS (Amazon Web Services) S3, EC2, RDS and IAM, Azure ML, Apache Spark, Scala process, and concepts.
- Good Understanding of working on Artificial Neural Networks and Deep Learning models using Theano and Tensor Flow packages using in Python.
- Experienced in Machine Learning Classification Algorithms like Logistic Regression, K-NN, SVM, Kernel SVM, Naive Bayes, Decision Tree & Random Forest classification.
- Experience in various phases of Software Development life cycle (Analysis, Requirements gathering, Designing) with expertise in writing/documenting Technical Design Document (TDD), Functional Specification Document (FSD), Test Plans, GAP Analysis and Source to Target mapping documents.
- Strong understanding of project life cycle and SDLC methodologies including RUP, RAD, Waterfall, and Agile.
- Very good knowledge and understanding of Microsoft SQL Server, Oracle, Teradata, Hadoop/Hive.
- Strong expertise in ETL, Data warehousing, Operational Data Store (ODS), Data Marts, OLAP and OLTP technologies.
- Analytical, performance-focused, and detail-oriented professional, offering in-depth knowledge of data analysis and statistics utilized complex SQL queries for data manipulation.
- Expertises in using Linear & Logistic Regression and Classification Modelling, Decision-trees, Principal Component Analysis (PCA), Cluster and Segmentation analyses, and have authored and co-authored several scholarly articles applying these techniques.
- Assist in determining the full domain of the MVP, create and implement its relevant data model for the App and work with App developers integrating the MVP into the App and any backend domains.
- Ensure REST-based API including all CRUD operations integrate with the App and other service domains.
- Installing and configuring additional services on appropriate AWS EC2, RDS, S3 and/or other AWS service instances.
- Integrating these services with each other and ensuring that user access to data, data storage, and communication between various services.
- Excellent Team player and self-starter possess good communication skills.