Lead Data Scientist – Remote work (Within the India)
Role: Lead Data Scientist
Chargebee is a recurring billing and subscription management tool that helps SaaS and SaaS-like businesses streamline Revenue Operations.
At Chargebee, we rely on insightful data to power our systems and solutions. We’re seeking experienced data scientists to deliver those insights to us on a daily basis. Our ideal team member will have the mathematical and statistical expertise you’d expect, along with natural curiosity and a creative mind that’s not so easy to find. As you mine, interpret, and clean the data, we will rely on you to ask questions, connect the dots, and uncover opportunities that lie hidden with the ultimate goal of realizing the data’s full potential. You are expected to bring in a strong experience of using a variety of data mining methods and tools in building models and running simulations. You must have a proven ability to drive business results with data-based insights and more importantly you should be comfortable working with a wide… range of stakeholders and functional teams. You will be instrumental in helping the business continue its evolution into an analytical and data-driven culture.
Roles & Responsibilities
• Work with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions.
• Develop a use case roadmap for a problem area or capability for the business. Frame the business problem into a Data Science or modelling problem.
• Extract data from multiple sources. Mine and analyse data from company databases to drive optimization and improvement of products.
• Work as the data strategist, identifying and integrating new datasets that can be leveraged through our product capabilities and work closely with the engineering team to strategies and execute the development of data products.
• Enhance data collection procedures to include information that is relevant for building analytic systems. Processing, cleansing, and verifying the integrity of data used for analysis. Undertake preprocessing of structured and unstructured data.
• Run data exploration to understand relationships and patterns within the data, develop data visualisation to represent and be able to demonstrate the relationships identified from data exploration.
• Data mining using state-of-the-art methods. Selecting features, building and optimizing classifiers using machine learning techniques.
• Refine and deepen understanding of the algorithmic and inferential aspects of statistical analysis. Evaluate new algorithms from latest research and develop intuition about the problems for which they are likely to improve the state of the practice.
• Build training pipelines for the production environment. Develop and execute on a plan for continuous iteration and refinement of a new model.
• Provide inputs for design, quality assurance parameters and support implementation for the model in an online environment.
• Provide inputs and determine infra requirements and infra management for model deployment.
• Lead debugging of data pipelines and model behaviour in production environment.Develop dashboards to enable easy tracking and communication of model impact.
Desired Skills & Experience
• We’re looking for someone with 5+ years of experience manipulating data sets and building statistical models, with a Bachelor’s/Master’s/PhD degree in Statistics, Mathematics, Computer Science or another quantitative field, from any of the top-tier colleges.
• Data-oriented personality. Strong problem solving skills with an emphasis on product development.
• Great communication skills. Excellent written and verbal communication skills for coordinating across teams.
• Good applied statistics skills such as distributions, statistical testing, regression.
• Good scripting and programming skills. Experience using statistical computer languages, Python,PySpark, R, SQL to manipulate data and draw insights from large data sets.
• Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, artificial neural networks and their real-world advantages or drawbacks. Knowledge of deep learning techniques is a plus.
• Experience with common data science toolkits such as R, NumPy, Pandas, Scikit-learn, TensorFlow, Keras etc.
• Experience with data visualisation tools such as D3.js, GGplot.
• Proficiency in using query languages such as SQL.
• Experience with NoSQL databases such as MongoDB, Cassandra, HBase is desired.
• Experience with distributed data/computing tools like Map/Reduce, Hadoop, Hive, Spark is a big plus