Isn’t it good to think about our career path once in few months? Trying to implement what the life has taught being proactive or say anticipate what is going to come in next 5–10Years. I may be overconfident that I can sustain my career with just digital marketing but it’s not going to bring exponential growth from X to 10X. I believe in the sentence for long time > ”Data is new oil”. And thankfully have moved from a traditional marketer to a data-driven specialist whether from performance-based marketing perspective or bringing tools for scalable execution and analysis.
But the New Problem!
Until now it is about how a human can use the data but in few years how we going to feed the machine to use the data. This is an observation if we see the industry trend, the job title related with data are getting sexier and more unique. I am forecasting with the innovation of automated digital ads(an example), the amount of work which marketers need to put to create ads is decreasing. So what as a marketer need to do is going to drastically different than what we used to do.
Some cases where currently AI extensively used in marketing are:
- Personalization of product merchandising in ecommerce (Evergage.com)(merchandiser can use this understanding to better merchandise the catalogue instead of playing with his own assumptions)
- App Predictions of user abandonment, loyalty, returns and profits ( localytics app marketing predictions) (app marketers need to understand how this works And better utilise automation of push notifications and lifecycle marketing )
- Attribution modelling (Windsor.ai) (analyst just need to make sure all the touch points and allow the system to predict where to spend and decrease)
- Using tools like Boomtrain, brands can send out customized email newsletters based on previous interactions recipients have had with content.
- Advertising: core field. With innovation of automated dynamic ads(facebook product feed) , automated bidding(facebook, Google, doubleclick), automated Budget Optimisation(doubleclick, smartly). The amount of marketers need to do will be reduced. Marketers need to be ready for this and learn how this automations works to better drive them
some more use cases : https://www.linkedin.com/pulse/15-applications-artificial-intelligence-marketing-robert-allen/
And the Idea!
So I need to learn and master any of the things which going to change the business drastically how they execute things. Or even we can use that to improve the other sectors in the society. Before data was just data analytics, now the data scope is expanded or packaged as Data scientist, Machine learning specialist/developer, AI specialist/engineer/developer. So I thought of collecting what companies are sharing in their description, just to get some patterns (at the end of the article is the consolidated info)
Than the idea, the execution here is tough because the topic is very broad and complex than I thought of. A silly way I did is collected all the description of many job listings and created a word cloud to see whats the repetitive word in those description (Below image)
It may not be exactly correct but it gave some sense to start and ask questions about those words. Next what I did is grouping those words in sticky notes and ask questions.
This data is big topic but the infographic from NUS explained well enough on what is my part. The infographic explained where developers, or database engineers or advanced marketers or data analyst(or says data scientist) play their role.
Knowing which part is our role can help us to eliminate the clutter and provide a better focus
According to the part where next generation advanced Marketers or data engineers sits is after the data extraction until Data visualisation. So the stages are
- Use scripting languages like python or R to extract the Raw data
- Use libraries/packages can be used to clean the data, process and transform the data
- Use algorithm to identify the pattern. Basically answering your business questions whether it can insights or recommendations, etc.
- Finally, send the data to visualisation tools. Or using basic visualization is available on python and R.
So what knowledges we need?
Table of Contents
1. Extraction:
- Scripting Languages: R or Python (The mostly used are this both) or spark
- Scripting Environments: R or Python. Matlab become very old
- API calls to extract from platforms like facebook, Google, other ad networks
- Knowledge on connecting different systems
- Know what to extract.
2. Cleaning, Processing, And Transforming
Environment or Tools for processing :
*tools to process and analyse very complex data.
3. Framework : This are the packages for python according to the consolidated Job info.
- TensorFlow
- Sakit-Learn
- Singa
- Caffee
- Pandas
- Numpy
- Nitk
Mlib or Mllib are said to be the libraries of Spark.
4. Algorithms:
- Linear and Logistic regression
- Neural nets
- Deep Learning
- Hidden Markav models
- Naive Bayes
- Game theory
- Natural Language processing( NLP)
*NLP plays important role on identifying the dense or size of each text and can be used to use cases as like sentiment analysis, trending topics, social monitoring, text recommendation what we get on messaging.
NLP toolkits (CoreNLP, OpenNLP, NLTK, Gensim, LingPipe, Mallet, etc.)
*Text pre-processing and normalization techniques, such as tokenization, POS tagging and parsing and how they work at a low level.
5. Statistical Modelling: (Modelling use algorithms)
- Bayesian Networks
- Chi Squat
- Z-test / T-test
- Anova
- Time Series
6. Presenting that data:
- Tableau
- PPT
- Excel
- Google Sheet
- Spark
On Additional : The processed or the raw can be stored in Hadoop
7. Action Plans I planned
- Data science course by Microsoft : https://www.edx.org/microsoft-professional-program-data-science
- Google machine learning coursehttps://cloud.google.com/training/data-ml
8. Some links
- https://www.python.org/downloads/mac-osx/
- TensorFlow Packages https://www.tensorflow.org/install/install_mac
- https://virtualenv.pypa.io/en/stable/
- Finocracy : http://finocracy.com/
Back to the story again….
This both were interesting to me comparing others as the latter focus it’s teaching on marketing use cases which will be easy for me(as am a marketer) to understand concept. The former is because their course material starts with excel and the some of add one which can be used to do some predictions and modelling. And then the deeper lessons focused on R or python.
The intention of sharing this article was to get feedbacks about the understanding I had so far and if there is other ideas from you how we can move forward with Data science and machine learning technologies. Whats your thought?
Initially published in Medium and Linkedin
— — — — — — — — — — — — — — — — — — —
The Job Descriptions collected:
TIA:
- databases such as PostgreSQL and MySQL
- distributed systems such as Hadoop, Spark, Redshift is a plus
- Experience with experimenting on different modelling techniques (supervised and unsupervised learning) and develop data visualisation for data stories
- Experience with natural language processing(NLP), graph theory and machine learning algorithms available on libraries such as scikit-learn and Spark MLlib
- Experience working with open-source software; experience with workflow management tools such as Airflow/Luigi is desired
- Experience with deployment of production-grade analytics pipelines will be a plus
- Experience building web applications, microservices and cloud services will be a plus
- Good knowledge in statistical modelling
- Experience in machine learning and data visualization.
- Ability to communicate complex ideas to average people.
- Ability to find the linkage between business, science and data
In one of singapore government Job listing :
- Technical expertise with data analysis techniques and tools, data models, database design development, data mining and segmentation techniques
Toookitaki:
- You have familiarity with Python/R
- You are familiar with SQL and other data manipulation tools and packages
- You have solved various Kaggle problems and are one among the top performers
A hospital:
- SQL-based queries (PHP, MySQL or other), flat-file handling through programmatic code, ETL Tools
AI developer
- Experience using at least one ML framework (TensorFlow, Scikit-Learn,
Singa, Caffe, etc.)
- Prior experience in scraping data from websites (i.e. using Pattern, Beautiful Soup, etc.)
- Prior experience using social networking APIs, or mining social media data.
- Prior experience working on mobile apps.
Silent eight:
- Practical knowledge of Python and text/data processing packages, e.g. pandas, numpy, scikit-learn, nltk
- Experience with machine learning applications and algorithms, e.g. Linear/Logistic Regression, Neural Nets, Deep Learning, Hidden Markov Models, Naive Bayes, Game Theory.
What will gain extra points::
- Knowledge of Spark ML/MLLib
- Practical skills in Natural Language Processing
- Knowledge of graph databases such as Neo4J
Others :
- predictive analytics/forecasting space
- what to predict, how to build DV, what value addition he is bringing to the client among others
- Understand and analyze large, complex, multi-dimensional datasets and build feature matrix relevant for business
- Understand the math behind algorithms and is able to choose one over another
- Understand approaches like stacking, ensemble and apply them rightly to increase accuracy
- Research on ML algorithms related to operational and credit risk models and write code in python
- Computational linguistics, semantics, information retrieval, summarization, question answering
- Statistical machine learning and inference, mathematical statistics, probabilistic programming , Bayesian Networks
- Recommender or decision support systems, human computer interaction, end user explanation
- Experience in solving real data science problems (working experience, Kaggle, or similar competition experience is a plus).
Perx: Senior Data Scientist Lead
- Requirements Ph.D. or Master’s Degree in CS, Statistics, operations research, applied statistics, data mining, machine learning, physics or a related quantitative discipline.
- A deep understanding of statistical and predictive modeling concepts, machine-learning approaches, clustering and classification techniques, and recommendation and optimization algorithms.
- More than three years of industry experience in predictive modeling and analysis, predictive software development Experience in mentoring junior team members, and guiding them on machine learning and data modeling applications, would be a plus.
- Strong Problem solving ability Good programming languages skills
- Experience using Python and/or R
- Experience using machine learning libraries eg scikit-learn, caret, mlr, mllib
- Strong communication and data presentation skills.
- Experience handling large, complex, or challenging datasets
- Experience working with distributed systems and/or grid computing
- Publications or presentations in recognized Machine Learning and Data Mining journals/conferences would be a plus.
From Twitter JD:
- Extracting and transforming data from systems like Hadoop and SQL, using tools such as Pig, Scalding, Hive, Presto
- Some experience with one or more object oriented languages like Java, Scala, C++
- Some experience with scripting languages like Python or Ruby etc.
- Some experience with statistical programming environments like R or Matlab
Bonus points:
- Experience with machine learning
- Experience with large datasets and Map Reduce architectures like Hadoop and open source data mining and machine learning projects
UBER:
What you’ll do
- Build Machine Learning models to help optimize our operations globally
- Dig through our extensive datasets to find actionable insights for city teams
What you’ll need
- Extensive experience with common analysis tools — SQL, R, Python, Julia or similar. Demonstrable familiarity with programming concepts
- 2+ years experience in quantitative analytical roles
FROM ALTITUDE LABS:
- Participate in cutting edge research in the application of artificial intelligence. Designs experiments, test hypotheses, and build models.
- Conducts data analysis and moderately complex designs algorithm.
- Develop predictive models and frames business scenarios that are meaningful and which impact on critical business processes and/or decisions.
Job Requirements
- MS or PhD in Computer Science, Mathematics, Statistics, Physics or related technical field (or equivalent practical knowledge).
- Experience with one or more general purpose programming languages including but not limited to: Python, C/C++, Java, Scala or Go.
- Good knowledge in statistical modelling.
- Experience in machine learning and data visualization.
- Ability to communicate complex ideas to average people.
- Ability to find the linkage between business, science and data
You must log in to post a comment.