My Data Guest — An Interview with Andrea De Mauro
It was my pleasure to recently interview Andrea De Mauro as part of the My Data Guest interview series. He busted myths about data science superheroes, settled the IT vs data science question, and spoke about the importance of getting your hands dirty with data and algorithms.
Andrea De Mauro has more than 15 years of international experience in managing data analytics and data science teams with various organic organizations. Currently, he is the Head of Business Intelligence at Vodafone in Italy. Prior to that, he served as Director of Business Analytics at Procter & Gamble. He is a professor of Marketing Analytics and Applied Machine Learning at various universities including the International University of Geneva (Switzerland) and the Universities of Bari and Florence (Italy). He is also the author of popular data science books and of research papers in international journals.
Rosaria: How many different professional profiles do you see in the data science space?
Andrea: The traditional myth of a data scientist as a superhero, who takes care of entire end-to-end processes or the full landscape of complexities around analytics is far from the reality. Today, there are plenty of roles available in the amazing world of data analytics. I normally use three main role families to explain them:
Data analysts or business analysts, who have deep business expertise in a specific domain and “translate” needs between other data practitioners and the business teams; Data scientists, who focus more on the algorithms and on the scaling of the analytics capabilities; and Data engineers, who are involved with the implementation and maintenance of the full technology stack.
Rosaria: Which professional category is the hardest to recruit?
Andrea: They are all tough to recruit for these days! But I think the Business Analyst role is the hardest one: finding people who have what it takes to get business complexities answered with the required algorithms is hard. The role is also difficult to explain to recent graduates.
Rosaria: Do the different profiles need different data education in your opinion?
Andrea: I think all of these profiles require primarily one thing: a growth mindset – the willingness to keep learning.
None of these professionals can survive without being open to learn continuously: this is particularly true for business analysts and data scientists. Data engineers require also some vertical technical expertise on one or more big data platforms, like GCP, AWS, or Azure. In general, as a data engineer, you want to have a certification also. The good news is there are plenty of opportunities to get certified.
I also think there is a very rich MOOC offering to learn data science online on platforms like Coursera, Edx, and other providers. They offer certification paths for aspiring data scientists or analysts. I normally recommend people who do not have an educational background dedicated to data science to make use of these learning platforms.
Rosaria: What tools do you think a data professional should learn?
Andrea: It's really important for an aspiring data professional to have the right set of tools and options – and to know how and when to leverage them. You don’t need to learn all the tools, but have a good mix of products that complete each other as part of a versatile toolbox.
The type of toolkit I would recommend possessing would include a business intelligence product, focusing on enabling scaled dashboards and data visualization capabilities, and a versatile analytics platform. KNIME is a great example of low-code analytics platforms. I would also include more traditional code-based analytics tools, which can be integrated perfectly with a low-code platforms like KNIME.
Rosaria: Now let’s talk to the teacher, do you use KNIME to teach your data science courses?
Andrea: Of course! I have used KNIME for a long time now, both at the universities and at work with Vodafone and P&G. It’s an amazing tool to teach data science for multiple reasons. KNIME makes the process of coding convenient so you can focus on the core analytical tasks.
Those who would like to start using analytics feel often discouraged by their lack of coding skills. KNIME offers a solution to this. Visual tools like KNIME let you “see” and track what’s going on at each step. You can easily identify where the problem is if you are stuck at a certain point. This really supports the educational experiences while teaching data science courses. In short, this increases the efficiency of the learning process for the students.
Students appreciate it as well. KNIME makes the learning more accessible and sometimes also more fun. The experience of building a workflow step by step is somewhat enjoyable for them. The usage of KNIME nodes makes the learning modular and progressive. A node makes you “see” what is going on with your data in the flow very easily. The Joiner node, for example, combines the two input tables into one single output table. Or the Loop nodes apply iteration to a sequence of steps between Start and End. By making it “visual”, you understand it better and reduce the chance of making mistakes.
Rosaria: So let's talk now to the technical professional. Do you see yourself as a data analyst, a data engineer, or a data scientist?
Andrea: I see myself more as a data analyst, right at the intersection between the business requirements, the business needs, data and algorithms. Fortunately, I’ve had the opportunity to see the full picture and practice bits and pieces of all three roles. If I had to choose, I would see myself more as a Data Analyst.
Rosaria: How significant is the impact of implementing data analytics in a business, does it help or it is just an academic exercise?
Andrea: It’s a rhetorical question, of course. Data analytics is making and will make a huge difference. It's a game changer for the business. It changes the operating model that defines how the company works. Eventually, it changes the way business is done. It's not just a technicality or an IT complexity to manage, but a novel way of running an organization. Emotionally, it also brings great excitement. Think about prototyping an analytics solution that fits well to your business case or finding deep insights you never expected. Experiencing the seemingly “magic” of data analytics is a potential moral booster for everyone.
Rosaria: As a manager how do you build your data science team? Where do you start from?
Andrea: Let’s start where you don't start from! You don’t start by hiring dozens of generic data professionals without first looking at the talents that you already have in your family. You can definitely grow data scientists from the current talents you already have in your company by upskilling those who have curiosity, passion, and willingness to learn.
Following this course of action, in my opinion, has two major benefits, which I feel are worth mentioning:
-
Only the person with knowledge and understanding about how the company operates and how data flows through it can really understand the real opportunities and build some meaningful data analytics.
-
It’s refreshing for the professionals, no matter what their background is, to boost their career path and development by getting serious on data analytics. This opportunity should not be restricted to those who have a technical or an IT background.
Rosaria: Talking about the IT and data science team, this is a question I’ve considered for a long time: Where should the deployment of data analytics applications reside? People argue whether it should stay with the data science team or be an IT responsibility?
Andrea: It is difficult to have a general answer as it really depends on each case and how the company is organized, but one thing for sure is that there is a strong collaboration need between IT and Data science teams. Whether these teams are separated or together depends on where the company is in terms of maturity and on many other factors, sometimes power-driven and political.
If they are separated, it is necessary that each team understands the data lifecycle process. Otherwise it's unsustainable. So, in short, it's inherently a collaboration.
The ownership of the capabilities and their utilization, however, should reside in the business - neither in the data science team nor in the IT one.
Rosaria: Now let’s play a bit of myth busters in the field of data analytics. What are the myths around data analytics prevalent in the tech market and do you think they are unjustified?
Andrea: Data analytics was born with a strong sense of euphoria around it. This has led to the creation of many myths:
First, as we mentioned earlier, some people were led to think that it is enough to have a good team of strong data scientists to cope with the full set of needs. This is a myth. You need a multifaceted team of data professionals, working hand in hand with engaged business partners. You also need a toolkit made of multiple tools to enable that team to perform at its peak.
Second, some claim that the process of data science will be fully automated. We hear a lot about AutoML, which is an important direction in machine learning and artificial intelligence. However, it creates the myth that in the future humans will not be needed and machines will take over the process, which isn't true. The AI we are working with right now is not meant to be generic. It is rather able to solve specific issues and complexities that can only be driven with human guidance. So, it will always be a collaboration between humans and machines.
Rosaria: Let’s talk a bit about the recent book you wrote “Data Analytics Made Easy”. Who is this book for? Is the book enough to understand data analytics and to even apply it?
Andrea: I would say it's for everyone who works – or would like to work – with data. The intent of the book is “to make data analytics easy.” It gives practical insights on how to perform specific tasks such as creating a report or automating a sequence of steps that today you run in Excel, or building a compelling presentation, telling a story with data. It’s for knowledge workers and for their managers. It’s important for the latter to be aware of what data analytics is all about and learn first-person about what they are asking from their team when it comes to data. It’s also for those students and graduates who would like to start a career in data analytics .
Everyone who reads the book will be able to put data analytics in practice. The tutorials are based on examples that can be reapplied to many different business cases. You don't have to be in a specific role or have a certain background to understand the content of this book. Ultimately, I think the book can give you reassurance that you “can” learn analytics little by little and become autonomous enough to put data analytics in practice at work.
Rosaria: Finally, I would really like to know what career advice you would give to an aspiring data professional?
Andrea: Do not wait for your first job to get the experience you need. You can start getting your hands dirty with data and algorithms well before your first interviews!
My advice will always be to go out and look for the right opportunities around you to apply analytics first-person. Look for local charities in need of help with their data. Check out websites like Kaggle to find free online competitions and gain experience. This will give you opportunities to build your own first portfolio of models and successful data analytics applications.