From Nicolas Aidoud
Generative AI has the potential to transform the role of data analysts and make their work more efficient and productive. In this article, we will explore the impact of generative AI on data analysis and how it can benefit data scientists.
Today’s Landscape:
Generative AI models like ChatGPT, Bard, and Bing Chat are already capable of writing SQL, Python, and R code. These models are continuously improving, and their code generation capabilities are becoming more efficient. OpenAI’s ChatGPT even released a plugin called Code Interpreter, allowing users to upload data files and perform data analysis tasks without writing any code. However, there are limitations to consider.
The Limitations:
While generative AI has impressive capabilities, there are challenges it faces in completely replacing data analysts. For example, Code Interpreter currently only supports the upload of one table, limiting its usability for complex analyses. Additionally, there are concerns about data security when pushing sensitive company data outside the firewall. These limitations highlight the need for domain knowledge and the ability to ask the right questions when analyzing data.
The Value of Domain Knowledge:
The greatest value of data analysis work lies in its ability to answer ad hoc, complex questions that require immediate attention. Generative AI models rely on existing data sets to generate answers, but they struggle with answering situational questions that have never been asked before. Data analysts possess domain knowledge that allows them to interpret findings and provide insights in unpredictable scenarios, making them irreplaceable in these situations.
Current Uses and Future Potential:
Currently, generative AI’s highest and best use in data analysis is its ability to write code and explain the code it generates. This feature can greatly assist data analysts in learning and writing code more efficiently. Furthermore, tools like GitHub’s Copilot provide real-time coding suggestions and improvements. Recent developments, such as Databricks’ open-source generative AI model called ‘Dolly,’ show the rapid progress in this field and its potential impact on data analysis workflows.
The Future Data Analyst:
Generative AI will undoubtedly reshape data analysis workflows. Repetitive tasks and routine analyses will likely be performed by generative AI, while data analysts will focus on leveraging their domain knowledge and incorporating generative AI tools to enhance their efficiency. Data analysts with business line-level expertise combined with generative AI skills will be the prototypical data scientists of the future.
In conclusion, while generative AI will bring significant changes to data analysis, it won’t fully replace data scientists. Instead, it will augment their capabilities, making them more effective in their work. The collaboration between data analysts and generative AI has the potential to revolutionize the field and drive greater insights from data.