Maturing AI in AI-Naive Enterprises via the Generative AI Hype
Don’t get me wrong- I would prefer we mature AI with tabular data use cases, but LLMs have us focused on NLP and computer vision. Let’s ride this wave to operationalize AI with tabular data use cases.
Many AI-naive organizations have bought into the Generative AI hype brought about by the latest Large Language Models that have been globally accessible to the public, notably to laypersons. When BERT and ELMo came out in 2018, this was mainly an NLP tool for data scientists, but ChatGPT released in late 2022 made it available to everyone.
We can thank ChatGPT for teaching executives on the value of AI for natural language processing and computer vision use cases of questions and answers, text generation, and photo generation. The hype also instilled fear and FOMO in executives, as not adopting this new Generative AI technology could set their organizations back and lose the race with competitors who have successfully implemented this disruptive technology.
Ideally, enterprises should have started with tabular data use cases for AI/ML, but implementing tabular use cases in production, at Enterprise scale, has been a fleeting endeavour for many. Attempts to build complete production AI systems that are safe and responsible for tabular use cases have been at a standstill due to the complexities at the enterprise with regards to security, governance, and sensitivity of corporate/proprietary data, just to name a few issues. And when analyzing corporate data, the majority of the data is tabular, as stored and retrieved from enterprise data warehouses.
But when there is executive push for adopting Generative AI for NLP and CV, this is an opportunity to also push for building complete AI/ML systems with tabular use cases. Because the backlog items needed to convert an AI proof of concept from the prototype/exploration phase into the development and production phases are similar for all 3 use cases: tabular, NLP, and CV.
Some of the reasons for AI-naive enterprises failing to operationalize their AI prototypes is that tabular data at many organizations are complex, not well defined, and difficult to access for research. By contrast, the LLMs have easy, open access to texts and images on the internet, and are able to train on huge corpuses. Hence, much research has made its way to NLP and CV.
Tabular data is not readily accessible for research, and much of this data is sensitive and proprietary. Even employees within the same company have difficulty accessing and sharing company data from their teams to other teams- the same company! Thereby researchers are left with the same small open data, such as the iris dataset many of us data scientists utilized when we were learning data science in university.
So with the hype of Generative AI, we can utilize this push from execs to get what we really want- which is to push for building complete, safe, and responsible AI/ML systems for tabular data, which is most of the uses cases at organizations which will unlock the most business insights and value.
But we will get there by riding the wave for Generative AI for NLP and CV, albeit these use cases will only be the minority of AI/ML use cases for most organizations. If Generative AI is the way for AI-naive enterprises to build complete AI systems that are safe and responsible, then by all means, let’s get to work and make this happen. A future article will talk about this journey to building complete Generative AI systems with a Generative AI roadmap and Generative AI governance.
If you're looking for support from me, here are a few options:
Enterprise Data Science Consultancy: With my consult team comprised of a Senior Data Scientist, Senior ML Engineer, Senior Data Engineer, and Senior Cloud Engineer, we will help you architect and build your Enterprise Data Science platform, and transfer knowledge to your IT team to maintain and optimize it. We will also overlay an MLOps framework to manage the AI solutions you build on this platform. If you don’t have an MLOps team, we will help you build one. Please get in touch about this consultancy here
Coaching and Mentorship: I offer coaching and mentorship; book a coaching session here
AI for tabular data is where the value will explode, specially in Pharmaceutical and Biotechnology R&D, but we need to get the foundations right first. There is not a coherent data management strategy across the organization and lacks appropriate metadata or standards, making it very complicated to assemble large data sets. Most everyone jumped on the LLM train as it did not seem too difficult to use existing unstructured data and it helped them to claim they have joined the hype without much delay, although the value of some of those projects might be questionable