SQL continues to democratize data analytics

Stephen Redmond
2 min readNov 20, 2020

Facebook famously were early adopters of Hadoop BigData technologies to manage their vast and growing amounts of data. But then they faced another problem: early Hadoop meant that you needed Java developers to write MapReduce code to get answers from your data. This created a bottleneck that Facebook solved by creating Hive, a SQL-like language that would automatically generate the MapReduce code so that a Data Analyst, with no Java programming, could query huge amounts of data. It is safe to say that Hive has been a huge part of the BigData revolution, making data accessible to users.

One area that is getting more traction recently is machine learning, part of the AI set of techniques. However, as with early Hadoop BigData manipulation, people who want to get involved with machine learning will need to learn a programming language — Python being the most popular tool-of-choice. True, there are a plethora of tools available that allow users to build and execute models, but all of them are different so users on one tool will still need to learn the new tool before becoming productive.

That is, until BigQueryML came along!

BigQuery is Google’s Hive — a data warehouse product that sits across extremely large datasets, and one that users work with using SQL syntax. By extending it to include machine learning functionality, mean that any user who understands SQL (and that means a whole swath of us data analytics folks) can extend to build and execute ML models using familiar syntax.

Of course, it is not perfect. But what it is though is an indication that Google recognize that SQL and SQL users are not going anywhere anytime soon. Google have lots of tools to allow you to build ML, from Notebooks to AutoML to using their pre-build models, so they didn’t have to do this. The fact that they did has opened machine learning to the “ordinary” data analyst and that democratizes it.

SQL is not going away any time soon!

--

--

Stephen Redmond
Stephen Redmond

Written by Stephen Redmond

Stephen Redmond, Big Data, AI & Data Viz Professional. MSc in Data Analytics. Qlik Luminary. Author and blogger. All opinions my own.

No responses yet