With one other yr nearly behind us, it’s time to sit down again and contemplate what we’ve simply been via. It’s been one other energetic 12 months within the massive information area, with loads of information for the intrepid massive information reader.
We’ve had an eventful final yr right here at Datanami, which can quickly full the transition to BigDATAwire (maintain your eyes out for that change in January). With that in thoughts, it’s price having a look the highest tales in every of the previous 12 months. The rankings are in keeping with pageviews.
January: All Eyes on Snowflake and Databricks in 2022
The brand new yr kicked off with lots of anticipation for what Databricks and Snowflake would do. The 2 firms didn’t disappoint, with a number of recent capabilities and continued sturdy progress (though the much-anticipated Databricks IPO by no means materialized). These two information giants might be fascinating to observe in 2023 too–though it will likely be powerful to cowl their respective person conferences in June, which happen the identical days (with Databricks in San Francisco and Snowflake in Las Vegas).
February: Snowflake, AWS Heat As much as Apache Iceberg
Apache Iceberg–the brand new open desk format that solves lots of consistency issues in massive information lakehouses–got here on sturdy in late 2021, and its utilization grew via 2022. We named Ryan Blue, the co-creator of Iceberg, as one in all our folks to observe. Databricks, for what it’s price, introduced assist for Iceberg later within the yr (it additionally open sourced its Delta desk format, offering competitors to Iceberg, together with Apache Hudi).
March: Residence Depot Finds DIY Success with Vector Search
Vector search was one of the vital compelling new applied sciences to search out traction in 2022. We acquired an inside view of how the know-how (typically deployed utilizing vector databases) helped dwelling enchancment large Residence Depot supercharge its clients’ Internet and cell searches through the use of neural networks to deduce what they’re on the lookout for as a substitute of a sustaining a large dictionary of generally misspelled phrases.
April: The Modernization of Information Engineering at Capital One
Democratization of information science and information evaluation stands out as the aim, however information engineering is commonly the trail to get there. The parents at Capital One notice this, which is why the corporate has poured assets into information engineering to streamline entry to information. It’s inner information market combines a knowledge catalog, an automatic information pipeline growth device, information governance, and information high quality, and it’s held along with a tremendous information mesh.
Might: Anaconda Unveils PyScript, the ‘Minecraft for Software program Growth’
Python has develop into the lingua franca for information science. That’s not information. However with Anaconda’s new PyScript, which CEO Peter Wang unveiled on the PyCon 2022 convention, the corporate helped to decrease the barrier to growing information science utility within the consolation of a Internet browser.
June: EMR Serverless Now Out there from AWS
Apache Hadoop has lengthy ceased being the middle of gravity of the large information world. However Hadoop’s legacy lives on, together with at AWS, the place its Amazon EMR providing continues to be a smash hit amongst clients utilizing Apache Spark, Apache Flink, Apache Hive, Presto, and even MapReduce code. And with its new serverless choice, Amazon EMR (which used to face for Elastic MapReduce however doesn’t formally anymore) helped to get rid of one of many massive usability hurdles that that outdated elephant Hadoop.
July: Mathematica Helps Crack Zodiac Killer’s Code
Typically, tales languish on Datanami for months earlier than readers lastly notice what they’ve lacking. Such was the case with this January 2022 story, which described how a trio of males from Virginia, Australia, and Belgium used the Mathematica statistical bundle from Wolfram to crack the Zodiac Killer’s code. Uncover Journal will get credit score for first reporting this story. Unfortunatley, the id of the Zodiac Killer, the serial killer who terrorized Northern California greater than half a century in the past, stays unresolved.
August: Datanami Individuals to Watch 2022
We first introduced the 12 Datanami Individuals to Watch again in February, and ran interviews with the group over the course of the yr. It’s an excellent group of leaders, together with Yu Xu (TigerGraph), Lauren Woodman (Datakind), Venkat Venkataramani (Rockset), Adam Selipsky (AWS), Matthew Scullion (Matillion), Satyen Sangani (Alation), Andrew Ng (LandingAI), Tristan Helpful (dbt Labs), Susan Gregurick (NIH), Zhamak Dehghani (Thoughtworks), Pleasure Buolamwini (MIT Media Lab), and Ryan Blue (Tabular). Preserve a watch out in early 2023 for the following batch.
September: Walmart Offers Information and Analytics Monetization A Strive
Because the world’s largest retailer, Walmart is aware of a factor or two about promoting. With the launch of its new Walmart Information Ventures arm earlier this yr, the corporate launched new choices in its Walmart Luminate line, similar to Shopper Habits, Channel Efficiency, and Buyer Notion. The retail large will not be solely promoting to its companions information about its retailer gross sales (2 billion market baskets per quarter, the corporate says), however promoting them prepackaged analytics insights, too.
October: Information Mesh Vs. Information Cloth: Understanding the Variations
There’s no denying it: Information materials and information meshes are scorching. There’s additionally no denying that there’s lots of confusion round these two ideas, which share some similarities but additionally have essential variations. This text, which was printed in October 2021, took a yr to develop into the most-viewed story for a month, displaying simply how a lot demand there may be for informaiton on information meshes and information materials. It simply occurred that it took a yr for it to bubble as much as the highest. Count on extra curiosity on information meshes and information materials within the new yr.
November: What Does Information and Analytics Want for 2023? Forrester Shares Predictions
Up thus far, Datanami had one ironclad rule: No new yr predictions tales earlier than Thanksgiving. (It was the one method to maintain the PR folks at bay.) For no matter motive, we broke the rule this yr once we interviewed Forrester analyst Kim Herrington and printed her analyst group’s predictions for 2023, and the outcome was the highest grossing story for the month. Go determine.
December: UC Berkeley Launches SkyPilot to Assist Navigate Hovering Cloud Prices
One of many greatest rising developments in 2022 was the rising prices of cloud computing. The parents working the pc science program at UC Berkeley realized this, which is why they created Sky Computing because the follow-on to RISELab (which succeeded AMPLab). Certainly one of Sky Computing’s first creations is Sky Pilot, which lets customers run batch machine studying workloads on any cloud. There’s no telling whether or not it will likely be as extremely profitable as Ray, which got here out of RISELab, or Spark, which got here out of AMPLab. However contemplating the eye workers author Jaime Hampton’s story acquired, we’re not betting in opposition to it.
That’s it from us this yr at Datanami. Joyful holidays, and we’ll see you again right here in 2023.
Alation, Anaconda, AWS, Databricks, DataKind, dbt Labs, Forrester, LandingAI, Matillion, MIT Media Lab, Snowflake, Tabular, TigerGraph, Wolfram