Radar

Data & AI Technology Radar

Introducing the Unit8 Technology Radar - a comprehensive guide that empowers businesses to navigate the ever-evolving landscape of technology. The radar serves as a strategic compass, providing insights into emerging technology trends, platforms, tools, languages, and frameworks. Join us on this journey as we explore and develop the Unit8 Technology Radar and discover the latest technological innovations that can propel your organization to new heights.

Download PDF Key Trends 2024

What’s the radar about?

Technology Radar is a comprehensive tool, inspired by the pioneering efforts of our colleagues at Thoughtworks, that showcases the latest trends and developments in the Data, Advanced Analytics, and AI space. This tool is a culmination of the collective experience of our engineering team, drawing from hundreds of projects and collaborations with our customers each year.

There are no blips on this quadrant, please check your Google sheet/CSV/JSON file once.

Methodology

ADOPTTRIALASSESSHOLD

The radar categorizes key technological trends and tools into four main quadrants: Infrastructure & xOps, Data & Analytics Platforms, ML & Data Science, and GenAI. Additionally, we have ranked these trends and tool by how confidently we would recommend them to our customers.

Adopt: We believe the industry should embrace these technology trends and tools. We incorporate them into our projects and see them as suitable for most of the Enterprises.

Trial: Worth pursuing. Most mature Enterprises should be poised to adopt those trends, even though many of the best practices, whether around architecture or the target operating model, have not been firmly established yet.

Assess: Consider testing the technology to evaluate its maturity and experiment with its potential effects on your Enterprise in the future.

Hold: Proceed with caution. Evaluate carefully if your organisation is internally prepared (talent, skills, infrastructure & data readiness) to embrace the tech trend.

2Reaching production

While low-hanging-fruit GenAI applications—such as document chats and private versions of ChatGPT—are already widely adopted, the current challenge for organizations is transitioning into full-scale production environments.

Stakeholder buy-in hinges on trust in the reliability of outputs. The maturing GenAI stack now offers robust tools to ensure that, including guardrails, evaluation methods, and live feedback mechanisms. Although these might introduce excess complexity for PoCs, leveraging them in production settings is often a wise choice, as their increasing ease of integration offers a favorable effort-to-benefit ratio.

However, the key roadblock to production-grade GenAI solutions is data quality and governance. Organizations have traditionally focused on structured data, given the limited business value previously derived from unstructured data. That has changed rapidly with the rise of LLMs, presenting a new challenge for data teams to ensure quality and availability to meet the appetite. Notably, metadata plays a key role in many use cases and must be taken seriously.

Cost is often a major consideration when moving from PoC to production. While costs per token are plummeting (GPT-4o tokens now being 9 times cheaper than those of GPT-4 upon release), modern implementations tend to be way more token-hungry than in the recent past, due to features like long-context RAGs, agents, and chain-of-thought.

Finally, the maturing GenAI stack is unlocking new classes of use cases that may offer greater business value than simple RAGs and "SafeGPTs". This expansion is likely to broaden the scope of production-level applications. Agents, in particular, provide powerful solutions for complex problems but require more substantial development efforts and robust guardrails. It's crucial to assess the business value of the use cases in which they are adopted to ensure a worthwhile return on investment.

3Zoo of LLM models

The landscape of LLMs provides developers with more choices than ever before. Open-source models are now performing competitively in benchmarks and have become easier to host and integrate, thanks to a maturing GenAI stack.

Experimenting with fine-tuning might be worthwhile if standard models do not meet specific use-case requirements, especially since it has become more accessible — not only for open-source, but also for proprietary models like OpenAI’s. While customizing models can yield significant benefits, it's important to note that the state-of-the-art changes quickly. A fine-tuned model might become obsolete if not regularly updated or soon outperformed by newer models that offer better accuracy, efficiency, or features.

However, fine-tuning doesn't always have to be done in-house. A plethora of specialized models fine-tuned by the community are available—such as those on Hugging Face for open-source models or OpenAI's GPT Store for proprietary options. Leveraging these can save time and resources while still providing tailored performance.

The selection of models varies not only in specialization but also in size. Smaller models (7–30 billion parameters) are becoming increasingly capable and popular. They are cost-effective option for fine-tuning and open up new possibilities for edge computing, enabling on-device deployments instead of relying solely on servers.

For enterprises, this means that today there is a wider range of options of deploying LLMs from various providers, enabling them to optimize performance, reduce costs, and potentially gain a competitive advantage, and to avoid lock-in they should adopt a flexible and modular AI strategy.

4The Rise of Interoperable Data Lakes

The era of restrictive data silos is coming to an end. Open data and table formats, such as Apache Iceberg, are breaking down barriers and fostering a new era of collaboration. Now, data teams can seamlessly access the same data with different processing engines—whether it's Databricks, Snowflake, Microsoft Fabric, or others – without needing to create redundant copies. This means you can use the best tool for the job, whether it's for exploratory analytics, BI dashboards, or AI model training, all while working with a single source of truth.

This interoperability is powered by a clever approach to metadata management. These formats abstract the underlying data structure, allowing different engines to understand and query the data in a consistent way. For instance, Databricks' UniForm feature allows Delta Lake to seamlessly interoperate with Iceberg and Hudi, while Apache XTable provides bi-directional conversions between various formats. Even Snowflake is embracing this trend, with their external tables functionality and commitment to open standards like Iceberg, further enhancing the interoperability between Snowflake and other platforms like Microsoft Fabric.

This approach means that organizations can consolidate data in a central repository, like a data lakes on S3 or Azure ADLS, while allowing different teams to use the most suitable processing tool for a given task, irrespective of the initial table format. This can also be a powerful way to save costs! Not only for cloud storage itself (as multiple copies of data are no longer necessary), but also ones related to the effort of migrating and maintaining consistent data between silos.

However, while interoperability solutions are bridging the gap, the choice of your primary table format still matters. Write operations can vary significantly between formats, and some optimization might be lost during metadata conversion. Therefore, it's crucial to select a format that aligns with your primary use case, whether it's high-volume batch processing, real-time streaming, or large-scale analytics.

In conclusion, interoperable data lakes are transforming the way organizations manage and access their data. By embracing open standards and leveraging the right tools, businesses can unlock new levels of efficiency, collaboration, and insight.

Contributors

The radar is prepared by Unit8's Tech Radar circle, a group of expert engineers specializing in emerging technologies.

Michal Rachtan (CTO)  •  Adam Zagrajek (Tech Radar Lead 2024 Edition)  •  Antoine Madrona  •  Arash Askari  •  Bernard Maccari  •  Dennis Bader  •  Emre Esendir  •  Gabor Kiss  •  Guillaume Raille  •  Kamil Wierciak  •  Khalil Elleuch  •  Marek Pasieka  •  Maxime Dumonal  •  Michel Gawron  •  Nathalie Wagner  •  Samuele Piazzetta  •  Spiros Apostolou  •  Yassir Benkhedda

Download Tech Radar PDF

Radar

Turning data into value.

Unit8 is a leading Swiss data services company with a mission to help non-digital native companies turn data into value with a mix of data science, analytics and AI. We operate at the intersection of technology and business to accompany our customers at every step of their data & AI journey by offering end to end services. Based in Switzerland, operating across Europe.

Visit Unit8 website
Pdf cover image

Powered by

Once you've created your Radar you can use this service to generate an interactive version of your Technology Radar. Not sure how? Read this first.

Building your radar...

Your Technology Radar will be available in just a few seconds

Enter the URL of your Google Sheet, CSV or JSON file below…

Need help?