Often in BI reporting, business needs require you to analyze facts from two different tables in the same visualization. At Trianz, we focus on helping our customers understand the options available within Looker and designing a solution that not only offers the most valuable business solution but is also clean and efficient technically. Looker has two main options to analyzing different data sets: merging results and persistent derived tables.
1. Do you understand how the data from the underlying database tables should be combined?
Explores in Looker are often developed by LookML developers that have a deep understanding of the underlying database tables and their relationships. Almost always, using a single explore with pre-defined relationships is less risky.
Yes -> As long as you have a firm understanding of the underlying data and how primary and secondary data sources differ, your merge will be successful.
No -> Using a single explore based on a PDT developed by a LookML developer with database knowledge is almost always more favorable.
2. Is performance an issue?
Depending on the size of your data and the relationships between the underlying tables, performance may be a large consideration. One major difference between PDTs and normal Looker views is that PDTs are materialized views and therefore have different performance considerations than typical Looker generated queries.
Yes -> PDTs are persistent, meaning they run on a set update schedule. This allows for flexible updates and manual performance tweaking. Furthermore, if everything is coming from a single query, if that query is written efficiently, performance in Looker will often be faster.
No -> Merging is Ok, assuming each underlying query perform adequately.
3. Is it reusable?
In BI reporting, reusability is important from both efficiency and user clarity standpoints. Looker focuses on reusability and “single source of truth” but also offer options for fast and easy ad-hoc reporting. Ask yourself if this same combination of explores will be needed in multiple places. How often do these types of reports come up? Is this an ad-hoc request?
Yes -> If this combination of tables is commonly used, then you should make a PDT and separate explore. By creating an explore and standardizing the filters and fields in the LookML, you ensure everyone is using the same logic.
No -> If this is a one-off report or the combination of facts are only used in one visualization then merging queries from two different explores is fast and easy.
4. Do you need to filter on the results set?
Filtering is often important for performance and usability. Both merging and PDTs have different options for filtering. PDTs have many more options for filtering because the user is writing the query and explores manually. If you are using merges, you have more limited options for filtering.
Yes -> If you need to filter on the final results set as a whole, then merging is not possible. A PDT will give you full flexibility to filter at the query level, explore level and/or Look level.
No -> Merging allows you to filter each individual query separately. You have to apply filters at the query level just as you would with a Look or tile.
5. Are there dimensions to merge on?
In Looker, merging two queries requires what are called “merge conditions”. In order to use fields for your merge conditions, the same or similar fields need to be available in each query.
Yes -> Merging requires common dimensions (such as order date) to merge the two queries together. When exploring each query, make sure you have the common dimension(s) in the select.
No -> Merging is not possible without common dimensions to specify in the merge rules section. If the two queries are too different or don’t have any common dimensions selected, you won’t be able to merge the two.
When you need to see two facts from different underlying tables, whose relationships haven’t been defined by Looker developers or cannot be defined due to technical limitations, there are multiple options available. Persistent Derived Tables and merging results are two efficient ways of combining tables. Outlined here are the major considerations taken when deciding which way to design your Looker model. Trianz specializes in helping clients make the decision that drives the most value for their business.
Senior Consultant - Analytics Practice
Rachel Stewart is a senior consultant in the Analytics Practice at Trianz who helps clients with all visualization needs. Rachel is certified in a number of different visualization tools, including Tableau and Looker. The Analytics Practice works with enterprises to achieve significant competitive advantage via modern visualization technologies, helping clients to maximize their analytics deployments.
Contact Us Today
What Is an SQL Query Engine? SQL query engine architecture was designed to allow users to query a variety of data sources within a single query. While early SQL-based query engines such as Apache Hive allowed analysts to cut through the clutter of analytical data, they found running SQL analytics on multi-petabyte data warehouses to be a time-intensive process that was difficult to visualize and hard to scale.Explore
A Winning Base for Successful Digital Transformations When it comes to developing a successful digital strategy, it is not just corporations planning to maximize the benefits of data assets and technology-focused initiatives. The Government of Western Australia recently unveiled four key priorities for digital reform in its new Digital Strategy for 2021-2025.Explore
Engage Your Workforce with a Modern Employee Intranet Solution The employee intranet has changed significantly since it was first introduced in the early 1990s. What started as HTML-based static portals have now evolved into intuitive communication tools complete with search engines, user profiles, blogs, event planners, and more. Today, many organizations are taking a second look at employee intranets to bridge gaps between teams, build company culture, centralize information, increase productivity, and improve workflow.Explore
Adopting emerging cloud technologies, consolidating resources, and improving processes is the key. “IT no longer just supports corporate operations as it traditionally has but is fully participating in business value delivery. Not only does this shift IT from a back-office role to the front of business, but it also changes the source of funding from an overhead expense that is maintained, monitored, and sometimes cut, to the thing that drives revenue,” said John-David Lovelock, research vice president at Gartner.Explore
Deliver Powerful Insights Instantaneously with Federated Queries - No Matter Where Your Data Resides The concept of federated queries isn’t new. Facebook PrestoDB popularized the idea of distributed structured query language (SQL) query engines in 2013. Over the years, AWS, Google, Microsoft, and many others in the industry have accelerated the adoption of a distributed query engine model within their products. For example, AWS developed Amazon Athena on top of the Presto code base, while Google’s BigQuery is based on Cloud SQL.Explore
What is Unstructured Data? Almost 80% of the data that enterprises and organizations collect is unstructured - data without a set record format or structure. Unstructured data includes data such as emails, web pages, PDFs, documents, customer feedback, in-app reviews, social media, video files, audio files, and images.Explore