/
Advanced Data Source Queries

Advanced Data Source Queries

When you query Analytics tables (e.g., Datamart, Data Source), you’re using PA Query API. This PA Query API should be used in all scenarios, whenever possible.

Sometimes you encounter special situation where PA SQL Query API can be the only solution to deal with the problem which you cannot solve by PA Query API; for example:

  • You need to do various types of JOINs between tables, possibly with special JOIN conditions.

Remember, simple LEFT JOINS of two Data Sources can be solved using a Datamart, using the common way of building it, i.e., that 1st Data Source will be the primary one (the one which supplies also key columns), and the columns of the 2nd Data Source will be used/picked up to the same Datamart. See documentation about Datamart for more details.

  • Usage of more complex functions/expressions (e.g., LAG()).

  • When something would have to be simulated in the Groovy code (e.g., doing two separate lookups and then join data in the Groovy code), but can be done by PA SQL directly and with better performance.

  • You need special constructs like SELECT FROM SELECT.

Think twice before using PA SQL Query API. Always ensure that you’ve designed your tables properly, to be well suited for solution of the problem before you start doing heavy PA SQLs on them. No special query will help if your tables' design does not fit the solution of the problem.

PA SQL Query

Is a query written using ANSI SQL, which is built on top of one or more "views" (kind of abstraction of the underlying source of data). You can use them to solve tasks, which are not possible to do with PA Query API.

Concept of "Views"

Because Pricefx backend is multi-tenant and the backend must ensure that you cannot read data of another tenant, the PA SQL queries cannot let you to read directly from actual underlying DB tables but instead they let you read data from abstraction of the underlying source of data, which we call here "views" (this does NOT mean a database view).

The "view" can be constructed by creation of a Query via PA Query API.

Figure 1. Sample PA SQL query and it's 2 views

From technical perspective, the "view" is implemented as Common Table Expressions but this could possibly change in the future, without any need for modification of logics.

For the Tx views, you can use exactly the same source as for PA Queries, i.e.:

  • Data Feed

  • Data Source

  • Datamart

  • DMTable (used in PriceOptimizer)

PA SQL Example

In the following example you can see that we’re defining two queries - one to Datamart and one to Data Source - which are then in the SQL used as views and JOINed together.

This example is not possible to do by built-in capabilities of Datamart itself, because the JOIN is using a date-range.

Note: Wherever you see the dots "…​..", we removed some parts for easier readability.

Sample of PA SQL Query with 2 views JOINed together

def ctx = api.getDatamartContext() def dm1 = ctx.getDatamart("Transaction") def q1 = ctx.newQuery(dm1) //❶ .select("Currency") .select("InvoiceDate") ..... .where(Filter.equal("CountryCode", input.CountryCode)) .where(Filter.equal("ProductId", input.ProductId)) def dm2 = ctx.getDataSource("ProductCompetitionData") def q2 = ctx.newQuery(dm2) //❷ .select("ProductId") .select("CountryCode") ..... def sql = ''' //❸ SELECT T1.CountryCode AS "CountryCode", AVG( T1.InvoicePrice / T1.Quantity ) AS "InvoicePricePerUnit" , AVG(T2.Price) AS "CompetitorPricePerUnit", T2.Currency AS "CompetitorPriceCurrency" ..... FROM T1 LEFT JOIN T2 ON T1.ProductId = T2.ProductId AND T1.CountryCode = T2.CountryCode AND T2.ValidFrom <= T1.InvoiceDate AND T1.InvoiceDate < T2.ValidTo GROUP BY T1.CountryCode, T1.ProductId, ..... ORDER BY T1.CountryCode, T1.ProductId, ..... ''' def result = ctx.executeSqlQuery(sql, q1, q2) //❹ return result ? result.toResultMatrix() : null

❶ Query to a Datamart, which will be used in the SQL as T1.
❷ Query to a DataSource, which will be used in the SQL as T2.
❸ The SQL, which references the 2 views as T1 and T2.
❹ This is the function, which puts it all together.The PA Query supplied in 2nd parameter will be then available in the SQL query as T1, the query in 3rd parameter will be T2, etc.

Comparison with PA Query API

The following quick comparison should give you an idea, when you might need to use PA SQL instead of PA Query API. You can see that many standard situations can be solved by Query API.

Feature

PA Query API

PA SQL Query API

Feature

PA Query API

PA SQL Query API

SELECT

+

+

DISTINCT

+

+

FROM

+

+

WHERE

+

+

GROUP BY

+

+

HAVING

+

+

ORDER BY

+

+

JOIN

only via Datamart

+

SELECT FROM SELECT

-

+

UNION

-

+

WHERE __ IN (SELECT…​

-

+

some complex functions

-

+

Studio’s PA Console

PA Console is a tool in Studio, which simplifies design and development of Analytics queries using SQL.

You can use it to quickly prototype initial basic queries and then modify them and test them before copying the final code to the Logic.

Figure 2. Studio's PA Console tool in IntelliJ

References

Studio

Documentation

Groovy API

External

Related information:

Found an issue in documentation? Write to us.