0

I'm using scan query in druid. I'm looking for a way to sort data using some query. How I can do this?

Now I have:

DataSource: Data,
Intervals:  "1000"/"2000",
Limit:      10
Legacy:     true, 

I have column "values" and I want to sort data by this column (no by timestamp) and return each column from table but sorted by "values".

Something like:

SELECT __time, value, company, count
FROM shares
ORDER BY 1 ASC
WHERE value > 200

4 Answers 4

2

Tried a similar query with the wikipedia test data :

SELECT namespace, channel, cityName, sum_added
FROM "wikipedia_demo" r 
WHERE sum_added > 30
ORDER BY sum_added DESC

which results in an error:

Error: Unknown exception

Cannot build plan for query: SELECT namespace, channel, cityName, sum_added FROM "wikipedia_demo" r WHERE sum_added > 30 ORDER BY sum_added DESC

org.apache.druid.java.util.common.ISE

The reason is that ORDER BY is only allowed on GROUP BY columns, aggregate expressions or if no grouping is done, then only on __time. Take a look at the docs here: https://druid.apache.org/docs/latest/querying/sql.html#order-by

If you are not aggregating, you can still use GROUP BY selecting all the SELECT expressions and then ORDER BY any of them, as in:

SELECT namespace, channel, cityName, sum_added
FROM "wikipedia_demo" r 
WHERE sum_added > 30
GROUP BY 1,2,3,4
ORDER BY sum_added DESC

Caution: since it is time-series data, it is a good practice to include a condition on __time to avoid scanning the whole table.

Sign up to request clarification or add additional context in comments.

1 Comment

Your solution not working on my side, It's working in druid console, but when I will create in Golang similar solution it do nothing when I have limit: 500
1

SQL query working good for me, so I found "EXPLAIN PLAN FOR" label to add before SQL in druid console. Druid showed me how JSON for druid native should be look. I solved my problem in this way. Thanks

Comments

0

Please also see the following docs page on the options you have regarding order on scan queries. https://druid.apache.org/docs/latest/querying/scan-query.html#time-ordering

1 Comment

like in title: Order data by another column than timestamp, you link has: #time-ordering
0

Yes. That's a separate question but yes, you can submit SQL through the API instead of Native JSON query.

JSON file named "a_query.json":

{
  "query":
  "SELECT namespace, channel, cityName, sum_added
   FROM \"wikipedia_demo\" r 
   WHERE sum_added > 30
   GROUP BY 1,2,3,4
   ORDER BY sum_added DESC"
}

Example API call:

curl -X 'POST' -H 'Content-Type:application/json' -d @/<path_to_file>/a_query.json http://localhost:8888/druid/v2/sql | jq

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.